๐ฏ Patchioner Trace Captioning Demo
This demo showcases the Patchioner model for generating image captions based on user-drawn traces or bounding boxes. More details about the Patch-ioner framework can be found in the official project webpage. Patch-ioner is an unified zero-shot captioning framework to describe arbitrary image regions.
Instructions:
- Choose between Trace or BBox mode
- Upload an image or use one of the provided examples
- Use the appropriate tool to mark areas of interest in the image
- Click "Generate Caption" to get AI-generated descriptions
Tip: Use the Layer tool to generate multiple captions for different traces.
Model Status: โ Default model loaded: https://huggingface.co/Ruggero1912/Patch-ioner_talk2dino_decap_COCO_Captions on cuda
๐ท Select from example images or upload your own:
๐ผ๏ธ Image Editor
๐ผ๏ธ Annotated Image
Generated caption will appear here...
๐ก Tips:
- Mode Selection: Switch between trace and bounding box modes based on your needs
- Trace Mode: Draw continuous lines over areas you want to describe
- BBox Mode: Draw rectangular bounding boxes around objects of interest
- Multiple Areas: Change Layer to create multiple traces/boxes for different objects to get individual captions
๐ง Technical Details:
- Trace Mode: Converts drawings to normalized (x, y) coordinates
- BBox Mode: Uses bounding box coordinates for region-specific captioning
- Processing: Each trace/bbox is processed separately to generate corresponding captions. Aggregated region representations also attend to the global image context.
Use the Patch-ioner framework for you projects
- just use pip install git+https://github.com/Ruggero1912/Patch-ionerto install the Patch-ioner package
- check the official project webpage and the GitHub repository for more details