Repository navigation
controllable-image-captioning
- Website
- Wikipedia
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything
A length-controllable and non-autoregressive image captioning model.
PyTorch implementation of a Controllable Image Captioning model with a language-driven mechanism for advancing the region pointer state that keeps it in sync with the state of the language model. Code for the paper Language-Driven Region Pointer Advancement for Controllable Image Captioning (Lindh et al., 2020).
Pipeline model for controllable image captioning with user preference settings. Code and model output for the paper Show, Prefer and Tell: Incorporating User Preferences into Image Captioning (Lindh et al., 2023).