Repository navigation
captioning-videos
- Website
- Wikipedia
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
awesome grounding: A curated list of research papers in visual grounding
[CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)
[CVPR 2023] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
PyTorch Implementation of Consensus-based Sequence Training for Video Captioning
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision
A tool for downloading from public image boards (which allow scraping) / preview your images & tags / edit your images & tags. Additional tabs for downloading other desired code repositories as well as S.O.T.A. diffusion and auto-tag/caption models for your purposes. Custom datasets can be added!
Transcription and annotation interface for recorded audio or video files
Video to Language Challenge (MSR-VTT Challenge 2016)
An image and video description generator using an CNN-RNN based architecture.
M-VAD Names Dataset. Multimedia Tools and Applications (2019)
Caption generator for live camera feed
Sample app to add captions to an uploaded video. From api.video (https://api.video)
Online professional courses that are captioned and/or subtitled
Video Search using Natural Language
Generate TikTok— and Instagram—tailored captions and hashtags for your videos using the power of some super creative robots up in the clouds ☁️ 🤖 💬 ☁️
Official Pytorch Implementation of 'LAVCap: LLM-based Audio-Visual Captioning using Optimal Transport' (ICASSP2025)
A multilingual automatic speech recognition and video captioning tool using faster whisper. Supports real-time translation to english. Runs on consumer grade cpu.