Repository navigation
#
video-language-pretraining
- Website
- Wikipedia
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Python
2989
1 年前
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.
Python
129
3 个月前
Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]
Python
113
1 年前
Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
Python
66
2 个月前
A Survey on video and language understanding.
48
2 年前
ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model
Python
16
1 年前
The official GitHub page for the survey paper "Self-Supervised learning for Videos: A survey"
6
2 年前