Repository navigation
#
multimodal-pretraining
- Website
- Wikipedia
Emu Series: Generative Multimodal Models from BAAI
Python
1711
7 个月前
The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
cross-modal-retrieval教程Awesome Listsimage-text-matchingimage-text-retrievallarge-language-modelslarge-vision-language-modelsmultimodal-pretrainingparameter-efficient-fine-tuningvision-and-languagemultimodal-large-language-modelslarge-language-modeltext-to-image-generationtext-to-image-synthesistext-to-video-generation
425
4 个月前
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
Python
296
1 年前
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
Python
226
2 年前