Repository navigation

#

multimodal-pretraining

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks

Python
296
1 年前

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)

Python
226
2 年前