Repository navigation
#
multimodal-generation
- Website
- Wikipedia
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
Python
864
4 个月前
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
HTML
459
16 天前
Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"
Python
211
1 年前
[ICLR 2024] Contextualized Diffusion Models for Text-Guided Image and Video Generation
Python
67
1 年前
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
Python
55
2 个月前
A curated list of vision-to-music generation: methods, datasets, evaluation and challenges.
53
6 天前
[CVPR '23] Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion Models
Python
36
1 年前
A Survey of Multimodal Retrieval-Augmented Generation
13
3 天前