Repository navigation

#

multimodal-generation

Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"

Python
861
3 个月前

Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"

Python
212
2 年前

[ISMIR 2025] A curated list of vision-to-music generation: methods, datasets, evaluation and challenges.

89
11 天前

[ICLR 2024] Contextualized Diffusion Models for Text-Guided Image and Video Generation

Python
69
1 年前

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

Python
63
6 个月前