Repository navigation

#

multimodal-generation

Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"

Python
864
4 个月前

Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"

Python
211
1 年前

[ICLR 2024] Contextualized Diffusion Models for Text-Guided Image and Video Generation

Python
67
1 年前

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

Python
55
2 个月前

A curated list of vision-to-music generation: methods, datasets, evaluation and challenges.

53
6 天前