Repository navigation

text-to-image-generation

Website
Wikipedia

NVlabs / Sana

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

diffusion dit PyTorch sana text-to-image-generation transformers

Python

4537

298

1 天前

Lightricks / ComfyUI-LTXVideo

LTX-Video Support for ComfyUI

comfyui diffusion-models dit image-to-video image-to-video-generation text-to-image text-to-image-generation

Python

2371

222

3 个月前

adobe-research / custom-diffusion

Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)

customization fine-tuning text-to-image-generation 机器视觉 diffusion-models few-shot PyTorch

Python

1970

140

2 年前

FoundationVision / Infinity

[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

auto-regressive-model autoregressive-models generative-model gpt gpt-2 image-generation text-to-image text-to-image-generation transformers

Python

1451

3 个月前

muzishen / IMAGDressing

[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high fidelity and garment consistency for virtual dressing.

datasets diffusion-models text-to-image-generation

Python

1288

114

4 天前

songweige / rich-text-to-image

Rich-Text-to-Image Generation

机器视觉 diffusion-models PyTorch rich-text text-to-image-generation

Python

801

2 年前

AIDC-AI / Awesome-Unified-Multimodal-Models

Awesome Unified Multimodal Models

multimodal-large-language-models text-to-image-generation multimodal-models vision-language-model

764

2 个月前

PKU-YuanGroup / UniWorld-V1

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

diffusion high-level-feature image-editing image-understanding low-level-vision text-to-image-generation unify unify-ai vlm

Python

712

2 个月前

FoundationVision / Liquid

Liquid: Language Models are Scalable and Unified Multi-modal Generators

generative generative-ai 大语言模型 text-to-image text-to-image-generation autoregressive-models large-language-models multimodal-large-language-models

Python

616

6 个月前

donahowe / AutoStudio

AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation

image-generation text-to-image-generation

Jupyter Notebook

448

6 个月前

Paranioar / Awesome_Matching_Pretraining_Transfering

The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.

cross-modal-retrieval 教程 Awesome Lists image-text-matching image-text-retrieval large-language-models large-vision-language-models multimodal-pretraining parameter-efficient-fine-tuning vision-and-language multimodal-large-language-models 大语言模型 text-to-image-generation text-to-image-synthesis text-to-video-generation

430

9 天前

ByteVisionLab / TokenFlow

[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".

large-language-models text-to-image-generation

Python

386

2 个月前

OSU-NLP-Group / MagicBrush

[NeurIPS'23] "MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing".

diffusion-models image-editing image-generation image-synthesis instruction-following text-to-image text-to-image-generation text-to-image-synthesis

Python

377

7 个月前

woctezuma / stable-diffusion-colab

Colab notebook for Stable Diffusion Hyper-SDXL.

colab colab-notebook colaboratory stable-diffusion huggingface-diffusers diffusion diffusion-models text-to-image text-to-image-generation text-to-image-synthesis diffusers google-colab google-colab-notebook image-generation text2image 深度学习

Jupyter Notebook

328

6 个月前

RockeyCoss / SPO

[CVPR 2025] Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization

diffusion-models dpo sdxl text-to-image text-to-image-generation

Python

252

6 个月前

markfulton / NanoBananaEditor

The most advanced Nano Banana image generator and editor application. Your central hub for AI image generation and revisions. Intuitive UI features reference images, editing with image masks, version history, and more. Powered by Gemini 2.5 Flash images API.

bolt imageeditor text-to-image text-to-image-generation vibecoding

TypeScript

229

18 天前