Repository navigation
vqgan
- Website
- Wikipedia
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
[ICCV 2025] SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
[CVPR 2023] | RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors
PyTorch codes for "Real-World Blind Super-Resolution via Feature Matching with Implicit High-Resolution Priors", ACM MM2022 (Oral)
Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt
Implements VQGAN+CLIP for image and video generation, and style transfers, based on text and image prompts. Emphasis on ease-of-use, documentation, and smooth video creation.
Implementation of the paper "MaskBit: Embedding-free Image Generation from Bit Tokens"
OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Perceptual loss for clear text-within-image generation. Fork from VQGAN in CompVis/taming-transformers
Towards training VQ-VAE models robustly!
Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized
Implementation of Binary Latent Diffusion
[ICLR 2024] DAEFR: Dual Associated Encoder for Face Restoration
NTIRE 2022 - Image Inpainting Challenge
VQ-VAE/GAN implementation in pytorch-lightning
Fast and controllable text-to-image model.