Repository navigation
vqvae
- Website
- Wikipedia
A Collection of Variational Autoencoders (VAE) in PyTorch.
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.
A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
Torchélie is a set of utility functions, layers, losses, models, trainers and other things for PyTorch.
Language Quantized AutoEncoders
Fast and scalable search of whole-slide images via self-supervised deep learning - Nature Biomedical Engineering
(ECCV 2024) SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark
Towards training VQ-VAE models robustly!
Voice conversion (VC) investigation using three variants of VAE
This repo implements VQVAE on mnist and as well as colored version of mnist images. It also implements simple LSTM for generating sample numbers using the encoder outputs of trained VQVAE
VQ-VAE/GAN implementation in pytorch-lightning
Inverse DALL-E for Optical Character Recognition
Experimental implementation for a sparse-dictionary based version of the VQ-VAE2 paper
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Image Generation using VQVAE and GPT Models
Tensorflow Implementation of "Theory and Experiments on Vector Quantized Autoencoders"
Vector-Quantized Generative Adversarial Networks