Repository navigation

#

text-to-audio

open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Python
9302
3 个月前

[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python
1793
2 天前
declare-lab/tango
Python
1189
22 天前

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching

Jupyter Notebook
768
22 天前

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

Python
650
1 年前

Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

Python
550
3 年前

Mustango: Toward Controllable Text-to-Music Generation

Python
373
3 个月前

High-quality Text-to-Audio Generation with Efficient Diffusion Transformer

Python
308
2 个月前

Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"

Jupyter Notebook
187
1 年前

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

Python
119
4 年前

Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.

Python
116
2 年前
Python
97
5 个月前

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Python
68
4 年前

Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.

Python
32
1 年前

Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs, LAION-CLAP, MS-CLAP, DeSync

Python
28
6 个月前