Repository navigation

voice-conversion

Website
Wikipedia

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python text-to-speech 深度学习 speech PyTorch tts vocoder tacotron glow-tts melgan speaker-encoder hifigan speaker-encodings multi-speaker-tts tts-model speech-synthesis voice-cloning voice-synthesis voice-conversion

Python

42854

5661

1 年前

RVC-Project / Retrieval-based-Voice-Conversion-WebUI

Easily train a good VC model with voice data <= 10 mins!

change sovits vits voice voice-conversion rvc audio-analysis conversational-ai conversion converter retrieval-model retrieve-data so-vits-svc vc voice-converter voiceconversion

Python

32284

4543

10 个月前

svc-develop-team / so-vits-svc

SoftVC VITS Singing Voice Conversion

人工智能 audio-analysis Generative Adversarial Network singing-voice-conversion so-vits-svc sovits variational-inference vc vits voice voice-conversion voiceconversion voice-changer flow 深度学习 PyTorch speech

Python

27646

5055

2 年前

espnet / espnet

End-to-End Speech Processing Toolkit

深度学习 end-to-end chainer PyTorch kaldi speech-recognition speech-synthesis speech-translation machine-translation voice-conversion speech-enhancement speech-separation singing-voice-synthesis speaker-diarization text-to-speech

Python

9494

2326

3 天前

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

audio-generation audio-synthesis audioldm music-generation naturalspeech2 singing-voice-conversion speech-synthesis text-to-audio text-to-speech vall-e voice-conversion audit fastspeech2 vits emilia maskgct vocoder

Python

9420

760

4 个月前

voicepaw / so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

sovits vits voice-conversion so-vits-svc hubert softvc realtime voice-changer 深度学习 PyTorch speech-synthesis Generative Adversarial Network lightning pytorch-lightning Hacktoberfest

Python

9138

1220

3 小时前

abus-aikorea / voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

faster-whisper tts Whisper gradio subtitles transcription translator webui speech-recognition speech-synthesis speech-to-text text-to-speech yt-dlp voice-cloning podcasts audiobook voice-conversion karaoke whisperx

Python

4840

414

2 个月前

Plachtaa / seed-vc

zero-shot voice conversion & singing voice conversion, with real-time support

voice-conversion singing-voice-conversion

Python

3285

384

5 个月前

zzw922cn / awesome-speech-recognition-speech-synthesis-papers

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)