Speech & Audio
Speech recognition, speech synthesis, and audio processing frameworks.
Repositories
openai / whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Python
94.2k
ggml-org / whisper.cpp
Port of OpenAI's Whisper model in C/C++
C++
46.5k
RVC-Boss / GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Python
54.7k
CorentinJ / Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Python
59.3k
coqui-ai / TTS
πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Python
44.5k
mozilla / DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
C++
26.7k