語音與音訊
語音識別、語音合成及音訊處理框架。
Repositories
Whisper 是 OpenAI 開源的通用語音識別模型。基於 68 萬小時多語言音頻數據訓練,支援多語言語音識別、語音翻譯和語言識別,具備出色的抗噪性和準確性。
Python
95.3k3 months ago
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Python
55.4k23 days ago
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Python
59.5k3 months ago
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Python
44.7k2 years ago
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
C++
26.7k8 months ago