語音與音訊

語音識別、語音合成及音訊處理框架。

Repositories

Whisper 是 OpenAI 開源的通用語音識別模型。基於 68 萬小時多語言音頻數據訓練,支援多語言語音識別、語音翻譯和語言識別,具備出色的抗噪性和準確性。

Python
95.3k
3 months ago
ggml-org/whisper.cpp

Port of OpenAI's Whisper model in C/C++

C++
47.2k
4 days ago

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python
55.4k
23 days ago
CorentinJ/Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python
59.5k
3 months ago

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python
44.7k
2 years ago

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

C++
26.7k
8 months ago