Repository navigation
deepspeech
- Website
- Wikipedia
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Examples of how to use or integrate DeepSpeech
基于PaddlePaddle实现的语音识别,中文语音识别。项目完善,识别效果好。支持Windows,Linux下训练和预测,支持Nvidia Jetson开发板预测。
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
speech to text benchmark framework
A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using either DeepSpeech or Coqui
A Keras CTC implementation of Baidu's DeepSpeech for model experimentation
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
A testing server for a speech to text service based on coqui.ai
Golang bindings for Mozilla's DeepSpeech speech-to-text library
Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments
Android Speech Recognition Service using Vosk/Kaldi and Mozilla DeepSpeech
Install Mozilla DeepSpeech on a Raspberry Pi 4
Tooling for producing Italian model (public release available) for DeepSpeech and text corpus
Traditional ASR (Signal & Cepstral Analysis, DTW, HMM) & DNNs (Custom Models + DeepSpeech) on Indian Accent Speech
An editor for speech-to-text transcripts such as AWS Transcribe and Mozilla DeepSpeech
A MXNet implementation of Baidu's DeepSpeech architecture