Repository navigation

#

speech-to-text

ggml-org/whisper.cpp
C++
39336
2 天前

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

C++
26241
8 个月前

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python
15039
7 天前

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell
14779
3 个月前

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。

Python
12519
12 天前
Jupyter Notebook
9273
1 个月前

Speech recognition module for Python, supporting several engines and APIs, online and offline.

Python
8691
3 天前

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Python
8093
7 个月前

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

Python
6750
8 天前

💬 Speech recognition for your site

JavaScript
6660
8 个月前

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, support 11 programming languages

C++
5662
2 天前

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Jupyter Notebook
5236
2 年前

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Jupyter Notebook
4585
1 年前

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

Python
4430
1 个月前

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook
4399
1 个月前