Repository navigation

speech-recognition

Website
Wikipedia

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

自然语言处理 PyTorch pytorch-transformers transformer model-hub pretrained-models speech-recognition Hacktoberfest Python 机器学习深度学习 audio deepseek gemma glm 大语言模型 qwen vlm

Python

148515

30084

4 小时前

ggml-org / whisper.cpp

Port of OpenAI's Whisper model in C/C++

openai speech-to-text transformer Whisper inference speech-recognition

C++

42484

4570

1 天前

mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

深度学习机器学习 neural-networks Tensorflow speech-recognition speech-to-text deepspeech embedded on-device offline

C++

26568

4074

2 个月前

SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2

深度学习 inference quantization speech-recognition speech-to-text transformer Whisper openai

Python

17662

1449

3 天前

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

asr speech speech-recognition speech-to-text Whisper

Python

17364

1832

2 个月前

leon-ai / leon

🧠 Leon is your open-source personal assistant.

leon personal-assistant Node.js Python 人工智能 speech-to-text text-to-speech speech-recognition speech-synthesis flite assistant virtual-assistant 聊天机器人 Bot voice-assistant 自动化 offline 隐私 ai-assistant

TypeScript

16578

1377

16 天前

kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

kaldi C++CUDA Shell speech-recognition speech-to-text speaker-verification speaker-id speech

Shell

15057

5364

1 个月前

NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

机器视觉深度学习 drug-discovery forecasting large-language-models mxnet paddlepaddle PyTorch recommender-systems speech-recognition speech-synthesis Tensorflow tensorflow2 translation 自然语言处理

Jupyter Notebook

14453

3361

1 年前

alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Jupyter Notebook

12984

1550

1 个月前

kmario23 / deep-learning-drizzle

Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!

机器学习深度学习深度神经网络 pattern-recognition 机器视觉 optimization visual-recognition reinforcement-learning deep-reinforcement-learning 自然语言处理 artificial-neural-networks artificial-intelligence-algorithms bayesian-statistics speech-recognition graph-neural-networks Medical imaging geometric-deep-learning explainable-ai probability

HTML

12660

2960

10 个月前

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

transformer conformer speech-translation streaming-asr speech-alignment punctuation-restoration streaming-tts speech-synthesis tts asr speech-recognition voice-cloning vocoder voice-recognition self-supervised-learning Whisper

Python

12168

1930

6 天前

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

conformer PyTorch speech-recognition paraformer punctuation speaker-diarization rnnt audio-visual-speech-recognition pretrained-model voice-activity-detection Whisper dfsmn vad speechgpt speechllm

Python

12098

1220

5 天前