Repository navigation

#

vad

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python
9870
6 天前
Python
2313
4 个月前

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

C++
1274
4 天前

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

MATLAB
854
4 年前

An audio/acoustic activity detection and audio segmentation tool

Python
771
4 个月前

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Python
455
3 个月前

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

Jupyter Notebook
394
8 个月前

Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.

C++
382
2 个月前

集成Webrtc的VAD,用于切分音频文件

C
341
5 年前

Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

C
327
3 个月前

Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts

Python
321
5 个月前

On-device voice activity detection (VAD) powered by deep learning

Python
206
5 天前

A statistical model-based Voice Activity Detection

Jupyter Notebook
192
6 年前

Python bindings of WebRTC Audio Processing

C++
189
7 个月前

Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021

Python
153
3 年前

Enumerate user mode shared memory mappings on Windows.

C
120
4 年前

This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming architecture for fluid conversations with immediate responses and natural interruption handling.

Python
111
3 天前