Repository navigation

#

voice-activity-detection

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python
9860
5 天前

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook
7302
4 天前

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

C++
1274
4 天前
juanmc2005/diart

A python package to build AI-powered real-time audio applications

Python
1253
2 个月前

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

MATLAB
854
4 年前

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Python
797
3 个月前

An audio/acoustic activity detection and audio segmentation tool

Python
771
4 个月前

Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/

Python
467
2 个月前

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

Jupyter Notebook
394
8 个月前

Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.

C++
382
2 个月前

Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

C
327
3 个月前