Repository navigation

voice-detection

Website
Wikipedia

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

voice-detection voice-recognition voice-commands PyTorch onnx voice-activity-detection voice-control onnx-runtime onnxruntime speech speech-processing vad

Python

6996

639

1 个月前

jtkim-kaist / VAD

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

vad dnn lstm attention speech data voice-detection speech-recognition voice-activity-detection

MATLAB

864

235

4 年前

amsehili / auditok

An audio/acoustic activity detection and audio segmentation tool

voice-detection vad voice-activity-detection

Python

818

10 个月前

gkonovalov / android-vad

Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

vad offline real-time audio-processing WebRTC Android dnn on-device-ai silero-vad neural-networks voice-detection 深度神经网络 onnx-models voice-activity-detection

414

3 个月前

gong-io / gecko

Gecko - A Tool for Effective Annotation of Human Conversations

transcription diarization voice-detection

JavaScript

298

3 年前

eesungkim / Voice_Activity_Detector

A statistical model-based Voice Activity Detection

vad voice-detection voice-activity-detection

Jupyter Notebook

194

7 年前

isrish / VAD-LTSD

Efficient voice activity detection algorithm using long-term speech information

audio speech voice voice-detection

MATLAB

8 年前

hernanrazo / human-voice-detection

Binary classification problem that aims to classify human voices from audio recordings. Implemented using PyTorch and Librosa.

PyTorch voice-detection speech-processing 深度学习 audio-classification librosa cnn-pytorch

Python

4 年前

baochuquan / ios-vad

iOS Voice Activity Detection (VAD). Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

audio-processing 深度神经网络 dnn iOS neural-networks offline on-device-ai real-time silero-vad speech-recognition vad voice-activity-detection voice-detection WebRTC

Swift

1 年前

cosmincatalin / voice-recognition-with-mxnet-and-sagemaker

End to end AWS SageMaker application for detecting the AWS Polly voice in an audio recording using Gluon and MXNet.

机器学习 mxnet gluon sagemaker Jupyter Notebook Amazon Web Services 数据科学 voice-detection audio audio-analysis

Jupyter Notebook

5 年前

joooooonyoung / Synthetic-Voice-Detection

Spoofing voice detection : 2nd YAICON

deepfake-detection voice-detection

Python

2 年前

tahangz / Poetry-Practicing

The Poetry Pronunciation Learning App is an interactive AI-powered tool that helps users practice and improve their pronunciation of poems. It uses real-time speech recognition, voice activity detection, and fuzzy word matching to provide instant feedback on spoken verses.

人工智能 CUDA gpu real-time voice-detection voice-recognition Whisper whisper-ai voice-activity-detection

Python

1 个月前

moego0 / custom_KWS

End-to-end pipeline for training a custom keyword detection model with TensorFlow & TFLite expor

audio-processing cnn-model 深度学习 edge-ai Keras speech-recognition Tensorflow tflite voice-detection

Python

2 个月前

tinahaibodi / creativescreaming

this is a p5js experiment that uses voice detection and cursor movement to multiply creative content in a variety of colours

colours voice-detection

JavaScript

7 年前

sky4689524 / TranscribeTube

TranscribeTube is a Python tool that transcribes and generates subtitles for videos from local files or YouTube links using Hugging Face models. It features an interactive Gradio web interface, allowing users to easily upload videos, select languages, and download subtitles in SRT format.

ai-tools gradio huggingface large-language-models 机器学习 multilingual Python speaker-diarization speech-recognition speech-to-text transcription voice-detection whisper-ai yt-dlp

Python

1 年前