Repository navigation

timit

Website
Wikipedia

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

speech-recognition gru dnn kaldi rnn-model PyTorch timit 深度学习深度神经网络 recurrent-neural-networks multilayer-perceptron-network lstm speech asr rnn

Python

2390

444

3 年前

mravanelli / SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.

Python

1188

268

4 年前

speechbrain / speechbrain.github.io

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

深度学习 speech-recognition speech-to-text speech speech-processing speaker-recognition speaker-verification speaker-identification speech-separation speechrecognition 神经网络 neural-networks timit speech-analysis

HTML

370

2 个月前

hirofumi0810 / tensorflow_end2end_speech_recognition

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

speech-recognition ctc Tensorflow timit timit-dataset attention-mechanism automatic-speech-recognition asr end-to-end end-to-end-learning speech-to-text beam-search

Python

314

119

8 年前

philipperemy / timit

The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.

timit-dataset timit darpa speech

309

149

3 年前

Diamondfan / CTC_pytorch

CTC end -to-end ASR for timit and 863 corpus.

ctc PyTorch timit kaldi decoder

Python

220

6 年前

HawkAaron / RNN-Transducer

MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks

end-to-end asr transducers mxnet rnn-transducer timit speech-recognition

Python

139

4 年前

WindQAQ / listen-attend-and-spell

Tensorflow implementation of "Listen, Attend and Spell" authored by William Chan. This project utilizes input pipeline and estimator API of Tensorflow, which makes the training and evaluation truly end-to-end.

speech-recognition Tensorflow timit seq2seq speech-to-text

Python

7 年前

grausof / keras-sincnet

Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)

深度学习 audio waveform filtering cnn convolutional-neural-networks speaker-recognition speaker-verification speech-recognition asr audio-processing speech-processing digital-signal-processing 神经网络机器学习人工智能 timit Tensorflow Keras

Python

4 年前

hirofumi0810 / asr_preprocessing

Python implementation of pre-processing for End-to-End speech recognition

speech-recognition ctc attention-mechanism timit timit-dataset automatic-speech-recognition end-to-end transcription preprocessing dataset

Python

8 年前

matthijsvk / TIMITspeech

Speech recognition on the TIMIT (or any other) dataset

speech timit 神经网络 theano phonemes speech-recognition

Python

8 年前

mravanelli / pytorch_MLP_for_ASR

This code implements a basic MLP for speech recognition. The MLP is trained with pytorch, while feature extraction, alignments, and decoding are performed with Kaldi. The current implementation supports dropout and batch normalization. An example for phoneme recognition using the standard TIMIT dataset is provided.

PyTorch asr kaldi 深度学习深度神经网络 speech-recognition timit mlp feedforward-neural-network multilayer-perceptron neural-networks Python CUDA

Perl

8 年前

AppleHolic / PytorchSR

Pytorch based phoneme recognition (TIMIT phoneme classification)

PyTorch Bukkit timit speechrecognition

Python

7 年前

mravanelli / theano-kaldi-rnn

THEANO-KALDI-RNNs is a project implementing various Recurrent Neural Networks (RNNs) for RNN-HMM speech recognition. The Theano Code is coupled with the Kaldi decoder.

recurrent-neural-networks kaldi rnn gru timit theano 深度学习深度神经网络

Perl

7 年前

zhaoyu611 / Automatic_Speech_Recognition_with_Multi_Models

A Simple Automatic Speech Recognition (ASR) Model in Tensorflow, which only needs to focus on Deep Neural Network. It's easy to test popular cells (most are LSTM and its variants) and models (unidirectioanl RNN, bidirectional RNN, ResNet and so on). Moreover, you are welcome to play with self-defined cells or models.

automatic-speech-recognition Tensorflow ctc lstm 深度学习 timit rnn

Python

8 年前