Repository navigation

multi-speaker

Website
Wikipedia

netease-youdao / EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

PyTorch speech speech-synthesis tts multi-speaker text-to-speech 深度学习 prompt emotivoice 人工智能 Python emotion style

Python

8141

710

1 年前

r9y9 / deepvoice3_pytorch

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

tts speech-synthesis end-to-end speech-processing 机器学习 PyTorch Python multi-speaker

Python

1979

487

2 年前

ranchlai / mandarin-tts

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder, with biaobei and aishell3 datasets

tts tacotron PyTorch fastspeech2 multi-speaker

Python

479

111

3 年前

keonlee9420 / Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

text-to-speech unsupervised non-autoregressive multi-speaker tts PyTorch fastspeech transformer neural-tts fastspeech2 hifi-gan sota speech-synthesis 深度学习

Python

325

3 年前

aishoot / LSTM_PIT_Speech_Separation

Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.

speech-separation audio-separation multi-speaker speech-enhancement

Jupyter Notebook

309

4 年前

DrewThomasson / VoxNovel

VoxNovel: generate audiobooks giving each character a different voice actor.

audiobooks epub generative-ai multi-speaker torch tts voice-cloning Linux macOS Windows m4b

Python

292

2 个月前

keonlee9420 / Comprehensive-E2E-TTS

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS

深度学习 fastspeech2 hifi-gan jets multi-speaker neural-tts non-autoregressive PyTorch sota speech-synthesis text-to-speech tts unsupervised end-to-end

Python

146

3 年前

Totoketchup / Adaptive-MultiSpeaker-Separation

Adaptive and Focusing Neural Layers for Multi-Speaker Separation Problem

深度学习 audio-separation speech-separation adaptive-learning Tensorflow source-separation multi-speaker

Jupyter Notebook

7 年前

anton-jeran / MULTI-AUDIODEC

This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.

codec multi-speaker overlap spatial-audio speech-enhancement speech-separation

Python

5 个月前

keonlee9420 / Comprehensive-Tacotron2

PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.

text-to-speech tts tacotron tacotron2 PyTorch speech-synthesis autoregressive multi-speaker robustness efficiency neural-tts hifi-gan 深度学习

Python

2 年前

hwRG / FastSpeech2-Pytorch-Korean-Multi-Speaker

Multi-Speaker FastSpeech2 applicable to Korean. Description about train and synthesize in detail.

fastspeech2 korean multi-speaker PyTorch transfer-learning tts

Python

3 年前

nikitashvarts / CocktailPartySpeakerRecognition

An Algorithm for Speaker Recognition in a Multi-Speaker Environment

speaker-recognition multi-speaker 深度学习 lstm

Python

5 年前

ZoraizQ / urdu-speech-recognition

Urdu Speech Recognition using Kaldi ASR, by training Triphone Acoustic GMMs using the PRUS dataset.

speech-recognition urdu multi-speaker

Shell

4 年前

TheSeraphim / scribe-forge-ai

🎵 Complete offline audio transcription system with speaker diarization using OpenAI Whisper and PyAnnote. Features automatic audio cleaning, precise timestamps, multiple output formats (JSON/TXT/Markdown), and support for 20+ audio formats. No external APIs required - works entirely offline.

audio-analysis audio-processing diarization FFmpeg huggingface 机器学习 multi-speaker 自然语言处理 openai-whisper Python speaker-diarization speech-recognition speech-to-text Whisper

Python

2 个月前

parisimaa / multi_speaker

audio-processing multi-speaker 深度学习

MATLAB

3 年前