Repository navigation

#

neural-tts

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

Python
325
3 年前

Official implementation of Meta-StyleSpeech and StyleSpeech

Python
245
3 年前

PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)

Python
238
3 年前

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Python
192
3 年前

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Python
192
2 年前

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Python
191
3 年前

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS

Python
146
3 年前

A cross-platform inference engine for neural TTS models.

Rust
72
5 个月前

PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis

Python
72
4 年前

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Python
72
4 年前

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Python
69
4 年前

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Python
56
4 年前

PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.

Python
48
2 年前

Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port of the DeepPhonemizer model is used. For speech synthesis VITS models are used. Piper models are compatible after a conversion script is run.

Python
16
8 个月前

This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Python
15
4 年前

Let your GNOME desktop speak to you. Reads your desktop notifications or selected text out-loud with human-like voice using Piper. Uses a local LLM to summarize selected text.

JavaScript
8
8 天前

VS Code extension for multi-language text translation and TTS (text-to-speech) using Azure Cognitive Services. Please [✩Star] if you're using it!

TypeScript
7
4 年前

A simple Discord bot that synthesizes speech directly to a voice channel via text commands with support for sound effects.

Python
0
1 年前