Repository navigation

#

voice-cloning

CorentinJ/Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python
57914
12 天前

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python
51314
25 天前
unslothai/unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python
46569
1 小时前

Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组

Python
15034
5 个月前

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Python
12266
7 天前

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python
5554
4 个月前
abus-aikorea/voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

Python
4840
2 个月前
Jupyter Notebook
2794
1 年前
IAHispano/Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

Python
2628
3 小时前

A Python/Pytorch app for easily synthesising human voices

Python
1440
10 个月前

AI Podcast Generator for bilingual episodes, Multi Languages, Alternative to NotebookLLM;真人对话AI播客生成器,多语言,多音色

TypeScript
1056
3 个月前

A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.

Python
995
2 天前

Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

Python
964
3 个月前

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Python
837
2 年前

The code for the bark-voicecloning model. Training and inference.

Python
703
2 年前