Repository navigation

#

gradio

Python
37541
13 小时前

⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。

Python
15568
15 天前

Convert ebooks to audiobooks with chapters and metadata using dynamic AI models and voice cloning. Supports 1,107+ languages!

Python
9526
12 天前

Easy Docker setup for Stable Diffusion with user-friendly UI

Shell
7122
8 个月前

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

Python
4430
1 个月前
abus-aikorea/voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

Python
3605
3 天前

[CVPR'25] Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System

Python
3298
4 天前

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

Python
3215
8 个月前

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python
3214
3 个月前
rsxdalv/tts-generation-webui

TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)

TypeScript
2110
11 小时前
Jupyter Notebook
2104
1 年前

A Web UI for easy subtitle using whisper model.

Python
1899
4 天前

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies

Python
1316
9 个月前