Repository navigation
audioldm
- Website
- Wikipedia
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
A webui for different audio related Neural Networks
OpenMusic: SOTA Text-to-music (TTM) Generation
(Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on 3 languages
Text prompt steered synthetic audio generators
[ICASSP'24] Investigating Personalization Methods in Text to Music Generation
A comprehensive, click to install, fully open-source, Video + Audio Generation AIO Toolkit using advanced prompt engineering plus the power of CogVideox + AudioLDM2 + Python!
AudioLDM text to audio colab
Generative AI version of the GeoGuesser game.
Workshop for Multimodale media generator
Enhancing Diffusion-Based Music Generation Performance with LoRA.
In this game, your given an image for so many seconds to view. Then you have to guess just by clicking on any point in the world that the photo was taken. NOTICE: This game is INCOMPLETE