Repository navigation

speech-representation

Website
Wikipedia

Self-Supervised Speech Pre-training and Representation Learning Toolkit

speech-representation mockingjay representation-learning apc tera self-supervised-learning speech-pretraining vq-apc wav2vec hubert wavlm

Python

2462

513

4 个月前

jishengpeng / WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

acoustic codec gpt4o semantic speech-representation text-to-speech dac

Python

1209

102

7 个月前

ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

pytorch-implementation speech-representation

Python

959

9 个月前

jishengpeng / WavChat

A Survey of Spoken Dialogue Models (60 pages)

gpt-4o speech speech-representation streaming

310

10 个月前

Ereboas / MagiCodec

A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.

codec 大语言模型 PyTorch speech-representation text-to-speech tts

Python

100

4 个月前

mechanicalsea / lighthubert

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

neural-architecture-search PyTorch self-supervised-learning speech-representation

Python

3 年前

gyt1145028706 / XY-Tokenizer

This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs. Demos, technical insights and experimental results are presented on

autoencoder automatic-speech-recognition speech-representation

Python

16 天前