Repository navigation

#

speech-representation

s3prl/s3prl
Python
2462
4 个月前

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python
1209
7 个月前

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python
959
9 个月前

A Survey of Spoken Dialogue Models (60 pages)

310
10 个月前

A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.

Python
100
4 个月前

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

Python
74
3 年前

This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs. Demos, technical insights and experimental results are presented on

Python
74
16 天前

Ultra-low bitrate speech codec (0.27-1 kbps) with cross-modal alignment and real-time capabilities

Python
65
1 个月前

A mini, simple, and fast end-to-end automatic speech recognition toolkit.

Jupyter Notebook
53
3 年前

DUSTED: Spoken-Term Discovery using Discrete Speech Units

Jupyter Notebook
18
1 年前

Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech Embeddings

Python
17
4 个月前

Semi-supervised spoken language understanding (SLU) via self-supervised speech and language model pretraining

Python
12
5 年前