Repository navigation

wavlm

Website
Wikipedia

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

深度学习 PyTorch speaker-adaptation speech-synthesis text-to-speech tts wavlm diffusion-models latent-diffusion latent-diffusion-models Generative Adversarial Network

Python

5656

536

8 个月前

s3prl / s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

speech-representation mockingjay representation-learning apc tera self-supervised-learning speech-pretraining vq-apc wav2vec hubert wavlm

Python

2372

498

1 个月前

wenet-e2e / wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

production-ready PyTorch resnet speaker-recognition speaker-verification speaker-diarization repvgg TLS (Transport Layer Security)dino wavlm

Python

879

135

2 个月前

lucadellalib / focalcodec

A low-bitrate single-codebook 16 kHz speech codec based on focal modulation

codec 深度学习 PyTorch speech-synthesis wavlm

Python

2 个月前

mjhydri / Singing-Vocal-Beat-Tracking

This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and tempo.

beat-tracking hubert music music-information-retrieval self-supervised singing-voice wavlm

Python

3 年前

lucadellalib / discrete-wavlm-codec

A neural speech codec based on discrete WavLM representations

clustering codec hifi-gan PyTorch quantization self-supervised-learning speech-synthesis wavlm

Python

8 个月前

lucadellalib / audiocodecs

A collections of audio codecs with a standardized API

codec dac PyTorch quantization self-supervised-learning speech-synthesis text-to-speech wavlm

Python

7 天前

Sarasadeghii / Sharif-WavLM

In this repository, the wavLM model is used for quality and poor quality data for speaker verification task, and the PyCM library is used for evaluation.

confusion-matrix speaker-verification wavlm

Jupyter Notebook

2 年前

alessandropec / data_driven_ai_voice_cloning

This repository contain the code of the main part of my master thesis degree at Politecnico di Torino in Data science & Engineering

人工智能 generative-ai speaker-verification text-to-speech voice-cloning zero-shot-learning 深度学习机器学习 wavlm fastspeech2 tacotron2

Python

2 年前

theolepage / wavlm_ssl_sv

SOTA method for self-supervised speaker verification leveraging a large-scale pretrained ASR model.

asr dino PyTorch self-supervised-learning speaker-recognition speaker-verification wavlm

Python

2 个月前

bunyaminergen / WavLMMSDD

This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.

diarization embedding Microsoft speaker-diarization speech wavlm

Jupyter Notebook

1 个月前

sadPororo / UniPool-SV

Universal Pooling Method for Speaker Verification Utilizing Pre-trained Multi-layer Features, 2025 preprint

hubert pretrained-models speaker-recognition speaker-verification wavlm

Python

7 个月前

zhu00121 / Universal-representation-dynamics-of-deepfake-speech

This repo contains code used in the paper "Characterizing the temporal dynamics of universal speech representations for generalizable deepfake detection"

deepfake-detection self-supervised wavlm

Python

2 年前

bunyaminergen / WavLMRawNetXSVBase

WavLM Large + RawNetX Speaker Verification Base: End-to-End Speaker Verification Architecture

audio feature-extraction speaker-verification speech speech-processing wavlm

Python

1 个月前