Repository navigation

#

wavlm

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python
5917
1 年前
s3prl/s3prl
Python
2448
2 个月前

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Python
1001
2 个月前

A low-bitrate single-codebook 16 kHz speech codec based on focal modulation

Python
94
6 个月前

This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and tempo.

Python
33
3 年前
Python
29
3 个月前

In this repository, the wavLM model is used for quality and poor quality data for speaker verification task, and the PyCM library is used for evaluation.

Jupyter Notebook
9
2 年前

SOTA method for self-supervised speaker verification leveraging a large-scale pretrained ASR model.

Python
8
6 个月前

This repository contain the code of the main part of my master thesis degree at Politecnico di Torino in Data science & Engineering

Python
8
2 年前

This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.

Jupyter Notebook
7
2 个月前

Universal Pooling Method for Speaker Verification Utilizing Pre-trained Multi-layer Features, 2025 preprint

Python
7
1 年前

kNN-SVC: Robust Zero-Shot Singing Voice Conversion with Additive Synthesis and Concatenation Smoothness Optimization

Python
6
2 个月前

Rethinking Leveraging Pre-Trained Multi-Layer Representations for Speaker Verification, 2025 Interspeech

Python
6
3 个月前

This repo contains code used in the paper "Characterizing the temporal dynamics of universal speech representations for generalizable deepfake detection"

Python
4
2 年前

WavLM Large + RawNetX Speaker Verification Base: End-to-End Speaker Verification Architecture

Python
2
5 个月前