Repository navigation

#

kenlm

shibing624/pycorrector

pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。

Python
5926
4 个月前

State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines of C++. This is a 100% private 100% offline 100% free CLI tool.

C
412
3 年前

CTC+Beam_Search+kenlm 是用于以汉字为声学模型建模单元的解码系统

C++
47
7 年前

A complete instruction for training a Persian spell checker and a language model based on SymSpell and KenLM, respectively using Wikipedia dataset.

Python
33
3 年前

Optical Character Recognition + Instance Segmentation for russian and english languages

Jupyter Notebook
30
3 年前

Targoman SMT framework source code

C++
30
7 年前

Romanian Automatic Speech Recognition from the ROBIN project

Python
27
3 年前

A Java JNI wrapper for KenLM: Faster and Smaller Language Model Queries

Java
12
4 年前

Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡

Python
9
3 个月前

INACTIVE - http://mzl.la/ghe-archive - Generate language models from OSCAR corpora

Python
8
5 年前

Neural Grammatical Error Correction for Romanian using Transformer

Python
7
7 个月前

Developed an AI tool to automatically generate captions and transcripts for YouTube videos in 67 languages and can generate summarized texts in 133 languages.

Python
7
1 年前

Scripts to train a n-gram language models on Wikipedia articles

Python
4
3 年前

This repo shows how to finetune the wav2vec2.0 model along with its prerequisites.

Jupyter Notebook
3
1 年前

Basic setup to use kenlm library in cpp

C++
3
5 年前

demo of domain corpus bootstrapping using language model perplexity

Jupyter Notebook
2
7 年前