Repository navigation

#

ngrams

Modern spell checking library - accurate, fast, multi-language

C++
646
1 年前

Covid-19 Twitter dataset for non-commercial research use and pre-processing scripts - under active development

Jupyter Notebook
479
2 年前

Next-token prediction in JavaScript — build fast language and diffusion models.

JavaScript
143
1 年前

A C++ library providing fast language model queries in compressed space.

C++
132
2 年前

NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.

JavaScript
131
1 年前

A fast and reliable PHP library for detecting languages

PHP
129
2 年前

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

C++
128
8 个月前

Python implementation of an N-gram language model with Laplace smoothing and sentence generation.

Python
86
8 年前

Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code

Python
83
2 年前

This Repository Contains Solution to the Assignments of the Natural Language Processing Specialization from Deeplearning.ai on Coursera Taught by Younes Bensouda Mourri, Łukasz Kaiser, Eddy Shyu

Jupyter Notebook
78
2 年前

A flexible and general-purpose ngrams library written in Ruby. Raingrams supports ngram sizes greater than 1, text/non-text grams, multiple parsing styles and open/closed vocabulary models.

Ruby
69
4 年前

Poetry generation via natural language markov models

Python
54
6 个月前

Typing Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.

CSS
54
8 年前

Next Word Prediction using n-gram Probabilistic Model with various Smoothing Techniques

Jupyter Notebook
36
7 年前

A deep learning project using fine-tuned RoBERTa to classify mental health sentiments from text, aiming to provide early insights and support. ⚕️❤️

Jupyter Notebook
26
1 个月前

Rust library providing fast language model queries in compressed space

Rust
24
3 年前