Repository navigation

#

ngrams

Modern spell checking library - accurate, fast, multi-language

C++
634
8 个月前

Covid-19 Twitter dataset for non-commercial research use and pre-processing scripts - under active development

Jupyter Notebook
476
2 年前

Next-token prediction in JavaScript — build fast language and diffusion models.

JavaScript
143
7 个月前

A C++ library providing fast language model queries in compressed space.

C++
129
2 年前

NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.

JavaScript
127
1 年前

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

C++
126
4 个月前

A fast and reliable PHP library for detecting languages

PHP
125
1 年前

Python implementation of an N-gram language model with Laplace smoothing and sentence generation.

Python
83
7 年前

This Repository Contains Solution to the Assignments of the Natural Language Processing Specialization from Deeplearning.ai on Coursera Taught by Younes Bensouda Mourri, Łukasz Kaiser, Eddy Shyu

Jupyter Notebook
74
2 年前

A flexible and general-purpose ngrams library written in Ruby. Raingrams supports ngram sizes greater than 1, text/non-text grams, multiple parsing styles and open/closed vocabulary models.

Ruby
69
4 年前

Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code

Python
64
2 年前

Poetry generation via natural language markov models

Python
54
2 个月前

Typing Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.

CSS
54
7 年前

Next Word Prediction using n-gram Probabilistic Model with various Smoothing Techniques

Jupyter Notebook
36
7 年前

Rust library providing fast language model queries in compressed space

Rust
24
3 年前

NGRAMS is a search engine for the Google Books Ngram Dataset. This repository contains documentation, discussions, announcements, and issues.

20
2 年前