Repository navigation

#

unigram

Build unigram and bigram language models, implement Laplace smoothing and use the models to compute the perplexity of test corpora.

Python
84
8 年前

Get n-grams from text

JavaScript
79
2 年前

Next Word Prediction using n-gram Probabilistic Model with various Smoothing Techniques

Jupyter Notebook
36
7 年前

Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and WordPiece tokenization in JavaScript, Python and Rust.

Rust
19
1 个月前

Telegram bot that generates random messages using a Markov chain, now for Node.js

TypeScript
8
3 年前

NLP extra project at AUT Artificial Intelligence course (Fall 2020)

Python
5
4 年前

All AUT's principles and applications of artificial intelligence course projects.

Python
5
4 年前

Word2Vec using Hierarchy Softmax and Negative Sampling with Unigram & Subsampling

Python
4
5 年前

Python Web Crawler implementing Iterative Deepening Depth Search

Python
3
2 年前

We have implemened an NLP Project to recognize the correct poets. In the project, we have used the bigram, unigram and backoff model for smoothing.

Jupyter Notebook
3
4 年前

Performance evaluation of sentiment classification on movie reviews

Python
3
7 年前

COMP6721: Introduction to Artificial Intelligence Assignments (GOFAI, Machine Learning, NLP)

HTML
2
6 年前

Academic project centered around n-grams and their application in developing a spelling corrector with contextual awareness.

Jupyter Notebook
2
1 年前

Unigram Tokenization realization from scratch

Jupyter Notebook
2
1 年前

🧛🏻‍♂️ Dark theme for Unigram

2
3 年前