Repository navigation

#

subword-units

Unsupervised Word Segmentation for Neural Machine Translation and Text Generation

Python
2230
8 个月前

Morfessor is a tool for unsupervised and semi-supervised morphological segmentation

Python
191
5 年前

Context-sensitive word embeddings with subwords. In Rust.

Rust
87
2 年前

Properly handle position-dependent phones in a subword lexicon FST

Python
31
4 年前
Python
13
6 年前

Finalfusion embeddings in Python

Python
9
3 年前

Semantic role labeling with subwords (character, character-ngram and morphology)

Perl
8
5 年前

A python package to build a corpus vocabulary using the byte pair methodology and also a tokenizer to tokenize input texts based on the built vocab.

Python
2
5 年前

Cognate-aware morphological segmentation

Python
2
6 年前

A tool for generating sub-word (phone or grapheme) level embeddings from an HTK-style MLF ASR corpus

Python
1
6 年前

Morfessor demonstration

Python
0
10 年前

Classified sentences into one of Slovak, Czech, and English. Implemented relevant preprocessing steps, addressed the class imbalance in training set by employing the learned theory of Naive Bayes Models, and implementing subword units.

Smalltalk
0
5 年前

This repository contains the code to learn subword embeddings from the arXiv dataset of 1.7M+ scholarly papers.

Shell
0
4 年前

🕵️ Language Model based on RNN for generating Sherlock Holmes stories.

Python
0
4 年前