Repository navigation

#

corpus-processing

An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation

Python
717
24 天前

Python scripts preprocessing Penn Treebank and Chinese Treebank

Python
161
5 年前

OpusFilter - Parallel corpus processing toolkit

Python
104
25 天前

Utilities for Processing the Switchboard Dialogue Act Corpus

Python
68
4 年前

A Serverless Text Annotation Tool for Corpus Development

JavaScript
55
2 个月前

Utilities for Processing the Meeting Recorder Dialogue Act Corpus

Python
32
4 年前

A simple collocation-driven recognition of rhymes. Contains pre-trained models for Czech, Dutch, English, French, German, Russian, and Spanish poetry

Python
29
3 年前

A library of functions enabling complex corpus search in context (KWIC), search aggregation, bag-of-words building & keyphrase extraction.

R
20
6 年前

Hard-Forked from JuliaText/TextAnalysis.jl

Julia
17
2 年前

Plotly-Dash NLP project. Document similarity measure using Latent Dirichlet Allocation, principal component analysis and finally follow with KMeans clustering. Project is completed with dynamic visual interaction.

Python
12
3 年前

Scripts for building a geo-located web corpus using Common Crawl data

Python
11
17 天前

A set of corpus-based sampling & analysis M4L devices

Max
11
3 年前

Script that sets up and configures an entire CQPweb server installation

Shell
11
5 年前