Repository navigation

#

text-data

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Python
2388
4 年前

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Python
745
3 年前

Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/

Python
244
1 年前

Conversational Toolkit. An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation

Python
127
5 年前

Tools to uniformly read in text data including semi-structured transcripts

R
74
2 年前

Tools for reshaping text data

R
50
1 年前

Question Classification for the dataset CogComp QC Dataset - [ http://cogcomp.org/Data/QA/QC/ ].

Python
29
4 年前

A Python library that enables smooth keyword extraction from any text using the RAKE(Rapid Automatic Keyword Extraction) algorithm.

Python
29
1 年前

Visualize large text collections with WebGL

JavaScript
25
7 个月前

Presents an optimized Apache Beam pipeline for generating sentence embeddings (runnable on Cloud Dataflow).

Python
20
3 年前

Scrape EDGAR filings from https://www.sec.gov/

Julia
14
1 个月前

Old book pages (with groundtruth), formerly used for OCR studies. There are several versions of the set (concerning resolution and binarization). Noised and denoised sets (done by several methods) are eventually going to be uploaded.

HTML
12
8 年前

A dataset which contains 30k+ so called "self-help" tweets from 100+ authors.

Jupyter Notebook
9
6 年前

곰tv 자막 데이터 수집 코드

R
6
8 年前

A machine learning model that predicts tags for a given question and body.

Jupyter Notebook
3
4 年前

Directional Co-clustering with a Conscience (DCC)

R
3
6 年前