Repository navigation

#

pos-tagging

hankcs/HanLP

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

Python
34891
3 个月前

Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for lucene,solr,elasticsearch,opensearch

Java
922
2 年前
Go
857
19 天前

Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/

Jupyter Notebook
488
5 天前

CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.

Java
475
2 年前

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

Python
449
5 天前

API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到 SIGHAN 2005 F1-measure 94% 以上,Recall 96% 以上的成績。

Python
412
18 天前

A neural network architecture for NLP tasks, using cython for fast performance. Currently, it can perform POS tagging, SRL and dependency parsing.

Python
407
3 年前

Python version of Sudachi, a Japanese tokenizer.

Python
404
3 年前
Python
398
10 个月前

Empower Sequence Labeling with Task-Aware Neural Language Model | a PyTorch Tutorial to Sequence Labeling

Python
366
5 年前

Sudachi in Rust 🦀 and new generation of SudachiPy

Rust
348
3 个月前