Repository navigation

#

chinese-nlp

Python
11156
1 年前

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

HTML
8125
6 个月前

A curated list of resources for Chinese NLP 中文自然语言处理相关资料

7861
2 年前

Language Technology Platform

Python
5084
17 天前
IDEA-CCNL/Fengshenbang-LM

Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。

Python
4109
8 个月前
C++
3927
4 年前

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3825
6 天前

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Python
3128
2 年前

Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集

Python
3050
1 年前

Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取

Python
2256
1 年前

An Efficient Lexical Analyzer for Chinese

Python
2061
3 年前

百度开源的依存句法分析系统

Python
991
2 年前

Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for lucene,solr,elasticsearch,opensearch

Java
922
2 年前

Collections of Chinese NLP corpus

Python
900
4 年前

🍀 Another Chinese chatbot implemented in PyTorch, which is the sub-module of intelligent work order processing robot. 👩‍🔧

Python
890
9 个月前

An Efficient Lexical Analyzer for Chinese

C++
806
2 年前

An open-source educational chat model from ICALK, East China Normal University. 开源中英教育对话大模型。(通用基座模型,GPU部署,数据清理) 致敬: LLaMA, MOSS, BELLE, Ziya, vLLM

Jupyter Notebook
785
6 个月前