Repository navigation
chinese-dataset
- Website
- Wikipedia
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
汉字数据集,包括汉字的相关信息,例如笔画数、部首、拼音、英文释义/同义词等。
Pretrained model for Chinese Scientific Text
A large-scale offline Chinese handwritten signature dataset
[CHABCNet] ABCNet on the Chinese dataset, building on Detectron2 (Facebook AI Research)
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Chinese-Traditional category for AI2001, containing Chinese (Traditional) language linguistic datasets
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Chinese-Simplified category for AI2001, containing Chinese (Simplified) language linguistic datasets
中国40年春晚小品类节目的文本数据及数据分析 Text Data and Data Analysis of Chinese Spring Festival Gala Comedy Sketches Over 40 Years
Top Economics Journals Publications Dataset and Data Analysis: Top 5 English Journals and Top 3 Chinese Journals
2003-2023焦点访谈节目文本数据及数据分析 Text Data and Data Analysis of Focus Report, a Chinese Investigative TV Program, 2003-2023
Code repository for training Taiwan-ELM models, including data preprocessing, tokenizer development, and model fine-tuning.