Repository navigation
chinese-corpus
- Website
- Wikipedia
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
Language Understanding Evaluation benchmark for Chinese: datasets, baselines, pre-trained models,corpus and leaderboard
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
地球上最全的华语现代诗歌语料库,3k+诗人,80K+诗歌,15M+字
An Implementation of 'Attention is all you need' with Chinese Corpus
Pre-trained Wikipedia corpus by MITIE
Pretrained model for Chinese Scientific Text
Corpus creator for Chinese Wikipedia
搜狗细胞词库到普通文本的转换提取工具。提取词汇表,用于深度学习做数据生成和字典特征
基于4-tag标注好的2019中文维基语料库,使用hanlp进行标注
Predicting Audience’s Response from Sketch Comedy and Crosstalk Scripts (A Corpus Supporting Comedy Writers)
20201124到20220710期间的微博热搜中出现过的姓名 (主要为明星、政客、名人、网红、企业家等)