Repository navigation
pretrain
- Website
- Wikipedia
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
Paper Lists, Notes and Slides, Focus on NLP. For summarization, please refer to https://github.com/xcfcode/Summarization-Papers
BERT-CCPoem is an BERT-based pre-trained model particularly for Chinese classical poetry
Bert-based models(BERT, MTB, CP) for relation extraction.
MatDGL is a neural network package that allows researchers to train custom models for crystal modeling tasks. It aims to accelerate the research and application of material science.
code for paper "Masked Frequency Modeling for Self-Supervised Visual Pre-Training" (https://arxiv.org/pdf/2206.07706.pdf)
Official code repository for the paper "Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain"
[CCIR 2023] Self-supervised learning for Sequential Recommender Systems
ALBERT trained on Mongolian text corpus
Make your Generative AI LM model from the scratch (Including pretraining / SFT with LoRA)
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
Parsing Hoyoverse game text corpus from public wikipedia
This repository provides code solution for Data Fusion Contest task 1
Understanding "A Lite BERT". An Transformer approach for learning self-supervised Language Models.