Repository navigation

#

gpt-2

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

Python
13523
12 天前

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python
11767
4 个月前

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook
10733
3 个月前

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

Python
8288
3 年前

Chinese version of GPT2 training code, using BERT tokenizer.

Python
7556
1 年前

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Jupyter Notebook
7537
1 个月前

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Python
5217
4 天前
Python
3345
2 年前
Python
3060
1 年前

GPT2 for Chinese chitchat/用于中文闲聊的GPT2模型(实现了DialoGPT的MMI思想)

Python
3014
1 年前

Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

Python
2642
7 个月前

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Python
2391
8 个月前

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Python
2388
4 年前

A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models

Python
1849
2 年前

Guide to using pre-trained large language models of source code

Python
1822
9 个月前