Repository navigation

#

gpt-2

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

Python
13902
10 小时前

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python
12560
8 个月前

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook
11173
2 个月前

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Jupyter Notebook
8353
3 个月前

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

Python
8292
3 年前

Chinese version of GPT2 training code, using BERT tokenizer.

Python
7581
1 年前

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Python
5370
20 小时前
Python
3401
2 年前
Python
3074
1 年前

GPT2 for Chinese chitchat/用于中文闲聊的GPT2模型(实现了DialoGPT的MMI思想)

Python
3014
2 年前

Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

Python
2663
1 年前

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Python
2388
4 年前

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Python
2387
1 年前

A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models

Python
1894
2 年前

llama and other large language models on iOS and MacOS offline using GGML library.

C
1852
11 天前

Guide to using pre-trained large language models of source code

Python
1839
1 年前