Repository navigation

#

rlhf

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Python
37307
8 个月前
ymcui/Chinese-LLaMA-Alpaca-2

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Python
7159
7 个月前

Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

Python
6868
2 个月前
huggingface/alignment-handbook

Robust recipes to align language models with human and AI preferences

Python
5134
5 个月前

A curated list of reinforcement learning with human feedback resources (continually updated)

3889
2 个月前

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

Python
3697
2 年前

Align Anything: Training All-modality Model with Feedback

Jupyter Notebook
3405
4 天前
Kiln-AI/Kiln

The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.

Python
3390
14 小时前
argilla-io/distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Python
2640
5 天前

Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.

TypeScript
2001
2 天前

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook
1719
4 个月前

WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)

Python
1587
25 天前

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

Python
1376
3 个月前

Secrets of RLHF in Large Language Models Part I: PPO

Python
1356
1 年前

Recipes to train reward model for RLHF.

Python
1296
2 个月前