Repository navigation

#

rlhf

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Python
37470
1 年前
ymcui/Chinese-LLaMA-Alpaca-2

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Python
7173
3 个月前

Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

Python
7063
2 个月前
huggingface/alignment-handbook

Robust recipes to align language models with human and AI preferences

Python
5386
1 个月前

Align Anything: Training All-modality Model with Feedback

Jupyter Notebook
4555
1 个月前

Open Source Application for Advanced LLM + Diffusion Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.

TypeScript
4390
1 天前

A curated list of reinforcement learning with human feedback resources (continually updated)

4151
16 天前

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

Python
3713
2 年前
argilla-io/distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Python
2896
5 天前

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python
1996
6 天前

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook
1868
2 个月前

WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)

Python
1603
6 个月前

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

Python
1530
8 个月前

Recipes to train reward model for RLHF.

Python
1457
5 个月前