Repository navigation
rlhf
- Website
- Wikipedia
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
The official GitHub page for the survey paper "A Survey of Large Language Models".
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
Robust recipes to align language models with human and AI preferences
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
A curated list of reinforcement learning with human feedback resources (continually updated)
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
Align Anything: Training All-modality Model with Feedback
The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
A Doctor for your data
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation