Repository navigation

open-r1

Website
Wikipedia

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, InternVL3, Ovis2.5, Llava, GLM4v, Phi4, ...) (AAAI 2025).

大语言模型 lora llama sft deploy multimodal peft internvl liger qwen2-vl rft deepseek-r1 embedding grpo open-r1 megatron omni llama4 qwen3 qwen3-moe

Python

9378

825

19 小时前

IAAR-Shanghai / xVerify

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

llm-as-a-judge benchmark evaluation Regular expression reliability ChatGPT 大语言模型 open-r1

Python

127

4 个月前

Exgc / R1V-Free

R1V, trained with AI feedback, answers open-ended visual questions.

open-r1 vlm

Python

4 个月前

Abhisang3 / xVerify

xVerify: Efficient Answer Verifier for Large Language Model Evaluations

benchmark ChatGPT evaluation 大语言模型 open-r1 reliability

Python

3 小时前