Repository navigation

#

preference-alignment

[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward

Python
873
2 个月前

[Paper][ACL 2024 Findings] Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering

Python
193
10 个月前

DPO-Shift: Shifting the Distribution of Direct Preference Optimization

Python
47
2 个月前

[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$

Python
43
6 个月前

Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).

Python
38
1 年前

[ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"

Python
12
1 年前

Survey of preference alignment algorithms

0
1 年前

Generate synthetic datasets for instruction tuning and preference alignment using tools like `distilabel` for efficient and scalable data creation.

Jupyter Notebook
0
3 个月前