Repository navigation

#

self-rewarding

Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI

Python
1400
1 年前

SQL-o1: A Self-Reward Heuristic Dynamic Search Method for Text-to-SQL

Python
192
4 个月前

Reinforcement Learning of Vision Language Models with Self Visual Perception Reward

Python
129
12 天前

An official implementation of "SPARK: Synergistic Policy And Reward Co-Evolving Framework"

Python
20
6 天前

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning

Python
16
3 个月前

SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data

Python
14
1 个月前