Repository navigation

self-rewarding

Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI

Python

1400

1 年前

SQL-o1: A Self-Reward Heuristic Dynamic Search Method for Text-to-SQL

Python

192

4 个月前

Reinforcement Learning of Vision Language Models with Self Visual Perception Reward

Python

129

12 天前

An official implementation of "SPARK: Synergistic Policy And Reward Co-Evolving Framework"

Python

6 天前

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning

Python

3 个月前

SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data

Python

1 个月前

Class-Conditional self-reward mechanism for improved Text-to-Image models

Jupyter Notebook

1 年前