Repository navigation

#

rl

Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用

Python
14678
4 个月前

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

Jupyter Notebook
10795
10 个月前
thu-ml/tianshou
Python
8706
14 天前
OpenPipe/ART

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python
5850
3 分钟前

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

Python
3504
1 年前

ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation

C++
3399
6 年前
DLR-RM/rl-baselines3-zoo

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

Python
2530
1 个月前

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms

Python
2349
3 年前
Python
2322
21 小时前

Scalable RL solution for advanced reasoning of language models

Python
1681
5 个月前

[ICML 2017] TensorFlow code for Curiosity-driven Exploration for Deep Reinforcement Learning

Python
1448
3 年前

Latest Advances on System-2 Reasoning

Python
1223
2 个月前
araffin/rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

Python
1179
3 年前

Understanding R1-Zero-Like Training: A Critical Perspective

Python
1068
1 个月前

Implementation of all RL algorithms in a simpler way

Jupyter Notebook
1023
2 天前

The open source post-building layer for agents. Our environment data and evals power agent post-training (RL, SFT) and monitoring.

Python
977
12 小时前