Repository navigation

#

self-play

Website
Wikipedia

suragnair / alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Tensorflow PyTorch Keras gobang alpha-zero alphago-zero alphago reinforcement-learning self-play mcts monte-carlo-tree-search 深度学习 alphazero 神经网络

Jupyter Notebook

4258

1115

9 个月前

opendilab / DI-engine

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

reinforcement-learning multiagent-reinforcement-learning self-play imitation-learning inverse-reinforcement-learning exploration-exploitation distributed-system Python impala smac atari mujoco r2d2 reinforcement-learning-algorithms pytorch-rl model-based-reinforcement-learning

Python

3524

415

2 个月前

opendilab / LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

alphazero atari continuous-control monte-carlo-tree-search muzero PyTorch reinforcement-learning mcts board-game gym self-play

Python

1440

172

2 天前

opendilab / DI-star

An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.

reinforcment-learning starcraft2 self-play 人工智能深度学习 league deep-reinforcement-learning

Python

1297

123

7 个月前

The official implementation of Self-Play Fine-Tuning (SPIN)

深度学习 fine-tuning large-language-models self-play

Python

1201

103

1 年前

The official implementation of Self-Play Preference Optimization (SPPO)

深度学习 fine-tuning large-language-models rlhf self-play

Python

583

47

8 个月前

inspirai / TimeChamber

A Massively Parallel Large Scale Self-Play Framework

deep-reinforcement-learning reinforcement-learning self-play multi-agent

Python

355

38

3 年前

spiral-rl / spiral

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

large-language-models self-play multi-agent-reinforcement-learning reinforcement-learning

Python

151

16

17 天前

ChuaCheowHuan / gym-continuousDoubleAuction

A custom MARL (multi-agent reinforcement learning) environment where multiple agents trade against one another (self-play) in a zero-sum continuous double auction. Ray [RLlib] is used for training.

multi-agent-reinforcement-learning gym-environment limit-order-book high-frequency-trading ray rllib financial-engineering self-play ppo quantitative-finance quantitative-trading marl lstm

Jupyter Notebook

148

31

1 个月前

Naton1 / osrs-pvp-reinforcement-learning

Train a neural network to PvP in Old School RuneScape using reinforcement learning.

人工智能深度学习 gym Java 机器学习 oldschool-runescape osrs ppo Python PyTorch reinforcement-learning rsps runescape self-play

Java

131

46

2 年前

blanyal / alpha-zero

AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.

alphazero alpha-zero alphago-zero Tensorflow reinforcement-learning mcts self-play game 深度学习机器学习 resnet tic-tac-toe deepmind

Python

90

28

7 年前

seungeunrho / football-paris

The exact codes used by the team "liveinparis" at the kaggle football competition ranked 6th/1141

self-play reinforcement-learning PyTorch ppo kaggle

Python

57

12

5 年前

cestpasphoto / alpha-zero-general

A very fast implementation of AlphaZero, applied to games like Splendor, Santorini, The Little Prince, … Browser version available

alphago alphago-zero alphazero Python PyTorch reinforcement-learning numba self-play

Python

54

18

2 天前

dellalibera / gym-backgammon

Backgammon OpenAI Gym

gym reinforcement-learning self-play game openai-gym 人工智能

Python

50

17

1 年前

dellalibera / td-gammon

TD-Gammon implementation

人工智能 reinforcement-learning 神经网络 PyTorch convolutional-neural-networks self-play game

Python

48

13

2 年前

Sebastian-Schuchmann / Self-Play-TicTacToe-AI-ML-Agents-

A Self Play reinforcement learning Agent learns to play TicTacToe using the ML-Agents Framework in Unity.

人工智能机器学习 reinforcement-learning ml-agents Unity self-play 神经网络 Tensorflow

C#

37

9

3 年前

tobiasemrich / SchafkopfRL

AI agents for the bavarian card game Schafkopf trained with reinforcement learning

collectible-card-game ppo reinforcement-learning self-play PyTorch

Python

37

6

25 天前

ShibiHe / Model-Free-Episodic-Control

This is the implementation of paper Model Free Episodic Control

openai-gym deep knn NumPy self-play game-theory

Python

36

10

6 年前

arianahejazyan / Athena

A UCI-compatible four-player chess engine powered by deep RL and 256-bit bitboards.

人工智能 chess chess-engine 深度学习 neural-networks C++游戏开发 reinforcement-learning chess-ai deep-rl uci self-play

C++

31

5

3 个月前

sirmammingtonham / alphastone

Using self-play, MCTS, and a deep neural network to create a hearthstone ai player

alpha-zero self-play monte-carlo-tree-search 深度学习 deep-reinforcement-learning PyTorch hearthstone 人工智能

Python

29

7

7 年前