Repository navigation

mcts

Website
Wikipedia

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

chain-of-thought Code 大语言模型数学 mcts openai-o1 strawberry reinforcement-learning

6684

368

5 天前

suragnair / alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Tensorflow PyTorch Keras gobang alpha-zero alphago-zero alphago reinforcement-learning self-play mcts monte-carlo-tree-search 深度学习 alphazero 神经网络

Jupyter Notebook

4106

1085

4 个月前

junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

alphazero mcts alphago-zero gobang monte-carlo-tree-search alphago reinforcement-learning rl board-game self-learning PyTorch Tensorflow

Python

3460

986

1 年前

werner-duvaud / muzero-general

MuZero

muzero reinforcement-learning alphazero PyTorch Python self-learning monte-carlo-tree-search 深度学习 deep-reinforcement-learning 神经网络 rl tensorboard gym mcts alphago 机器学习

Python

2617

639

8 个月前

opendilab / LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

alphazero atari continuous-control monte-carlo-tree-search muzero PyTorch reinforcement-learning mcts board-game gym self-play

Python

1342

151

3 天前

zzli2022 / Awesome-System2-Reasoning-LLM

Latest Advances on System-2 Reasoning

benchmark mcts o1 prm reasoning rl

Python

940

3 天前

yaotingwangofficial / Awesome-MCoT

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

chain-of-thought cot deepseek-r1 instruction-tuning large-vision-language-model multimodal multimodal-chain-of-thought multimodal-large-language-models openai-o1 reasoning survey mcts

472

2 天前

chauvinSimon / My_Bibliography_for_Research_on_Autonomous_Driving

Personal notes about scientific and research works on "Decision-Making for Autonomous Driving"

reinforcement-learning inverse-reinforcement-learning planning model-based-reinforcement-learning decision-making game-theory mcts prediction bibliography carla imitation-learning end-to-end interaction risk-assessment

453

101

4 年前

s-casci / tinyzero

Easily train AlphaZero-like agents on any environment you want!

alphazero mcts reinforcement-learning

Python

429

1 年前

hrpan / tetris_mcts

MCTS project for Tetris

reinforcement-learning mcts tetris 深度学习 game tetris-bots

Python

346

6 个月前

dylandjian / SuperGo

A student implementation of Alpha Go Zero

alphago-zero alphago reinforcement-learning PyTorch mcts Python 机器学习

Python

280

7 年前

DataCanvasIO / Hypernets

A General Automated Machine Learning framework to simplify the development of End-to-end AutoML toolkits in specific domains.

neural-architecture-search hyperparameter-optimization hyperparameter-tuning evolutionary-algorithms monte-carlo-tree-search automl autodl reinforcement-learning mcts nas Keras

Python

267

4 天前

QueensGambit / CrazyAra

A Deep Learning UCI-Chess Variant Engine written in C++ & Python 🦜

Python chess-engine 深度学习人工智能 convolutional-neural-network mcts alphazero mxnet gluon Open Source 机器学习 lichess alphago

Jupyter Notebook

263

25 天前

vgarciasc / mcts-viz

Visualization of MCTS algorithm applied to Tic-tac-toe.

mcts visualization p5js

JavaScript

234

4 年前

sungyubkim / Deep_RL_with_pytorch

A pytorch tutorial for DRL(Deep Reinforcement Learning)

deep-reinforcement-learning PyTorch dqn a2c ppo soft-actor-critic mcts

Jupyter Notebook

211

2 年前

initial-h / AlphaZero_Gomoku_MPI

An asynchronous/parallel method of AlphaGo Zero algorithm with Gomoku

alphazero parallel Tensorflow alphago mcts tensorlayer tree-search 算法 deep-reinforcement-learning

Python

203

2 个月前

thuxugang / doudizhu

AI斗地主

人工智能 collectible-card-game dqn reinforcement-learning doudizhu mcts

Python

183

7 年前

kaesve / muzero

A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.

muzero alphazero reinforcement-learning Tensorflow tensorflow2 mcts tf2 深度学习 deep-reinforcement-learning

Jupyter Notebook

157

4 年前

zjeffer / chess-deep-rl

Research project: create a chess engine using Deep Reinforcement Learning

chess alphazero reinforcement-learning 人工智能神经网络 mcts 深度学习机器学习 deep-reinforcement-learning neural-networks chess-engine

Jupyter Notebook

140

10 个月前

akolishchak / doom-net-pytorch

Reinforcement learning models in ViZDoom environment

PyTorch vizdoom reinforcement-learning doom agent learning ppo mcts behavior-tree

Python

133

3 年前