joshvarty / alphazerosimple Goto Github PK
View Code? Open in Web Editor NEWThe absolute most basic example of AlphaZero and Monte Carlo Tree Search I could come up with
License: MIT License
The absolute most basic example of AlphaZero and Monte Carlo Tree Search I could come up with
License: MIT License
Mr. Varty,
Upon executing "python main.py", I received an error stating, "KeyError: 'num_simulations'". I added "num_simulations: 100" to main.py's argument dictionary, which enabled training. Is this appropriate?
Hey,
First of all thanks for the video, post, and the code. I really find it underrated as I went over almost every single resource on AZ. I am still reviewing your code, but I wanted to add the issue as soon as possible. Even though you explain the concepts etc. your code is too dry in terms of explanations, it would be much better if you could add comments as detailed as possible (for people like me who occasionally have problems following up ๐ ). If not I might have time in a couple of weeks to do so if you want (though that still would require you to review it so still better if you add it yourself I guess).
Again Thanks a lot for the effort!
Hi,
From the code, I see trainer.learn() calls self.exceute_episode() which in turn, calls self.mcts.run when then calls model.predict(state). Is it intentional to predict the policy and value before training the network?
Hi Josh,
In order to better understand the code, I tested it with plain pytorch without CUDA but it failed with AssertionError: Torch not compiled with CUDA enabled
from File trainer.py, line 83, in train boards = boards.contiguous().cuda()
. Is it sufficient to simply comment out the three lines with ending in .cuda()
starting from boards = boards.contiguous().cuda()?
Mr. Varty,
Would be willing to provide scripts or ideas for self-play, console-IO play, and/or 2D games?
Tom Lever
1/500
Traceback (most recent call last):
File "main.py", line 27, in
trainer.learn()
File "/home/uu/decy5/inne_nz/AlphaZeroSimple/trainer.py", line 61, in learn
self.train(train_examples)
File "/home/uu/decy5/inne_nz/AlphaZeroSimple/trainer.py", line 83, in train
boards = boards.contiguous().cuda()
File "/home/uu/.local/lib/python3.8/site-packages/torch/cuda/init.py", line 172, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
i mn not have nvidia
setup cpu not working too
I've misnamed this function. It should be called Backup()
.
In your blog, you emphasize "We record the state and the probabilities produced by the MCTS" Do you mean we record board state, priors and values? Trainer.exceute_episode ret.append((hist_state, hist_action_probs, reward * ((-1) ** (hist_current_player != current_player))))
We don't need to make the Monte Carlo Search Tree object live inside of the trainer. It can exist as a local inside execute_episode()
.
Hi there,
I found your project on youtube and it is such a good explanation of the alpha zero algorithm. Thank you very much for that! :)
I was wondering if there is some example code how to setup the kaggle environment to play against the agent.
I know you do something like this:
from kaggle_environments import make
# Setup a tictactoe environment.
env = make("tictactoe")
# Basic agent which marks the first available cell.
def my_agent(obs):
return [c for c in range(len(obs.board)) if obs.board[c] == 0][0]
# Run the basic agent against a default agent which chooses a "random" move.
env.run([my_agent, "random"])
# Render an html ipython replay of the tictactoe game.
env.render(mode="ipython")
But I am currently not 100% sure how to provide the trained connect2 agent, especially with the latest.pth
file.
Cheers & thanks again,
Florentin
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.