cestpasphoto / alpha-zero-general Goto Github PK
View Code? Open in Web Editor NEWA very fast implementation of AlphaZero, applied to games like Splendor, Santorini, The Little Prince, … Browser version available
License: MIT License
A very fast implementation of AlphaZero, applied to games like Splendor, Santorini, The Little Prince, … Browser version available
License: MIT License
Hi, this looks like a fantastic implementation of AlphaZero for Splendor—thanks for making it. I had a few questions:
Is there any new theory involved or does the same old MCTS + NNS work just fine? If the latter, is there anything special you have to do to handle these different gameplay elements?
Thank you!
I have implemented token exchange (all 406 ways except gold return) based on this repository and modified the environment to be more similar to the actual Splendor.
https://github.com/kuboyoo/alpha-zero-general
I would like to compare the strength of the model (model.onnx {cpuct=1.0, fpu=0.1, numMCTSSims=6400}) in this repository with my model re-trained in the modified environment.
Would you be willing to share the .pt file before conversion to .onnx?
I tried to play Splendor using the command from the tutorial (I first changed the package imports):
python ./pit.py splendor/pretrained_2players.pt human -n 1
But I got this following error:
File "D:\programs\Python\Python311\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 483, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (MatMulBnFusion_Gemm) Op (Gemm) [ShapeInferenceError] First input does not have rank 2
So I figured maybe it's due to the mentioned issue "Ongoing code/features rework, some pretrained networks won't work anymore". So I reverted to the version of 30/1/2024, without avail. Then I decided to first run the training myself, using the example from the tutorial (I had to add the -V 85
though, otherwise it complained about version 1 not existing):
python main.py -m 800 -e 1000 -i 5 -F -c 2.5 -f 0.1 -T 10 -b 32 -l 0.0003 -p 1 -D 0.3 -C ../results/mytest -V 85
But now I got the following error:
File "D:\programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1688, in __getattr__
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'SplendorNNet' object has no attribute 'first_layer'
Heya, thank you for your awesome additions to alpha-zero!
I tried to run your code, but unfortunately ran into some errors that look similar to the ones in #3.
First, updated all the dependencies:
pip3 install -U onnxruntime numba tqdm colorama coloredlogs
pip3 install -U torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu
In particular, I'm using:
colorama 0.4.6
coloredlogs 15.0.1
numba 0.59.1
onnxruntime 1.17.3
torch 2.3.0+cpu
torchvision 0.18.0+cpu
tqdm 4.66.4
And then tried the commands from the readme:
python ./pit.py splendor splendor/pretrained_2players.pt human -n 1
which still printed the initial game-board, but then threw:
Turn 1 Player 0: Traceback (most recent call last):
File "D:\alpha-zero-general\pit.py", line 252, in <module>
main()
File "D:\alpha-zero-general\pit.py", line 246, in main
play(args)
File "D:\alpha-zero-general\pit.py", line 71, in play
result = arena.playGames(args.num_games, initial_state=args.state, verbose=args.display or human)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\alpha-zero-general\Arena.py", line 123, in playGames
gameResult = self.playGame(verbose=verbose, initial_state=initial_state, other_way=not one_vs_two)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\alpha-zero-general\Arena.py", line 74, in playGame
action = players[curPlayer](canonical_board, it)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\alpha-zero-general\pit.py", line 59, in <lambda>
player = lambda x, n: np.argmax(mcts.getActionProb(x, temp=(0.5 if n <= 6 else 0.), force_full_search=True)[0])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\alpha-zero-general\MCTS.py", line 65, in getActionProb
self.search(canonicalBoard, dirichlet_noise=dir_noise, forced_playouts=forced_playouts)
File "D:\alpha-zero-general\MCTS.py", line 144, in search
Ps, v = self.nnet.predict(canonicalBoard, Vs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\alpha-zero-general\GenericNNetWrapper.py", line 100, in predict
self.switch_target('inference')
File "D:\alpha-zero-general\GenericNNetWrapper.py", line 290, in switch_target
self.export_and_load_onnx()
File "D:\alpha-zero-general\GenericNNetWrapper.py", line 338, in export_and_load_onnx
self.ort_session = ort.InferenceSession(temporary_file, sess_options=opts, providers=['CPUExecutionProvider'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python\Python312\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 419, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "C:\Python\Python312\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 483, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (MatMulBnFusion_Gemm) Op (Gemm) [ShapeInferenceError] First input does not have rank 2
python main.py splendor -m 800 -f 0.1 -l 0.0003 -D 0.3 -C ../results/mytest -V 74
Yielded:
Traceback (most recent call last):
File "C:\Python\Python312\Lib\threading.py", line 1073, in _bootstrap_inner
self.run()
File "C:\Python\Python312\Lib\threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "D:\alpha-zero-general\GenericNNetWrapper.py", line 142, in predict_server
self.switch_target('inference')
File "D:\alpha-zero-general\GenericNNetWrapper.py", line 290, in switch_target
self.export_and_load_onnx()
File "D:\alpha-zero-general\GenericNNetWrapper.py", line 338, in export_and_load_onnx
self.ort_session = ort.InferenceSession(temporary_file, sess_options=opts, providers=['CPUExecutionProvider'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python\Python312\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 419, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "C:\Python\Python312\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 483, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (MatMulBnFusion_Gemm) Op (Gemm) [ShapeInferenceError] First input does not have rank 2
Unrelated, but I think in the readme, line 108, the game-argument splendor
is missing.
Hi, I have recently stumbled upon this repository and am going through the code to better understand Alpha Zero.
One weird thing I noticed is the creation of the OneCycleLR scheduler each time the training function of the model is called. Since it happens at every iteration, the learning rate probably ends up very bumpy. The scheduler was created with supervised learning in mind, where the training process is more straightforward.
At the same time, the models seems to learn to play very well, so that cannot be so bad.
Do you have any insights on why it works? Or maybe you have a graph of the learning rates throughout training to illustrate what happens?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.