kevaday / alphazero-general Goto Github PK
View Code? Open in Web Editor NEWA fast, generalized, and modified implementation of Deepmind's distinguished AlphaZero in PyTorch.
License: MIT License
A fast, generalized, and modified implementation of Deepmind's distinguished AlphaZero in PyTorch.
License: MIT License
I'm training gobang (10x10 board with 5 connect), I got this error:
FloatingPointError: underflow encountered in cast
Exception ignored in: 'alphazero.MCTS.MCTS._add_root_noise'
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
The train is not interrupt. But I don't know what will happened.
I training it with python 3.8
when I run command python -m AlphaZeroGUI.main
. I got this error:
File "/home/zbf/Desktop/git/github/alphazero-general/AlphaZeroGUI/_gui.py", line 86, in __init__
self.setWindowFlags(QtCore.Qt.WindowMinimizeButtonHint | QtCore.Qt.WindowCloseButtonHint)
TypeError: 'PySide2.QtCore.Qt.WindowType' object cannot be interpreted as an integer
code is:
class Ui_FormMainMenu(QtWidgets.QMainWindow):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.setWindowFlags(QtCore.Qt.WindowMinimizeButtonHint | QtCore.Qt.WindowCloseButtonHint)
self.setupUi(self)
def setupUi(self, FormMainMenu):
FormMainMenu.setObjectName("FormMainMenu")
FormMainMenu.resize(876, 647)
......
I can't fix this bug. I run it on python 3.11
I keep getting a model gating for version 4 even though the win rate against past model is 114/45
Is this a bug or is this intended? It seemingly plays well against the mcts player with a model vs baseline rate of 154/18. Not sure what this means
I don't have a fancy gpu on this computer.
I used this code to create my repo:
conda create --name agz_kevaday
conda activate agz_kevaday
conda install pytorch torchvision torchaudio cpuonly -c pytorch
conda install -c anaconda numpy cython
conda install -c conda-forge tensorboard tensorboardx choix
I navigated to the main alphazero-general directory and then executed this:
python -m alphazero.envs.tictactoe.train
Here is a names-redacted version of the output:
Because of batching, it can take a long time before any games finish.
------ITER 1------
Warmup: random policy and value
Traceback (most recent call last):
File "C:\Users\_redacted_\Anaconda3\envs\agz_kevaday\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\_redacted_\Anaconda3\envs\agz_kevaday\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\_redacted_\Documents\Personal\alphazero-general\alphazero\envs\tictactoe\train.py", line 32, in <module>
c.learn()
File "C:\Users\_redacted_\Documents\Personal\alphazero-general\alphazero\Coach.py", line 180, in learn
self.generateSelfPlayAgents()
File "C:\Users\_redacted_\Documents\Personal\alphazero-general\alphazero\Coach.py", line 226, in generateSelfPlayAgents
self.input_tensors[i].pin_memory()
RuntimeError: Pinned memory requires CUDA. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason. The CUDA library MUST be loaded, EVEN IF you don't directly use any symbols from the CUDA library! One common culprit is a lack of -INCLUDE:?warp_size@cuda@at@@YAHXZ in your link arguments; many dynamic linkers will delete dynamic library dependencies if you don't depend on any of their symbols. You can check if this has occurred by using link on your binary to see if there is a dependency on *_cuda.dll library.
Given the error, I think the code doesn't work for cpu-only. It seems to be saying "CUDA required".
Could it be possible to have multiple machines all generate training games over the same network and send the games generated to a "master" machine which will use the training data and train a new version of the model and then send the new model (after applying gating if enabled) back to the other machines and start over again? I'm 99% sure I can implement this using sockets fairly easily but I need to know a few things.
I am reluctant to use Ray for this since it sounds like overkill for a simple task of generation and file transfer. Scheduling would be very straightforward, just count how many samples have been transfered and once they reach a threshold of say 1 mil, tell the other machines to stop generation and start training on the master machine. After this send the (gated) model back and have the machines run baseline testing if needed.
I think this is basically what Lc0 does but with distributed training as well which would probably need Ray
Dear Kevi Aday,
Thanks for public your code. There some error when I trying to train connect4 game. Can you help me to solve this error.
Thank you in advance!
Albert,
PITTING AGAINST BASELINE: RawMCTSPlayer Traceback (most recent call last): File "D:\ana3\envs\muzero\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "D:\ana3\envs\muzero\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "D:\88Projects\alphazero-general\alphazero\envs\connect4\train.py", line 58, in <module> c.learn() File "D:\88Projects\alphazero-general\alphazero\Coach.py", line 267, in learn self.compareToBaseline(self.model_iter) File "D:\88Projects\alphazero-general\alphazero\Coach.py", line 148, in wrapper ret = func(self, *args, **kwargs) File "D:\88Projects\alphazero-general\alphazero\Coach.py", line 589, in compareToBaseline self.arena = Arena(players, self.game_cls, use_batched_mcts=can_process, args=self.args) File "alphazero\Arena.pyx", line 66, in alphazero.Arena._set_state.decorator.wrapper ret = func(self, *args, **kwargs) File "alphazero\Arena.pyx", line 108, in alphazero.Arena.Arena.__init__ self.players = players File "alphazero\Arena.pyx", line 129, in alphazero.Arena.Arena.players self.__check_players_valid() File "alphazero\Arena.pyx", line 132, in genexpr if self.use_batched_mcts and not all(p.player.supports_process() for p in self.players): File "alphazero\Arena.pyx", line 132, in genexpr if self.use_batched_mcts and not all(p.player.supports_process() for p in self.players): AttributeError: 'function' object has no attribute 'supports_process'
In your implementation of the Othello game, there is no logic to handle passing a turn when a player has no valid moves. I added this logic to the win_state method as follows:
`def win_state(self) -> np.ndarray:
result = [False] * (NUM_PLAYERS + 1)
player = self._player_range()
has_legal_moves_player = self._board.has_legal_moves(player)
has_legal_moves_reverse_player = self._board.has_legal_moves(-player)
if not has_legal_moves_player and has_legal_moves_reverse_player:
self._update_turn()
return np.array(result, dtype=np.uint8)
if not has_legal_moves_player and not has_legal_moves_reverse_player:
diff = self._board.count_diff(player)
if diff > 0:
result[self.player] = True
elif diff < 0:
result[self._next_player(self.player)] = True
else:
result[NUM_PLAYERS] = True
return np.array(result, dtype=np.uint8)`
After adding this logic, I started encountering a division by zero error in the probs method in MCTS.pyx on the line:
probs = (counts / np.sum(counts)) ** (1.0 / temp)
where np.sum(counts) becomes zero.
I tried to fix this by modifying the code as follows:
`total_count = np.sum(counts)
if total_count == 0:
return np.full_like(counts, 1.0 / len(counts))
try:
probs = (counts / total_count) ** (1.0 / temp)
probs /= np.sum(probs)
return probs`
However, after this change, I encounter another error in the update_root method in MCTS.pyx:
raise ValueError(f'Invalid action encountered while updating root: {c.a}')
Hi,
I'm struggling to run the training on macOS, while on windows everything works just fine.
The following error is raised when I run python -m alphazero.envs.tictactoe.train
on a macOS terminal.
Traceback (most recent call last):
File "/Users/User/anaconda3/envs/alphazero/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Users/User/anaconda3/envs/alphazero/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/Users/User/Projects/alphazero-general/alphazero/envs/tictactoe/train.py", line 37, in <module>
c.learn()
File "/Users/User/Projects/alphazero-general/alphazero/Coach.py", line 250, in learn
self.saveIterationSamples(self.model_iter)
File "/Users/User/Projects/alphazero-general/alphazero/Coach.py", line 146, in wrapper
ret = func(self, *args, **kwargs)
File "/Users/User/Projects/alphazero-general/alphazero/Coach.py", line 365, in saveIterationSamples
num_samples = self.file_queue.qsize()
File "/Users/User/anaconda3/envs/alphazero/lib/python3.9/multiprocessing/queues.py", line 126, in qsize
return self._maxsize - self._sem._semlock._get_value()
NotImplementedError
/Users/User/anaconda3/envs/alphazero/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 10 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
I think that the issue is related to this issue.
Process:
Here is the text for that error:
Traceback (most recent call last):
File "/home/<username>/anaconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/<username>/anaconda3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/<username>/<useful_directory>/AlphaZeroGUI/main.py", line 2, in <module>
from PySide2.QtWidgets import QApplication, QMessageBox, QInputDialog, QTableWidgetItem, QLineEdit
ImportError: /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2: symbol krb5_ser_context_init version krb5_3_MIT not defined in file libkrb5.so.3 with link time reference
It looks like the requirements.txt may need something involving "Pyside2".
I spun up a conda environment and manually am going through the requirements.... and now I get a GUI interface!!
Hey, just pulled your repo, have a custom game interface. I get a segfault after appending agents, plenty of ram available, ulimit 32k, stack and recursion size maxed out. Worked fine on bhasconnect's original repo - any ideas as to what could be causing it?
Hi,
Attempting to use this as instructed (subclassing GameState) seems to work at first, but during the self-play step the console is filled with the same errors:
Exception ignored in: 'alphazero.MCTS.MCTS.update_root'
Traceback (most recent call last):
File "/home/tyto/miniconda3/envs/torch/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
ValueError: Invalid action encountered while updating root: 155
ValueError: Invalid action encountered while updating root: 91
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.