kevaday / alphazero-general Goto Github PK

View Code? Open in Web Editor NEW

59.0 7.0 18.0 70.82 MB

A fast, generalized, and modified implementation of Deepmind's distinguished AlphaZero in PyTorch.

License: MIT License

Python 64.99% Cython 35.01%

reinforcement-learning alphazero deep-learning game board-game cython

alphazero-general's People

Contributors

Stargazers

Watchers

Forkers

jbdatascience bhansconnect dodatko m-r-munroe casper2002casper bobingstern syllebra lusu2004 andykhang404 yang0110 colllin tokarev-i-v pierreremacle julencw pavanyellow

alphazero-general's Issues

Exception ignored in: 'alphazero.MCTS.MCTS._add_root_noise'

I'm training gobang (10x10 board with 5 connect), I got this error:

FloatingPointError: underflow encountered in cast
Exception ignored in: 'alphazero.MCTS.MCTS._add_root_noise'
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()

The train is not interrupt. But I don't know what will happened.
I training it with python 3.8

'PySide2.QtCore.Qt.WindowType' object cannot be interpreted as an integer

when I run command python -m AlphaZeroGUI.main. I got this error:

  File "/home/zbf/Desktop/git/github/alphazero-general/AlphaZeroGUI/_gui.py", line 86, in __init__
    self.setWindowFlags(QtCore.Qt.WindowMinimizeButtonHint | QtCore.Qt.WindowCloseButtonHint)
TypeError: 'PySide2.QtCore.Qt.WindowType' object cannot be interpreted as an integer

code is:

class Ui_FormMainMenu(QtWidgets.QMainWindow):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.setWindowFlags(QtCore.Qt.WindowMinimizeButtonHint | QtCore.Qt.WindowCloseButtonHint)
        self.setupUi(self)

    def setupUi(self, FormMainMenu):
        FormMainMenu.setObjectName("FormMainMenu")
        FormMainMenu.resize(876, 647)
        ......

I can't fix this bug. I run it on python 3.11

Strange "bug?" with model gating

I keep getting a model gating for version 4 even though the win rate against past model is 114/45
Is this a bug or is this intended? It seemingly plays well against the mcts player with a model vs baseline rate of 154/18. Not sure what this means

Running cpuonly on windows gives "pinned memory requires CUDA"

I don't have a fancy gpu on this computer.

I used this code to create my repo:

conda create --name agz_kevaday
conda activate agz_kevaday
conda install pytorch torchvision torchaudio cpuonly -c pytorch
conda install -c anaconda numpy cython 
conda install -c conda-forge tensorboard tensorboardx choix

I navigated to the main alphazero-general directory and then executed this:
python -m alphazero.envs.tictactoe.train

Here is a names-redacted version of the output:

Because of batching, it can take a long time before any games finish.
------ITER 1------
Warmup: random policy and value
Traceback (most recent call last):
  File "C:\Users\_redacted_\Anaconda3\envs\agz_kevaday\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\_redacted_\Anaconda3\envs\agz_kevaday\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\_redacted_\Documents\Personal\alphazero-general\alphazero\envs\tictactoe\train.py", line 32, in <module>
    c.learn()
  File "C:\Users\_redacted_\Documents\Personal\alphazero-general\alphazero\Coach.py", line 180, in learn
    self.generateSelfPlayAgents()
  File "C:\Users\_redacted_\Documents\Personal\alphazero-general\alphazero\Coach.py", line 226, in generateSelfPlayAgents
    self.input_tensors[i].pin_memory()
RuntimeError: Pinned memory requires CUDA. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason.  The CUDA library MUST be loaded, EVEN IF you don't directly use any symbols from the CUDA library! One common culprit is a lack of -INCLUDE:?warp_size@cuda@at@@YAHXZ in your link arguments; many dynamic linkers will delete dynamic library dependencies if you don't depend on any of their symbols.  You can check if this has occurred by using link on your binary to see if there is a dependency on *_cuda.dll library.

Given the error, I think the code doesn't work for cpu-only. It seems to be saying "CUDA required".

A simple attempt at distributed training

Could it be possible to have multiple machines all generate training games over the same network and send the games generated to a "master" machine which will use the training data and train a new version of the model and then send the new model (after applying gating if enabled) back to the other machines and start over again? I'm 99% sure I can implement this using sockets fairly easily but I need to know a few things.

How are training samples stored and sampled for training
How can I "merge" training samples from multiple files into the base 3 .pkl files (data, policy, value)

I am reluctant to use Ray for this since it sounds like overkill for a simple task of generation and file transfer. Scheduling would be very straightforward, just count how many samples have been transfered and once they reach a threshold of say 1 mil, tell the other machines to stop generation and start training on the master machine. After this send the (gated) model back and have the machines run baseline testing if needed.

I think this is basically what Lc0 does but with distributed training as well which would probably need Ray

AttributeError: 'function' object has no attribute 'supports_process'

Dear Kevi Aday,
Thanks for public your code. There some error when I trying to train connect4 game. Can you help me to solve this error.
Thank you in advance!
Albert,

PITTING AGAINST BASELINE: RawMCTSPlayer Traceback (most recent call last): File "D:\ana3\envs\muzero\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "D:\ana3\envs\muzero\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "D:\88Projects\alphazero-general\alphazero\envs\connect4\train.py", line 58, in <module> c.learn() File "D:\88Projects\alphazero-general\alphazero\Coach.py", line 267, in learn self.compareToBaseline(self.model_iter) File "D:\88Projects\alphazero-general\alphazero\Coach.py", line 148, in wrapper ret = func(self, *args, **kwargs) File "D:\88Projects\alphazero-general\alphazero\Coach.py", line 589, in compareToBaseline self.arena = Arena(players, self.game_cls, use_batched_mcts=can_process, args=self.args) File "alphazero\Arena.pyx", line 66, in alphazero.Arena._set_state.decorator.wrapper ret = func(self, *args, **kwargs) File "alphazero\Arena.pyx", line 108, in alphazero.Arena.Arena.__init__ self.players = players File "alphazero\Arena.pyx", line 129, in alphazero.Arena.Arena.players self.__check_players_valid() File "alphazero\Arena.pyx", line 132, in genexpr if self.use_batched_mcts and not all(p.player.supports_process() for p in self.players): File "alphazero\Arena.pyx", line 132, in genexpr if self.use_batched_mcts and not all(p.player.supports_process() for p in self.players): AttributeError: 'function' object has no attribute 'supports_process'

Issue: raise ValueError(f'Invalid action encountered while updating root: {c.a}')

In your implementation of the Othello game, there is no logic to handle passing a turn when a player has no valid moves. I added this logic to the win_state method as follows:

`def win_state(self) -> np.ndarray:
result = [False] * (NUM_PLAYERS + 1)
player = self._player_range()

has_legal_moves_player = self._board.has_legal_moves(player)
has_legal_moves_reverse_player = self._board.has_legal_moves(-player)

if not has_legal_moves_player and has_legal_moves_reverse_player:
    self._update_turn()
    return np.array(result, dtype=np.uint8)

if not has_legal_moves_player and not has_legal_moves_reverse_player:
    diff = self._board.count_diff(player)
    if diff > 0:
        result[self.player] = True
    elif diff < 0:
        result[self._next_player(self.player)] = True
    else:
        result[NUM_PLAYERS] = True

return np.array(result, dtype=np.uint8)`

After adding this logic, I started encountering a division by zero error in the probs method in MCTS.pyx on the line:

probs = (counts / np.sum(counts)) ** (1.0 / temp)

where np.sum(counts) becomes zero.

I tried to fix this by modifying the code as follows:

`total_count = np.sum(counts)
if total_count == 0:
return np.full_like(counts, 1.0 / len(counts))

try:
probs = (counts / total_count) ** (1.0 / temp)
probs /= np.sum(probs)
return probs`

However, after this change, I encounter another error in the update_root method in MCTS.pyx:

raise ValueError(f'Invalid action encountered while updating root: {c.a}')

Training not working on macOS due to multiprocessing queue missing implementation

Hi,
I'm struggling to run the training on macOS, while on windows everything works just fine.
The following error is raised when I run python -m alphazero.envs.tictactoe.train on a macOS terminal.

Traceback (most recent call last):
  File "/Users/User/anaconda3/envs/alphazero/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/User/anaconda3/envs/alphazero/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/User/Projects/alphazero-general/alphazero/envs/tictactoe/train.py", line 37, in <module>
    c.learn()
  File "/Users/User/Projects/alphazero-general/alphazero/Coach.py", line 250, in learn
    self.saveIterationSamples(self.model_iter)
  File "/Users/User/Projects/alphazero-general/alphazero/Coach.py", line 146, in wrapper
    ret = func(self, *args, **kwargs)
  File "/Users/User/Projects/alphazero-general/alphazero/Coach.py", line 365, in saveIterationSamples
    num_samples = self.file_queue.qsize()
  File "/Users/User/anaconda3/envs/alphazero/lib/python3.9/multiprocessing/queues.py", line 126, in qsize
    return self._maxsize - self._sem._semlock._get_value()
NotImplementedError
/Users/User/anaconda3/envs/alphazero/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 10 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

I think that the issue is related to this issue.

Error in default code from default download

Process:

Downloaded zip, extracted into a useful directory. --> OK
Opened README.md, started working through "Getting Started"
in terminal, in directory, ran "pip3 install -r requirements. txt" --> all requirements satisfied
ran "python -m AlphaZeroGUI.main" --> Got ERROR

Here is the text for that error:

Traceback (most recent call last):
  File "/home/<username>/anaconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/<username>/anaconda3/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/<username>/<useful_directory>/AlphaZeroGUI/main.py", line 2, in <module>
    from PySide2.QtWidgets import QApplication, QMessageBox, QInputDialog, QTableWidgetItem, QLineEdit
ImportError: /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2: symbol krb5_ser_context_init version krb5_3_MIT not defined in file libkrb5.so.3 with link time reference

It looks like the requirements.txt may need something involving "Pyside2".

I spun up a conda environment and manually am going through the requirements.... and now I get a GUI interface!!

Getting a segfault after appending agents

Hey, just pulled your repo, have a custom game interface. I get a segfault after appending agents, plenty of ram available, ulimit 32k, stack and recursion size maxed out. Worked fine on bhasconnect's original repo - any ideas as to what could be causing it?

Self-play - ValueError: Invalid action encountered while updating root

Hi,

Attempting to use this as instructed (subclassing GameState) seems to work at first, but during the self-play step the console is filled with the same errors:

Exception ignored in: 'alphazero.MCTS.MCTS.update_root'
Traceback (most recent call last):
  File "/home/tyto/miniconda3/envs/torch/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
ValueError: Invalid action encountered while updating root: 155
ValueError: Invalid action encountered while updating root: 91

kevaday / alphazero-general Goto Github PK

alphazero-general's People

Contributors

Stargazers

Watchers

Forkers

alphazero-general's Issues

Exception ignored in: 'alphazero.MCTS.MCTS._add_root_noise'

'PySide2.QtCore.Qt.WindowType' object cannot be interpreted as an integer

Strange "bug?" with model gating

Running cpuonly on windows gives "pinned memory requires CUDA"

A simple attempt at distributed training

AttributeError: 'function' object has no attribute 'supports_process'

Issue: raise ValueError(f'Invalid action encountered while updating root: {c.a}')

Training not working on macOS due to multiprocessing queue missing implementation

Error in default code from default download

Getting a segfault after appending agents

Self-play - ValueError: Invalid action encountered while updating root

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent