Git Product home page Git Product logo

deep-mcts's Introduction

Deep MCTS

This repository contains the code and raw data for my master's thesis Deep reinforcement learning using Monte-Carlo tree search for Hex and Othello.

Raw data and models

Experiment 1

Due to storage restrictions the trained models from experiment 1 are not in the repository, but can instead be found in OneDrive. Note that only the final models are included, not checkpoints. The models are located in subdirectories of deep_mcst/<game>/saves as files on the form anet-<n>.tar, where n denotes the number of iterations it has been trained for. They are stored as pickled dictionaries of parameters. They can be loaded using the from_path_full method of GameNet subclasses. Many of the training parameters are also included in a parameters.json file in each subdirectory. The post-training evaluations are found as CSV files in deep_mcts/<game>/training in the repository. The format is specified by the header.

Experiment 2

The evaluations are found as JSON files in deep_mcts/<game>/simple_rollouts, one for each model. The JSON describes a 6x6x3x2 array with the dimensions corresponding to:

  1. Each rollout probability
  2. Each rollout probability it was compared to
  3. Wins, draws and losses for the rollout probability in the first dimension
  4. As the first player, as the second player

Experiment 3

The evaluations are found as JSON files in the two subdirectories of deep_mcts/<game>/complex_rollouts. There is one subdirectory for evaluations with a state evaluator and one without, and one file for each model in each folder. The JSON describes an object, where the "results" key corresponds to a 3x2 array with the dimensions corresponding to:

  1. Wins, draws and losses for the policy network rollouts
  2. As the first player, as the second player

Additionally there are "complex_simulations" and "simple_simulations" keys, corresponding to the number of simulations with and without expansion in each move of each game for policy network rollouts and random rollouts respectively.

Running

Setup

If using Poetry, run poetry install. If not, run pip install -r requirements.txt.

Experiment 1

If running on a machine with only one GPU, set both train_device and self_play_device in TrainingConfiguration in deep_mcts/train.py to cuda:0.

  1. python -m deep_mcts.<game>.train
  2. python -m deep_mcts.<game>.evaluate_training

Experiment 2

  1. python -m deep_mcts.<game>.evaluate_simple_rollouts

Experiment 3

  1. python -m deep_mcts.<game>.evaluate_complex_rollouts

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.