Git Product home page Git Product logo

reversi-rl's Introduction

Reversi

Overview

Othello game prepared for reinforcement learning with following agents implemented:

  • Value Iteration
  • MCTS
  • SARSA
  • SARSA-Lambda
  • Expected SARSA
  • Q-Learning
  • Double Q-Learning
  • Value Function Approximation
  • Random

Usage

Inside src directory: python reversi.py --help:

Usage: reversi.py [OPTIONS] [[human|random|value_iter|mcts|sarsa|exp_sarsa|sar
                  sa_lambda|q_learning|dq_learning|value_approx]] [[human|rand
                  om|value_iter|mcts|sarsa|exp_sarsa|sarsa_lambda|q_learning|d
                  q_learning|value_approx]]

  Runs Reversi game of given size, given number of times, with selected
  players, which are learning or not, with or without GUI and returns wins
  count

Options:
  -l1                    Enable learning for first player
  -l2                    Enable learning for second player
  -s, --size INTEGER...  Size of the map
  -n, --number INTEGER   Number of game repeats
  -d, --delay FLOAT      Minimum delay between player moves in ms
  --live / --prepared    Whether use live or prepared backend
  --gui / --nogui        Whether graphical interface should be shown
  --help                 Show this message and exit.

Backends

Backend specifies how possible player moves, terminal states, subsequent game states are calculated. There are two backends implemented:

  • Live - Everything is calculated on the fly, what is relatively slow.
  • Prepared - All states, transitions and so on are calculated only once, saved in a file and are fast loaded in subsequent program launches. Require initial delay to build everything, but following games are much faster.

Obtained results

Percent results of 1000 games with random player

Map 5x4

Algorithm win / lost / draw
dq_learning 98.8 / 0.7 / 0.5
exp_sarsa 98.3 / 1.4 / 0.3
mcts 79.3 / 15.8 / 4.9
q_learning 99.7 / 0.3 / 0.0
sarsa 99.6 / 0.4 / 0.0
sarsa_lambda 99.7 / 0.2 / 0.1
value_approx 93.3 / 6.1 / 0.6
value_iter 99.5 / 0.3 / 0.2

Map 5x5

Algorithm win / lost / draw
dq_learning 55.6 / 44.3 / 0.1
exp_sarsa 61.8 / 38.0 / 0.2
mcts 65.1 / 34.9 / 0.0
q_learning 59.0 / 40.7 / 0.3
sarsa 58.2 / 41.5 / 0.3
sarsa_lambda 92.3 / 7.6 / 0.1
value_approx 89.4 / 10.5 / 0.1

Map 6x6

Algorithm win / lost / draw
value_approx 79.3 / 18.0 / 2.7
mcts 47.7 / 46.3 / 6.0

Map 8x8

Algorithm win / lost / draw
value_approx 85.1 / 12.1 / 2.8
sarsa_lambda 49.9 / 46.1 / 4.0
q_learning 49.8 / 46.6 / 3.6
mcts 51.0 / 45.5 / 3.5

Requirements

  • Python version: 3.10.5
  • Installing requirements: pip install -r requirements.txt

reversi-rl's People

Contributors

klima7 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

petertim449

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.