Git Product home page Git Product logo

qlearning's Introduction

Applying Deep-Q Learning to Connect-4

Code Structure

This repository is split into several components. deepq.py holds the main functionality for all of the Q-learning implementations, filter_visualizer.ipynb has funcitons to show what the convolutional encoders in the Q networks learned, QNetworks/ contains pre-trained Q-Learning agents of varying configurations, Histories/ contains the models' loss over time on the training and validation set, Example_Boards/ contains 95,000 randomly generated boards that have beeen evaluated by a Monte-Carlo agent to get an estimate of the Q values, deepq.ipynb contains visualizations of the agents playing a game and what the predicted Q values were for any given state, and analysis.ipynb contains different analyses done on the different pretrained agents.

We reccomend starting at the analysis.ipynb to gain an understanding of the evaluation methods and how to use the functions defined in deepq.py

deepq.py

This file is the bread and butter of this project and contains all of the logic for playing agents against eachother, training agents, and creating environments for agents to play in. It is comprised of several major classes that will be discussed below:

  1. Board
    • This class holds all of the information necessary for any agent that implements BasePlayer's functions to play in a connect-4 environment. It has methods to train agents whether they be tabular Q or deep q and play agents against eachother for the user to see (as can be seen in deepq.ipynb).
    • There are several configurations when training the agents. The number of games played in a single episode, how often Deep-Q agents copy Q` into Q, how to alter the $\alpha$ value as training goes on, how to alter $\epsilon$ for the epsilon greedy policy as training goes on, and how many total games should be played in training.
  2. SinglePredictionQPLayer
    • This class is an implementation of a Deep-Q player that has a network that will take as input a state and action pair and output a single Q value for it. There are several configurations for this class when it is constructed like the initial epsilon value, the initial alpha value, the discount factor (denoted by gamma), the training batch size, the number of epochs to train on each episode, and also the tensorflow model to use when making predictions.
  3. MultiPredictionQPlayer
    • This class is very similar to the SinglePredictionQPlayer but with the main difference being that this agent expects the underlying model to only accept a single state and output a q value for all actions. It has the same configurations as the SinglePredictionQPlayer.
  4. Helper Functions
    • There are various helper functions defined in deepq.py. They range from gaining the transition function from state and action to new state, checking if a player has won the game, and evaluating the board with a heuristic function for the reward. They should not be imported outside of the deepq.py with the exception of some desinged to help the user understand what is going on inside of the game, like the visualize() method that will take a 2D numpy array and show it as a connect-4 board.

qlearning's People

Contributors

alexshock66 avatar

Stargazers

Michael Hahsler avatar  avatar

Watchers

Michael Hahsler avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.