Git Product home page Git Product logo

connect4's Introduction

Graduation thesis

This was my graduation thesis so you can find the whole documentation in Macedonian on this link

Connect4 agent using Reinforcement Learning

In this repository I have created an intelligent agent using reinforcement learning that learned how to play the game Connect4. I have experimented with DQN, minimax-DQN and DDQN algorithms.

The ConnectX competition that is happening on Kaggle has motivated me to learn how to create this kind of agents. The most trained agents based on the code in this repository managed to achive top 15% ranking in the competition.

Algorithm description

In the following part I will briefly explain the learning proccess of the agent and the code sections that make the alghorithm.

  • Environment - this object represents dynamics of the Connect4 game. Each game of the Connect4 represents one episode in the Connect4 environment. The state of the game is represented as a matrix of size 6x7 where each element of the matrix has one of the following three values: 0 - empty position in the board, 1 - mark put by the first player and 2 mark put by the second player. The environment objects has two main functions:
    • reset() - resets the environment meaning that new game will be played.
    • step(action) - executes the given action in the environment. After each executed action the agent receives the new state and a reward for the executed action.
  • Model - represents the neural network that decides which actions the agent should execute. The goal of the whole project is to train this network in order for the agent to win more often in the Connect4 game. This network represents the Q function of the DQN algorithm.
  • Experience - this object stores all the games that the agent has played which means that it represents the agent's experience of the game. Based on this experience the agent learns how to get better in the game.
  • Exploration strategy - represents an implementation of a specific algorithm that tackles the trade-off between exploration and exploation in RL problems. I implemented the simples algorithm which is the Epsilon Greedy Strategy.
  • Connect4 Learner - this object encapsulates the Model, Experience and the Exploration Strategy. Additionaly it implements the minimax-DQN and minimax-DDQN which is the main algorithm for training the policy_network. This object has one main function fit(EPOCHS) which trains the policy network based on the stored experience for the given number of epochs.
  • Self-play - this represents a function where an agent plays an episode in the environment against itself. The experience that is gathered during the episode is stored in the Experience object.
  • Hyperparameters - constants that are representing all the hyperparameters.
  • Agent Evaluation - evaluates the agent performance against a random agent and a negamax agent.
  • Agent submission - creates a submission file for the Kaggle's competition. Additionaly it adds two-step lookahead in order to provide better performance for the agent. The function encodes the weights of the policy network in a string using base64 encoding.

connect4's People

Contributors

vilijan avatar

Stargazers

Arvind Raghavendran avatar Marina Angelovska avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.