Git Product home page Git Product logo

gym-snake's Introduction

Snake Environment (gym-snake)

Build Status

This is a set of OpenAI Gym environments representing variants on the classic Snake game.

Requirements:

  • Python 3.5+
  • OpenAI Gym
  • NumPy
  • PyQT 5 for graphics

Please use this bibtex if you want to cite this repository in your publications:

@misc{gym_snake,
  author = {Correa, Telmo},
  title = {Snake Environment for OpenAI Gym},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/telmo-correa/gym-snake}},
}

Installation

You can clone this repository and install all dependencies with pip:

git clone https://github.com/telmo-correa/gym-snake.git
cd gym-snake
pip install -e .

Basic Usage

There is a UI application which allows you to manually control single-agent environments with the arrow keys:

./manual_control.py

The environment being run can be selected with the --env-name option, eg:

./manual_control.py --env-name Snake-Hex-8x8-v0

Design

Inspiration

This environment was created as a study in response to OpenAI's requests for research problems.

Code architecture and renderer are adapted from the excellent gym-minigrid environment (see original license in rendering.py).

Structure of the world

  • The world is a grid of tiles:
    • NxM square tiles, or
    • NxM hex tiles
  • Each tile can contain an apple, part of a snake (a queue of connected tiles), or be empty.
  • Each agent controls its own snake. Until the agent is done, at each step it can take one of three actions, on square tiles, or one of five actions, on hex tiles:
    • move forward
    • turn left and move forward
    • turn right and move forward
    • turn left twice and move forward (hex only)
    • turn right twice and move forward (hex only)
  • If a snake tries to move out of bounds or into the body of a snake, it dies -- the agent is done and all further inputs are ignored.
  • If a snake moves into an apple, the apple is removed, the snake body grows in size by 1, and a new apple is generated in a random empty tile**.
  • All agents are marked as done when a set number of time steps is reached (default 1000).
  • The following rewards are provided to each live agent by default:
    • Eating an apple: +1
    • Collision: -1
    • Timeout: 0
    • Default reward each time step: 0

Rule variants

  • Single apple: snakes die immediately after collecting an apple.
  • No contraction: snake bodies just extend at each step, instead of only when collecting an apple.
    This creates a tron-like ever-growing trail behind each agent.

Observations

The world state is fully observable by all agents.

A rectangular image with one pixel per tile is provided for each agent. Pixels have distinct values for:

  • Empty tile
  • Apple
  • Own agent's snake body
  • Own agent's snake head
  • Another agent's snake body
  • Another agent's snake head
  • Dead/done agent's snake body
  • Dead/done agent's snake head

Multi-agent scenarios

Some of the environments included in this package are multi-agent environments -- more than one snake is being controlled, with potentially competitive rewards.

OpenAI gym environments do not have a standardized interface to represent this.

In this package, they are implememented in the same manner as the one in the Multi-Agent Particle Environments (MPE) presented with the MADDPG paper:

  • env.action_space is a list of action spaces, one for each agent.
  • env.observation_space is a list of observation spaces, one for each agent.
  • env.step expects a list of actions to be executed simultaneously as input, and it returns a tuple observations, rewards, dones, {}, with each element being a list with one element per agent.

Included Environments

The environments below are implemented in the gym_snake/envs directory.

Dead apple

An agent receives a reward and dies whenever it collects an apple.

Environment name # Agents Grid type Grid size Initial snake size
Snake-4x4-DeadApple-v0 1 square 4x4 2
Snake-8x8-DeadApple-v0 1 square 8x8 4
Snake-16x16-DeadApple-v0 1 square 16x16 4
Snake-Hex-4x4-DeadApple-v0 1 hex 4x4 2
Snake-Hex-8x8-DeadApple-v0 1 hex 8x8 4
Snake-Hex-16x16-DeadApple-v0 1 hex 16x16 4
Snake-4x4-DeadApple-2s-v0 2 square 4x4 2
Snake-8x8-DeadApple-2s-v0 2 square 8x8 4
Snake-16x16-DeadApple-2s-v0 2 square 16x16 4
Snake-Hex-4x4-DeadApple-2s-v0 2 hex 4x4 2
Snake-Hex-8x8-DeadApple-2s-v0 2 hex 8x8 4
Snake-Hex-16x16-DeadApple-2s-v0 2 hex 16x16 4
Snake-4x4-DeadApple-3s-v0 3 square 4x4 2
Snake-8x8-DeadApple-3s-v0 3 square 8x8 4
Snake-16x16-DeadApple-3s-v0 3 square 16x16 4
Snake-Hex-4x4-DeadApple-3s-v0 3 hex 4x4 2
Snake-Hex-8x8-DeadApple-3s-v0 3 hex 8x8 4
Snake-Hex-16x16-DeadApple-3s-v0 3 hex 16x16 4

Classic

An agent receives a reward whenever it collects an apple, and a new apple is spawned in an empty location.

Environment name # Agents Grid type Grid size Initial snake size
Snake-4x4-v0 1 square 4x4 2
Snake-8x8-v0 1 square 8x8 4
Snake-16x16-v0 1 square 16x16 4
Snake-Hex-4x4-v0 1 hex 4x4 2
Snake-Hex-8x8-v0 1 hex 8x8 4
Snake-Hex-16x16-v0 1 hex 16x16 4
Snake-4x4-2s-v0 2 square 4x4 2
Snake-8x8-2s-v0 2 square 8x8 4
Snake-16x16-2s-v0 2 square 16x16 4
Snake-Hex-4x4-2s-v0 2 hex 4x4 2
Snake-Hex-8x8-2s-v0 2 hex 8x8 4
Snake-Hex-16x16-2s-v0 2 hex 16x16 4
Snake-4x4-3s-v0 3 square 4x4 2
Snake-8x8-3s-v0 3 square 8x8 4
Snake-16x16-3s-v0 3 square 16x16 4
Snake-Hex-4x4-3s-v0 3 hex 4x4 2
Snake-Hex-8x8-3s-v0 3 hex 8x8 4
Snake-Hex-16x16-3s-v0 3 hex 16x16 4

4 apples

Same as classic, but four apples are present at all times.

Environment name # Agents Grid type Grid size Initial snake size
Snake-4x4-4a-v0 1 square 4x4 2
Snake-8x8-4a-v0 1 square 8x8 4
Snake-16x16-4a-v0 1 square 16x16 4
Snake-Hex-4x4-4a-v0 1 hex 4x4 2
Snake-Hex-8x8-4a-v0 1 hex 8x8 4
Snake-Hex-16x16-4a-v0 1 hex 16x16 4
Snake-4x4-4a-2s-v0 2 square 4x4 2
Snake-8x8-4a-2s-v0 2 square 8x8 4
Snake-16x16-4a-2s-v0 2 square 16x16 4
Snake-Hex-4x4-4a-2s-v0 2 hex 4x4 2
Snake-Hex-8x8-4a-2s-v0 2 hex 8x8 4
Snake-Hex-16x16-4a-2s-v0 2 hex 16x16 4
Snake-4x4-4a-3s-v0 3 square 4x4 2
Snake-8x8-4a-3s-v0 3 square 8x8 4
Snake-16x16-4a-3s-v0 3 square 16x16 4
Snake-Hex-4x4-4a-3s-v0 3 hex 4x4 2
Snake-Hex-8x8-4a-3s-v0 3 hex 8x8 4
Snake-Hex-16x16-4a-3s-v0 3 hex 16x16 4

Expand

No apples, but snakes expand at every action while alive -- and receive a reward for doing so. Equivalent to there being apples in all empty squares.

Environment name # Agents Grid type Grid size Initial snake size
Snake-4x4-Expand-v0 1 square 4x4 2
Snake-8x8-Expand-v0 1 square 8x8 4
Snake-16x16-Expand-v0 1 square 16x16 4
Snake-Hex-4x4-Expand-v0 1 hex 4x4 2
Snake-Hex-8x8-Expand-v0 1 hex 8x8 4
Snake-Hex-16x16-Expand-v0 1 hex 16x16 4
Snake-4x4-Expand-2s-v0 2 square 4x4 2
Snake-8x8-Expand-2s-v0 2 square 8x8 4
Snake-16x16-Expand-2s-v0 2 square 16x16 4
Snake-Hex-4x4-Expand-2s-v0 2 hex 4x4 2
Snake-Hex-8x8-Expand-2s-v0 2 hex 8x8 4
Snake-Hex-16x16-Expand-2s-v0 2 hex 16x16 4
Snake-4x4-Expand-3s-v0 3 square 4x4 2
Snake-8x8-Expand-3s-v0 3 square 8x8 4
Snake-16x16-Expand-3s-v0 3 square 16x16 4
Snake-Hex-4x4-Expand-3s-v0 3 hex 4x4 2
Snake-Hex-8x8-Expand-3s-v0 3 hex 8x8 4
Snake-Hex-16x16-Expand-3s-v0 3 hex 16x16 4

License

Rendering code and some of the wrappers are under the BSD-3 license of their original distribution.

Remaining code is under a new BSD-3 license specific to this repository.

๐Ÿ ๐ŸŽ

gym-snake's People

Contributors

telmo-correa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.