classic-arcade-game-dataset

A TensorFlow dataset for identifying classic arcade games from sequenzes of screendumps. All screendumps for a game are, typically, from the games attract mode. The sequence could also be from the game being played, if the game only has a single stage/level. That is, what is trying to recognize, are

The games are identified using the name of the game's MAME ROM set.

The dataset is available as archive files or as a TensorFlow dataset built with the TFDS CLI.

last part of sequence is used for testing, and first par for training. Not good. Should be mixed.

Supported games

The dataset contains data for the following games (As named in MAME):

amidar
depthcho
digdug
dkong
frogger
galagao
invadrmr
missile1
pacman
qix
rallyx

How to use

An example of loading the 16x16 version of the TensorFlow dataset. The first 50% will be used for training, and the second 50% will be used for testing:

import tensorflow_datasets as tfds

from dataset import classic_arcade_games

(train_images, train_labels), (test_images, test_labels) = tfds.load(
    "classic_arcade_games/16x16",
    split=['train[:50%]', 'train[50%:]'],
    as_supervised=True,
    batch_size=-1
    )

For a full example, take a look at dataset_demo.py in this repository, as well as this Terraform tutorial.

Data

The data in this dataset has been collected by running MAME and manually creating screendumps from the chosen sequenze in the game.

Raw data

Unmodified screendumps are stored in /data/mame/original/<mame_id>/*.png

Modified data

Squared, grayscale version in different resolutions, are avaiable in .zip archives. These images have been created using the script scale_screndumps.py. are stored in these directories:

data/mame/8x8.zip
data/mame/16x16.zip
data/mame/32x32.zip
data/mame/64x64.zip

Contributing

Please contribute to this dataset by adding screendumps made with MAME of game sequences. Make a PR adding around 100 screendums to data/original/<name-of-game-rom>/<number>.png. Do not use any effects on the image.

Illustrations

Examples of what the squared screendumps look like.

Performance

Results for the variations of the dataset with 50% of the data used for training and the other 50% used for testing.

tobiasbp / classic-arcade-game-dataset Goto Github PK