Git Product home page Git Product logo

video-game-encyclopedia's Introduction

Video Game Encyclopedia

Gaming

Visit Kaggle page for the full Game Dataset

Directory Tree

.
|-- Combine_game_info.ipynb
|-- Get_game_id.ipynb
|-- Get_game_info.ipynb
`-- data
    |-- game_id.csv
    |-- game_info.csv
    |-- game_id
    |   |-- 1.json
    |   |-- 2.json
    |   `-- *.json
    `-- game_info
        |-- 1.json
        |-- 2.json
        `-- *.json

How to Start

pip3 install -r requirements.txt

Steps to replicate the dataset:

  1. Run Get_game_id.ipynb. This will go through all pages in https://api.rawg.io/api/games?page=1 and save a JSON file for each page in ./data/game_id/*.json where * is the page number. At the end, ./data/game_id.csv is created containing information of all downloaded JSON files to prepare for the next step.
  2. Run Get_game_info.ipynb. This will get each game information i.e. https://api.rawg.io/api/games/ and save a JSON file for each game in ./data/game_info/*.json where * is the game id.
  3. Run Combine_game_info.ipynb. This will go through each game in ./data/game_info/ and put the data together and save it as ./data/game_info.csv. game_info.csv contains the final data set

Important Notes:

  • Only needed information are saved in JSON files:
    • ./data/game_id with 17000 files has the size of ~10MBs.
    • ./data/game_info with 350000 files has the size of ~170MBs
  • To increase the speed of obtaining the data from RAWG API, concurrent programming was applied to step 1 and 2.
    • Step 1 takes ~40 minutes with 50 threads
    • Step 2 takes ~100 minutes with 100 threads
    • Step 3 takes ~5 minutes
  • When 1 thread fails to get data, it will skip to next game/page without any notification. To make sure you get all games on the database. You can run Step 1 and Step 2 multiple times. Downloaded files will be skipped.

Limitations:

  • To reduce the file size of downloaded files and the final CSV dataset, not all JSON information are downloaded. If you need more customization, you will need to change how the JSON data is handled in Step 2
  • Although Multhreading was applied, the whole process can take up to ~3 hours to finish because of the large amount of data.

Context

This is a game data set containing 345667 games on over 50 platforms including mobiles. All games information are obtained using Python with RAWG API. This data set was last updated on Nov 10th 2019. If you are interested in obtaining more recent games, visit the GitHub page for more information.

Content

Each row contains information about one game. There are several columns that have multiple values like platforms, genres, ... In those cases, values are separated by double pipes ||. All game information are updated on Nov 10th 2019.

Column definitions:

  • id: An unique ID identifying this Game in RAWG Database
  • slug: An unique slug identifying this Game in RAWG Database
  • name: Name of the game
  • metacritic: Rating of the game on Metacritic
  • released: The date the game was released
  • tba: To be announced state
  • updated: The date the game was last updated
  • website: Game Website
  • rating: Rating rated by RAWG user
  • rating_top: Maximum rating
  • playtime: Hours needed to complete the game
  • achievements_count: Number of achievements in game
  • ratings_count: Number of RAWG users who rated the game
  • suggestions_count: Number of RAWG users who suggested the game
  • game_series_count: Number of games in the series
  • reviews_count: Number of RAWG users who reviewed the game
  • platforms: Platforms game was released on. Separated by ||
  • developers: Game developers. Separated by ||
  • genres: Game genres. Separated by ||
  • publishers: Game publishers. Separated by ||
  • esrb_rating: ESRB ratings
  • added_status_yet: Number of RAWG users had the game as "Not played"
  • added_status_owned: Number of RAWG users had the game as "Owned"
  • added_status_beaten: Number of RAWG users had the game as "Completed"
  • added_status_toplay: Number of RAWG users had the game as "To play"
  • added_status_dropped: Number of RAWG users had the game as "Played but not beaten"
  • added_status_playing: Number of RAWG users had the game as "Playing"

Acknowledgements

Thanks to RAWG for providing easy to use and fast API
Icon made by Good Ware from www.flaticon.com

Inspiration

With this data, one can create a game recommendation platform as well as drawing insights about the gaming industry and gaming trends.

video-game-encyclopedia's People

Contributors

trung-hn avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.