Git Product home page Git Product logo

amq-ranked-data's Introduction

amq-ranked-data

tools to collect data from the spreadsheet and load it

ranked data

Spreadsheet with the ranked data JSON files: https://docs.google.com/spreadsheets/d/1g0jW7k-GJiHueQ0ZVYe4WilupnUkBYLVlbB9GEdqQ98/

The zips directory contains monthly/seasonally zip files with all the JSON files for that month/season.

The 2 scripts are meant to download all the ranked data listed in the spreadsheet and conveniently load it for analysis/usage. The scraping part is mostly handled well, but some data cleaning/reformatting is necessary.

To download the JSON files, go on the google sheet and use File -> Download -> CSV. Then use the saved CSV files in the amq_scraper.py script to get the JSON files. The usage is: amq_scraper.py <sheet csv> <output dir>

Some older files may contain JSON errors that have to be corrected manually. However, if you download the JSON files from this repo rather than scraping them again, I should have those issues taken care of. Another issue is that some links to older JSON files are now dead, but you can still get them here.

The amq_loader.py script has 2 functions: 1 for loading all ranked files from a directory and 1 for reformatting the data to make it consistent. Documentation for these is in amq_loader.py. The clean_ranked_data function will work the way I intended only if you follow the details below about which files I decided to exclude. As of now, I cannot be 100% sure the data is completely ready.

data to use

Upon further inspection, I decided to drop some of the early data that is missing total player count. The percent of players guessing correctly is important to analysis so correct player count does not mean a lot without also having the total number of players. Also, many of these older files are missing video links and only provide 1 anime name (appears to be English). These were probably collected with older versions of the userscripts.

The list of files I am dropping are:

  • everything in amq_2019s03 (2019 season 3)
  • everything in amq_2020s01 (2020 season 1)
  • the following in amq_2020s02 (2020 season 2)
    • amq_2020s02_01_2020-01-27_central.json
    • amq_2020s02_01_2020-01-27_west.json
    • amq_2020s02_02_2020-01-28_west.json
    • amq_2020s02_03_2020-01-29_west.json
    • amq_2020s02_04_2020-01-30_west.json

What remains should be consistent enough for the desirable percent correct analysis I am planning on. The only issue I see remaining is some of the files not having all 75 songs (not surprising because AMQ is often unreliable and you may get disconnected even with a good internet connection). Most of the time only a few are missing so it should not be a big problem for analysis. The bigger issue is several ranked matches missing in older seasons which I cannot do anything about.

amq-ranked-data's People

Contributors

tkoz0 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.