Git Product home page Git Product logo

restpectfulscraper's Introduction

RESTpectfulScraper

This simple scraper harvests Yu-Gi-Oh card data from a .csv in the /input directory and turns them into a game-formatted .csv placed in the /output directory.

Sourcing data

The .csv feeding this scraper is generated from .cdb files used to power EDO Pro, the popular Yu-Gi-Oh online simulator. You can actually extract most of the card list from EDO Pro's program files, but that appears to be an incomplete archive that cuts off sometime before the end of 2022. For cards released since then, I had to UNION the EDO Pro "base" .cdb file with .cdb files sourced from ProjectIgnis/DeltaPuppetOfStrings. Specifically, I used cards.delta.cdb because it appeared to contain the collection with all TCG data confirmed by printing instead of pre-release fan interpretation. This repo, BabelCDB may be better suited for what I want to do. That's something worth investigating.

To manually generate .csv files based on .cdb files, use DB Browser for SQLite. Open the main .cdb file with this tool, then open cards.delta.cdb within the db context by clicking the "attach database" button and following the instructions. The only values we need for our .csv are texts.id and texts.name.


Future goals

  1. Automate the data collection process

    When cards get added to cards.delta.cdb, the EDO Pro maintainers make a commit. If we can trigger an update to our .csv whenever that file changes, we're in business.

  2. Record all intermediate API responses

    As a gut check, it would be great if we could cache the json responses from the ygoprodeck API in the /output directory. There's juicy data in those responses that could be helpful later on.

restpectfulscraper's People

Contributors

clifflezark avatar cllezark avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.