Git Product home page Git Product logo

mir-datasets's Introduction

mir-datasets

Build Status

This repository exists solely for tracking MIR datasets and their corresponding metadata in a standardized, structured way. There are multiple consumers of this data:

Importantly, mir-datasets.yaml file is the One Source of Truth – all other formats and representations of this data (markdown, HTML, latex tables, what have you) should be derived from this master object. If there is some format for this table you would like to see, feel free to submit a pull request!

Acknowledgement

This effort extends years of effort by Alexander Lerch to maintain a rolling list of MIR datasets on his website. Many thanks and high fives for the diligent work and foresight to recognize the value this collection has to the ISMIR community and beyond!

Schema

Each record in the mir-datasets.yaml file adheres to one of the following formats:

# Single metadata field
key1:
  url: http://path/to/website
  metadata: tempo
  contents: 123 songs
  audio: no

# Multiple metadata fields
key2:
  url: http://path/to/something.html
  metadata:
   - tempo
   - lyrics: http://my/lyrics/page
  contents: 10s snippets
  audio: yes

Note that multiple metadata fields additionally support providing URLs for each metadata field.

Contributing

You are encouraged to add (or update) dataset listings in this file! When doing so, please try to adhere to the following:

  • Preserve order in the list. While this isn't technically necessary, it'll help others make sense of it.
  • Provide as much info as possible.
  • Make sure that you've added valid YAML. We (will) have tests, and they will (eventually) fail on Travis.

Scripts

You can render the output as either Markdown or Javascript by specifying the correct output format, via the following:

$ ./render_datasets.py mir-datasets.yaml outputs/mir-datasets.md

Which will produce the output data-sets.md file contained in this repository. Note that this output file should not be modified directly:

  • If the information is incorrect, update the source YAML file
  • If the formatting is wrong, please help fix the script (or open an issue)

Testing

You can verify that the JS table is produced correctly by running a local HTTP server from the repository root, which points to index.html.

python -m http.server

Note that the table will inherit the CSS of the page that renders it.

ISMIR-Home

This repository serves the datasets information on the ISMIR website. We have made a conscious choice to make this work with static web technologies. While this makes serving easier, it requires that the ISMIR website consume this table as a JS source. This repository achieves this through the following:

  • maintain a markdown table here
  • when updated, render it as JS
  • commit to the repository (manually)
  • serve as an asset via github pages

There is opportunity to automate this via e.g. Travis-CI, but the velocity on this repository hasn't been high enough to warrant it yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.