Git Product home page Git Product logo

sports1m-crawler's Introduction

Sports1M Tools

Requirements

  1. youtube-dl (https://github.com/rg3/youtube-dl/)

Fetch Sports1M

To download all the Sports1M videos run the following command line:

    mkdir $VIDEO_PATH
    chmod +x fetch_sports1m_videos.sh
    ./fetch_sports1m_videos.sh $VIDEO_PATH all_vid.txt NUM_WORKERS

Where $VIDEO_PATH is the path where the videos will be located. If you already have a subset of the videos, input that directory. NUM_WORKERS is the number of workers to download the dataset concurrently.

Generate all_vid.txt

You should download JSON annotation zip from project webpage. Extracting the zip, we have two JSON files: sports1m_train.json and sports1m_test.json. Running the following Python script to create all_vid.txt contains Youtube id of all videos of Sports1M:

    python generate_all_videos_txt_file.py

We have filtered out some videos having excessive duration accoring to recommendations from the [project page] (http://cs.stanford.edu/people/karpathy/deepvideo/) of the dataset by setting a DURATION THRESHOLD. You can use the following script to visualize histogram of video durations in Sports1M dataset.

    python histogram_durations.py

sports1m-crawler's People

Contributors

antran89 avatar

Stargazers

 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.