Git Product home page Git Product logo

video-preprocessing's Introduction

Video Preprocessing

This repository provides tools for preprocessing videos for TaiChi, VoxCeleb and UvaNemo dataset used in paper.

Dowloading videos and cropping according to precomputed bounding boxes

  1. Instal requirments:
pip install -r requirements.txt
  1. Load youtube-dl:
wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl
chmod a+rx youtube-dl
  1. Run script to download videos, there are 2 formats that can be used for storing videos one is .mp4 and another is folder with .png images. While .png images occupy significantly more space, the format is loss-less and have better i/o performance when training.

Taichi

python load_videos.py --metadata taichi-metadata.csv --format .mp4 --out_folder taichi --workers 8

select number of workers based on number of cpu avaliable. Note .png format take aproximatly 80GB.

VoxCeleb

python load_videos.py --metadata vox-metadata.csv --format .mp4 --out_folder vox --workers 8

Note .png format take aproximatly 300GB.

UvaNemo Since videos is not avaliable on youtube you have to download videos from official website, and run:

python load_videos.py --metadata nemo-metadata.csv --format .mp4 --out_folder nemo --workers 8 --video_folder path/to/original/videos

Note .png format take aproximatly 18GB.

Preprocessing VoxCeleb dataset

If you need to change cropping strategy for VoxCeleb dataset or produce new bounding box annotations folow these steps:

  1. Load vox-celeb1(vox-celeb2) annotations:
wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox1_test_txt.zip
unzip vox1_test_txt.zip

wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox1_dev_txt.zip
unzip vox1_dev_txt.zip
wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox2_test_txt.zip
unzip vox2_test_txt.zip

wget www.robots.ox.ac.uk/~vgg/data/voxceleb/data/vox2_dev_txt.zip
unzip vox2_dev_txt.zip
  1. Load youtube-dl:
wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl
chmod a+rx youtube-dl
  1. Install face-alignment library:
git clone https://github.com/1adrianb/face-alignment
cd face-alignment
pip install -r requirements.txt
python setup.py install
  1. Install ffmpeg
sudo apt-get install ffmpeg
  1. Run preprocessing (assuming 8 gpu, and 5 workers per gpu).
python crop_vox.py --workers 40 --device_ids 0,1,2,3,4,5,6,7 --format .mp4 --dataset_version 2
python crop_vox.py --workers 40 --device_ids 0,1,2,3,4,5,6,7 --format .mp4 --dataset_version 1 --data_range 10000-11252

Preprocessing TaiChi dataset

If you need to change cropping strategy for TaiChi dataset or produce new bounding box annotations folow these steps:

  1. Download videos based on annotations:
python load_videos.py --metadata taichi-metadata.csv --format .mp4 --out_folder taichi --workers 8 --video_folder youtube-taichi --no_crop
  1. Install mask-rcnn benchmark. Follow the instalation guide https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/INSTALL.md

  2. Load youtube-dl:

wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl
chmod a+rx youtube-dl
  1. Run preprocessing (assuming 8 gpu, and 5 workers per gpu).
python crop_taichi.py --workers 40 --device_ids 0,1,2,3,4,5,6,7 --format .mp4

Preprocessing Nemo dataset

If you need to change cropping strategy for Nemo dataset or produce new bounding box annotations folow these steps:

  1. Install face-alignment library:
git clone https://github.com/1adrianb/face-alignment
cd face-alignment
pip install -r requirements.txt
python setup.py install
  1. Download videos from official website, and run:
python crop_nemo.py --in_folder /path/to/videos --out_folder nemo --device_ids 0,1 --workers 8 --format .mp4

Additional notes

Citation:

@InProceedings{Siarohin_2019_NeurIPS,
  author={Siarohin, Aliaksandr and Lathuilière, Stéphane and Tulyakov, Sergey and Ricci, Elisa and Sebe, Nicu},
  title={First Order Motion Model for Image Animation},
  booktitle = {Conference on Neural Information Processing Systems (NeurIPS)},
  month = {December},
  year = {2019}
}

video-preprocessing's People

Contributors

aliaksandrsiarohin avatar abcdea avatar reznet avatar maxopoly avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.