Git Product home page Git Product logo

neuralscenedecomposition's Introduction

NSD: Neural Scene Decomposition for Human Motion Capture

CVPR 2019 paper by Helge Rhodin, Victor Constantin, Isinsu Katircioglu, Mathieu Salzmann, Pascal Fua

https://arxiv.org/abs/1903.05684

Please cite the paper in your publications if it helps your research:

@inproceedings{rhodin2019neural,
  author = {Rhodin, Helge and Constantin, Victor and Katircioglu Isinsu and Salzmann, Mathieu and Fua, Pascal},
  booktitle = {CVPR},
  title = {Neural Scene Decomposition for Multi-Person Motion Capture},
  year = {2019}
}

Features

Learning general image representations has proven key to the success of many computer vision tasks. For example, many approaches to image understanding problems rely on deep networks that were initially trained on ImageNet, mostly because the learned features are a valuable starting point to learn from limited labeled data. However, when it comes to 3D motion capture of multiple people, these features are only of limited use. In this paper, we introduce a self-supervised approach to learning what we call a neural scene decomposition (NSD) that can be exploited for 3D pose estimation. NSD comprises three layers of abstraction to represent human subjects: spatial layout in terms of bounding-boxes and relative depth; a 2D shape representation in terms of an instance segmentation mask; and subject-specific appearance and 3D pose information. By exploiting self-supervision coming from multiview data, our NSD model can be trained end-to-end without any 2D or 3D supervision. In contrast to previous approaches, it works for multiple persons and full-frame images.

The provided pytorch implementation provides

  • Network definition and weights (detector, image encoder, and image decoder)
  • Interactive test code
  • Training code (request our Boxing dataset for training NSD on two persons)

Minimal Dependencies

For testing a pre-trained model only the following packages are required:

  • Pytorch 0.4 (higher versions might work as well) and torchvision
  • numpy
  • matplotlib
  • pickle
  • imageio

Moreover you will need a GPU and X Windows System (e.g.,XQuartz for mac) to run the interactive demo.

Test the pretrained model

A pre-trained model can then be tested with

cd NSD/python
python configs/test_detect_encode_decode.py

It outputs synthesized views, detections, instance segmentation, and a relative depth map. Note that this requires an X Window System when exectued on a remote server, e.g., ssh -Y [email protected]. Different view angles can be explored interactively through slider input. It should look like this:

NSD viewer image

Training Dependencies

Training your own model requires more dependencies:

  • Ignite (provided in subdirectory)
  • Visdom (optional, for graphical display of training progress, https://github.com/facebookresearch/visdom)
  • EPFL-Boxing dataset (request it by contacting Helge Rhodin and provide the following information: Your full name, affiliation (send the email from your institutional mail address), your supervisor (professor), and your intended use case.)

Self-supervised Representation Learning

After downloading and file extraction, you have to specify the dataset paths in 'configs/config_train_detect_encode_decode.py' by changing the following lines,

'dataset_folder_train' : 'YOUR_PATH_TO_DATASET/EPFL-AmateurBoxingDataset-train',
'dataset_folder_test' : 'YOUR_PATH_TO_DATASET/EPFL-AmateurBoxingDataset-val',

The training is started by executing the following scrip from within the code root folder.

python configs/train_detect_encode_decode.py

There is quite a bit of debug output. Feel free to remove some if you feel like.

If an error in relation to parallelization occurs, you can try setting ' 'num_workers' : 8' in 'configs/config_train_detect_encode_decode.py'.

The script will create an "output/trainNVS...." folder to monitor the progress (useful in case you don't use visdom). Every 5k frames it will evaluate on the test set. This and other settings can be changed in 'configs/config_detect_encode_decode.py'.

Test your trained model

You have to set the 'network_path' in configs/config_test_detect_encode_decode.py to "SOME_PATH/output/trainNVS....". The trained model can then be tested as before with

python configs/test_detect_encode_decode.py

You might want to change the test set in configs/test_detect_encode_decode.py to your own dataset.

neuralscenedecomposition's People

Contributors

hrhodin avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.