Git Product home page Git Product logo

rgbd-dog's Introduction

RGBD-Dog: Predicting Canine Pose from RGBD Sensors

RGBD-Dog

Table of Contents

Abstract

The automatic extraction of animal 3D pose from images without markers is of interest in a range of scientific fields. Most work to date predicts animal pose from RGB images, based on 2D labelling of joint positions. However, due to the difficult nature of obtaining training data, no ground truth dataset of 3D animal motion is available to quantitatively evaluate these approaches. In addition, a lack of 3D animal pose data also makes it difficult to train 3D pose-prediction methods in a similar manner to the popular field of body-pose prediction. In our work, we focus on the problem of 3D canine pose estimation from RGBD images, recording a diverse range of dog breeds with several Microsoft Kinect v2s, simultaneously obtaining the 3D ground truth skeleton via a motion capture system. We generate a dataset of synthetic RGBD images from this data. A stacked hourglass network is trained to predict 3D joint locations, which is then constrained using prior models of shape and pose. We evaluate our model on both synthetic and real RGBD images and compare our results to previously published work fitting canine models to images. Finally, despite our training set consisting only of dog data, visual inspection implies that our network can produce good predictions for images of other quadrupeds -- e.g. horses or cats -- when their pose is similar to that contained in our training set.

Link to paper and supplemental material

YouTube

Dataset

Details on accessing the data will be posted in the next few days (as of June 8th 2020)

RGBD-Dog

Contents

Our dataset consists of five similar motions for five dogs:

  • walking in an approximately straight line
  • trotting in an approximately straight line
  • jump over poles
  • walk over poles
  • stepping/jumping on and off a table approximately 30cm in height.

For each sequence, the dog is accompanied by its handler. This person is not wearing a motion capture suit and no skeleton data of the person is provided.

For each dog, this data is available in the form of:

  • 3D marker locations
  • the solved skeleton joint rotations
  • the neutral mesh of the dog
  • Linear Blend Skinning weights
  • multi-view HD RGB footage recorded at 59.97 fps
  • multi-view RGB and RGB-D images from the Microsoft Kinect recording at approximately 6 fps.

The HD RGB footage will be available in 4K resolution on request. The number of cameras used per dog varied between eight to ten for the HD RGB cameras and five to six for the Kinects.

Note that the first frame of every .bvh file is the neutral pose of the dog. As such, frame F for camera C is frame F+1 in the .bvh/skeleton data.

Layout

Data for each dog is located in its own folder. The structure of this folder is as follows:

  • calibration
    • sony
      • calibFile_CAMERA_ID
      • ...
    • kinect_rgb
      • calibFile_CAMERA_ID
      • ...
    • kinect_depth
      • calibFile_CAMERA_ID
      • ...
  • meta
    • neutralMesh.obj
    • skinningWeights.mat
    • vskSticks.txt
  • motion_MOTION_NAME
    • kinect_depth
      • camera_CAMERA_ID
        • images
        • masks
      • ...
    • kinect_rgb
      • camera_CAMERA_ID
        • images
        • masks
      • ...
    • motion_capture
      • markers.json
      • skeleton.bvh
      • timecodes.json
    • sony
      • camera_CAMERA_ID
        • masks
        • camera_CAMERA_ID_2K.mp4
      • ...
  • motion_MOTION_NAME
  • ...

Access

This data is available for academic use. Please have a staff faculty member complete the form Data_Release_Form_RGBDDog_CVPR_2020.pdf, listed on this github page, and email it to Prof. Darren Cosker. You will then receive details on how to access the data. Companies should also contact Prof. Darren Cosker.

Revisions

Although all efforts were made to ensure that the data provided is correct, the details of any required revisions will be given in this section.

Code

We provide code for visualising the data in both Python and Blender. This code is located in the "Source" folder. We also provide the shape model, structured to be similar to the Skinned Multi-Person Linear model (SMPL), Skinned Multi-Animal Linear model (SMAL), etc. We call this model the DynaDog model, and is located in "DynaDog_model".

Dependencies

All code has been tested using Python3 on Windows 10. Blender is version 2.79.

Python libraries used:

  • numpy
  • scipy
  • pylab
  • cv2
  • matplotlib
  • pickle
  • objloader

Citation

If you find this dataset useful, we would kindly ask you to cite:

@InProceedings{Kearney_2020_CVPR,
author = {Kearney, Sinead and Li, Wenbin and Parsons, Martin and Kim, Kwang In and Cosker, Darren},
title = {RGBD-Dog: Predicting Canine Pose from RGBD Sensors},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

Contact

This code is no longer actively maintained.

rgbd-dog's People

Contributors

camera-bath avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.