Git Product home page Git Product logo

virtualpose's Introduction

VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data (ECCV 2022)

quality result

Introduction

This is the official Pytorch implementation for: VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data overall pipeline

Installation

pip install -r requirement.txt
python setup.py develop

Data preparation

The directory tree should look like this:

${ROOT}
|-- data
    |-- MSCOCO
    |   |-- annotations
    |   |   |-- person_keypoints_train2017.json
    |   |-- images
    |   |   |-- train2017
    |-- MuCo-3DHP
    |   |-- images
    |   |   |-- augmented_set
    |   |   |-- unaugmented_set
    |   |-- MuCo-3DHP.json
    |-- MuPoTS-3D
    |   |-- cameras.pkl
    |   |-- images
    |   |   |-- TS1
    |   |   |-- ...
    |   |   |-- TS20
    |   |-- MuPoTS-3D.json
    |-- panoptic-toolbox
    |   |-- data
    |   |-- data_hmor
    |   |   |-- 160224_haggling1
    |   |   |-- 160224_mafia1
    |   |   |-- ...
    |   |   |-- train_cam.pkl
    |   |   |-- val_cam.pkl
    |   |-- clean_train.pkl 
    |   |-- clean_valid.pkl 
    |   |-- pack.py
|-- models
    |-- pose_resnet_152_384x288.pth.tar

Training

We use 4 NVIDIA V100 with 32GB GPU memory for training.

CMU Panoptic dataset

Train the 2D pose estimation and human detection backbone with 2 gpus:

python run/train_3d.py --cfg configs/coco/backbone_res152_mix_panoptic.yaml --gpus 2

Train the root depth estimator and 3D pose estimator with 4 gpus:

python run/train_3d.py --cfg configs/panoptic/synthesize_full.yaml --gpus 4

MuCo-3DHP and MuPoTS-3D datasets

Train the 2D pose estimation and human detection backbone with 2 gpus:

python run/train_3d.py --cfg configs/coco/backbone_res152_mix_muco.yaml --gpus 2

Train the root depth estimator and 3D pose estimator with 4 gpus:

python run/train_3d.py --cfg configs/muco/synthesize_full.yaml --gpus 4

Evaluation

Our pre-trained models are available for download from Google drive or Onedrive.

CMU Panoptic dataset

Inference with 4 gpus:

python run/validate_3d.py --cfg configs/panoptic/synthesize_full_inference.yaml --gpus 4

MuCo-3DHP and MuPoTS-3D datasets

Inference with 4 gpus:

python run/validate_3d.py --cfg configs/muco/synthesize_full_inference.yaml --gpus 4

The results are in ${ROOT}/mupots_results/$, then use the evaluation code provided by MuPoTS-3D dataset to evaluate the results.

Citing

If our code helps your research, please consider citing the following paper:

@inproceedings{su2022virtualpose,
    title={VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data},
    author={Su, Jiajun and Wang, Chunyu and Ma, Xiaoxuan and Zeng, Wenjun and Wang, Yizhou},
    booktitle={European Conference on Computer Vision},
    pages={55--71},
    year={2022},
    organization={Springer}
}

Acknowledgement

This repo is built on https://github.com/microsoft/voxelpose-pytorch.

virtualpose's People

Contributors

shirleymaxx avatar wkom avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

virtualpose's Issues

MuPoTS-3D cameas.pkl

Hi, there
Thanks for your elaborated work :)

How did you unnormalize root depths for testing MuPoTS-3D dataset?
I mean, how did you get the focal lengths data of it?

About Ground Center in MuPoTS-3D evaluation

Hi, thanks for your great work

I have a question on your code (Link) which I'm a bit confused.
Could you explain the meaning of "ground center" argument here, used for MuPoTS-3D evaluation?

I'm not sure whether I understood it right, for me the purpose of it seems like aligning the center of persons between prediction and GT, which should not be done for a fair evaluation of PCK_abs.

Will be grateful if you correct me right if I misunderstood it.

About Custom Images testing

Thank you very much for your excellent work. However, I have encountered some issues while testing several custom images and I would appreciate your help in resolving them. The problem is as follows: I used the pre-trained model and config(configs/images/images_inference.yaml) and run/inference_images.py that you provided, but when running single_posenet(single_pose = self.PEN(heatmaps, meta, grid_centers[:, n])), it does not seem to execute. Could there be an issue with the threshold processing that you have set up?

confirm

"Hello, I would like to ask if the dataset directory structure you provided in the project document is slightly different from the configuration given below. Can we confirm and standardize it

About a novel training strategy

Thanks for your great work ! You mention that “we can even generate virtual training data specifically for the environment”. Can you tell me where is the function to generate virtual training data ? Thanks !

Generalization

Does this method also generalize well to multi-view predictions ? , i.e generating heatmaps from several views and training in this manner to predict directly the 3d poses as in voxel pose (but the data is only synthetic only camera poses known I'm advance but no actual pairs available)

MuPoTS google driver link and model weight

Hi, thanks for your great work!

I have a question about the google driver link of MuPoTS that has been not found 404. By the way, can u provide the model weight for evaluating?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.