Git Product home page Git Product logo

rsn's Introduction

Learning Delicate Local Representations for Multi-Person Pose Estimation

Introduction

This is a pytorch realization of Residual Steps Network which won 2019 COCO Keypoint Challenge and ranks 1st place on both COCO test-dev and test-challenge datasets as shown in COCO leaderboard. The original repo is based on the inner deep learning framework (MegBrain) in Megvii Inc.

In this paper, we propose a novel method called Residual Steps Network (RSN). RSN aggregates features with the same spatialsize (Intra-level features) efficiently to obtain delicate local representations, which retain rich low-level spatial information and result in pre-cise keypoint localization. In addition, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to further refine the keypointlocations. Our approach won the 1st place of COCO Keypoint Challenge 2019 and achieves state-of-the-art results on both COCO and MPII benchmarks, without using extra training data and pretrained model. Our single model achieves 78.6 on COCO test-dev, 93.0 on MPII test dataset. Ensembled models achieve 79.2 on COCO test-dev, 77.1 on COCO test-challenge dataset. The source code is publicly available for further research.

Pipieline of Multi-stage Residual Steps Network

Overview of RSN.

Architecture of the proposed Pose Refine Machine

Overview of RSN.

Some prediction resullts of our method on COCO and MPII valid datasets

Prediction Results of COCO-valid.

Prediction Results of MPII-valid.

Results

Results on COCO val dataset

Model Input Size GFLOPs AP AP50 AP75 APM APL AR
RSN-18 256x192 2.5 73.6 90.5 80.9 67.8 79.1 78.8
RSN-50 256x192 6.4 74.7 91.4 81.5 71.0 80.2 80.0
RSN-101 256x192 11.5 75.8 92.4 83.0 72.1 81.2 81.1
2×RSN-50 256x192 13.9 77.2 92.3 84.0 73.8 82.5 82.2
3×RSN-50 256x192 20.7 78.2 92.3 85.1 74.7 83.7 83.1
4×RSN-50 256x192 29.3 79.0 92.5 85.7 75.2 84.5 83.7
4×RSN-50 384x288 65.9 79.6 92.5 85.8 75.5 85.2 84.2

Results on COCO test-dev dataset

Model Input Size GFLOPs AP AP50 AP75 APM APL AR
RSN-18 256x192 2.5 71.6 92.6 80.3 68.8 75.8 77.7
RSN-50 256x192 6.4 72.5 93.0 81.3 69.9 76.5 78.8
2×RSN-50 256x192 13.9 75.5 93.6 84.0 73.0 79.6 81.3
4×RSN-50 256x192 29.3 78.0 94.2 86.5 75.3 82.2 83.4
4×RSN-50 384x288 65.9 78.6 94.3 86.6 75.5 83.3 83.8
4×RSN-50+ - - 79.2 94.4 87.1 76.1 83.8 84.1

Results on COCO test-challenge dataset

Model Input Size GFLOPs AP AP50 AP75 APM APL AR
4×RSN-50+ - - 77.1 93.3 83.6 72.2 83.6 82.6

Results on MPII dataset

Model Split Input Size Head Shoulder Elbow Wrist Hip Knee Ankle Mean
4×RSN-50 val 256x256 96.7 96.7 92.3 88.2 90.3 89.0 85.3 91.6
4×RSN-50 test 256x256 98.5 97.3 93.9 89.9 92.0 90.6 86.8 93.0

Note

  • + means using ensemble models.
  • All models are trained on 8 V100 GPUs

Repo Structure

This repo is organized as following:

$RSN_HOME
|-- cvpack
|
|-- dataset
|   |-- COCO
|   |   |-- det_json
|   |   |-- gt_json
|   |   |-- images
|   |       |-- train2014
|   |       |-- val2014
|   |
|   |-- MPII
|       |-- det_json
|       |-- gt_json
|       |-- images
|   
|-- lib
|   |-- models
|   |-- utils
|
|-- exps
|   |-- exp1
|   |-- exp2
|   |-- ...
|
|-- model_logs
|
|-- README.md
|-- requirements.txt

Quick Start

Installation

  1. Install Pytorch referring to Pytorch website.

  2. Clone this repo, and config RSN_HOME in /etc/profile or ~/.bashrc, e.g.

export RSN_HOME='/path/of/your/cloned/repo'
export PYTHONPATH=$PYTHONPATH:$RSN_HOME
  1. Install requirements:
pip3 install -r requirements.txt
  1. Install COCOAPI referring to cocoapi website, or:
git clone https://github.com/cocodataset/cocoapi.git $RSN_HOME/lib/COCOAPI
cd $RSN_HOME/lib/COCOAPI/PythonAPI
make install

Dataset

COCO

  1. Download images from COCO website, and put train2014/val2014 splits into $RSN_HOME/dataset/COCO/images/ respectively.

  2. Download ground truth from Google Drive or Baidu Drive (code: fc51), and put it into $RSN_HOME/dataset/COCO/gt_json/.

  3. Download detection result from Google Drive or Baidu Drive (code: fc51), and put it into $RSN_HOME/dataset/COCO/det_json/.

MPII

  1. Download images from MPII website, and put images into $RSN_HOME/dataset/MPII/images/.

  2. Download ground truth from Google Drive or Baidu Drive (code: fc51), and put it into $RSN_HOME/dataset/MPII/gt_json/.

  3. Download detection result from Google Drive or Baidu Drive (code: fc51), and put it into $RSN_HOME/dataset/MPII/det_json/.

Model

For your convenience, We provide well-trained models for COCO and MPII in Google Drive or Baidu Drive.

Log

Create a directory to save logs and models:

mkdir $RSN_HOME/model_logs

Train

Go to specified experiment repository, e.g.

cd $RSN_HOME/exps/RSN50.coco

and run:

python config.py -log
python -m torch.distributed.launch --nproc_per_node=gpu_num train.py

the gpu_num is the number of gpus.

Test

python -m torch.distributed.launch --nproc_per_node=gpu_num test.py -i iter_num

the gpu_num is the number of gpus, and iter_num is the iteration number you want to test.

Citation

Please considering citing our projects in your publications if they help your research.

@misc{cai2020learning,
    title={Learning Delicate Local Representations for Multi-Person Pose Estimation},
    author={Yuanhao Cai and Zhicheng Wang and Zhengxiong Luo and Binyi Yin and Angang Du and Haoqian Wang and Xinyu Zhou and Erjin Zhou and Xiangyu Zhang and Jian Sun},
    year={2020},
    eprint={2003.04030},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

@article{li2019rethinking,
  title={Rethinking on Multi-Stage Networks for Human Pose Estimation},
  author={Li, Wenbo and Wang, Zhicheng and Yin, Binyi and Peng, Qixiang and Du, Yuming and Xiao, Tianzi and Yu, Gang and Lu, Hongtao and Wei, Yichen and Sun, Jian},
  journal={arXiv preprint arXiv:1901.00148},
  year={2019}
}

@inproceedings{chen2018cascaded,
  title={Cascaded pyramid network for multi-person pose estimation},
  author={Chen, Yilun and Wang, Zhicheng and Peng, Yuxiang and Zhang, Zhiqiang and Yu, Gang and Sun, Jian},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={7103--7112},
  year={2018}
}

And the code of Cascaded Pyramid Network is also available.

Contact

You can contact us by email published in our paper or [email protected].

rsn's People

Contributors

caiyuanhao1998 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.