Git Product home page Git Product logo

betapose's Introduction

Betapose: Estimating 6D Pose From Localizing Designated Surface Keypoints

Please refer to our paper for detailed explanation. Arxiv Link is here. In the following, ROOT refers to the folder containing this README file.

0. News:

Before you running this repository, please look at this one: https://github.com/sjtuytc/segmentation-driven-pose. It's much greater because it's:

  1. RGB only.
  2. SOTA precision and speed.
  3. Multi Object.
  4. Training end-to-end instead of two-step.
  5. Code super easy to understand and fully open-source.

Ⅰ. Installation

  1. All the codes are tested in Python, CUDA 8.0 and CUDNN 5.1.
  2. Install pytorch 0.4.0 and other dependencies.
  3. Download LineMod dataset here. Only folders called models and test are needed. Put them in DATAROOT/models and DATAROOT/test where DATAROOT can be any folder you'd like to place LineMod dataset.

Ⅱ. Designate Keypoints

You can skip this step since we have provided designated keypoints files in '$ROOT/1_keypoint_designator/assets/sifts/'.

  1. The related code is in $ROOT/1_keypoint_designator/.
    $ cd ROOT/1_keypoint_designator/
  2. Place the input model file (e.g. DATAROOT/models/obj_01.ply) in $./assets/models/
  3. Build the code and run it. Just type:
    $ sh build_and_run.sh
    The output file is in $./assets/sifts/. It's a ply file storing the 3D coordinates of designated keypoints.

Ⅲ. Annotate Keypoints

  1. The related code is in $ROOT/2_keypoint_annotator/.
    $ cd ROOT/2_keypoint_annotator/
  2. Run keypoint annotator on one object of LineMod.
    $ python annotate_keypoint.py --obj_id 1 --total_kp_number 50 --output_base ROOT/3_6Dpose_estimator/data --sixd_base DATAROOT
    Type the following to see the meaning of options.
    $ python annotate_keypoint.py -h
  3. The annotated keypoints are in file annot_train.h5 and annot_eval.h5. The corresponding training images are in folders train and eval.

Ⅳ. Training

Train Object Detector YOLOv3

  1. Relative files locate in $ROOT/3_6Dpose_estimator/train_YOLO.
    $ cd ROOT/3_6Dpose_estimator/train_YOLO
  2. Build Darknet (YOLOv3).
    $ make
  3. Prepare data as AlexeyAB/darknet's instructions. Refer to folder ./scripts for more help.
  4. Download pretrained darknet53 here
  5. Run train_single.sh or train_all.sh to train the network.
  6. Put trained weights (e.g. 01.weights) in folder $ROOT/3_6Dpose_estimator/models/yolo/.

Train Keypoint Detector (KPD)

  1. Relative code is in $ROOT/3_6Dpose_estimator/train_KPD/
    $ cd ROOT/3_6Dpose_estimator/train_KPD
  2. Modify Line 19, 21, 39, 46 of file./src/utils/dataset/coco.py to previously annotated dataset. Examples are given in these lines.
  3. Train on Linemod dataset without DPG.
    $ python src/train.py --trainBatch 28 --expID seq5_Nov_1_1 --optMethod adam
  4. Train on Linemod dataset with DPG. Just add a --addDPG option. and load the model trained after in the second step.
    $ python src/train.py --trainBatch 28 --expID seq5_dpg_Nov_1_1 --optMethod adam --loadModel ./exp/coco/seq5_Nov_1_1/model_100.pkl --addDPG
  5. (Optional) Visualize training process. Type
    $ tensorboard --logdir ./

Ⅴ. Evaluate

  1. Move back to the root of pose estimator.

    $ cd ROOT/3_6Dpose_estimator/
  2. Run the following command.

    $ CUDA_VISIBLE_DEVICES=1 python3 betapose_evaluate.py --nClasses 50 --indir /01/eval --outdir examples/seq1 --sp --profile

    The output json file containing predicted 6D poses will be in examples/seq1.

betapose's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.