Our model includes two stages: Partial Body Detector and pose estimator. To use the model, please follow the below instructions.
Name | Backbone | [email protected]:0.95 | [email protected] | Download |
---|---|---|---|---|
Partial Body Detector | ResNet101 | 76.67 | 98.63 | model |
Name | Backbone | [email protected]:0.95 | [email protected] | Download |
---|---|---|---|---|
Pose Estimator | VITbase | 65.21 | 88.24 | model |
Because the pose prediction of the patient needs the proposal (bounding box) as input, so we need to run run our trained Partial Body Detector to get these bounding boxes first. Following the following steps to run the detector.
- Linux or macOS with Python ≥ 3.6
- PyTorch ≥ 1.8 and torchvision that matches the PyTorch installation. Install them together at pytorch.org to make sure of this
- OpenCV is optional but needed by demo and visualization
Step 1: Install Pytroch: following the instruction in https://pytorch.org/ to install the latest version of pytorch.
Step 2: Following the corresponding structure, clone the code, and run:
cd ./Medical-Partial-Body-Pose-Estimation/
python -m pip install -e detectron2
In order to make the model compatible to your syste, you may need adjust the version of some pachages:
pip install pillow==9.5
pip install opencv-python
pip install xtcocotoolsls
Download our Partail Body Detecor weights in the chart in our model zoo and save it to ./detectron2/weights
folder.
The model takes input as input, if you have video, you should first split the video as images and save it to some place (please see the the third part of this repo for trainsfering between image and video).
Feel free to use our prepared data for test. You can download them at test_data.
Then get the detection results by running:
python ./detectron2/demo/bbox_detection_medic.py --config-file configs/medic_pose/medic_pose.yaml --input you_path/*.jpg
You will get a dict including the frame level preditions, with the structure of
├── demo
│ ├── bbox_detection_results
│ │ ├── vis
│ │ └── frame1_result.jpg
| | └── frame2_result.jpg
| | └── ......
│ ├── bbox_detections.json
You will also get frame-level prediction is vis
folder (see the expample of the following Figure) and a json file named bbox_detections.json
for the sebsequent pose estimation.
In order to run the second stage, you should first install some packages.
We use mmcv 1.3.9 for the experiments.
git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
pip install -e . -v
cd ..
git clone https://github.com/Dantong88/Medical-Partial-Body-Pose-Estimation
cd ViTPose
pip install -v -e .
After install the two repos, install timm and einops, i.e.,
pip install timm==0.4.9 einops
Download our Pose Estimator weights in the chart in our model zoo and save it to ./ViTPose/weights
folder.
The model takes images and the partial body detection results as input, feel free use oue test data (the link above) and the pre-generated detection results bbox_detections.json to test.
Then get the pose estimation results by running:
python ./ViTPose/demo/top_down_img_demo.py --json-file your_path_of_detection_results --pose_config ViTPose/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco
/ViTPose_base_medic_casualty_256x192.py --img-root the_path_of_input_images --out-img-root the_path_to_save_the_results
You will get the pose estimation results as following structure:
├── the_path_to_save_the_results
│── frame1_result.jpg
└── frame2_result.jpg
└── ......
We give some examples of how the results will look like:
Our Partial Body Detector and Pose Estimator both take images as input and output image-level prediction, do not fprget to change the path
to the path you put/save your video.
If you need to prepare the input from a video, please run:
python ViTPose/demo/process_our_video.py
If you need to generate videos using the image-level pose prediction, please run: (change line23
and line29
for the image input path and video save path)
python /ViTPose/demo/img2video.py
Following the steps above, you are supposed to get a demo like demo example .
We acknowledge the excellent implementation from ViTPose and detectron2.