Git Product home page Git Product logo

dekr's People

Contributors

gengzigang avatar jenhaoyang avatar wolfeeglick avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dekr's Issues

ONNX file generation fails

I am using the script tools/valid.py to export ONNX file. Here is the error stack.

Total Parameters: 29,561,182
----------------------------------------------------------------------------------------------------------------------------------
Total Multiply Adds (For Convolution and Linear Layers only): 45,385,056,256
----------------------------------------------------------------------------------------------------------------------------------
Number of Layers
Conv2d : 395 layers   BatchNorm2d : 343 layers   ReLU : 306 layers   Bottleneck : 4 layers   BasicBlock : 105 layers   Upsample : 31 layers   HighResolutionModule : 8 layers   DeformConv2d : 34 layers   AdaptBlock : 34 layers   
=> loading model from pose_dekr_hrnetw32_coco.pth
/home/vinay/.local/lib/python3.6/site-packages/torch/onnx/symbolic_helper.py:347: UserWarning: You are trying to export the model with onnx:Upsample for ONNX opset version 9. This operator might cause results to not match the expected results by PyTorch.
ONNX's Upsample/Resize operator did not match Pytorch's Interpolation until opset 11. Attributes to determine how to transform the input were added in onnx:Resize in opset 11 to support Pytorch's behavior (like coordinate_transformation_mode and nearest_mode).
We recommend using opset 11 and above for models using this operator. 
  "" + str(_export_onnx_opset_version) + ". "
Traceback (most recent call last):
  File "tools/valid_hpe.py", line 122, in <module>
    main()
  File "tools/valid_hpe.py", line 119, in main
    torch.onnx.export(model, dump_input, "/home/vinay/Downloads/dekr.onnx", verbose=True)
  File "/home/vinay/.local/lib/python3.6/site-packages/torch/onnx/__init__.py", line 276, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/home/vinay/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 94, in export
    use_external_data_format=use_external_data_format)
  File "/home/vinay/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 698, in _export
    dynamic_axes=dynamic_axes)
  File "/home/vinay/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 465, in _model_to_graph
    module=module)
  File "/home/vinay/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 206, in _optimize_graph
    graph = torch._C._jit_pass_onnx(graph, operator_export_type)
  File "/home/vinay/.local/lib/python3.6/site-packages/torch/onnx/__init__.py", line 309, in _run_symbolic_function
    return utils._run_symbolic_function(*args, **kwargs)
  File "/home/vinay/.local/lib/python3.6/site-packages/torch/onnx/utils.py", line 994, in _run_symbolic_function
    return symbolic_fn(g, *inputs, **attrs)
  File "/home/vinay/.local/lib/python3.6/site-packages/torch/onnx/symbolic_opset9.py", line 1777, in slice
    raise RuntimeError("step!=1 is currently not supported")
RuntimeError: step!=1 is currently not supported

I add the following line valid.py to export the model.

torch.onnx.export(model, dump_input, "/home/dekr.onnx", verbose=True)

The same file with the above line works fine in the other repository such as HRNet Image Classification and Human Pose Estimation.

Questions about the salient regions in figure 1

Hi, thanks for sharing your code. Could you tell me how to generate the salient regions in figure 1 of paper? I have noticed that the toolbox you share in paper include many method, such as grad-sam, score-cam. However, these method are focus on the classification task. Is there any modification to make it compatible with your code?

Train costume dataset

Hello, Thank for your great work and open source selflessly.

I repclace coco dataset with my dataset, which without segmentation annotation, some error happened.

  File "/fastdata/computervision/liuxingyu/shared/projects/pose_estimation/DEKR/tools/../lib/dataset/COCOKeypoints.py", line 47, in __getitem__
    mask = self.get_mask(anno, image_info)
  File "/fastdata/computervision/liuxingyu/shared/projects/pose_estimation/DEKR/tools/../lib/dataset/COCOKeypoints.py", line 110, in get_mask
    obj['segmentation'], img_info['height'], img_info['width'])
KeyError: 'segmentation'

Could I replace segmentation with bbox annotation and then to train ? Will it affect model badly ?

How to run DEKR in real-time

in default DEKR only allow to play video clip for pose estimation, Now i want to modify the code to detect pose in real time, and the problem is not responding.
Any way to fix it?

无法训练自己的数据

训练自己的数据的时候,修改了joint_nums 依然无法训练,很好奇这是为啥。
提示:“the number of joints should be 22” (我的数据是21个关键点)
图片
在debug的时候,CocoKeypoints.py 中的 joints_list 一会18 一会22。

set_epoch for DistributedSampler

Describe the bug
PyTorch example suggests the use set_epoch function for DistributedSampler class before each epoch start. I could not find anywhere in your code.

https://github.com/pytorch/examples/blob/master/imagenet/main.py
Line 232-234

As can be seen from the DistributedSampler class code (https://github.com/pytorch/pytorch/blob/master/torch/utils/data/distributed.py), the set_epoch function is required to set the seed for each iter function call.

Can you confirm if this function has been called on DistributedSampler (for training dataset) at some point in your code?

Copyright Claim: I ask the same question as @ananyahjha93 did. Hence I copied and slight modified his post here: Lightning-AI/pytorch-lightning#224 (comment)

how can i vis the heatmap_avg

i can't understand the shape of heatmap_avg
such as: i input 320x240x3 image into hrnetw32_coco.pth and then the size of heatmap_avg is 6488064.
how can i vis the heatmap_avg?

RutimeError:CUDA out of memory. when i training on COCO train2017 dataset

when i run python tools/train.py --cfg experiments/coco/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_coco_x140.yaml
RuntimeError: CUDA out of memory. Tried to allocate 86.00 MiB (GPU 0; 10.91 GiB total capacity; 9.39 GiB already allocated; 125.50 MiB free; 9.48 GiB reserved in total by PyTorch) , i didn't find batch_size in the w32_4x_reg03_bs10_512_adam_lr1e-3_coco_x140.yaml? is there anything else that need to be modified?

about training loss

The loss of training is very small and decreases slowly when i finetune model on cocodatasets. Is't normal?
and how many epoches do you set?

About AdaptBlock

I notice that in the implement of AdaptBlock at line 126 show that:

        offset = torch.matmul(transform_matrix, self.regular_matrix)
        offset = offset-self.regular_matrix
        offset = offset.transpose(1,2).reshape((N,H,W,18)).permute(0,3,1,2)

why offset need subtract self.regular_matrix ?

about offset loss, difficult to convergence?

Hi, I am reproducing your work. I found that the offset loss is very difficult to convergence. The offset value is about 1000~2000. I think it's very abnormal. I don't know why this happen, could you help me?

Thank you very much.

JOINT_COCO_LINK_1 and JOINT_COCO_LINK_2

What does that mean ? lib/utils/rescore.py
JOINT_COCO_LINK_1 = [0, 0, 1, 1, 2, 3, 4, 5, 5, 5, 6, 6, 7, 8, 11, 11, 12, 13, 14]
JOINT_COCO_LINK_2 = [1, 2, 2, 3, 4, 5, 6, 6, 7, 11, 8, 12, 9, 10, 12, 13, 14, 15, 16]
How did you get these parameters?

vision

Can you upload the code on how to visualize the attention map, I don't know how to use the mentioned reference tool

Queries regarding inference

Thank you for the amazing work! I wanted to know if there was any instance tracking implemented in the codebase? such as SORT was being used in HRNET. Thanks!
@Gengzigang

testing on a custom dataset

Hi!
First thanks for your work that show magnificient results.

I would like to use your model on a custom dataset to try out if it is efficient on my challenging pictures. Could you tell what are the different step ?

Thanks a lot!

KeyError: "There is no item named 'val2017/000000397133.jpg' in the archive"

I have prepared the coco2017 dataset as the format of .zip.
But when I run the test command
python tools/valid.py --cfg experiments/coco/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_coco_x140.yaml TEST.MODEL_FILE model/pose_coco/pose_dekr_hrnetw32_coco.pth ,
I faced with the error:
Traceback (most recent call last):
File "tools/valid.py", line 212, in
main()
File "tools/valid.py", line 134, in main
for i, images in enumerate(data_loader):
File "/root/anaconda3/envs/dekr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 517, in next
data = self._next_data()
File "/root/anaconda3/envs/dekr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 557, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/root/anaconda3/envs/dekr/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/root/anaconda3/envs/dekr/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/root/lzj/DEKR/tools/../lib/dataset/COCODataset.py", line 104, in getitem
cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION
File "/root/lzj/DEKR/tools/../lib/utils/zipreader.py", line 53, in imread
data = _im_zfile[-1]['zipfile'].read(path_img)
File "/root/anaconda3/envs/dekr/lib/python3.7/zipfile.py", line 1465, in read
with self.open(name, "r", pwd) as fp:
File "/root/anaconda3/envs/dekr/lib/python3.7/zipfile.py", line 1504, in open
zinfo = self.getinfo(name)
File "/root/anaconda3/envs/dekr/lib/python3.7/zipfile.py", line 1431, in getinfo
'There is no item named %r in the archive' % name)
KeyError: "There is no item named 'val2017/000000397133.jpg' in the archive"

Do you have any idea to solve it?
Thanks a lot.

Replicate experiments for W48

Thanks for sharing your work.

I was able to replicate your results using the w32 weights for COCO, but when using the w48 model, the model doesn't perform over 40% mAP with the pre-trained weights or trying to re-train it. Do you have other correct weights for it or should I use a different protocol for training?

Thank you

Can the code generate 19 dimension heatmaps?

Can the code generate 19 dimension heatmaps? What parts need to be modified?I trained with the coco dataset.I only modified the dataset part of yaml.
DATASET:
DATASET: coco_kpt
DATASET_TEST: coco
DATA_FORMAT: zip
FLIP: 0.5
INPUT_SIZE: 512
OUTPUT_SIZE: 64
MAX_NUM_PEOPLE: 30
MAX_ROTATION: 30
MAX_SCALE: 1.5
SCALE_TYPE: 'short'
MAX_TRANSLATE: 40
MIN_SCALE: 0.75
NUM_JOINTS: 18
ROOT: 'data/coco'
TEST: val2017
TRAIN: train2017
OFFSET_RADIUS: 4
SIGMA: 2.0
CENTER_SIGMA: 4.0
BG_WEIGHT: 0.1
issue:
INFO:root:Dataset CocoKeypoints
Number of datapoints: 64115
Root Location: data/coco
Dataset CocoKeypoints
Number of datapoints: 64115
Root Location: data/coco
Traceback (most recent call last):
File "tools/train.py", line 295, in
main()
File "tools/train.py", line 108, in main
mp.spawn(
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/home/ubuntu/DEKR-main1/tools/train.py", line 258, in main_worker
do_train(cfg, model, train_loader, loss_factory, optimizer, epoch,
File "/home/ubuntu/DEKR-main1/tools/../lib/core/trainer.py", line 32, in do_train
for i, (image, heatmap, mask, offset, offset_w) in enumerate(data_loader):
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 517, in next
data = self._next_data()
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
return self._process_data(data)
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
data.reraise()
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/_utils.py", line 429, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/miniconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/DEKR-main1/tools/../lib/dataset/COCOKeypoints.py", line 54, in getitem
joints, area = self.get_joints(anno)
File "/home/ubuntu/DEKR-main1/tools/../lib/dataset/COCOKeypoints.py", line 82, in get_joints
joints[i, :self.num_joints, :3] =
ValueError: could not broadcast input array from shape (17,3) into shape (18,3)

installation issue: ncclSystemError: System call (socket, malloc, munmap, etc) failed.

Hi,

thank you for sharing your work.

I'm trying to test DEKR but facing with NCLL issue. When I run the train.py, it returns error:

RuntimeError: NCCL error in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:825, unhandled system error, NCCL version 2.7.8

ncclSystemError: System call (socket, malloc, munmap, etc) failed.

Could you give me some tips to overcome this?

Environment:
CUDA:
GPU:

  • NVIDIA GTX 1080 Ti PCIE-11GB
  • NVIDIA GTX 1080 Ti PCIE-11GB
  • NVIDIA GTX Titan PCIE-12GB
  • -NVIDIA GTX Titan PCIE-12GB
  • version: 10.2

System:

  • OS: Ubuntu 18.04
  • architecture:
  • 64bit
  • processor: x86_64
  • python: 3.6.9

When I execute the first command in python tools/valid.py \ --cfg experiments/coco/rescore_coco.yaml \ TEST.MODEL_FILE model/pose_coco/pose_dekr_hrnetw32.pth, I can't find '../data/rescore_data/rescore_dataset_train_coco_kpt' in it, and an error will be reported after the execution of more than 110000 pictures. I think it's because '../data/rescore_data/rescore_dataset_train_coco_kpt' needs to be created by myself, so I created such a file and modified the path in 'score/data/rescore_data/recore_dataset_train_coco_kpt', but an error is still reported. I hope you can help me solve it. In addition, I have another question. I have changed it three times. Each time I need to run 110000 pictures first, so it takes a lot of time. Is there any way to make him not need to run these 110000 pictures again?

command:
image
in rescore_coco.yaml:
The path before modification is in the box, and the path after modification is in the ellipse:
image
I created this myself:
image

about backbone

why don't you use lite-hrnet as your backbone? limited performence or lite-hrnet isn't suitable in this paper

ValueError: desired inference fps is 10 but video fps is 0.0

When i run "python tools/inference_demo.py --cfg experiments/coco/inference_demo_coco.yaml
--videoFile ../multi_people.mp4
--outputDir output
--visthre 0.3
TEST.MODEL_FILE model/pose_coco/pose_dekr_hrnetw32.pth"
I get error:
Traceback (most recent call last):
File "tools/inference_demo.py", line 286, in
main()
File "tools/inference_demo.py", line 208, in main
str(args.inferenceFps)+' but video fps is '+str(fps))
ValueError: desired inference fps is 10 but video fps is 0.0
can you tell me why and how to slove the problem

The evaluation result of my running is not good

Hello, Thank for your great work.

I ran the evaluation code with your public trained model following your readme,
but the result is different from your paper.

Arch AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
hrnet_dekr 0.365 0.484 0.400 0.335 0.463 0.684 0.844 0.732 0.619 0.778

Could you tell me that any error is in the code?
The code is https://github.com/asahiruyoru/dekr_eval

inference_demo.py csv_header little bug

Hi, in inference_demo.py line 268, when DATASET_TEST == 'crowdpose', the csv_header should use CROWDPOSE_KEYPOINT_INDEXES but not COCO_KEYPOINT_INDEXES. It's a little bug though the headers don't really matters ;)

How to make the GT offset maps?

Hello,could you tell me how to compute the GT offset maps because I couldn't understand when you form the regression loss,the predict is the local offset while the GT is seems the global offset.Thank you very much.

Is there any comparison between adaptconv and deformable-conv?

As far as I know, deformable conv also learns the shape or apperance of an object and has been proven effective in various vision tasks. I'm not sure about the major difference between adapt conv and deformable-conv in terms of their design and performance. Looking forward to your reply.

Training custom dataset

Hello, thanks for your excellent work! Actually, I have a dataset containing some keypoints of a stable structure, and now I would like to detect the structure category via the keypoints. Should I directly formulate my dataset following the COCO format? Or I must change the dataloader code to implement the experiment? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.