Git Product home page Git Product logo

Comments (7)

bearpaw avatar bearpaw commented on July 23, 2024 2

@xmyqsh This a very comprehensive analysis. Thanks very much!

To @adityaarun1 @weigq @xmyqsh , here is a quick fix (at least allows you to train at the same speed with fewer workers)

As suggested by @xingyizhou, I missed an accelerating part as implemented in the original hourglass: https://github.com/anewell/pose-hg-train/blob/master/src/util/img.lua#L91-L105
(Because I follow the python code here: https://github.com/anewell/pose-hg-train/blob/master/src/pypose/img.py#L48-L78)

This has been fixed in my latest commit: 88a2294

The original code is a bit difficult to understand for me. I try to implement this part in my own understanding. The result seems correct ( 140 epoch for hg-stack2-block1 achieves 86.3 PCKh).

from pytorch-pose.

adityaarun1 avatar adityaarun1 commented on July 23, 2024 2

@bearpaw the fix works. 👍
This thread on twitter might be of interest to us to further improve the efficiency of the data loader. People from FastAI have developed multi-processing + thread pool dataloader for PyTorch which works fine. But this adds an additional dependency to this repo.

from pytorch-pose.

xmyqsh avatar xmyqsh commented on July 23, 2024 1

@bearpaw
I used the torch version, it has the same problem. Though set the augmentation part ahead of crop operation in this pytorch version seems more appropriate but also more time-consuming. I have tried to remove the augmentation part, it has a bit of speedup.

Also, the np.linalg.inv operation is called 17 times. And the image reading is definitely time-consuming. So, the two main parts, image reading and image preprocessing are both time-consuming. And they wait for each other sometimes as we have seen.

It should be better to sperate the two parts with two Producer-Consumer model instead of just one.
For implementation, I think set a new dataloader on load_image mpii.py#L90 and load_annot mpii.py#L70-#L73 in training phase should be OK.

For evaluation phase in the above Framework, an extra image_id to prediction map should also be added.

Personal opinion, hope helpful.

from pytorch-pose.

weigq avatar weigq commented on July 23, 2024

here about 15min a epoch with 1080ti and E5,
i think your issue may not be caused by the dataloader.
BTW, have you used cuda, cudnn?

from pytorch-pose.

adityaarun1 avatar adityaarun1 commented on July 23, 2024

@weigq Yes, I am using CUDA 8.0 and CUDNN 6. I don't think CUDA/CUDNN is an issue, as all my other codes are running just fine.

I have just tested it out another machine with same configuration and this slowness is still there. The slowness occurs only during the data loading part and not during the forward pass (which happens in few milliseconds).

from pytorch-pose.

bearpaw avatar bearpaw commented on July 23, 2024

This is a known issue, and I'm trying to fix it.

Basically, when all the loaded data are forwarded while the new data are not ready, the program needs to wait for the dataloader.

I think the augmentation part should be optimized (e.g. pose/datasets/mpii.py). I also try to figure out which part is the most time-consuming. If you guys have some suggestions, don't hesitate to create a pull request or leave your comments here.

from pytorch-pose.

xmyqsh avatar xmyqsh commented on July 23, 2024

@bearpaw
In other words, zoom out of img = scipy.misc.imresize(img, [new_ht, new_wd]) which is a subsample operation is much faster than copy operation new_img[new_y[0]:new_y[1], new_x[0]:new_x[1]] = img[old_y[0]:old_y[1], old_x[0]:old_x[1]] in terms of the same size.

So, the big copy operation on the original image is a bottleneck. Right?

from pytorch-pose.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.