The data loader seems to extremely slow for few batches. After every few batches (like

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Data loader is slow about pytorch-pose HOT 7 CLOSED

bearpaw commented on July 23, 2024

Data loader is slow

from pytorch-pose.

Comments (7)

bearpaw commented on July 23, 2024 2

@xmyqsh This a very comprehensive analysis. Thanks very much!

To @adityaarun1 @weigq @xmyqsh , here is a quick fix (at least allows you to train at the same speed with fewer workers)

As suggested by @xingyizhou, I missed an accelerating part as implemented in the original hourglass: https://github.com/anewell/pose-hg-train/blob/master/src/util/img.lua#L91-L105
(Because I follow the python code here: https://github.com/anewell/pose-hg-train/blob/master/src/pypose/img.py#L48-L78)

This has been fixed in my latest commit: 88a2294

The original code is a bit difficult to understand for me. I try to implement this part in my own understanding. The result seems correct ( 140 epoch for hg-stack2-block1 achieves 86.3 PCKh).

from pytorch-pose.

adityaarun1 commented on July 23, 2024 2

@bearpaw the fix works. 👍
This thread on twitter might be of interest to us to further improve the efficiency of the data loader. People from FastAI have developed multi-processing + thread pool dataloader for PyTorch which works fine. But this adds an additional dependency to this repo.

from pytorch-pose.

xmyqsh commented on July 23, 2024 1

@bearpaw
I used the torch version, it has the same problem. Though set the augmentation part ahead of crop operation in this pytorch version seems more appropriate but also more time-consuming. I have tried to remove the augmentation part, it has a bit of speedup.

Also, the np.linalg.inv operation is called 17 times. And the image reading is definitely time-consuming. So, the two main parts, image reading and image preprocessing are both time-consuming. And they wait for each other sometimes as we have seen.

It should be better to sperate the two parts with two Producer-Consumer model instead of just one.
For implementation, I think set a new dataloader on load_image mpii.py#L90 and load_annot mpii.py#L70-#L73 in training phase should be OK.

For evaluation phase in the above Framework, an extra image_id to prediction map should also be added.

Personal opinion, hope helpful.

from pytorch-pose.

weigq commented on July 23, 2024

here about 15min a epoch with 1080ti and E5,
i think your issue may not be caused by the dataloader.
BTW, have you used cuda, cudnn?

from pytorch-pose.

adityaarun1 commented on July 23, 2024

@weigq Yes, I am using CUDA 8.0 and CUDNN 6. I don't think CUDA/CUDNN is an issue, as all my other codes are running just fine.

I have just tested it out another machine with same configuration and this slowness is still there. The slowness occurs only during the data loading part and not during the forward pass (which happens in few milliseconds).

from pytorch-pose.

bearpaw commented on July 23, 2024

This is a known issue, and I'm trying to fix it.

Basically, when all the loaded data are forwarded while the new data are not ready, the program needs to wait for the dataloader.

I think the augmentation part should be optimized (e.g. pose/datasets/mpii.py). I also try to figure out which part is the most time-consuming. If you guys have some suggestions, don't hesitate to create a pull request or leave your comments here.

from pytorch-pose.

xmyqsh commented on July 23, 2024

@bearpaw
In other words, zoom out of img = scipy.misc.imresize(img, [new_ht, new_wd]) which is a subsample operation is much faster than copy operation new_img[new_y[0]:new_y[1], new_x[0]:new_x[1]] = img[old_y[0]:old_y[1], old_x[0]:old_x[1]] in terms of the same size.

So, the big copy operation on the original image is a bottleneck. Right?

from pytorch-pose.

Data loader is slow about pytorch-pose HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent