Git Product home page Git Product logo

Comments (9)

vmoens avatar vmoens commented on September 16, 2024 1

I believe it's because we use non_blocking=True but don't sync after the tensordict creation, I can fix that!
It won't be the case if you do it the other way (cpu -> cuda) as cuda syncs itself.
Thanks for reporting

from tensordict.

vmoens avatar vmoens commented on September 16, 2024 1

In the meantime you can call

torch.cuda.synchronize()

after instantiating the td!

from tensordict.

vmoens avatar vmoens commented on September 16, 2024

Looking at it, this has already been solved as now non_blocking is an arg of tensordict constructor.
If you set it to True, it will be your responsibility to call synchronize if it's needed.
Hope that makes sense!

from tensordict.

JCBrouwer avatar JCBrouwer commented on September 16, 2024

Is the use of non_blocking=True also your own responsibility with regular tensors? I was under the impression that torch would synchronize for you if it was necessary, but maybe I'm wrong.

I'd say it's pretty dangerous to have hidden responsibilities enabled by default. If indeed this is a user's own responsibility then the default setting should be non_blocking=False in my opinion.

from tensordict.

vmoens avatar vmoens commented on September 16, 2024

Is the use of non_blocking=True also your own responsibility with regular tensors? I was under the impression that torch would synchronize for you if it was necessary, but maybe I'm wrong.

For cpu -> cuda yes, for anything else no

I'd say it's pretty dangerous to have hidden responsibilities enabled by default. If indeed this is a user's own responsibility then the default setting should be non_blocking=False in my opinion.

I agree. I can put a synchronize in the tensordict - the only thing blocking me right now is that we should do it only at the root tensordict, not at every node, and i'm not sure how I should achieve that without adding yet another keyword argument to __init__

from tensordict.

vmoens avatar vmoens commented on September 16, 2024

After further discussion with @albanD we agreed on the following plan:

  • tensordict will always use non_blocking=True during creation
  • if non_blocking=True is passed by the user, we don't call synchronize at the end
  • if non_blocking=False (default) we do call synchronize if device is cpu / mps or any non-cuda

Your calls will be fixed, but you will have the same error if non_blocking is True.

from tensordict.

JCBrouwer avatar JCBrouwer commented on September 16, 2024

Sounds like a well thought-out resolution. Thanks a lot!

from tensordict.

shagunsodhani avatar shagunsodhani commented on September 16, 2024

After further discussion with @albanD we agreed on the following plan:

  • tensordict will always use non_blocking=True during creation
  • if non_blocking=True is passed by the user, we don't call synchronize at the end
  • if non_blocking=False (default) we do call synchronize if device is cpu / mps or any non-cuda

Your calls will be fixed, but you will have the same error if non_blocking is True.

Double checking, did you mean to say tensordict will always use non_blocking=False during creation

from tensordict.

vmoens avatar vmoens commented on September 16, 2024

I meant that under the hood we'll always use non_blocking=True
From a UX perspective, the only different when calling TensorDict(..., non_blocking=True) will be that a call to syncrhonize will not be made after sending the tensor asynchronously to the required device. If False, we call synchronize on cuda (thereby synchronizing all cuda devices) whenever cuda is available, and same for MPS.

Using non_blocking=True will be useful if you want to keep control over which stream is synchronized. For instance, you may have some tensors on cuda:0 that you want to bring to cpu while cuda:1 is busy doing somethings else. In that case, non_blocking=False will dispatch things cuda:0 -> cpu async but then call a synchronize over both cuda:0 and cuda:1, which can be annoying. With non_blocking=True, you can avoid that and call the synchronize over the first stream only.

from tensordict.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.