Git Product home page Git Product logo

Comments (9)

sbulfer avatar sbulfer commented on June 26, 2024 2

I figured it out!
something happened where a lock file was generated, but never cleared. The steps to fix it are the following:
when the python file hangs, use ctrl-c to kill the process. There should be a stack trace that is printed out. mine was the following:
File "", line 1, in
File "/home/lost/.pyenv/versions/3.9.13/lib/python3.9/site-packages/qtorch/quant/init.py", line 1, in
from .quant_function import *
File "/home/lost/.pyenv/versions/3.9.13/lib/python3.9/site-packages/qtorch/quant/quant_function.py", line 20, in
quant_cuda = load(
File "/home/lost/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1202, in load
return _jit_compile(
File "/home/lost/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1439, in _jit_compile
baton.wait()
File "/home/lost/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/utils/file_baton.py", line 42, in wait
time.sleep(self.wait_seconds)

analyzing this trace, we see that it is hung on a file lock. I used pdb to debug the program like so:
python3 -m pdb my_file.py

within pdb, i set a breakpoint at the file:
b /home/lost/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/utils/file_baton.py:42

press c to continue...

I then opened the file lock code and noticed there was an object called "self.lock_file_path"
i printed it by typing "self.lock_file_path" in pdb
navigate to this path (sans lock)
and delete the lock file
your file should now run again :)

from qpytorch.

Tiiiger avatar Tiiiger commented on June 26, 2024

Hi @hasnainnaeem ,

what's your environment? pytorch, cuda version?

from qpytorch.

hasnainnaeem avatar hasnainnaeem commented on June 26, 2024

Environment Details:
Torch: 1.11.0
Cuda: 11.3
Ubuntu: 22.7
Python: 3.8
GCC: 9.3

from qpytorch.

sbulfer avatar sbulfer commented on June 26, 2024

I ran into this exact problem. it seems it is hanging during the just in time compilation. I am not sure yet how to fix it.. it might require reinstalling pytorch to clear out some cache or something

from qpytorch.

hasnainnaeem avatar hasnainnaeem commented on June 26, 2024

I ran into this exact problem. it seems it is hanging during the just-in-time compilation. I am not sure yet how to fix it.. it might require reinstalling PyTorch to clear out some cache or something

Unfortunately, that doesn't fix the issue. I tried doing that multiple times, plus reinstalled the Linux subsystem. Then, I tried again on dual-booted ubuntu, but the issue persisted.

Right now, I am working on Colab, it does not occur there.

I think it has something to do with the graphics card/drivers.

from qpytorch.

hasnainnaeem avatar hasnainnaeem commented on June 26, 2024

Awesome! Thanks for letting me.

I knew it had something to do with some lock file, but I couldn't find the lock file.

from qpytorch.

sbulfer avatar sbulfer commented on June 26, 2024

I'm glad I could help :)

from qpytorch.

RuokaiYin avatar RuokaiYin commented on June 26, 2024

Thank you very much for the solution! I have no idea why I suddenly ran into the same situation, but the solution fix the problem! (The codes work normally for weeks, then suddenly freeze...)

from qpytorch.

Tiiiger avatar Tiiiger commented on June 26, 2024

Hi all on this thread,

Thank you all for sharing the knowledge here. I have become too busy to maintain this repo and have not tested it on more recent environment.

Sorry about this!

Bests,
Tianyi

from qpytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.