Git Product home page Git Product logo

Comments (6)

XuehaiPan avatar XuehaiPan commented on July 17, 2024

Hi @marcreichman-pfi, thanks for raising this. I have encountered the same issue before. I think this would be a bug on the upstream (nvidia-ml-py) with the incompatible NVIDIA driver. The nvidia-ml-py returns invalid PIDs.

In [1]: import pynvml

In [2]: pynvml.nvmlInit()

In [3]: handle = pynvml.nvmlDeviceGetHandleByIndex(0)

In [4]: [p.pid for p in pynvml.nvmlDeviceGetComputeRunningProcesses(handle)]
Out[4]:
[1184,
 0,
 4294967295,
 4294967295,
 16040,
 0,
 4294967295,
 4294967295,
 19984,
 0,
 4294967295,
 4294967295,
 20884,
 0,
 4294967295,
 4294967295,
 26308,
 0,
 4294967295,
 4294967295,
 16336,
 0,
 4294967295,
 4294967295,
 5368,
 0,
 4294967295,
 4294967295,
 19828,
 0,
 4294967295]

I haven't found a solution for this yet. This may be due to an internal API change in the NVML library. We may need to wait for the next nvidia-ml-py release.

As a temporary workaround, you could downgrade your NVIDIA driver version.

See also:

from nvitop.

marcreichman-pfi avatar marcreichman-pfi commented on July 17, 2024

Hi @XuehaiPan and thanks for your response and excellent tool!

We cannot downgrade because we need newer CUDA version support, so for now we'll just have to wait for an updated version with the NVML library fix.

from nvitop.

XuehaiPan avatar XuehaiPan commented on July 17, 2024

Hi @marcreichman-pfi, a new release of nvidia-ml-py with version 12.535.77 came out several hours ago. You can upgrade your nvidia-ml-py package with the command:

python3 -m pip install --upgrade nvidia-ml-py

This would resolve the unrecognized PIDs with CUDA 12 drivers.

I would also make a new release of nvitop to resolve CUDA 12 driver support.

from nvitop.

marcreichman-pfi avatar marcreichman-pfi commented on July 17, 2024

Thanks @XuehaiPan - is there a way to do this in the docker version?

from nvitop.

XuehaiPan avatar XuehaiPan commented on July 17, 2024

Thanks @XuehaiPan - is there a way to do this in the docker version?

@marcreichman-pfi You could upgrade nvidia-ml-py in your docker container.

from nvitop.

marcreichman-pfi avatar marcreichman-pfi commented on July 17, 2024

Thanks this did the trick! Here was what I did from your Dockerfile:

$ git diff Dockerfile
diff --git a/Dockerfile b/Dockerfile
index c3194cf..96874da 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -32,6 +32,7 @@ COPY . /nvitop
 WORKDIR /nvitop
 RUN . /venv/bin/activate && \
   python3 -m pip install . && \
+  python3 -m pip install --upgrade nvidia-ml-py && \
   rm -rf /root/.cache

 # Entrypoint

from nvitop.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.