Comments (4)
Hi @johnnynunez, the PyTorch Kineto library calculates the tensor core ratio from the kernel times.
That needs the users explicitly modify their code to register event callbacks.
with torch.profiler.profile(
activities=[
torch.profiler.ProfilerActivity.CPU,
torch.profiler.ProfilerActivity.CUDA,
]
) as p:
code_to_profile()
I don't think there is something we can do in nvitop
as a monitor tool rather than a profiler. The profiler needs in-process injection to the user program. nvitop
is based on the NVML library and runs in a separate process.
from nvitop.
Hi @johnnynunez, nvitop
is built on top of the NVIDIA Management Library (NVML), which is instantly usable after installing the NVIDIA driver. The only APIs from NVML to get GPU utilization rates are:
Per device:
nvmlDeviceGetUtilizationRates
(%GPU + %MEM bandwidth)nvmlDeviceGetEncoderUtilization
(%ENC)nvmlDeviceGetDecoderUtilization
(%DEC)
Per process:
nvmlDeviceGetProcessUtilization
(%SM (GPU) + %MEM bandwidth + %ENC + %DEC)
nvitop
do provide per process GPU utilization usage in the %SM
column. I found this blog:
The GA100 streaming multiprocessor (SM).
said:
A100 GPU streaming multiprocessor
The new streaming multiprocessor (SM) in the NVIDIA Ampere architecture-based A100 Tensor Core GPU significantly increases performance, builds upon features introduced in both the Volta and Turing SM architectures, and adds many new capabilities.
The SM unit is consist of multiple tensor cores. Does this resolve your request?
The NVML can only retrieve the SM (streaming multiprocessor) usage in total rather than fine-grained details for the tensor cores. If you want to profile your program, I think using nvprof
is the best practice as NVIDIA documented.
from nvitop.
Closing due to inactivity. Please feel free to ask for a reopening.
from nvitop.
Hi, @XuehaiPan pytorch has the capability to watch tensor cores percentatge. Is it possible to use here?
from nvitop.
Related Issues (20)
- [Feature Request] Add shortcut that shows parent process's name for selected process HOT 1
- [Question] Memory bandwidth utilization of GPUs? HOT 4
- [BUG] Monitor mode displays nothing under Python 3.12 in Windows 11 HOT 8
- [Feature Request] It is recommended to change the dependency from nvidia-ml-py to pynvml HOT 1
- [BUG][exporter] Process metrics still exist when the process is gone HOT 5
- [BUG] curser error init display on ubuntu 22.10 HOT 1
- [Question] How to log GPU performance to `wandb` HOT 2
- [Feature Request] add io stat for disk and process like glances HOT 2
- [BUG] Pytorch lightning callback HOT 3
- [Feature Request] Show real-time bandwidth under monitor mode
- [Question] ERROR: Failed to initialize `curses` (setupterm: could not find terminfo database) HOT 4
- [Question] How snapshot could be used HOT 2
- Installation: which step to follow? HOT 1
- [BUG] Prometheus connection refused HOT 4
- [Question] Grafana Dashboard Example
- [Feature Request] can you support word wrap for COMMAND information HOT 1
- Windows download link please.. HOT 1
- [Feature Request] Add CPU Processes
- [BUG] UTF-8 Error during decoding device name on R555 driver HOT 7
- [BUG] `nvitop.Device.from_cuda_visible_devices()` not detecting GPU HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nvitop.