Git Product home page Git Product logo

Comments (12)

johncadengo avatar johncadengo commented on September 26, 2024 2

I'm experiencing this issue as well

from nvtop.

supernovae avatar supernovae commented on September 26, 2024 1

Same problem with 7900xtx

bymiller@byron-X570:~$ nvtop
No GPU to monitor.

rocm-smi --showproductname

============================ ROCm System Management Interface ============================
====================================== Product Info ======================================
GPU[0] : Card series: 0x744c
GPU[0] : Card model: 0x2422
GPU[0] : Card vendor: Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0] : Card SKU: EXT84765

================================== End of ROCm SMI Log ===================================

from nvtop.

qwertychouskie avatar qwertychouskie commented on September 26, 2024

It's not surprising this isn't working given that the card is based on CDNA rather than GCN or RDNA. It's very well possible that kernel APIs are missing, and even if not, I doubt any dev off nvtop has a test card available to them. I personally would be inclined to close this issue as wontfix, but @Syllo would know better than I would if implementing support is a possibility or not.

from nvtop.

Syllo avatar Syllo commented on September 26, 2024

If I had access to such card I could try and add support if there is a way to discover these GPUs. If it's not registering through the drm driver I'm not surprised it's not showing in nvtop.

from nvtop.

supernovae avatar supernovae commented on September 26, 2024

bymiller@byron-X570:~$ sudo dmesg | grep drm
[ 3.645163] ACPI: bus type drm_connector registered
[ 4.921387] [drm] amdgpu kernel modesetting enabled.
[ 4.921389] [drm] amdgpu version: 6.3.6
[ 4.921390] [drm] OS DRM version: 6.5.0
[ 4.935928] [drm] initializing kernel modesetting (IP DISCOVERY 0x1002:0x744C 0x148C:0x2422 0xC8).
[ 4.935939] [drm] register mmio base: 0xFCC00000
[ 4.935940] [drm] register mmio size: 1048576
[ 4.940610] [drm] add ip block number 0 <soc21_common>
[ 4.940612] [drm] add ip block number 1 <gmc_v11_0>
[ 4.940613] [drm] add ip block number 2 <ih_v6_0>
[ 4.940614] [drm] add ip block number 3
[ 4.940615] [drm] add ip block number 4
[ 4.940617] [drm] add ip block number 5
[ 4.940618] [drm] add ip block number 6 <gfx_v11_0>
[ 4.940619] [drm] add ip block number 7 <sdma_v6_0>
[ 4.940620] [drm] add ip block number 8 <vcn_v4_0>
[ 4.940621] [drm] add ip block number 9 <jpeg_v4_0>
[ 4.940622] [drm] add ip block number 10 <mes_v11_0>
[ 4.946689] [drm] VCN(0) encode/decode are enabled in VM mode
[ 4.946691] [drm] VCN(1) encode/decode are enabled in VM mode
[ 4.947687] amdgpu 0000:0a:00.0: [drm:jpeg_v4_0_early_init [amdgpu]] JPEG decode is enabled in VM mode
[ 4.949179] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
[ 4.949194] [drm] Detected VRAM RAM=24560M, BAR=32768M
[ 4.949196] [drm] RAM width 384bits GDDR6
[ 4.949283] [drm] amdgpu: 24560M of VRAM memory ready
[ 4.949284] [drm] amdgpu: 32107M of GTT memory ready.
[ 4.949299] [drm] GART: num cpu pages 131072, num gpu pages 131072
[ 4.949364] [drm] PCIE GART of 512M enabled (table at 0x0000008001300000).
[ 4.949718] [drm] Loading DMUB firmware via PSP: version=0x07002100
[ 4.950212] [drm] Found VCN firmware Version ENC: 1.16 DEC: 5 VEP: 0 Revision: 6
[ 5.020366] [drm] reserve 0x1300000 from 0x85fc000000 for PSP TMR
[ 5.349912] [drm] Display Core v3.2.255 initialized on DCN 3.2
[ 5.349914] [drm] DP-HDMI FRL PCON supported
[ 5.351765] [drm] DMUB hardware initialized: version=0x07002100
[ 5.569780] [drm] kiq ring mec 3 pipe 1 q 0
[ 5.577336] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[ 5.577955] amdgpu 0000:0a:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
[ 5.682527] [drm] ring gfx_32768.1.1 was added
[ 5.682796] [drm] ring compute_32768.2.2 was added
[ 5.683000] [drm] ring sdma_32768.3.3 was added
[ 5.683057] [drm] ring gfx_32768.1.1 ib test pass
[ 5.683107] [drm] ring compute_32768.2.2 ib test pass
[ 5.683132] [drm] ring sdma_32768.3.3 ib test pass
[ 5.684846] [drm] Initialized amdgpu 3.56.0 20150101 for 0000:0a:00.0 on minor 0
[ 5.691939] fbcon: amdgpudrmfb (fb0) is primary device
[ 5.691942] amdgpu 0000:0a:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[ 5.788512] [drm] DSC precompute is not needed.
[ 6.253192] systemd[1]: Starting Load Kernel Module drm...
[ 6.258790] systemd[1]: [email protected]: Deactivated successfully.
[ 6.258990] systemd[1]: Finished Load Kernel Module drm.

from nvtop.

ehartford avatar ehartford commented on September 26, 2024

If I had access to such card I could try and add support if there is a way to discover these GPUs. If it's not registering through the drm driver I'm not surprised it's not showing in nvtop.

Hey I'm happy to give you access to my server

from nvtop.

numas avatar numas commented on September 26, 2024

This also happens for me on a Radeon RX 7900 XTX (as well as on a Radeon RX 7900 XT)

====================================== ROCm System Management Interface ======================================
================================================ Concise Info ================================================
Device  [Model : Revision]    Temp    Power  Partitions      SCLK     MCLK   Fan  Perf  PwrCap  VRAM%  GPU%  
        Name (20 chars)       (Edge)  (Avg)  (Mem, Compute)                                                  
==============================================================================================================
0       [0x471e : 0xc8]       30.0°C  69.0W  N/A, N/A        1564Mhz  96Mhz  0%   auto  303.0W    0%   56%   
        0x744c                                                                                               
==============================================================================================================
============================================ End of ROCm SMI Log =============================================

from nvtop.

numas avatar numas commented on September 26, 2024

Added the ids for RX 7900 XTX / XT myself to src/amdgpu_ids.h - it works now: #293

Regarding the MI100 card, I would guess the line would be:

{0x0C34, 0x01, "AMD Instinct MI100"},

from nvtop.

Umio-Yasuno avatar Umio-Yasuno commented on September 26, 2024

Added the ids for RX 7900 XTX / XT myself to src/amdgpu_ids.h - it works now: #293

Regarding the MI100 card, I would guess the line would be:

{0x0C34, 0x01, "AMD Instinct MI100"},

Really? What you are adding is the SubDeviceID, not the DeviceID, and nvtop doesn't use the SubDeviceId.
And amdgpu_ids.h is only used to get the name.

from nvtop.

numas avatar numas commented on September 26, 2024

Really? What you are adding is the SubDeviceID, not the DeviceID, and nvtop doesn't use the SubDeviceId. And amdgpu_ids.h is only used to get the name.

Well, nvtop went from "No GPU to monitor" to this:

nvtop_7900xtx

This is the information on 7900 XTX from https://gitlab.freedesktop.org/mesa/drm/-/blob/main/data/amdgpu.ids

744C, C8, AMD Radeon RX 7900 XTX

I took 0x471e from rocm-smi but this is the SubDeviceId? As you can see there are some missing info (N/A) so that may be because of this? I'll try with 0x744c as I somehow missed that.

Just weird that the OP couldn't get nvtop to start with the MI100 as the DeviceID in amdgpu_ids.h should be correct.

UPDATE: Changed to DeviceID 0x744C in amdgpu_ids.h and the nvtop output is identical to the screenshot above. Is more code needed to support the 7900 XTX / XT in nvtop than adding the DeviceID?

from nvtop.

Umio-Yasuno avatar Umio-Yasuno commented on September 26, 2024

@numas
Hmm, have you tried the unpatched build?
nvtop gets the device name from libdrm_amdgpu, and uses amdgpu_ids.h list when that fails.
The driver name, such as "AMD GPU", is used even if the list does not contain the device name.
So it seems strange that adding the device name to the list makes it recognize the device.

from nvtop.

numas avatar numas commented on September 26, 2024

Thank you @Umio-Yasuno !

You are correct, the unpatched build works (though still with some N/A info) - I was comparing with the distro provided nvtop which is old (1.2.2 in Ubuntu 22.04) and went straight to hacking instead of checking a clean build first...

Sorry for the noise, I will remove the pull request.

from nvtop.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.