Git Product home page Git Product logo

Comments (35)

kmittman avatar kmittman commented on July 28, 2024 1

Resolved. There was a timing issue that impacted the precompiled kmod release pipelines

Date	        Component	Driver version	Kernel version	Kernel suffix
2024-06-06	kmod-nvidia	550.90.07	4.18.0-553.5.1	el8_10.x86_64
2024-06-05	kernel-core	-	        4.18.0-553.5.1	el8_10.x86_64
2024-06-05	dnf-plugin	2.1	        -	        el8.noarch
2024-06-04	kmod-nvidia	555.42.02       4.18.0-553.5.1	el8_10.x86_64
2024-06-04	kmod-nvidia	535.183.01      4.18.0-553.5.1	el8_10.x86_64
2024-06-04	kmod-nvidia	470.256.02      4.18.0-553.5.1	el8_10.x86_64
2024-05-22	kernel-core	-	        4.18.0-553	el8_10.x86_64
$ sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
$ sudo dnf module list nvidia-driver

$ sudo dnf module install nvidia-driver:latest
Dependencies resolved.
====================================================================================================================================
 Package                                        Architecture     Version                          Repository                   Size
====================================================================================================================================
Installing group/module packages:
 cuda-drivers                                   x86_64           555.42.02-1                      cuda-rhel8-x86_64           8.1 k
 nvidia-driver                                  x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64           124 M
 nvidia-driver-NVML                             x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64           600 k
 nvidia-driver-NvFBCOpenGL                      x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64            77 k
 nvidia-driver-cuda                             x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64           530 k
 nvidia-driver-cuda-libs                        x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64            53 M
 nvidia-driver-devel                            x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64            13 k
 nvidia-driver-libs                             x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64           135 M
 nvidia-kmod-common                             noarch           3:555.42.02-1.el8                cuda-rhel8-x86_64            13 k
 nvidia-libXNVCtrl                              x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64            26 k
 nvidia-libXNVCtrl-devel                        x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64            56 k
 nvidia-modprobe                                x86_64           3:555.42.02-2.el8                cuda-rhel8-x86_64            37 k
 nvidia-persistenced                            x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64            45 k
 nvidia-settings                                x86_64           3:555.42.02-1.el8                cuda-rhel8-x86_64           841 k
 nvidia-xconfig                                 x86_64           3:555.42.02-2.el8                cuda-rhel8-x86_64           105 k
Installing dependencies:
 dnf-plugin-nvidia                              noarch           2.1-1.el8                        cuda-rhel8-x86_64            14 k
 kmod-nvidia-555.42.02-4.18.0-553.5.1           x86_64           3:555.42.02-3.el8_10             cuda-rhel8-x86_64            41 M
Installing module profiles:
 nvidia-driver/default                                                                                                             
Enabling module streams:
 nvidia-driver                                                   latest                                                            

Transaction Summary
====================================================================================================================================
Install  17 Packages

Total download size: 356 M
Installed size: 799 M
Is this ok [y/N]:  y
$ dnf nvidia-plugin

Installed kmod(s):
  kmod-nvidia-555.42.02-4.18.0-553.5.1-3:555.42.02-3.el8_10.x86_64

Available kernel(s):
  kernel-core-4.18.0-553.5.1.el8_10.x86_64
  kernel-core-4.18.0-553.el8_10.x86_64

Available driver(s):
  nvidia-driver-3:555.42.02-1.el8.x86_64
  nvidia-driver-cuda-3:555.42.02-1.el8.x86_64

Available kmod(s):
[...]
  kmod-nvidia-550.54.15-4.18.0-513.18.1-3:550.54.15-3.el8_9.x86_64
  kmod-nvidia-550.54.15-4.18.0-513.24.1-3:550.54.15-3.el8_9.x86_64
  kmod-nvidia-555.42.02-4.18.0-553.5.1-3:555.42.02-3.el8_10.x86_64

from yum-packaging-precompiled-kmod.

chindokae avatar chindokae commented on July 28, 2024 1

@hhue13 At this point I would try removing the NVIDIA packages, and if you have it, the kmod-kvdo packages as well. On my test machine removing the kmod-kvdo driver allowed normal patching to occur after it rebuilt the boot initrd ramdisk. In my case at least, I think kmod-kvdo caused the entire mess. The NV kmods evidently had issues as well but I never go to the point of doing an NV kmod install because the kmod-kvdo kernel mod issue was blocking any further installation of kernel modules. I don't run Ceph or host VMs with virtual drives, so I never even needed the virtual drive optimizer modules to begin with.

It sounds like your machine has gotten into a state that may not be recoverable, though. If the de-installation routines get confused by the driver/kmod mismatch, they may refuse to do it.

from yum-packaging-precompiled-kmod.

michaelbarkdoll avatar michaelbarkdoll commented on July 28, 2024

I'm also experiencing this issue.

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

Same here! The missing drivers are obviously blocking the upgrade from RHEL 8.9 to RHEL 8.10.

Running a yum update shows the following warnings:

NOTE: Skipping kernel installation since no kernel module package kmod-nvidia-555.42.02-4.18.0-553 for kernel version 4.18.0-553.el8_10 and NVIDIA driver 550.54.15 could be found
NOTE: Skipping kernel installation since no kernel module package kmod-nvidia-555.42.02-4.18.0-553.5.1 for kernel version 4.18.0-553.5.1.el8_10 and NVIDIA driver 550.54.15 could be found

The page https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/precompiled/ does not contain any update since April 2024 either.

from yum-packaging-precompiled-kmod.

chindokae avatar chindokae commented on July 28, 2024

The updates were posted but they are not working for kernel 4.18.0-553.5.1 for RHEL 8.10. Every update is filtered out by modular filtering. They all say "no precompiled modules available for ..."

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

When I run my update RHEL8.10 with Kernel 4.18.0-553.5.1.el8_10.x86_64 I face the issue that the Laptop freezes if I don't have an external display attached. Furthermore I've discovered that the screensaver seems to freeze the Laptop after unlocking the screen.

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

Not sure if that's related but if fits the issue with the NVIDIA Driver on RHEL8 since RHEL 8.10 was released. Tried to run another yum update and here is what I get:

 sudo yum update
Updating Subscription Management repositories.
Last metadata expiration check: 11:09:29 ago on Thu 13 Jun 2024 08:17:15 PM CEST.
NVIDIA driver: filtering kernel 4.18.0-80.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.4.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.1.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.11.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.11.1.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.7.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.7.1.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.8.1.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.5.1.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.0.3.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.3.1.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.0.2.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.1.2.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.19.1.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.6.3.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.13.2.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.14.3.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.28.1.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.1.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.22.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.8.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.10.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.15.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.10.2.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.7.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.12.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.3.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.19.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.25.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.17.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.20.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.7.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.23.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.2.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.12.2.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.9.1.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.3.1.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.16.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.32.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.13.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.19.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.26.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.19.2.el8_7, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.13.1.el8_7, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.10.1.el8_7, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.13.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.15.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.10.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.21.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.27.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.5.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.18.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.11.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.9.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.24.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-553.el8_10, no precompiled modules available for version 3:555.42.02
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/dnf/plugin.py", line 107, in _caller
    getattr(plugin, method)()
  File "/usr/lib/python3.6/site-packages/dnf-plugins/nvidia.py", line 155, in resolved
    if pkg.name == DRIVER_PKG_NAME:
NameError: name 'DRIVER_PKG_NAME' is not defined

with the following kernel running:

[xxx@xxx ~]$ uname -a
Linux xxx.yyy.zz 4.18.0-553.5.1.el8_10.x86_64 #1 SMP Tue May 21 03:13:04 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
[xxx@xxx ~]$ cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.10 (Ootpa)

Rerunning the sudo yum update resulted in an update of the package dnf-plugin-nvidia to 2.1-2.el8 and now the Python exception is at least gone.

from yum-packaging-precompiled-kmod.

scaronni-nvidia avatar scaronni-nvidia commented on July 28, 2024

Yes the python exception is fixed in 2.1-2.el8. The issues you are seeing is that the precompiled branch 555 is missing all precompiled kernel module for any version except for the combination driver 555.42.02 + kernel 4.18.0-553.5.1, which I see missing in your list, you stop at 553.

We are looking into it.

from yum-packaging-precompiled-kmod.

scaronni avatar scaronni commented on July 28, 2024

A precompiled kernel module package has been posted for driver 555.42.02 and kernel 4.18.0-553.5.1.

from yum-packaging-precompiled-kmod.

kmittman avatar kmittman commented on July 28, 2024

Thank you @scaronni, closing.

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

Thanks but I'm still getting the same issue:

[me@myhost ~]$ sudo yum update
Updating Subscription Management repositories.
Last metadata expiration check: 0:02:11 ago on Tue 18 Jun 2024 06:59:39 AM CEST.
NVIDIA driver: filtering kernel 4.18.0-80.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.4.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.1.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.11.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.11.1.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.7.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.7.1.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.8.1.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.5.1.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.0.3.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.3.1.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.0.2.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.1.2.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.19.1.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.6.3.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.13.2.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.14.3.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.28.1.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.1.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.22.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.8.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.10.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.15.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.10.2.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.7.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.12.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.3.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.19.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.25.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.17.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.20.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.7.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.23.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.2.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.12.2.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.9.1.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.3.1.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.16.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.32.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.13.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.19.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.26.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.19.2.el8_7, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.13.1.el8_7, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.10.1.el8_7, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.13.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.15.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.10.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.21.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.27.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.5.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.18.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.11.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.9.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.24.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-553.el8_10, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-553.5.1.el8_10, no precompiled modules available for version 3:555.42.02
Dependencies resolved.
Nothing to do.
Complete!
[me@myhost ~]$ uname -a
Linux myhost.hhue.at 4.18.0-553.5.1.el8_10.x86_64 #1 SMP Tue May 21 03:13:04 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux

when I try to run a yum update. Anything I'm missing to get the updated kernel module package?

from yum-packaging-precompiled-kmod.

chindokae avatar chindokae commented on July 28, 2024

Same here. I got 47 new files but there were no kmods. The new files were libvinfer*, python-libvinfer*, libvonnxparsers*, tensorrt*, and a new dnf-plugin.

from yum-packaging-precompiled-kmod.

chindokae avatar chindokae commented on July 28, 2024

This is from one of my CUDA dev machines. They are behind corporate firewalls so the repos are air-gapped, hence the non-standard names. We are stuck on kernel 4.18.0-513.24.1 because the update to 4.18.0-553.5.1 remains blocked by the lack of precompiled modules. We did get a new dnf-plugin on yesterday's reposync but we did not get anything that looked like kmods.

The machine has access to these repos:

rhel-8-for-x86_64-baseos-rpms
rhel-8-for-x86_64-appstream-rpms
rhel-8-for-x86_64-supplementary-rpms
codeready-builder-for-rhel-8-x86_64-rpms
rhel-8-for-x86_64-baseos-source-rpms
epel_8
cuda-rhel8-x86_64

I should add that the release of the RHEL9 kmods took so long that by the time they did appear, Redhat had release a new kernel rendering the kmods unusable. I haven't tested the RHEL9 CUDA repos because we don't have any RHEL9 machines running with NVIDIA cards yet.

NVIDIA driver: filtering kernel 4.18.0-147.0.2.el8_1, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-147.0.3.el8_1, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-147.3.1.el8_1, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-147.5.1.el8_1, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-147.8.1.el8_1, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-147.el8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-193.1.2.el8_2, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-193.13.2.el8_2, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-193.14.3.el8_2, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-193.19.1.el8_2, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-193.28.1.el8_2, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-193.6.3.el8_2, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-193.el8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-240.1.1.el8_3, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-240.10.1.el8_3, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-240.15.1.el8_3, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-240.22.1.el8_3, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-240.8.1.el8_3, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-240.el8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-305.10.2.el8_4, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-305.12.1.el8_4, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-305.17.1.el8_4, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-305.19.1.el8_4, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-305.25.1.el8_4, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-305.3.1.el8_4, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-305.7.1.el8_4, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-305.el8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-348.12.2.el8_5, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-348.2.1.el8_5, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-348.20.1.el8_5, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-348.23.1.el8_5, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-348.7.1.el8_5, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-348.el8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-372.13.1.el8_6, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-372.16.1.el8_6, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-372.19.1.el8_6, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-372.26.1.el8_6, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-372.32.1.el8_6, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-372.9.1.el8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-425.10.1.el8_7, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-425.13.1.el8_7, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-425.19.2.el8_7, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-425.3.1.el8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-477.10.1.el8_8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-477.13.1.el8_8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-477.15.1.el8_8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-477.21.1.el8_8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-477.27.1.el8_8, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-513.11.1.el8_9, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-513.18.1.el8_9, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-513.24.1.el8_9, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-513.5.1.el8_9, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-513.9.1.el8_9, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-553.5.1.el8_10, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-553.el8_10, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-80.1.2.el8_0, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-80.11.1.el8_0, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-80.11.2.el8_0, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-80.4.2.el8_0, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-80.7.1.el8_0, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-80.7.2.el8_0, no precompiled modules available for version 3:550.54.15
NVIDIA driver: filtering kernel 4.18.0-80.el8, no precompiled modules available for version 3:550.54.15
Error:
Problem 1: package kmod-kvdo-6.2.8.7-94.el8.x86_64 from LOCAL-SERVER_BASEOS-REPO requires kernel-modules-uname-r >= 4.18.0-526.el8, but none of the providers can be installed

  • cannot install the best update candidate for package kmod-kvdo-6.2.8.7-92.el8.x86_64
  • package kernel-modules-4.18.0-553.5.1.el8_10.x86_64 from LOCAL-SERVER_BASEOS-REPO is filtered out by exclude filtering
  • package kernel-modules-4.18.0-553.el8_10.x86_64 from LOCAL-SERVER_BASEOS-REPO is filtered out by exclude filtering
    Problem 2: package cuda-drivers-555.42.02-1.x86_64 from LOCAL-SERVER-CUDA-REPO requires nvidia-kmod >= 3:555.42.02, but none of the providers can be installed
  • package kmod-nvidia-550.54.15-4.18.0-513.18.1-3:550.54.15-3.el8_9.x86_64 from @System conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:555.42.02-1.el8.x86_64 from LOCAL-SERVER-CUDA-REPO
  • cannot install the best update candidate for package cuda-drivers-550.54.15-1.x86_64
  • problem with installed package kmod-nvidia-550.54.15-4.18.0-513.18.1-3:550.54.15-3.el8_9.x86_64
  • package kmod-nvidia-555.42.02-4.18.0-553.5.1-3:555.42.02-3.el8_10.x86_64 from LOCAL-SERVER-CUDA-REPO is filtered out by modular filtering
  • package kmod-nvidia-open-dkms-3:555.42.02-1.el8.x86_64 from LOCAL-SERVER-CUDA-REPO is filtered out by modular filtering
    Problem 3: package nvidia-kmod-common-3:555.42.02-1.el8.noarch from LOCAL-SERVER-CUDA-REPO requires nvidia-kmod = 3:555.42.02, but none of the providers can be installed
  • package kmod-nvidia-550.54.15-4.18.0-513.24.1-3:550.54.15-3.el8_9.x86_64 from @System conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:555.42.02-1.el8.x86_64 from LOCAL-SERVER-CUDA-REPO
  • cannot install the best update candidate for package nvidia-kmod-common-3:550.54.15-1.el8.noarch
  • problem with installed package kmod-nvidia-550.54.15-4.18.0-513.24.1-3:550.54.15-3.el8_9.x86_64
  • package kmod-nvidia-555.42.02-4.18.0-553.5.1-3:555.42.02-3.el8_10.x86_64 from LOCAL-SERVER-CUDA-REPO is filtered out by modular filtering
  • package kmod-nvidia-open-dkms-3:555.42.02-1.el8.x86_64 from LOCAL-SERVER-CUDA-REPO is filtered out by modular filtering
    (try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
    andidate packages)

uname -a Linux cuda.local 4.18.0-513.24.1.el8_9.x86_64 #1 SMP Thu Mar 14 14:20:09 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux### # uname -a Linux cuda.local 4.18.0-513.24.1.el8_9.x86_64 #1 SMP Thu Mar 14 14:20:09 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux

from yum-packaging-precompiled-kmod.

scaronni-nvidia avatar scaronni-nvidia commented on July 28, 2024

Hi, yes, the combination 550.54.15 and kernel 4.18.0-553.5.1.el8_10 is missing, but is available for driver version 550.90.07:

https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/kmod-nvidia-550.90.07-4.18.0-553.5.1-550.90.07-3.el8_10.x86_64.rpm

It should appear inside the appropriate stream nvidia-driver:550.

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

Sorry but I'm a bit lost now. How can I get the correct driver installed then? So far all I had to to was to run a yum update and things got installed on RHEL8.

That's the repo configured on my RHEL 8_10 system:

[cuda-rhel8-x86_64]
name=cuda-rhel8-x86_64
baseurl=https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64
enabled=1
gpgcheck=1
gpgkey=https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/D42D0685.pub

Do I have to modify anything here to get yum update working again? Could install the downloaded version of https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/kmod-nvidia-550.90.07-4.18.0-553.5.1-550.90.07-3.el8_10.x86_64.rpm using yum localinstall

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

Hi, yes, the combination 550.54.15 and kernel 4.18.0-553.5.1.el8_10 is missing, but is available for driver version 550.90.07:

https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/kmod-nvidia-550.90.07-4.18.0-553.5.1-550.90.07-3.el8_10.x86_64.rpm

It should appear inside the appropriate stream nvidia-driver:550.

After installing the rpm and booting using kernel Linux xxx.hhue.at 4.18.0-553.5.1.el8_10.x86_64 #1 SMP Tue May 21 03:13:04 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux GNOME did not start at all anymore. So not sure if the module works.

from yum-packaging-precompiled-kmod.

chindokae avatar chindokae commented on July 28, 2024

There is a conflict between Redhat's kmod-kvdo package and it's available kernel modules. kmod-kvdo needs to be removed. After that, yum update still requires --allow-erasing to proceed with patching. Even after all that, the kernel stays at 4.18.0-513.24.1.

Once the patching finished and I rebooted, I was able to reinstall kmod-kvdo and that installed kernel 4.18.0-553.5.1.

The kmod-kvdo conflict only exists on machines with existing installations of NVIDIA drivers from this repo. All the other machines without NV drivers patched without error.

from yum-packaging-precompiled-kmod.

chindokae avatar chindokae commented on July 28, 2024

And that update failed to install with the following error: FATAL : modpost : GPL-incompatible module nvidia.ko uses GPL-only symbol 'lock_is_held_type in make.log.

After rebooting it came back up with the new kernel and the old 555.42.02 NV driver, but at least it came back up.

This is going to be hard to write into an Ansible workbook.

I've been using this repo since March 2023 and nothing like this has happened before. It has been seamless up to now as long as I ensured the kernel version in the Redhat repo matched the listed kernel version on the "RHEL8 precompiled kmod status" page.

Have I missed something? I have every file on the NV repo and every file on the RH8 repo(s) as of an hour ago. I have tried using the NV ./repodata metadata and I have also run createrepo myself. Results are the same. I did the same with the RHL8 repos and that also made no difference.

Unfortunately I have no machines available to test this on the internet, but AFAIK the air gap method should be OK. It has been for the last 14 months.

Correction: 555.42.02 is a new driver but it installed before the kernel was updated. Prior to that I had a 550 series driver. No idea how I ended up on 555 but it's fine.

from yum-packaging-precompiled-kmod.

scaronni-nvidia avatar scaronni-nvidia commented on July 28, 2024

After installing the rpm and booting using kernel Linux xxx.hhue.at 4.18.0-553.5.1.el8_10.x86_64 #1 SMP Tue May 21 03:13:04 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux GNOME did not start at all anymore. So not sure if the module works.

Can you check if the module is loaded? From the screen, Ctrl+Alt+F4 (for example) or via ssh.

from yum-packaging-precompiled-kmod.

scaronni-nvidia avatar scaronni-nvidia commented on July 28, 2024

There is a conflict between Redhat's kmod-kvdo package and it's available kernel modules. kmod-kvdo needs to be removed. After that, yum update still requires --allow-erasing to proceed with patching. Even after all that, the kernel stays at 4.18.0-513.24.1.

Kernel version 4.18.0-513.24.1 has modules only for specific versions:

kmod-nvidia-525.147.05-4.18.0-513.24.1-525.147.05-3.el8_9.x86_64.rpm
kmod-nvidia-535.161.08-4.18.0-513.24.1-535.161.08-3.el8_9.x86_64.rpm
kmod-nvidia-545.23.08-4.18.0-513.24.1-545.23.08-3.el8_9.x86_64.rpm
kmod-nvidia-550.54.15-4.18.0-513.24.1-550.54.15-3.el8_9.x86_64.rpm

Once the patching finished and I rebooted, I was able to reinstall kmod-kvdo and that installed kernel 4.18.0-553.5.1.
The kmod-kvdo conflict only exists on machines with existing installations of NVIDIA drivers from this repo. All the other machines without NV drivers patched without error.

Yes, because unfortunately we can not sign and ship kernel modules at the same time that Red Hat ships a new kernel + kmod-vdo package with signed kernel modules in it.

from yum-packaging-precompiled-kmod.

scaronni-nvidia avatar scaronni-nvidia commented on July 28, 2024

@chindokae what is your target setup? 550 + latest kernel? latest driver regardless of branch (555)?

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

After installing the rpm and booting using kernel Linux xxx.hhue.at 4.18.0-553.5.1.el8_10.x86_64 #1 SMP Tue May 21 03:13:04 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux GNOME did not start at all anymore. So not sure if the module works.

Can you check if the module is loaded? From the screen, Ctrl+Alt+F4 (for example) or via ssh.

cat lsmod.out | grep nvi
nvidia_drm             90112  0
nvidia_modeset       1294336  1 nvidia_drm
nvidia_uvm           4567040  0
nvidia              53403648  2 nvidia_uvm,nvidia_modeset
drm_kms_helper        184320  3 drm_display_helper,nvidia_drm,i915
drm                   602112  8 drm_kms_helper,drm_display_helper,nvidia,drm_buddy,nvidia_drm,i915,ttm
video                  57344  3 thinkpad_acpi,i915,nvidia_modeset

Furthermore I'm attachng the lsmod output as well.

lsmod.out.gz

What are the steps required to reinstall kmod-kvdo? Is that really required?

from yum-packaging-precompiled-kmod.

scaronni-nvidia avatar scaronni-nvidia commented on July 28, 2024

What are the steps required to reinstall kmod-kvdo? Is that really required?

I guess that if you ask you don't need it? It's for deduplicating logical volumes. For RHEL 10 it will be part of the kernel: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/deduplicating_and_compressing_logical_volumes_on_rhel/index

from yum-packaging-precompiled-kmod.

scaronni-nvidia avatar scaronni-nvidia commented on July 28, 2024

@chindokae @hhue13 we found an issue with the repository metadata, the nvidia-driver:latest stream is missing the kernel modules that are in nvidia-driver:555. Let's check again after we regenerate the metadata.

from yum-packaging-precompiled-kmod.

scaronni-nvidia avatar scaronni-nvidia commented on July 28, 2024

Updating Subscription Management repositories.
Last metadata expiration check: 11:09:29 ago on Thu 13 Jun 2024 08:17:15 PM CEST.
NVIDIA driver: filtering kernel 4.18.0-80.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.4.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.1.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.11.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.11.1.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.7.2.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-80.7.1.el8_0, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.8.1.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.5.1.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.0.3.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.3.1.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-147.0.2.el8_1, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.1.2.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.19.1.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.6.3.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.13.2.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.14.3.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-193.28.1.el8_2, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.1.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.22.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.8.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.10.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-240.15.1.el8_3, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.10.2.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.7.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.12.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.3.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.19.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.25.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-305.17.1.el8_4, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.20.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.7.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.23.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.2.1.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-348.12.2.el8_5, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.9.1.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.3.1.el8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.16.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.32.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.13.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.19.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-372.26.1.el8_6, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.19.2.el8_7, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.13.1.el8_7, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-425.10.1.el8_7, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.13.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.15.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.10.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.21.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-477.27.1.el8_8, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.5.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.18.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.11.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.9.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-513.24.1.el8_9, no precompiled modules available for version 3:555.42.02
NVIDIA driver: filtering kernel 4.18.0-553.el8_10, no precompiled modules available for version 3:555.42.02

I will change the output of the plugin as a separate update, all these lines if you use Satellite and have all the historical content is too much. I think a one line notice and a suggestion to run dnf nvidia-plugin should be enough.

from yum-packaging-precompiled-kmod.

kmittman avatar kmittman commented on July 28, 2024

I've rebuilt the RPM repo metadata for rhel8/x86_64, please check if this solves the issues mentioned.
For RHEL9, I've opened #51

from yum-packaging-precompiled-kmod.

chindokae avatar chindokae commented on July 28, 2024

I built a clean RHEL8.10 machine with a 1080TI in it. I patched it once to kernel 4.18.0-553.5.1 and rebooted.

Since it was a clean install there was no preexisting NV driver and this should be the least complex install possible.

This was built on the open Internet. No repo mirroring. No security policy.

[root@r101 user]# uname -a
Linux r101.home.local 4.18.0-553.5.1.el8_10.x86_64 #1 SMP Tue May 21 03:13:04 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux

I added your repos and tried installing and it failed. Eventually I realized that I needed EPEL to get dkms, which is not something I need to think about at work because I set up an Ansible workbook to distribute my custom local.repo file post-installation so it is always there.

I ran dnf module install nvidia-driver:latest-dkms and it worked without errors.

The shop is closed tomorrow due to the federal holiday so I won't get to test this in prod until Thursday.

As far as installing on a clean 8.10 machine goes, the repo is good.

I suspect that this might not apply to systems with existing drivers, though.

[user@r101 ~]$ nvidia-smi
Tue Jun 18 22:09:35 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.02 Driver Version: 555.42.02 CUDA Version: 12.5 |

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

After installing the rpm and booting using kernel Linux xxx.hhue.at 4.18.0-553.5.1.el8_10.x86_64 #1 SMP Tue May 21 03:13:04 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux GNOME did not start at all anymore. So not sure if the module works.

Can you check if the module is loaded? From the screen, Ctrl+Alt+F4 (for example) or via ssh.

cat lsmod.out | grep nvi
nvidia_drm             90112  0
nvidia_modeset       1294336  1 nvidia_drm
nvidia_uvm           4567040  0
nvidia              53403648  2 nvidia_uvm,nvidia_modeset
drm_kms_helper        184320  3 drm_display_helper,nvidia_drm,i915
drm                   602112  8 drm_kms_helper,drm_display_helper,nvidia,drm_buddy,nvidia_drm,i915,ttm
video                  57344  3 thinkpad_acpi,i915,nvidia_modeset

Furthermore I'm attachng the lsmod output as well.

lsmod.out.gz

What are the steps required to reinstall kmod-kvdo? Is that really required?

from yum-packaging-precompiled-kmod.

scaronni-nvidia avatar scaronni-nvidia commented on July 28, 2024

I ran dnf module install nvidia-driver:latest-dkms and it worked without errors.

Just did a test and I could install, upgrade, switch branch to nvidia-driver:latest and upgrade, etc on both el8 and el9 with the latest metadata, so the issue of the missing precompiled kernel module in the latest stream is fixed.

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

I've rebuilt the RPM repo metadata for rhel8/x86_64, please check if this solves the issues mentioned. For RHEL9, I've opened #51

No change for me :(

from yum-packaging-precompiled-kmod.

scaronni avatar scaronni commented on July 28, 2024

@hhue13 This issue is for dependencies and not being able to resolve prcompiled modules via DNF on module streams. Fi your original comment at #50 (comment) is fixed, please open a new issue if your problem pertains (like it seems) about a display not working, etc.?

Also please post module version and driver version or the nvidia bug report. Based on comment #50 (comment) it seems you are running the drivers on an Optimus laptop, so the display should be driven by your Intel card.

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

@hhue13 This issue is for dependencies and not being able to resolve prcompiled modules via DNF on module streams. Fi your original comment at #50 (comment) is fixed, please open a new issue if your problem pertains (like it seems) about a display not working, etc.?

Also please post module version and driver version or the nvidia bug report. Based on comment #50 (comment) it seems you are running the drivers on an Optimus laptop, so the display should be driven by your Intel card.

I'm running on a Lenovo P16 Gen1 Laptop and that used to work without any issues for over 2 years and broke only with the RHEL 8_10 Upgrade. Since then I can't get this working.

Running dnf module install nvidia-driver:latest-dkms results in the following error:

Error: 
 Problem: package kmod-nvidia-555.42.02-4.18.0-553.5.1-3:555.42.02-3.el8_10.x86_64 from @System conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:555.42.02-1.el8.x86_64 from cuda-rhel8-x86_64
  - conflicting requests
  - problem with installed package kmod-nvidia-555.42.02-4.18.0-553.5.1-3:555.42.02-3.el8_10.x86_64
(try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)

Checked the dmsg outbut when booting with the RHEL8_10 kernel and got the following errors:

[  110.121391] NVRM: API mismatch: the client has the version 555.42.02, but
               NVRM: this kernel module has the version 550.90.07.  Please
               NVRM: make sure that this kernel module and all NVIDIA driver
               NVRM: components have the same version.

So it seems that there is still an inconsistency of the driver and the kernel I'm, running. How can I get that fixed?

from yum-packaging-precompiled-kmod.

hhue13 avatar hhue13 commented on July 28, 2024

@hhue13 At this point I would try removing the NVIDIA packages, and if you have it, the kmod-kvdo packages as well. On my test machine removing the kmod-kvdo driver allowed normal patching to occur after it rebuilt the boot initrd ramdisk. In my case at least, I think kmod-kvdo caused the entire mess. The NV kmods evidently had issues as well but I never go to the point of doing an NV kmod install because the kmod-kvdo kernel mod issue was blocking any further installation of kernel modules. I don't run Ceph or host VMs with virtual drives, so I never even needed the virtual drive optimizer modules to begin with.

It sounds like your machine has gotten into a state that may not be recoverable, though. If the de-installation routines get confused by the driver/kmod mismatch, they may refuse to do it.

Thanks for the hint on this one! Based on the attached messages in the dmesg
signal-2024-06-21-090706 I have actually tried to switch the nvidia-driver module to 550 (dnf module switch-to nvidia-driver:550 which caused that dnf started to downgrade the nvidia packaged and not at least GNOME comes up again.

What I'm wondering here is why was that step required? So far yum installed to proper version of the drivers matching the kernel automatically. Why isn't that working again? Wondering what will happen after the next kernel update.

from yum-packaging-precompiled-kmod.

chindokae avatar chindokae commented on July 28, 2024

After doing updates on the RHEL8, EPEL, and CUDA repos, all the other systems were successfully updated. Root cause it hard to determine, but if I had to guess, I'd say that when using modules they all have to work or none of them will work as far as yum or dnf is concerned. kmods save time and make getting the CUDA stuff right possible, but they have to be perfect. I've seen this once before and it that case removing the NV driver got rid of the problem. Every month from then to last month worked. The anomaly did not persist.

from yum-packaging-precompiled-kmod.

scaronni-nvidia avatar scaronni-nvidia commented on July 28, 2024

I'm running on a Lenovo P16 Gen1 Laptop and that used to work without any issues for over 2 years and broke only with the RHEL 8_10 Upgrade. Since then I can't get this working.
So it seems that there is still an inconsistency of the driver and the kernel I'm, running. How can I get that fixed?

At the beginning of this thread there was an issue with the metadata for the DNF module streams, where all the kmod packages (precompiled and DKMS) where out of the DNF module streams, leading to unpredictable results and confusion on what gets installed. Somehow it used to work as long as dependencies for nvidia-kmod were satisfied. By putting the proper metadata in place (kmod packages available in their respective streams) some of this situations popped out.

In your particular case, you have a conflicting packages installed now, Just removed them and continue installing the DNF module stream of your choice.

For example:

rpm -qa kmod-nvidia\*

Then remove all the results:

rpm -e --nodeps <package1> <package2>

And keep on installing the module stream of your choice (dnf module install nvidia-driver:latest-dkms based on your previous comment).

from yum-packaging-precompiled-kmod.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.