Nvidia driver breaks in most recent update 20240317

Nvidia driver breaks in most recent update, after a regular sudo zypper dup in Tumbleweed recently.

I cannot load the nvidia driver. When running nvidia-smi it says cannot communicate with the driver, nvtop will not show the graphic card work. However, in other tools including fastfetch, the card is recognized. That is to say, the driver won’t load at all, but the device is here.

The most recent kernel version is 6.7.7.1-default, and the nvidia driver version (I am using G06) is installed from https://download.nvidia.com/opensuse/tumbleweed, the official zypper source. The version is now up to date, 550.54.14-20.1-x86_64.

The kernel module is installed using zypper from the same source, and I installed nvidia-driver-G06-kmp-default, version 550.54.14_k6.7.5_1-20.1, which I guess is designed for kernel version 6.7.5 instead of 6.7.7. Thus, it cannot be loaded properly. Meanwhile, the nvidia-open-driver-G06-signed-kmp-default, which do not need a key enrolled into shim, provided in https://developer.download.nvidia.com/compute/cuda/repos/opensuse15/x86_64/, whose version is 550.54.14_k6.7.6_1-2.1, which I tried and worked modestly, but has extremely serious bugs when I suspend my computer and it will not wake up and work again unless a power cut is given. The issue is known and not solved yet, and many issues are still on, including this one Black screen (monitor doesn't wake) after waking up from suspend · Issue #450 · NVIDIA/open-gpu-kernel-modules · GitHub and many more.

Thus, I cannot just use that at present and the only choice for me is to use the non-open-source kmp, which, however, also fails to load at all! Now the display of my device is okay as I have an Intel graphic card, and everything just works. However, I sometimes require this NVIDIA card to do video decoding and other multimedia tasks.

Has anyone also met the problem? I have tried uninstalling the driver and reboot and re-install the driver. Shim seems to have been able to enroll the key in /var/lib/nvidia-pubkeys/, but the driver seems just won’t load.

The latest kernel is kernel-default 6.7.9-1 and works on all my machines which use the G06 driver (550.54.14).
Including Optimus setups.

It seems you messed up stuff by mixing and manually modifying.
https://en.opensuse.org/SDB:NVIDIA_drivers

Kernel modules get built on the user machine and not external…

nvidia-driver-breaks-in-most-recent-update-20240317

Doesn’t know if you mean the snapshot or the date. The latest Tumbleweed snapshot is 20240314.
But maybe there is also a time zone difference…here it is still the 16th March…

@szw0407 aside from comments by @hui I’m assuming suse-prime in play? Considered using Prime Render Offload for your multimedia tasks?

What is your hardware in use inxi -Gxxz

well I meant the current time and I am in UTC+8 now 00:57. Thanks

Graphics:
  Device-1: Intel Raptor Lake-P [Iris Xe Graphics] vendor: ASUSTeK driver: i915 v: kernel
    arch: Gen-13 ports: active: eDP-1 empty: DP-1, DP-2, DP-3, DP-4 bus-ID: 0000:00:02.0
    chip-ID: 8086:a7a0
  Device-2: NVIDIA AD107M [GeForce RTX 4050 Max-Q / Mobile] vendor: ASUSTeK driver: N/A
    arch: Lovelace bus-ID: 0000:01:00.0 chip-ID: 10de:28a1

assuming suse-prime in play?

I am mainly using Wayland, which is not supported by suse-prime at present. According to SDB:NVIDIA PRIME Render Offload - openSUSE Wiki, I was using envycontrol, which worked fine before. Thanks for your advice anyway.

@szw0407 unfortunately suse-prime cares not and always seems to install :wink: So if it is, uninstall and lock so it stays away.

So no nvidia kmp is installed/available(?) do you have both the open driver and proprietary installed?

In Tumbleweed, switcherooctl is supported, you don’t indicate desktop in use, GNOME has dbus integration, but consider switching to that and start the service to use as your user.

I do have a dual AMD laptop running Aeon, that’s fine and two desktops with discrete cards Intel/Intel ARC/Nvidia and Nvidia/Nvidia, the former uses the repo and it’s all good, the later the run file which is also fine.

EDIT: Can you also post the output from cat /proc/cmdline

BOOT_IMAGE=/boot/vmlinuz-6.7.7-1-default root=/dev/mapper/opensuse_main-system splash=silent silent security=apparmor mitigations=auto rd.shell=0

Nvidia-kmp is now installed and kernel-firmware-nvidia-gspx-G06 is installed. Open source kmp has been removed as it won’t work fine.

Anyway, thanks for your help and advice. I may later check the docs to see how to use switcherooctl .

@szw0407 Can you reboot and edit grub (temporarily) and add simplefb=1 and F10 to boot.

If I did it as expected, adding this into grub won’t help and the nvidia-smi command still cannot find the driver running.

I have now uninstalled everything from nvidia repo and perhaps later reinstall them using another method. Maybe they would work then. Thanks for your help.

@szw0407 If you re-install the nvidia packages from the repo, it will rebuild for the new kernel on install.

Don’t worry about nvidia-smi at present, use inxi or /sbin/lspcvi -nnk | grep -EA3 "VGA|Display|3D" to check for the driver installed.

Well, after a few days trying to deal with this issue I finally found what the cause is.

I am using GNOME 45 wayland which is not happy if NVIDIA is enabled. After installing NVIDIA drivers I have to manually systemctl enable nvidia-suspend.service nvidia-resume.service nvidia-hibernate.serviceotherwise wayland is blocked in the login screen.

And, I swear that I have NEVER used Bumblebeeor bbswitchbut there is a file /usr/lib/modprobe.d/09-nvidia-modprobe-bbswitch-G04.conf which blocked NVIDIA driver, and it blocks nvidia drivers. Meanwhile, the kernel module of nvidia driver diables the nouveau driver, resulting in no drivers available for nvidia card. Remove the former file and the driver loads. I am not sure how the file was created, as my driver is G06 instead of G04.

Anyway, thanks for your help and wish you a nice day.

from SUSEPrime/09-nvidia-modprobe-bbswitch-G04.conf at master · openSUSE/SUSEPrime · GitHub I saw the same file. Perhaps it was once created by suse-prime, but I uninstalled it right after the system upgrade, thus I did not realize that it has done something to my device. To avoid similar things from happening again I have locked this from yast. Thanks for your advice.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.