Kernel 5.12.9 &‌ Nvidia 460.80 drivers: no graphical desktop

After upgrading to decent TW snapshot with 5.12.4 kernel no graphical display started. I then went to very latest TW snapshot with 5.12.9 kernel. After reboot to 5.12.9 graphical desktop did not start either.

/usr/bin/sddm starts. but Xorg server does not. Sddm does not log any errors. It is somewhat annoying that sddm believes everything being ok, while its is not. There is nothing logged into /var/log/Xorg.0.log.

From time to time this has happeded before, too. I have back then usually reverted to snapshot prior upgrade and waited. This time I decided I try to understand and fix the problem. First part is to find relevant logs. I highlighted with blue lines I believe are most meaningful now.

Nvidia drvers are now of version 460.80. Display card is MSI Geforce GTX 1660.

System journal snippets:

Reboot to 5.12.4-2-default:

Jun 05 17:24:18 localhost systemd-udevd[1580]: modprobe: ERROR: could not insert '**nvidia**': Exec format error 
Jun 05 17:24:18 localhost kernel: **nvidia**: disagrees about version of symbol module_layout 
Jun 05 17:24:18 localhost kernel: **nvidia**_modeset: disagrees about version of symbol module_layout 
Jun 05 17:24:18 localhost systemd-udevd[1618]: modprobe: ERROR: could not insert '**nvidia**': Exec format error 
Jun 05 17:24:18 localhost kernel: **nvidia**: disagrees about version of symbol module_layout 
Jun 05 17:24:18 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=3 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input19 
Jun 05 17:24:18 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=7 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input20 
Jun 05 17:24:18 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=8 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input21 
Jun 05 17:24:18 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=9 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input22 
Jun 05 17:24:18 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=10 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input23 
Jun 05 17:24:18 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=11 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input24 
Jun 05 17:24:18 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=12 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input25 
Jun 05 17:24:19 localhost kernel: **nvidia**-gpu 0000:41:00.3: i2c timeout error e0000000 
Jun 05 17:25:16 video2 systemd[1]: Starting X **Display** Manager... 
Jun 05 17:25:16 video2 **display**-manager[3092]: /etc/vconsole.conf available 
Jun 05 17:25:16 video2 **display**-manager[3092]: KEYMAP: fi-kotoistus 
Jun 05 17:25:16 video2 **display**-manager[3092]: Command: localectl set-keymap fi-kotoistus 
Jun 05 17:25:16 video2 **display**-manager[3092]: I: Using systemd /usr/share/systemd/kbd-model-map mapping 
Jun 05 17:25:16 video2 systemd[1]: Started X **Display** Manager. 
Jun 05 17:30:00 video2 [RPM][15853]: scriptlet %triggerin(**nvidia**-gfxG05-kmp-default-460.80_k5.12.0_2-38.1.x86_64) failure: 2 
Jun 05 17:38:36 video2 systemd[1]: Stopping X **Display** Manager... 
Jun 05 17:38:36 video2 systemd[1]: **display**-manager.service: Succeeded. 
Jun 05 17:38:36 video2 systemd[1]: Stopped X **Display** Manager.

Reboot to 5.12.9-1-default:

Jun 05 17:43:14 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=3 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input19 
Jun 05 17:43:14 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=7 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input20 
Jun 05 17:43:14 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=8 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input21 
Jun 05 17:43:14 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=9 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input22 
Jun 05 17:43:14 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=10 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input23 
Jun 05 17:43:14 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=11 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input24 
Jun 05 17:43:14 localhost kernel: input: HDA **NVidia** HDMI/DP,pcm=12 as /devices/pci0000:40/0000:40:03.1/0000:41:00.1/sound/card1/input25 
Jun 05 17:43:16 localhost kernel: **nvidia**-gpu 0000:41:00.3: i2c timeout error e0000000 
Jun 05 17:44:09 video2 systemd[1]: Starting X **Display** Manager... 
Jun 05 17:44:09 video2 **display**-manager[2928]: /etc/vconsole.conf available 
Jun 05 17:44:09 video2 **display**-manager[2928]: KEYMAP: fi-kotoistus 
Jun 05 17:44:09 video2 **display**-manager[2928]: Command: localectl set-keymap fi-kotoistus 
Jun 05 17:44:09 video2 **display**-manager[2928]: I: Using systemd /usr/share/systemd/kbd-model-map mapping 
Jun 05 17:44:10 video2 systemd[1]: Started X **Display** Manager.

See here:
https://bugzilla.opensuse.org/show_bug.cgi?id=1186904

I had the same problem, I only managed to solve by uninstalling the drivers. I tried to install it several times again but I always got the same result so I kept uninstalling and I’ll keep like this for now.

Thanks for reply Sauerland. It helped me to resolve this issue to get graphical desktop working again. For others hitting same (or similar) problem:

  1. Remove and recreate kernel source and build tree links. See bug https://bugzilla.opensuse.org/show_bug.cgi?id=1186710 for details, why
[FONT=monospace]cd /lib/modules/5.12.9-1-default
rm build source
ln -s ../../../src/linux-5.12.9-1 source
ln -s ../../../src/linux-5.12.9-1-obj/x86_64/default build[/FONT]
  1. Find Nvidia related packages
[FONT=monospace]zypper search nvidia[/FONT]
  1. Reinstall packages forcibly. I had the following packages installed.
[FONT=monospace]zypper install -f kernel-firmware-nvidia nvidia-computeG05 nvidia-gfxG05-kmp-default nvidia-glG05 nvidia-texture-tools x11-video-nvidiaG05[/FONT]
  1. Reboot

  2. Enjoy
    :slight_smile:

Thank you for your guide. I had to make slight modification to make it work…


[FONT=monospace]cd /lib/modules/5.12.9-1-default
rm build source
ln -s /src/linux-5.12.9-1 source
ln -s /src/linux-5.12.9-1-obj/x86_64/default build[/FONT]

Used absolute paths and it worked. It seems this is because of the UsrMerge that is going on now.

shouldn’t it be /usr/src/linux-5.12.9-1 ? On my end there’s no src in root folder.

Correct – that’s what worked for me:

pushd /lib/modules/5.12.9-1-default
[FONT=monospace]rm build source
[/FONT][FONT=monospace]ln -s /usr/src/linux-5.12.9-1 source
[/FONT]ln -s /usr/src/linux-5.12.9-1-obj/x86_64/default build
[FONT=monospace][FONT=monospace][FONT=monospace]zypper install -f kernel-firmware-nvidia nvidia-computeG05 nvidia-gfxG05-kmp-default nvidia-glG05 suse-prime nvidia-texture-tools x11-video-nvidiaG05
[/FONT][/FONT][/FONT]

Yes you are right…my bad…must have missed the first part when copying :smiley:

See https://en.opensuse.org/openSUSE:Usr_merge for an a more general workaround, basically:

cd /usr
ln -s . /usr/usr

This will work without requiring you to find the kernel symlinks for old and new kernels that have not yet been fixed.

I face the same problem.

So what is the solution? Shall I do


cd /usr
ln -s . /usr/usr

(This looks suspicious: I don't have a /usr/usr/ directory.)


or


pushd /lib/modules/5.12.9-1-default
[FONT=monospace]rm build source
[/FONT][FONT=monospace]ln -s /usr/src/linux-5.12.9-1 source
[/FONT]ln -s /usr/src/linux-5.12.9-1-obj/x86_64/default build
[FONT=monospace][FONT=monospace][FONT=monospace]zypper install -f kernel-firmware-nvidia nvidia-computeG05 nvidia-gfxG05-kmp-default nvidia-glG05 suse-prime nvidia-texture-tools x11-video-nvidiaG05
[/FONT][/FONT][/FONT]

[FONT=monospace][FONT=monospace][FONT=monospace]
or
[/FONT][/FONT][/FONT]

[FONT=monospace]
cd /lib/modules/5.12.9-1-default [/FONT][FONT=monospace]
rm build source [/FONT][FONT=monospace]
ln -s /src/linux-5.12.9-1 source [/FONT][FONT=monospace]
ln -s /src/linux-5.12.9-1-obj/x86_64/default build[/FONT]

?

Are the same steps to be done after a similar failed Nvidia driver installation when nvidia driver or the kernel are updated in repositories?

How long is the UsrMerge problem going to be around? (Please don’t say for the next twenty years.)

Hi
The next nineteen years :wink:

I prefer the ln method, the /usr/usr one creates file system loops…


 find /usr -follow -printf ""

find: File system loop detected; ‘/usr/bin/X11’ is part of the same file system loop as ‘/usr/bin’.
find: File system loop detected; ‘/usr/include/c++/7/x86_64-suse-linux/32’ is part of the same file system loop as ‘/usr/include/c++/7/x86_64-suse-linux’.
find: File system loop detected; ‘/usr/include/c++/11/x86_64-suse-linux/32’ is part of the same file system loop as ‘/usr/include/c++/11/x86_64-suse-linux’.
find: File system loop detected; ‘/usr/usr’ is part of the same file system loop as ‘/usr’.

According to 1186710 – wrong/dead symlinks with kernel-default-devel-5.12.4-2.1.x86_64 (after usrmerge?) this is no longer an issue.

It doesn’t make sense.
This post was about kernel 5.12.9, while kernel in TW is already 5.12.12. Run a “dup”, if any problems them open a new post with:

  • Output from zypper dup;
  • zypper lr --uri

The point of “cd /usr; ln -s . /usr/usr” commands is to create /usr/usr. this works-around one missing “…/” in the installables by providing a “usr” one level deeper than normal. This should work for subsequent updates with no further effort required.

or

pushd /lib/modules/5.12.9-1-default
[FONT=monospace]rm build source
[/FONT][FONT=monospace]ln -s /usr/src/linux-5.12.9-1 source
[/FONT]ln -s /usr/src/linux-5.12.9-1-obj/x86_64/default build
[FONT=monospace][FONT=monospace][FONT=monospace]zypper install -f kernel-firmware-nvidia nvidia-computeG05 nvidia-gfxG05-kmp-default nvidia-glG05 suse-prime nvidia-texture-tools x11-video-nvidiaG05
[/FONT][/FONT][/FONT]

[FONT=monospace][FONT=monospace][FONT=monospace]
or
[/FONT][/FONT][/FONT]

[FONT=monospace]
cd /lib/modules/5.12.9-1-default [/FONT][FONT=monospace]
rm build source [/FONT][FONT=monospace]
ln -s /src/linux-5.12.9-1 source [/FONT][FONT=monospace]
ln -s /src/linux-5.12.9-1-obj/x86_64/default build[/FONT]

?

Are the same steps to be done after a similar failed Nvidia driver installation when nvidia driver or the kernel are updated in repositories?

These above solutions are probably more “correct” in that they actually fix the installables to work without a /usr/usr hack. Although they have the same effect, they have the disadvantage that each new kernel update will have to undergo a similar manual patchup.

How long is the UsrMerge problem going to be around? (Please don’t say for the next twenty years.)

It appears to me that from 5.12.10 onward the relative ‘…/…/…’ links have been changed to absolute links such as ‘source -> /usr/src/linux-5.12.12-1’, so the workaround won’t be required once you purge any earlier kernels from your system. Once clear of the broken kernels the /usr/usr symlink should be able to be removed.

I have tried all suggestions. None worked.

OS crashes, not starting, not found, mkinitrd not starting when it should, and all sort of other error messages pop up. I made no progress.

It was difficult even to disable the nouveau; the usual settings in /etc/modprobe.d/… didn’t work. Even adding brokenmodules=nouveau in boot parameters didn’t work.

I cannot also install the driver from driver package downloaded from Nvidia. Tried two versions of the .run file.

Could somebody write a tested manual on how to install nvidia drivers in opensuse Leap 15.3? Then I could follow the instructions and report the first problem.

I run an updated version of 15.3. Kernel is 5.3.18-59.10-default. On the same computer, the nvidia driver worked with one of the previous versions of Leap, don’t know which one for sure. Nouveau works, but is slow and not useful. The card is 20 80 Ti.

(Are packages checked before being uploaded into repositories, or authors of packages are trusted without checking?)

All of the above suggestions and discussions relate to fixes for issues resulting from the recent Tumbleweed /usr merge. None of the suggestions apply to Leap. When I was running Leap I just followed the easy way instructions, something similar to: 15. Graphics Drivers - Install Nvidia or ATI/AMD 3D Driver - they just worked.

Thanks. I see.

Since which version of kernel the UsrMerge was introduced?

I will start a new thread for nvidia driver on 15.3 if needed.

This is not really related to any kernel version. UsrMerge refers to the Tumbleweed operating system update that merges /bin, /lib, /lib64, /sbin into /usr (for example, /bin no longer exists except as a symbolic link to /usr/bin). This is a Tumbleweed only change related to tumbleweed OS packaging and updates. Tumbleweed packaged kernel sources prior to kernel 5.12.10 were packaged for the pre-merge situation, the UsrMerge update failed to account for this and retained some “unmerged” symbolic links. This meant that some software installers, such as the nvidia driver, failed to find kernel source files that they needed.

So the issue has nothing to do with non-Tumbleweed systems and it has nothing to do with the compatibility of the nvidia-driver or kernel, but is solely related to the packaging layout of older tumbleweed kernel-src packages. Tumbleweed packaged kernel sources from 5.12.10 onward now have the correct links.