Cannot install Nvidia driver in 15.3 and Tumbleweed

After years of Nvidia driver installations, I cannot install it on 15.3 and Tumbleweed. Both operating systems are updated before installation. On the same computer, the driver works with previous versions of Leap. 20 80 Ti is the card.

I have tried both installing the downloaded package and from Nvidia repository.

The software is installed correctly in runlevels 3 or 1. The problems start when graphical interface is requested.

Switching to runlevel 5 leads to a crash. Hard reboot is needed after the crash. I couldn’t find any meaningful diagnostic messages.

Attempting to run startx fails with a message


"VGA arbitration: cannot restore default device"

Before starting graphics, Yast’s Hardware Information shows that the nvidia is the driver of the card. And before Nvidia driver’s installation, the nouveau driver works correctly.

In Tumbleweed, the kernel version is 5.12.13-1-default. I don’t remember the kernel version of 15.3 after updating after installation.

I have tried the following - nothing helped:

Running nvidia-xconfig
Deleting xorg.conf
Adding BusID of the card in xorg.conf
Reinstalling the OS
Booting with kernel parameters pci=realloc and pci=noaer
Trying driver versions 460.84, 465.31, 470.42
Blacklisting nvidiafb, nouveau, rtsx_usb
Running mkinitrd after installation attempts.

The output of startx contains


xauth:  file /root/.serverauth.2381 does not exist

X.Org X Server 1.20.11
X Protocol Version 11, Revision 0
Build Operating System: openSUSE SUSE LINUX
Current Operating System: Linux localhost.localdomain 5.12.13-1-default #1 SMP Mon Jun 28 06:37:23 UTC 2021 (74bd8c0) x86_64
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.12.13-1-default root=UUID=... splash=verbose 3 pci=noaer mitigations=auto
Build Date: 17 June 2021  12:00:00AM
 
Current version of pixman: 0.40.0
    Before reporting problems, check http://wiki.x.org
    to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
    (++) from command line, (!!) notice, (II) informational,
    (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Mon Jul  5 00:29:30 2021
(==) Using config file: "/etc/X11/xorg.conf"
(==) Using config directory: "/etc/X11/xorg.conf.d"
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
(EE) Fatal server error:
(EE) no screens found(EE) 
(EE) Please consult the The X.Org Foundation support at http://wiki.x.org for help. 
(EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
(EE) VGA Arbitration: Cannot restore default device.
(EE) Server terminated with error (1). Closing log file.
xinit: giving up
xinit: unable to connect to X server: Connection refused
xinit: server error
-------------------------------------------------------------------------------------------
xinit failed. /usr/bin/Xorg is not setuid, maybe that's the reason?
If so either use a display manager (strongly recommended) or adjust /etc/permissions.local and run "chkstat --system --set" afterwards

The Xorg.log.0 contains


   296.508] 
X.Org X Server 1.20.11
X Protocol Version 11, Revision 0
   296.508] Build Operating System: openSUSE SUSE LINUX
   296.508] Current Operating System: Linux localhost.localdomain 5.12.13-1-default #1 SMP Mon Jun 28 06:37:23 UTC 2021 (74bd8c0) x86_64
   296.508] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.12.13-1-default root=UUID=... splash=verbose 3 pci=noaer mitigations=auto
   296.508] Build Date: 17 June 2021  12:00:00AM
   296.508]  
   296.508] Current version of pixman: 0.40.0
   296.508]     Before reporting problems, check http://wiki.x.org
    to make sure that you have the latest version.
   296.508] Markers: (--) probed, (**) from config file, (==) default setting,
    (++) from command line, (!!) notice, (II) informational,
    (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
   296.508] (==) Log file: "/var/log/Xorg.0.log", Time: Mon Jul  5 00:29:30 2021
   296.509] (==) Using config file: "/etc/X11/xorg.conf"
   296.509] (==) Using config directory: "/etc/X11/xorg.conf.d"
   296.509] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
   296.509] (==) ServerLayout "Layout0"
   296.509] (**) |-->Screen "Screen0" (0)
   296.509] (**) |   |-->Monitor "Monitor0"
   296.510] (**) |   |-->Device "Device0"
   296.510] (**) |-->Input Device "Keyboard0"
   296.510] (**) |-->Input Device "Mouse0"
   296.510] (==) Automatically adding devices
   296.510] (==) Automatically enabling devices
   296.510] (==) Automatically adding GPU devices
   296.510] (==) Max clients allowed: 256, resource mask: 0x1fffff
   296.510] (WW) The directory "/usr/share/fonts/misc/sgi" does not exist.
   296.510]     Entry deleted from font path.
   296.510] (==) FontPath set to:
    /usr/share/fonts/misc:unscaled,
    /usr/share/fonts/Type1/,
    /usr/share/fonts/100dpi:unscaled,
    /usr/share/fonts/75dpi:unscaled,
    /usr/share/fonts/ghostscript/,
    /usr/share/fonts/cyrillic:unscaled,
    /usr/share/fonts/truetype/,
    built-ins
   296.510] (==) ModulePath set to "/usr/lib64/xorg/modules"
   296.510] (WW) Ignoring unrecognized extension "XFree86-DGA"
   296.510] (WW) Hotplugging is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled.
   296.510] (WW) Disabling Keyboard0
   296.510] (WW) Disabling Mouse0
   296.510] (II) Loader magic: 0x5584e611ea00
   296.510] (II) Module ABI versions:
   296.510]     X.Org ANSI C Emulation: 0.4
   296.510]     X.Org Video Driver: 24.1
   296.510]     X.Org XInput driver : 24.1
   296.510]     X.Org Server Extension : 10.0
   296.511] (++) using VT number 1

   296.513] (II) systemd-logind: took control of session /org/freedesktop/login1/session/_31
   296.514] (II) xfree86: Adding drm device (/dev/dri/card0)
   296.514] (II) systemd-logind: got fd for /dev/dri/card0 226:0 fd 11 paused 0
   296.536] (--) PCI:*(101@0:0:0) 10de:1e04:3842:2281 rev 161, Mem @ 0xd7000000/16777216, 0xc0000000/268435456, 0xd0000000/33554432, I/O @ 0x0000b000/128, BIOS @ 0x????????/131072
   296.536] (II) LoadModule: "glx"
   296.536] (II) Loading /usr/lib64/xorg/modules/extensions/libglx.so
   296.537] (II) Module glx: vendor="X.Org Foundation"
   296.537]     compiled for 1.20.11, module version = 1.0.0
   296.538]     ABI class: X.Org Server Extension, version 10.0
   296.538] (II) LoadModule: "nvidia"
   296.538] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so
   296.538] (II) Module nvidia: vendor="NVIDIA Corporation"
   296.538]     compiled for 1.6.99.901, module version = 1.0.0
   296.538]     Module class: X.Org Video Driver
   296.538] (II) NVIDIA dlloader X Driver  460.84  Wed May 26 20:07:09 UTC 2021
   296.538] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
   296.538] (II) systemd-logind: releasing fd for 226:0
   296.539] (II) Loading sub module "fb"
   296.539] (II) LoadModule: "fb"
   296.539] (II) Loading /usr/lib64/xorg/modules/libfb.so
   296.539] (II) Module fb: vendor="X.Org Foundation"
   296.539]     compiled for 1.20.11, module version = 1.0.0
   296.539]     ABI class: X.Org ANSI C Emulation, version 0.4
   296.539] (II) Loading sub module "wfb"
   296.539] (II) LoadModule: "wfb"
   296.539] (II) Loading /usr/lib64/xorg/modules/libwfb.so
   296.540] (II) Module wfb: vendor="X.Org Foundation"
   296.540]     compiled for 1.20.11, module version = 1.0.0
   296.540]     ABI class: X.Org ANSI C Emulation, version 0.4
   296.540] (II) Loading sub module "ramdac"
   296.540] (II) LoadModule: "ramdac"
   296.540] (II) Module "ramdac" already built-in
   296.540] (EE) No devices detected.
   296.540] (EE) Fatal server error:
   296.540] (EE) no screens found(EE) 
   296.540] (EE) Please consult the The X.Org Foundation support 
     at http://wiki.x.org for help. 
   296.540] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
   296.540] (EE) 
   296.543] (EE) Server terminated with error (1). Closing log file.


I don’t know what else to do. The message about VGA arbitration is not understandable.

On Nvidia’s developers forum, there is a discussion on situation with Linux drivers. The essence of discussions is the following.

  • Nvidia drivers recently became severely buggy.

  • The Xorg server is fine.

  • The main issues are that monitors connected to DP cause kernel panic and crash without leaving records in log files. Or the monitor’s maximum resolution is not available for usage.

  • Nvidia has distributed versions 460.80 or 460.84 to Linux distributions as stable, which they picked up despite the bug being known.

  • Nvidia is working on the issue, and may have fixed the issue in version 470.

Based on the experience of the victims on that forum, it looks like the version 460.73 might be the last functional one. I suggest opensuse Leap change the current version in the repositories


https://download.nvidia.com/opensuse/leap/15.3

and

https://download.nvidia.com/opensuse/tumbleweed/

to 460.73 or to 470.42.01 (after testing with a few monitors connected to DP).

Thanks for the information, but as you can see those are nvidia.com domains so I doubt that openSUSE has any control on what is offered there.
Anyway we can pick the right *.run file and install it “the hard way” until the situation is fixed by a “stable” release.

I did the installation the hard way.

But is there a program, possibly called dkms, which will monitor the kernel version and availability of the Nvidia driver in a certain folder of the computer, and run the steps of the driver installation every time the kernel is changed, in silent mode?

AFAIK these files are created by openSUSE team. Nvidia only host them.

Is problem solved?

Yes, dkms triggers the rebuild of registered modules when a new kernel is installed. If I recall correctly the actual rebuild happens the first time the new kernel is booted, so that reboot is noticeably longer than usual.

Yes.

My computer works normally for a couple of days. I haven’t checked various possibilities of connections, monitor count, resolutions and so on.

The Nvidia driver package version 460.84, which is currently in repositories, is likely defective.

Bug report?

New version of driver appeared in the Nvidia repository. It works correctly.

No need for a bug report; this is actually Nvidia’s bug. But opensuse is not completely bug-free: buggy driver was selected and put in repository. Although I am not sure whether it would have been possible to use the last functional version from Nvidia with new kernels coming.

Problerm is:
The Maintainer of the openSUSE Nvidia driver does not own all Nvidia cards, so he relies on bugzilla.
He can change the Nvidia rpm because that is build on the OBS, but he can only uploading it to Nvidia and that will take some time…

So for me, I use Nvidia the hard way without dkms.
And have mostly ( 99%) no problems.

:|Maybe I shouldn’t hijack this thread but I have struggled with this problem and given up so didn’t want to trouble people with a new post.
Question: if I want to try “The Hard Way”, is the driver I want the one listed here
https://opensuse.pkgs.org/15.3/nvidia-x86_64/
as “NVIDIA graphics driver for GeForce 600 series and newer”. It is available at
https://download.nvidia.com/opensuse/leap/15.3/x86_64/
Is the only other package I might need the “NVIDIA OpenGL libraries for OpenGL acceleration”? I admit to being ignorant of the function of all the files available at these sites.

I have purged every reference that I can find to Nvidia from my installation of 15.3 as I was going to soldier on with the nouveau driver which is almost usable some of the time and look towards replacing my graphics card, so was not keen to try the “easy way” again.

If anyone is interested my saga went as follows
*Identified my Graphics card as “Nvidia G86 [GeForce 8500 GT] (rev a1)”
*Downloaded driver NVIDIA-Linux-x86_64-340.108.run as the correct one to install the “Hard Way” (I thought it might work better than the GEforce 600 series and newer as the “Right” one!:X
*when this failed I tried the --use-this-kernel option; the interface was the first thing it tried to do and it failed.
*Installed the GeForce 600 driver the easy way; this succeeded but the only resolution available was less than that recommended for my monitor so the right margin of the screen was cut off. It seemed stable through reboots. I did various things like run nvidia-xconfig and reboot into a previous snapshot to compare the functionality. I shut down as there was a thunderstorm.
*when I rebooted, I had a lovely desktop! Perhaps I should have made a snapshot or done a mkinitrd but I thought my torubles were over. There was thunder on the left (Superstition!) so I shut down again to save all my hard work.:sarcastic:
*Booting up the next day, no graphics:O only a login prompt!
Thinking this through, maybe I should just try the easy way…
Any advice gratefully accepted, and again apologies for adding all this to this thread which seems to be solved.

Create new thread for your problem.
“The Hard Way” means using these drivers: Official Drivers | NVIDIA
For your hardware: Linux x64 (AMD64/EM64T) Display Driver | 340.108 | Linux 64-bit | NVIDIA
You card is not supported in Leap 15.3. Replace it or use nouveau drivers.

Thank you Syvatko I will carry on.