Issues w Nvidia 352.79 Driver

I allowed Apper to update my OpenSUSE 13.2 Nvidia G04 graphics driver, and now it won’t boot using the NVIDIA 352.79 driver. And it also seems that I can’t uninstall all the nouveau driver packages without the package manager recommending 100’s of uninstalls of other packages. Therefore, I’ve left those packages that trigger those recommendations. I can still get to recovery mode, but I’d like some assistance on what to do. I’ve uninstalled/reinstalled NVIDIA several times thinking that maybe I happened to update when the repositories were not totally synched, but that didn’t help. I looking at several log files (XOrg and System) at one point it told me that some of the files were not at the same version level. Now, I’m at a point where those version messages are no longer being displayed but I do get an error message in the X server log about not being able to find the nvidia module. I can post more when I get back to that PC to help troubleshoot.

Thanks in advance,
Bernard

What does “won’t boot” mean?
I suppose you don’t get a graphical system, but can login just fine in text mode? (press Ctrl+Alt+F1)

This means Xorg fails to start, but otherwise the system is booting fine.

And it also seems that I can’t uninstall all the nouveau driver packages without the package manager recommending 100’s of uninstalls of other packages.

You don’t need to uninstall nouveau, you shouldn’t uninstall it, and you can’t really.
Well, you can uninstall the X driver, but not the rest like the kernel module which is part of the kernel itself.

I looking at several log files (XOrg and System) at one point it told me that some of the files were not at the same version level. Now, I’m at a point where those version messages are no longer being displayed but I do get an error message in the X server log about not being able to find the nvidia module. I can post more when I get back to that PC to help troubleshoot.

Can you please post the Xorg log?

Also, what nvidia and kernel packages do you have installed?

rpm -qa | egrep "kernel|nvidia"

Here’s the Xorg.0.log file when I try to do a startx:

499.160]
X.Org X Server 1.16.1
Release Date: 2014-09-21
499.160] X Protocol Version 11, Revision 0
499.160] Build Operating System: openSUSE SUSE LINUX
499.160] Current Operating System: Linux Beast 3.16.7-32-desktop #1 SMP PREEMPT Wed Jan 20 14:05:33 UTC 2016 (d4df98a) x86_64
499.160] Kernel command line: BOOT_IMAGE=/vmlinuz-3.16.7-32-desktop root=UUID=77802fe8-ff2c-48df-84d8-020bcca90a47 nomodeset resume=/dev/disk/by-id/ata-ST3320620AS_5QF26GPD-part5 splash=silent quiet showopts vga=0x31a
499.160] Build Date: 28 January 2016 11:12:56AM
499.160]
499.160] Current version of pixman: 0.32.6
499.160] Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
499.160] Markers: (–) probed, () from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
499.160] (==) Log file: “/var/log/Xorg.0.log”, Time: Fri Feb 5 17:08:26 2016
499.161] (==) Using config file: “/etc/X11/xorg.conf”
499.161] (==) Using config directory: “/etc/X11/xorg.conf.d”
499.161] (==) Using system config directory “/usr/share/X11/xorg.conf.d”
499.161] (==) ServerLayout “Layout0”
499.161] (
) |–>Screen “Screen0” (0)
499.161] () | |–>Monitor “Monitor0”
499.162] (
) | |–>Device “Device0”
499.162] () |–>Input Device “Keyboard0”
499.162] (
) |–>Input Device “Mouse0”
499.162] (==) Automatically adding devices
499.162] (==) Automatically enabling devices
499.162] (==) Automatically adding GPU devices
499.162] (WW) The directory “/usr/share/fonts/misc/sgi” does not exist.
499.162] Entry deleted from font path.
499.162] (==) FontPath set to:
/usr/share/fonts/misc:unscaled,
/usr/share/fonts/Type1/,
/usr/share/fonts/100dpi:unscaled,
/usr/share/fonts/75dpi:unscaled,
/usr/share/fonts/ghostscript/,
/usr/share/fonts/cyrillic:unscaled,
/usr/share/fonts/truetype/,
built-ins
499.162] (==) ModulePath set to “/usr/lib64/xorg/modules”
499.162] (WW) Hotplugging is on, devices using drivers ‘kbd’, ‘mouse’ or ‘vmmouse’ will be disabled.
499.162] (WW) Disabling Keyboard0
499.162] (WW) Disabling Mouse0
499.162] (II) Loader magic: 0x80ec80
499.162] (II) Module ABI versions:
499.162] X.Org ANSI C Emulation: 0.4
499.162] X.Org Video Driver: 18.0
499.162] X.Org XInput driver : 21.0
499.162] X.Org Server Extension : 8.0
499.162] (II) xfree86: Adding drm device (/dev/dri/card0)
499.163] (–) PCI: (0:2:0:0) 1a0a:6200:1461:6202 rev 1, Mem @ 0xfddf8000/32768
499.163] (–) PCI:*(0:4:0:0) 10de:1381:196e:1381 rev 162, Mem @ 0xfb000000/16777216, 0xd0000000/268435456, 0xee000000/33554432, I/O @ 0x00008c00/128, BIOS @ 0x???/524288
499.163] (II) LoadModule: “glx”
499.164] (II) Loading /usr/lib64/xorg/modules/extensions/libglx.so
499.183] (II) Module glx: vendor=“NVIDIA Corporation”
499.183] compiled for 4.0.2, module version = 1.0.0
499.183] Module class: X.Org Server Extension
499.183] (II) NVIDIA GLX Module 352.79 Wed Jan 13 15:54:44 PST 2016
499.183] (II) LoadModule: “nvidia”
499.183] (II) Loading /usr/lib64/xorg/modules/updates/drivers/nvidia_drv.so
499.184] (II) Module nvidia: vendor=“NVIDIA Corporation”
499.184] compiled for 4.0.2, module version = 1.0.0
499.184] Module class: X.Org Video Driver
499.184] (II) NVIDIA dlloader X Driver 352.63 Sat Nov 7 20:29:25 PST 2015
499.184] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
499.184] (–) using VT number 2

499.191] (II) Loading sub module “fb”
499.191] (II) LoadModule: “fb”
499.191] (II) Loading /usr/lib64/xorg/modules/libfb.so
499.192] (II) Module fb: vendor=“X.Org Foundation”
499.192] compiled for 1.16.1, module version = 1.0.0
499.192] ABI class: X.Org ANSI C Emulation, version 0.4
499.192] (II) Loading sub module “wfb”
499.192] (II) LoadModule: “wfb”
499.192] (II) Loading /usr/lib64/xorg/modules/libwfb.so
499.192] (II) Module wfb: vendor=“X.Org Foundation”
499.192] compiled for 1.16.1, module version = 1.0.0
499.192] ABI class: X.Org ANSI C Emulation, version 0.4
499.192] (II) Loading sub module “ramdac”
499.192] (II) LoadModule: “ramdac”
499.192] (II) Module “ramdac” already built-in
499.193] (EE) NVIDIA: Failed to initialize the NVIDIA kernel module. Please see the
499.193] (EE) NVIDIA: system’s kernel log for additional error messages and
499.193] (EE) NVIDIA: consult the NVIDIA README for details.
499.193] (EE) No devices detected.
499.193] (EE)
Fatal server error:
499.193] (EE) no screens found(EE)
499.193] (EE)
Please consult the The X.Org Foundation support
at http://wiki.x.org
for help.
499.193] (EE) Please also check the log file at “/var/log/Xorg.0.log” for additional information.
499.193] (EE)

And here’s the output from /var/log> rpm -qa | egrep “kernel|nvidia” :

kernel-desktop-devel-3.16.7-29.1.x86_64
kernel-desktop-3.11.6-4.1.x86_64
kernel-desktop-3.11.10-29.1.x86_64
nvidia-computeG04-352.79-19.1.x86_64
kernel-devel-3.16.7-32.1.noarch
nvidia-uvm-gfxG04-kmp-desktop-352.79_k3.16.6_2-19.1.x86_64
kernel-macros-3.16.7-32.1.noarch
nvidia-glG04-352.79-19.1.x86_64
kernel-desktop-devel-3.16.7-32.1.x86_64
kernel-desktop-3.16.7-29.1.x86_64
x11-video-nvidiaG04-352.79-19.1.x86_64
kernel-desktop-3.16.7-32.1.x86_64
nvidia-gfxG04-kmp-desktop-352.79_k3.16.6_2-19.1.x86_64
kernel-devel-3.16.7-29.1.noarch

Let me know if you’d like additional information.

Thank you,

Bernard

I would rename the xorg.conf first and see if it then works:

499.161] (==) Using config file: “/etc/X11/xorg.conf”

/etc/X11/xorg.conf is not needed any more, but if it is there, it will be used.

OK. I renamed the file with a .old extension. Now when I boot, I don’t have to go to the advance recovery options, but the Nvidia driver still is not loading.

Also, when I do a Control-Alt-F10 there is an entry on this screen with a line:

NVRM: API Mismatch: the client has the version 352.63, but
NVRM: this kernel module has the version 352.79. Please
NVRM: make sure that this kernal module and all NVIDIA Driver
NVRM: components have the same version.
NVRM: nvidia_frontend_ioctl: minor 255, module->ioctl failed, error -22

Does this give any additional clues as to what is going on?

Thanks

There are 5 packages that make up the NVIDIA driver be sure ALL are the same version. You can check in Yast

Ok, so indeed it seems the kernel module cannot be loaded, although it is installed.
Maybe it got removed by something?
This happens e.g. if you switch from G03 to G04 without removing the old version first.

Try to reinstall the kernel modules:

sudo zypper in -f nvidia-gfxG04-kmp-desktop nvidia-uvm-gfxG04-kmp-desktop

Well, apparently an old version of the driver is still installed somewhere and used instead of the current one.
Did you upgrade to 13.2 from an older version?
In this case, depending on how you did the upgrade, old nvidia files can be left-over.
You should better remove the driver completely before doing an openSUSE upgrade too to prevent problems.

See SDB:NVIDIA drivers - openSUSE Wiki.

In this case, check whether you have a directory /usr/lib/xorg/modules/updates/ and/or /usr/lib64/xorg/modules/updates/ with nvidia files in it. If yes, just delete them and the driver should work. (Those directories were used by the nvidia driver in earlier openSUSE releases, and take predecence over the standard locations where the nvidia driver files are installed to now)

Upto now it probably worked because they still were the same versions by chance, the update broke that.

PS: and that really is the problem here, see this from the log:


 499.164] (II) Loading /usr/lib64/xorg/modules/extensions/libglx.so

 499.183] (II) Module glx: vendor="NVIDIA Corporation"
 499.183] compiled for 4.0.2, module version = 1.0.0
 499.183] Module class: X.Org Server Extension
 499.183] (II) NVIDIA GLX Module 352.79 Wed Jan 13 15:54:44 PST 2016
 499.183] (II) LoadModule: "nvidia"
 499.183] (II) Loading /usr/lib64/xorg/modules/updates/drivers/nvidia_drv.so
 499.184] (II) Module nvidia: vendor="NVIDIA Corporation"
 499.184] compiled for 4.0.2, module version = 1.0.0
 499.184] Module class: X.Org Video Driver
 499.184] (II) NVIDIA dlloader X Driver 352.63 Sat Nov 7 20:29:25 PST 2015

So, delete /usr/lib64/xorg/modules/updates/ to fix it.

sudo rm -r 
/usr/lib64/xorg/modules/updates

I deleted the /usr/lib64/xorg/modules/updates/ and the everything works properly now.

I did upgrade the system about 6-9 months ago (from 13.1 to 13.2) and I allowed two NVidia driver updates applied since then. So it puzzles me that it would start using the left over drivers. :
Regardless, thank you for your assistance! lol!

Bernard

Great! :slight_smile:

I did upgrade the system about 6-9 months ago (from 13.1 to 13.2) and I allowed two NVidia driver updates applied since then. So it puzzles me that it would start using the left over drivers. :\

What’s in particular interesting is that the “left over” driver is version 352.63 from Nov 7 2015.

Did you maybe use the nvidia repo for 13.1 for the first few months and changed to the correct one in the last two months?

Or maybe you have both (the 13.1 and the 13.2 repo) in your repo list? The result which one is being installed might be quite random then.

Better check that, if that’s the case your system might again break randomly whenever there’s an nvidia driver update.