Kernel update breaks NVIDIA driver

Hi, I’ve recently updated to the 4.1.21-14.2-x86_64 kernel and it has broken my nvidia driver. The Xorg.0.log contains the error:

NVIDIA: Failed to initialise the NVIDIA kernel module

I’m now using the kernel:

Linux desktop 4.1.20-11-default #1 SMP PREEMPT Fri Mar 18 14:42:07 UTC 2016 (0a392b2) x86_64 x86_64 x86_64 GNU/Linux

by selecting the advanced options from the grub menu.

The version of the nvidia driver is:

367.18_K4.1.12_1-25.1-x86_64

Has anyone else had this issue? I can’t find any error messages in other logs to help. Any ideas would be appreciated.

Mike Railton

Uninstall the kernel module and install it again:

sudo rpm -e --nodeps nvidia-gfxG04-kmp-default
sudo zypper in nvidia-gfxG04-kmp-default

I am not aware of any problems with the latest Leap 42.1 kernel, but there seems to be a problem with updates:
http://bugzilla.opensuse.org/show_bug.cgi?id=983927

PS: Apparently there is a bug in the current nvidia packages:
https://bugzilla.opensuse.org/show_bug.cgi?id=983934

Try to add this line to /etc/modprobe.d/50-nvidia.conf:

install nvidia-drm /sbin/modprobe nvidia; /sbin/modprobe --ignore-install nvidia-drm

and run “sudo mkinitrd”.

Experiencing a similar issue. The last round of updates, including several Nvidia driver components, broke the system and it will no longer boot with the latest kernel, 4.1.21. It gives me only the “something went wrong!” screen. I’ve reverted to 4.1.20 - which still works - until I can get the problem sorted out.

No problem here with opensuse 42.1, GNOME 3.16, kernel 4.1.21-14.2-x86_64 and nvidia driver 367.18_k4.1.12_1-25.1

Same setup here, except it’s Nvidia driver 340.96_k4.1.12-_1-45.1. This isn’t the first time an Nvidia update has made my Suse system unbootable. In my experience, adding the Nvidia repository to the list greatly increases the chance of getting a piece of software that hasn’t been sufficiently debugged. It used to be their newest barracuda GPU-computing driver that would always screw things up. Not sure which it is this time. Kernel module?

quoting myself:

This resulted in one of those, “well, duh!” moments where you realize the answer is staring you in the face. I reverted the driverset back to 367.18 and the system boots with kernel 4.1.21 again. Sometimes, the most obvious solution is the best solution.

The packages in the nvidia repo have meanwhile been downgraded to the previous version, so reinstalling the driver should work as well now.

Either with YaST (“Upgrade Unconditionally”), or:

sudo zypper in -f nvidia-gfxG04-kmp-default x11-video-nvidiaG04 nvidia-glG04 nvidia-computeG04

(for G02 and G03 change that accordingly, G03 needs nvidia-uvm-gfxG03-kmp-default as well though, and G02 doesn’t have nvidia-glG02, so leave that out…)

But why exactly?
What is wrong with this update?

MichaelRailton has a problem with it, but not me with the same drive and kernel version.
Does Michael have a package installed or not installed that could cause the driver to fail to build?
So are the dependencies not set correctly for the kernel or driver?

I cannot tell you more than what is written in the bug report. (I don’t even use the nvidia driver myself…)

But it is no dependency problem. It seems to be related to the creation of the initrd.
And the proposed “fix” turned out to not work either.

And btw, there are more people who have/had problems, not only MichaelRailton…

I tried this and it had no effect. The original contents of /etc/modprob.d/50-nvidia.conf were:

options nvidia NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=33 NVreg_DeviceFileMode=0660
install nvidia PATH=$PATH:/bin:/usr/bin; /sbin/modprobe --ignore-install nvidia; /sbin/modprobe nvidia_uvm; test -c /dev/nvidia-uvm || mknod -m 660 /dev/nvidia-uvm c $(cat /proc/devices | while read major device; do if  "$device" == "nvidia-uvm" ]; then echo $major; break; fi ; done) 0 && chown :video /dev/nvidia-uvm; test -c /dev/nvidiactl || mknod -m 660 /dev/nvidiactl c 195 255 && chown :video /dev/nvidiactl; devid=-1; for dev in $(ls -d /sys/bus/pci/devices/*); do vendorid=$(cat $dev/vendor); if  "$vendorid" == "0x10de" ]; then class=$(cat $dev/class); classid=${class%%00}; if  "$classid" == "0x0300" -o "$classid" == "0x0302" ]; then devid=$((devid+1)); test -c /dev/nvidia${devid} || mknod -m 660 /dev/nvidia${devid} c 195 ${devid} && chown :video /dev/nvidia${devid}; fi; fi; done

after editing the contents are:

options nvidia NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=33 NVreg_DeviceFileMode=0660
install nvidia PATH=$PATH:/bin:/usr/bin; /sbin/modprobe --ignore-install nvidia; /sbin/modprobe nvidia_uvm; test -c /dev/nvidia-uvm || mknod -m 660 /dev/nvidia-uvm c $(cat /proc/devices | while read major device; do if  "$device" == "nvidia-uvm" ]; then echo $major; break; fi ; done) 0 && chown :video /dev/nvidia-uvm; test -c /dev/nvidiactl || mknod -m 660 /dev/nvidiactl c 195 255 && chown :video /dev/nvidiactl; devid=-1; for dev in $(ls -d /sys/bus/pci/devices/*); do vendorid=$(cat $dev/vendor); if  "$vendorid" == "0x10de" ]; then class=$(cat $dev/class); classid=${class%%00}; if  "$classid" == "0x0300" -o "$classid" == "0x0302" ]; then devid=$((devid+1)); test -c /dev/nvidia${devid} || mknod -m 660 /dev/nvidia${devid} c 195 ${devid} && chown :video /dev/nvidia${devid}; fi; fi; done
install nvidia-drm /sbin/modprobe nvidia; /sbin/modprobe --ignore-install nvidia-drm

Was that the change you intended?

Mike R

Yes.
Did you run “sudo mkinitrd” after the change?

Meanwhile it turned out that the nvidia Xwrapper script was broken too.
So if the above doesn’t help, try to set DISPLAYMANAGER_XSERVER=“Xorg” in /etc/sysconfig/displaymanager (modify the existing line accordingly).

But as I wrote, the packages in the repo have been reverted to the older version.
I’m not sure whether the repo metadata is fixed already, but you could also download the packages manually:
ftp://download.nvidia.com/opensuse/leap/42.1/x86_64

If you uninstall the nvidia driver packages, you should be able to get to a graphical session, where you could download and install the packages easier…

I tried this and the packages appear to be not there. This is what happened:

sudo zypper in -f nvidia-gfxG04-kmp-default x11-video-nvidiaG04 nvidia-glG04 nvidia-computeG04
Loading repository data...
Reading installed packages...
Forcing installation of 'x11-video-nvidiaG04-367.18-25.1.x86_64' from repository 'nVidia Graphics Drivers'.
Forcing installation of 'nvidia-glG04-367.18-25.1.x86_64' from repository 'nVidia Graphics Drivers'.
Forcing installation of 'nvidia-computeG04-367.18-25.1.x86_64' from repository 'nVidia Graphics Drivers'.
Forcing installation of 'nvidia-gfxG04-kmp-default-367.18_k4.1.12_1-25.1.x86_64' from repository 'nVidia Graphics Drivers'.
Resolving package dependencies...

The following 4 packages are going to be reinstalled:
  nvidia-computeG04 nvidia-gfxG04-kmp-default nvidia-glG04 x11-video-nvidiaG04

4 packages to reinstall.
Overall download size: 84.4 MiB. Already cached: 0 B. No additional space will be used or freed after the operation.
Continue? [y/n/? shows all options] (y): y
Retrieving package nvidia-gfxG04-kmp-default-367.18_k4.1.12_1-25.1.x86_64                            (1/4),   5.8 MiB ( 65.8 MiB unpacked)
Retrieving: nvidia-gfxG04-kmp-default-367.18_k4.1.12_1-25.1.x86_64.rpm ............................................................[error]
File './x86_64/nvidia-gfxG04-kmp-default-367.18_k4.1.12_1-25.1.x86_64.rpm' not found on medium 'http://download.nvidia.com/opensuse/leap/42.1'

Abort, retry, ignore? [a/r/i/? shows all options] (a): a
Problem occured during or after installation or removal of packages:
Installation aborted by user

Please see the above error message for a hint.

This is not really a high priority for me as I have the workaround of using the earlier kernel at boot. As there is a bug report in re the nvidia drivers, perhaps I should wait for the fix to come through.

Mike R

Yes.
As I wrote, the repo metadata is wrong currently.
It still refers to the newer packages that have been removed and replaced with the older (working) ones.

You’d need to download the rpm files manually and install them with rpm.

Or run this command (after you removed the currently installed nvidia packages), that should download and install them:

sudo rpm -i ftp://download.nvidia.com/opensuse/leap/42.1/x86_64/nvidia-computeG04-361.42-21.1.x86_64.rpm ftp://download.nvidia.com/opensuse/leap/42.1/x86_64/nvidia-gfxG04-kmp-default-361.42_k4.1.12_1-21.1.x86_64.rpm ftp://download.nvidia.com/opensuse/leap/42.1/x86_64/nvidia-glG04-361.42-21.1.x86_64.rpm ftp://download.nvidia.com/opensuse/leap/42.1/x86_64/x11-video-nvidiaG04-361.42-21.1.x86_64.rpm

This is not really a high priority for me as I have the workaround of using the earlier kernel at boot. As there is a bug report in re the nvidia drivers, perhaps I should wait for the fix to come through.

Well, if that’s ok for you, waiting is obviously the easiest thing to do… :wink:

Fixed (hopefully) packages have been submitted, it just takes some time until they will be in the repo.
Unfortunately these things always happen shortly before the weekend, and then often get fixed only after the weekend…

I think I’ll wait until either the previous packages are available or the fixed ones arrive. An earlier attempt to remove the nvidia packages resulted in lot’s of dependant stuff being uninstalled so it would probable be safer to “leave well alone”. After all - I have a working system.

Thanks for your help - just one last question. Should I edit the thread title to say SOLVED or should I leave it as it is untill the new packages arrive?

Yeah i understand, sorry for that.
I didn’t mean to make it look like it was his problem alone.

Hi,

not sure if this is the right place to put this, but it’s nvidia related, so here goes. Does anyone know the correct driver to use for EVGA-NVIDIA-GEFORCE GT 730? I don’t see any likely candidates in the repository (but that doesn’t mean there isn’t one :\ ). Tried the one from nvidia’s site, blacklisted nouveau, installed it and then missed doing “mkinitrd” and it didn’t work, ended up reinstalling the system (long story short, just built this box, and it’s taken days to iron out the hw-linux-compatability issues, this is the hopefully last one), a little leery of trying the nvidia site’s driver again, only because I’m not positive if not doing “mkinitrd” was the only reason it failed …can anyone tell me before I run the install again, and/or how to undo it if it fails??

Been using a copy of Slacko-Puppy Linux (which makes a great live maintenance program btw, since it auto mounts everything at boot) to undo what I can, but video cards I’m kind of lost on…I’ve had to do like umpteen OS re-installs for various reasons, some of them froze, a few just cut out halfway and rebooted the computer (partly my fault, tried to get away with using some old SLI cards I had at first, the result was…not good lol…just scrapped the idea and bought this one, in for a penny, in for a pound…). Even installed a copy of win7 to make sure it wasn’t just incorrectly installed hardware, win went without a hitch >:( figures…(first box I’ve built in like 16 years, and the others I built were Windows boxes, this is a bit trickier, but worth it) been using OpenSuse for a long time now, I could NEVER go back to Windows, so I really need to get this worked out. I’m using LEAP kernel is 4.1.21-14-default, any other info needed let me know, sorry if this is kind of vague, but any help’s greatly appreciated!

Add teh nvidia repo in Yast repository management in Community repos
In Yast software management you search for nvidia the GO4 set of packages should do it but the GO3 flavour should also work

We don’t use marking threads “SOLVED” here and you can’t edit the thread title anyway. You might edit the title of a reply message though, possibly explaining what you did to “solve” your problem (if and when you really get there :wink: )

I get this error when trying to run software updates:

http://i67.tinypic.com/715u78.jpg

Then when going into YaST and doing an online update I get these, and skip them

http://i68.tinypic.com/qq8tgn.jpg

http://i65.tinypic.com/25kn4zo.jpg

Anyone know how to fix this by these errors from YaST??