Jan 18 Kernel update and NVIDIA driver mismatch

Hello all,

I seem to be stuck at the moment. On January 18, my system performed a kernel update. Since that time I have not been able to get back into X.

I am running:
cat /etc/os-release
NAME=openSUSE
VERSION=“13.1 (Bottle)”
VERSION_ID=“13.1”
PRETTY_NAME=“openSUSE 13.1 (Bottle) (x86_64)”
ID=opensuse
ANSI_COLOR=“0;32”
CPE_NAME=“cpe:/o:opensuse:opensuse:13.1”
BUG_REPORT_URL=“https://bugs.opensuse.org
HOME_URL=“https://opensuse.org/
ID_LIKE=“suse”

Kernel:
Linux hub 3.11.6-4-desktop #1 SMP PREEMPT Wed Oct 30 18:04:56 UTC 2013 (e6d4a27) x86_64 x86_64 x86_64 GNU/Linux

I have the following card installed: (lspci -k)
05:00.0 VGA compatible controller: NVIDIA Corporation GK106 [GeForce GTX 650 Ti] (rev a1)
Subsystem: Micro-Star International Co., Ltd. Device 2806
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia

The /var/log/Xorg.0.log file gives the following:
<begin log clip>
“Default Screen Section” for depth/fbbpp 24/32
19.940] (==) NVIDIA(0): Depth 24, (==) framebuffer bpp 32
19.940] (==) NVIDIA(0): RGB weight 888
19.940] (==) NVIDIA(0): Default visual is TrueColor
19.940] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
19.940] (**) NVIDIA(0): Enabling 2D acceleration
19.940] (EE) NVIDIA(0): Version mismatch detected between the NVIDIA X driver and the
19.940] (EE) NVIDIA(0): NVIDIA GLX module. X driver version: 331.38; GLX module
19.940] (EE) NVIDIA(0): version: 304.117. Please try reinstalling the NVIDIA
19.940] (EE) NVIDIA(0): driver.
19.941] (EE) NVIDIA(0): Failed to initialize the NVIDIA kernel module. Please see the
19.941] (EE) NVIDIA(0): system’s kernel log for additional error messages and
19.941] (EE) NVIDIA(0): consult the NVIDIA README for details.
19.941] (EE) NVIDIA(0): *** Aborting ***
19.941] (EE) NVIDIA(0): Failing initialization of X screen 0
19.941] (II) UnloadModule: “nvidia”
19.941] (II) UnloadSubModule: “wfb”
19.941] (II) UnloadSubModule: “fb”
19.941] (EE) Screen(s) found, but none have a usable configuration.
19.941] (EE)
Fatal server error:
19.941] (EE) no screens found(EE)
19.941] (EE)
Please consult the The X.Org Foundation support
</end log clip>

So far I have attempted several things to resolve this issue.
I have completely removed all NVIDIA drivers and attempted to revert to the nouveau drivers. The same error message is observed in the log.

I checked the /etc/X11/xorg.conf.d/50-device.conf. This file is basically empty. (I assume this is correct to allow auto configuration but am not sure.)
I have also attempted to blacklist the glx and nvidia drivers from loading
I have checked for and did not find an errant Xorg.conf file on my system.
I have hand verified that the removal of the nvidia drivers was successful. I did not find any modules left from the previous installation. I did however find mention of nvidia modules in the kernel source directories. (for example: /lib/modules/3.11.6-4-desktop/kernel/drivers/video)
I even tried to back-rev the proprietary NVIDIA driver back to the 304.117 version to no avail.

I currently have the 304.117 version of the NVIDIA driver installed (that is why you see the nvidia module in the lspci -k output above) and am still having the same conflict according to the Xorg.0.log file.

Can anyone provide some guidance with this issue? I am not sure where I am going wrong.

Thanks in advance
Wayne

There was no kernel update that I know of
But there was a nvidia driver and associated update which dragged in a couple of extra packages for me
IIRC: pv and something to do with nvidia and openGL
Not on that machine right now, so I can’t look further

Thank you. You are correct. My bad. Here is the line from /var/log/zypp/history.

2014-01-18 07:04:00 nvidia-gfxG03-kmp-desktop-331.38_k3.11.6_4-23.1.x86_64.rpm installed ok

Additional rpm output:

make: Entering directory `/usr/src/linux-3.11.6-4-obj/x86_64/desktop’

CC [M] /usr/src/kernel-modules/nvidia-331.38-desktop/nv.o

In file included from /usr/src/linux-3.11.6-4/arch/x86/include/asm/bitops.h:514:0,

from /usr/src/linux-3.11.6-4/include/linux/bitops.h:22,

from /usr/src/linux-3.11.6-4/include/linux/kernel.h:10,

from /usr/src/linux-3.11.6-4/include/linux/sched.h:15,

from /usr/src/linux-3.11.6-4/include/linux/utsname.h:5,

from /usr/src/kernel-modules/nvidia-331.38-desktop/nv-linux.h:44,

from /usr/src/kernel-modules/nvidia-331.38-desktop/nv.c:13:

/usr/src/linux-3.11.6-4/include/linux/bitops.h: In function ‘hweight_long’:

/usr/src/linux-3.11.6-4/include/asm-generic/bitops/const_hweight.h:27:70: warning: signed and unsigned type in conditional expression -Wsign-compare]

#define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w))

^

I had the exact same problem with that Nvidia update with my GTX 650 Ti Boost.

Here’s the culprit:


 19.940] (EE) NVIDIA(0): Version mismatch detected between the NVIDIA X driver and the
 19.940] (EE) NVIDIA(0): NVIDIA GLX module. X driver version: 331.38; GLX module
 19.940] (EE) NVIDIA(0): version: 304.117. Please try reinstalling the NVIDIA
 19.940] (EE) NVIDIA(0): driver.

BTW this is posted between CODE tags, the # in the editor.

Please post output of


rpm -qa | grep nvidia

my bet is that the kmp’s NVIDIA version does not match the x11-video ones.

https://www.dropbox.com/s/0vc5540s6tu997u/IMG_20140119_144257.jpg

Here’s an image of the output of the command you suggested. Not sure if it’s pasted correctly so let me know if you can’t see it.

Please, just post the text. You should still be able to boot to a graphical system by choosing “Recovery mode” under “Advanced Options” in the boot menu.

And please post the output of this as well:

rpm -qa | grep kernel

Anyway, my bet is that you have both the G02 and the G03 driver installed.
I would suggest to remove all nvidia packages in YaST. And kernel-default if that is installed.

Then start YaST again and only install the packages you want, i.e. either nvidia-gfxG03-kmp-desktop, x11-video-nvidiaG03 and nvidia-computeG03, or the same in their G02 variants.

If you don’t remove everything first and re-install the correct packages, there’s no guarantee that the correct files are present. Those G02 and G03 packages overwrite each other’s files, so it’s quite random which files you really have on your hard disk now.

Recovery mode hasn’t worked since the nvidia update. I guess right clicking on an image icon is too much trouble but yes there were G02 and G03 drivers installed. I tried removing them all with Zypper and reinstalling only the G02 drivers. Still no dice. Still no recovery mode either.

No, but I didn’t even see an image icon in your post.

I right-clicked on that empty icon now.
So you even had 4! nvidia kernel module packages installed… lol!

I tried removing them all with Zypper and reinstalling only the G02 drivers. Still no dice. Still no recovery mode either.

And do you get a graphical mode (maybe in “Recovery mode”) when you just uninstall all nvidia packages?

And please post “rpm -qa | grep nvidia” again to confirm.

Thanks for taking the time to look into this.

Here is the output requested:
hub:~ # rpm -qa | grep -i nvidia
hub:~ #

As you can see. Zero nvidia packages installed at the moment. I also removed the proprietary nvidia drivers by running the installer with the --uninstall option. I have also re-installed the nouveau drivers to no avail since my last posting.

Your hunch was correct Wolfi323. I DID have both G2 and G3 versions installed to begin with. Not sure how that happened but I am sure it was my doings. :slight_smile:

I do not understand why the configuration will not go back to using the nouveau drivers. I have checked the /etc/modprobe.d/50-blacklist.conf file and the module is not currently blacklisted. It seems like I am missing something in the configuration somewhere that has been told to use the nvidia drivers. Also, how can it be complaining about this version mis-match when none of those modules are present on the system from what I can see anyway. Trying to understand.

Wayne

So you had the driver installed using the .run file from nvidia as well?
That is surely calling for trouble…

Your hunch was correct Wolfi323. I DID have both G2 and G3 versions installed to begin with. Not sure how that happened but I am sure it was my doings. :slight_smile:

That this is even possible is actually libzypp’s fault, because it ignores file conflicts itself and uses “rpm --force” to install the packages (rpm would complain otherwise).

I do not understand why the configuration will not go back to using the nouveau drivers. I have checked the /etc/modprobe.d/50-blacklist.conf file and the module is not currently blacklisted. It seems like I am missing something in the configuration somewhere that has been told to use the nvidia drivers.

Maybe there’s some nouveau blacklist or “nomodeset” kernel option still in place?
Do you have an /etc/X11/xorg.conf ? If yes, remove it, as it will force X to load the nvidia driver, which fails obviously (especially if it is not installed anymore… :wink: ).

Also, how can it be complaining about this version mis-match when none of those modules are present on the system from what I can see anyway.

So you still have that error from the first post in /var/log/Xorg.0.log?
Are you sure that log is current? (could be that X is not even trying to start)
Have a look at the file date, or at the time inside the log.

If that error still comes up you must have some traces of the nvidia .run installer left on your system.
Could you maybe post the whole log file in that case?

I had already tried recovery mode with no nvidia drivers installed and it didn’t work. And I tried G02 drivers for my GeForce 6xxx card as suggested by OpenSuse documentation. That didn’t work. FINALLY got it to work by installing the G03 drivers which got me to graphical desktop at 640x480 which didn’t recognize the installed nvidia drivers. So I installed these packages manually from Yast and it is finally back to normal.

nvidia-gfxG03-kmp-pae
nvidia-gfxG03-kmp-desktop
nvidia-glG03

Should I report this as a bug? I never installed any nvidia drivers manually, only used the OpenSuse 1-click from the support pages (which worked great for weeks until the update yesterday) and when this update came down it broke everything. It sounds like the problem may have been that I used the 1-click for GeForce 6xxx cards (which seems logical as that’s what I have) but only the G03 drivers seem to work at this point (listed as only for GeForce 8xxx cards).

I did not have both going at the same time. I tried forcing the system back to 304.117 in order to get it back up once I started having trouble. I also tried installing the latest version and still had issues.

I did try nomodeset as a troubleshooting step but this was a one time attempt. Nothing was present or put permanently in place.

[/QUOTE]

Well. My attention to detail is in a bit of a sorry state at the moment. You nailed this one. The last time stamp on the Xorg.0.log was from 0808 today. So it appears that X is not even starting then. …Right?

Wayne

[QUOTE=iluria;2617630
Should I report this as a bug? I never installed any nvidia drivers manually, only used the OpenSuse 1-click from the support pages (which worked great for weeks until the update yesterday) and when this update came down it broke everything. It sounds like the problem may have been that I used the 1-click for GeForce 6xxx cards (which seems logical as that’s what I have) but only the G03 drivers seem to work at this point (listed as only for GeForce 8xxx cards).[/QUOTE]

I found this to be confusing as well. I did use the 1-Click installer just as you did. I pretty much followed your logic step by step.

I just have not been able to get it going again by re-installing up to this point.

Well I used Zypper from the command line to remove all nvidia packages, then followed the Zypper section of this site exactly http://en.opensuse.org/SDB:NVIDIA_drivers but did so as if I had an 8xxx card. Then it worked as I said above where you get a graphical desktop but some of the G03 nvidia packaged are missing and need to be installed manually from Yast. Hopefully I don’t still have some packages missing that I’m unaware of.

You are the man! (Or woman if that is the case:) ) I was able to get it going by re-installing only the G03 versions of the packages. Thanks so much.

Wayne

You don’t have a 6xxx, but a GeForce 600 series card. 6xxx is GeForce 6 series, which are very old.
Maybe the G02 driver doesn’t support your GTX 650 Ti card anymore.
I already contemplated that, that’s why I suggested to remove the driver completely. I would have suggested to try G03 next.

Although, the GeForce 650Ti is actually listed as supported by the G02 driver on the nvidia page…:\

nvidia-gfxG03-kmp-pae

You don’t need that, that’s for kernel-pae. Apparently you hit another zypper issue, in that it doesn’t want to install a package that you uninstalled, and prefers to install another kernel instead in this case.

Remove that, and kernel-pae. That’s a 32bit kernel anyway.

Should I report this as a bug?

What? The libzypp issues?
You could, but I already mentioned them in a bug report nearly a year ago.
And actually, those issues are by design. (YaST/zypper install every package independently, and therefore have to use “rpm --force” to install packages)
IIRC, libzypp will gain the ability to check for file conflicts in a future version, which should fix this issue.

I never installed any nvidia drivers manually, only used the OpenSuse 1-click from the support pages (which worked great for weeks until the update yesterday) and when this update came down it broke everything.

Well, you wrote that you ran the installer with the --uninstall option. That’s why I though you installed it that way as well.

The update shouldn’t have caused you any problems then, if you wouldn’t have had both G02 and G03 drivers installed.
Such a constellation can break any time when there is an update.

It sounds like the problem may have been that I used the 1-click for GeForce 6xxx cards (which seems logical as that’s what I have) but only the G03 drivers seem to work at this point (listed as only for GeForce 8xxx cards).

Again, you have a GeForce 600 series card, the G03 driver supports everything since the GeForce 8 series.
GeForce 8xxx cards are GeForce 8 series (GT 8600 and the like), GeForce 6xxx are GeForce 6 series (GT 6600 f.e.).
You have a GeForce 600 series.

Yes, nvidia’s card naming scheme is a bit confusing.

Well I meant bug in the broadest sense, meaning a bug in the software or process. Like if the 1-click installers are mislabeled I’d consider that a bug. But it seems they aren’t mislabeled just poorly labeled. There should be a link to what cards are considered what series by Nvidia standards, or at least a range of release years. If it had said that GF6xxx cards were from 2006 I’d have known that a GTX 650 wasn’t what was meant by a GeForce 6xxx card.

Well, that page is a Wiki page. Everybody can edit and improve it (changes have to be approved first of course).
You should be able to login there with the same username/password as here… :wink:

Or if you don’t want to edit it yourself, there’s a “Discussion” link at the bottom.

But on the 1-click install section, it does say “GeForce 8 and later”.
The “Repository Way” section should be improved, it doesn’t even mention the G03 driver in the YaST part…
And maybe the mention of the 96.xx driver (x11-video-nvidia, for GeForce 4 and older) should be removed completely, because it is not available anymore for 12.3/13.1 (it doesn’t work at all with the latest Xorg).