nvidia-gfxG03-kmp-default 340.94 recent update problems

Hello, I’ve been having problems with the latest update for the nvidia drivers. The 4 packages that seem to cause the problem are:

nvidia-computeG03 (340.96-45.1)
nvidia-gfxG03-kmp-default (340.96_k4.1.12_1-45.1)
nvidia-glG03 (340.96-45.1)
nvidia-uvm-gfxG03-kmp-default (340.96_k4.1.12_1-45.1)

I am holding them back for some time now, because the few times I tried to let them install (and reboot afterwards) I end up in black screen X11 with no login prompt, no nothing, just black screen (from xdm I guess). Each time I revert the changes using snapper (great tool!!!). I was wondering if anyone had this issue and if there’s a known solution?

Here’s my gfxcard info:

hwinfo --gfxcard
26: PCI 100.0: 0300 VGA compatible controller (VGA)             
  [Created at pci.366]                                                                                                                                                               
  Unique ID: VCu0.suXSFqhBVG3                                                                                                                                                        
  Parent ID: vSkL.vfb7IfXn7N2                                                                                                                                                        
  SysFS ID: /devices/pci0000:00/0000:00:01.0/0000:01:00.0                                                                                                                            
  SysFS BusID: 0000:01:00.0                                                                                                                                                          
  Hardware Class: graphics card                                                                                                                                                      
  Model: "nVidia GK107 [GeForce GTX 650]"                                                                                                                                            
  Vendor: pci 0x10de "nVidia Corporation"                                                                                                                                            
  Device: pci 0x0fc6 "GK107 [GeForce GTX 650]"                                                                                                                                       
  SubVendor: pci 0x1569 "Palit Microsystems Inc."                                                                                                                                    
  SubDevice: pci 0x0fc6                                                                                                                                                              
  Revision: 0xa1                                                                                                                                                                     
  Driver: "nvidia"                                                                                                                                                                   
  Driver Modules: "nvidia"                                                                                                                                                           
  Memory Range: 0xde000000-0xdeffffff (rw,non-prefetchable)                                                                                                                          
  Memory Range: 0xc0000000-0xcfffffff (ro,non-prefetchable)                                                                                                                          
  Memory Range: 0xd0000000-0xd1ffffff (ro,non-prefetchable)                                                                                                                          
  I/O Ports: 0xe000-0xefff (rw)                                                                                                                                                                                                                                                   
  Memory Range: 0xdf000000-0xdf07ffff (ro,non-prefetchable,disabled)                                                                                                                                                                                                              
  IRQ: 140 (196723 events)                                                                                                                                                                                                                                                        
  Module Alias: "pci:v000010DEd00000FC6sv00001569sd00000FC6bc03sc00i00"                                                                                                                                                                                                           
  Driver Info #0:                                                                                                                                                                                                                                                                 
    Driver Status: nouveau is not active                                                                                                                                                                                                                                          
    Driver Activation Cmd: "modprobe nouveau"                                                                                                                                                                                                                                     
  Driver Info #1:                                                                                                                                                                                                                                                                 
    Driver Status: nvidia is active                                                                                                                                                                                                                                               
    Driver Activation Cmd: "modprobe nvidia"                                                                                                                                                                                                                                      
  Config Status: cfg=new, avail=yes, need=no, active=unknown                                                                                                                                                                                                                      
  Attached to: #10 (PCI bridge)                                                                                                                                                                                                                                                   
Primary display adapter: #26

The current (working) version of the drivers is 340.96-40.1. Only the last token from the version differs (40 to 45).

I don’t know what else to investigate in order to provide more sufficient data about the problem…

These are older packages ( Leap’s kernel has been updated, the version number reflects in the nvidia-kmp packages ).

knurpht@knurphtserver:~> date
ma jul 18 13:38:16 CEST 2016
knurpht@knurphtserver:~> rpm -qa | grep nvidia

Running Leap with an NVIDIA card fine on my server / workstation.

I’m not using that driver so I cannot confirm the problem nor say that it doesn’t exist here.
Though the forum is quiet regarding that recently, so probably it’s no general issue.

Hard to say anything though without seeing the Xorg.0.log from a failed start.

Anyway, I would suggest to remove the driver packages completely, and then install the new ones fresh, to hopefully rule out an installation problem.
As you use snapper, you should be able to go back to the old driver again if it should not work either then.

One thing you should check before though is whether you have the kernel-devel and kernel-default-devel for the latest (running) kernel installed.

Btw, your card is supported by the latest driver (G04) too, so you may want to switch to that one. But please uninstall the current driver completely first.
The G03 driver is legacy since quite a while…
Therefore it might not always work with the latest kernel or Xorg, though that should not be a problem in Leap

Not in the case of the nvidia driver packages though (as you can see in your own output :wink: ).
They always refer to the kernel shipped in the standard repo, but should work fine with the latest kernel from the Update repo too.

In addition to my previous post:
There is a problem with installation of kmp packages if you have virtualbox-guest-kmp-default installed.
So remove that before you update if you have it installed, you don’t need it on the host anyway.
See http://bugzilla.opensuse.org/show_bug.cgi?id=983927

I do have that version of the kernel available in YaST, but since I am using G03 drivers, not G04 (like you do) the kernel update is automatically holding back, since there are no suitable versions of G03 for that kernel.

That’s incorrect.
Nothing is (or should be) being hold back because of the nvidia driver.

To see what kernel you use, run “uname -a”.

If you mean the difference in the last number (-24.1 vs. -45.1), that’s just a rebuild count, that obviously differs between the G03 and G04 packages in the nvidia repo.
Completely irrelevant though, and says nothing about the (kernel) version.

Hard to say anything though without seeing the Xorg.0.log from a failed start.

I will try to produce that information.

There is a problem with installation of kmp packages if you have virtualbox-guest-kmp-default installed.

I do have virtualbox. Will try that as well.

To see what kernel you use, run “uname -a”.

Here it is:

Linux iganev 4.1.26-21-default #1 SMP PREEMPT Mon Jun 13 13:32:30 UTC 2016 (294632f) x86_64 x86_64 x86_64 GNU/Linux

After removing virtualbox, I updated the kernel (4.1.27-24-default), uninstalled the G03 drivers and installed G04 instead (367.27-24.1), rebooted and everything seems to work fine :slight_smile: Thank you all for the quick reaction and for the help! This community is the best!:peace:

Just in case it wasn’t clear: you only need to remove virtualbox-guest-kmp-default (which only makes sense in a guest anyway).
Removing virtualbox (or virtualbox-host-kmp-default which is needed to use virtualbox) completely (or virtualbox-host-kmp-default which is needed to use virtualbox) is not necessary.

uninstalled the G03 drivers and installed G04 instead (367.27-24.1)

The G03 update would likely have worked then as well though.
But yeah, personally I would go with the latest driver (G04) if it supports the card.

Yes, but I was a bit paranoid that it may try to load the module upon boot and didn’t want to see it fail. I have reinstalled virtualbox again and it’s ok. :slight_smile:

This may have pulled in virtualbox-guest-kmp-default though which will cause the same problem again on the next nvidia update.

Unless the problem has been fixed till then of course.
There is a virtualbox update in the queue which should get released today or tomorrow…

Now that I come to think of it, probably virtualbox has held back your kernel update. The current version is built against kernel-4.1.26.
The coming update will fix that though, it is a rebuild for 4.1.27.