Leap 42.3 KDE5/Plasma doesn't start after installing Nvidia drivers for GPU

We are using an Nvidia P2000 GPU for graphics computation on a system and need the non-nouveau, drivers from Nvidia to use the Cuda tools.

I have recreated this problem on two quite different platforms. Install steps:

  1. Installed openSUSE Leap 42.3 from DVD/USB image.
  2. Installed several other packages from the Leap 42.3 distribution repo. (i.e. no updates repo)
  3. Removed package: ​drm-kmp-default (at this point the Plasma desktop was working well on an AMD video adapter)
  4. systemctl set-default multi-user.target
  5. Reboot to mulit-user (non-graphical) mode
  6. Installed NVIDIA-Linux-x86_64-384.98.run (which when executed compiles the Nvidia drivers and installs the drivers in the /lib/modules/<kernel> path)
  7. systemctl set-default graphical.target

Upon reboot, the system shows a black screen with a movable mouse pointer (and a small line in the upper left-hand corner) on the AMD adapter.

Below is additional information from one of the machine (3 video adapter: onboard, Radeon, and Nvidia P2000) (The other machine was a big Lenovo server with only an onboard Matrox adapter and the P2000. This one is not shown here.)

hwinfo --gfxcard

mixertestimm:~ # hwinfo --gfxcard32: PCI 100.0: 0300 VGA compatible controller (VGA)             
  [Created at pci.378]
  Unique ID: VCu0.Sxqts2bYtk3
  Parent ID: vSkL.JGpLEUrIHC3
  SysFS ID: /devices/pci0000:00/0000:00:01.0/0000:01:00.0
  SysFS BusID: 0000:01:00.0
  Hardware Class: graphics card
  Model: "nVidia GP106GL [Quadro P2000]"
  Vendor: pci 0x10de "nVidia Corporation"
  Device: pci 0x1c30 "GP106GL [Quadro P2000]"
  SubVendor: pci 0x10de "nVidia Corporation"
  SubDevice: pci 0x11b3 
  Revision: 0xa1
  Driver: "nvidia"
  Driver Modules: "nvidia"
  Memory Range: 0xfa000000-0xfaffffff (rw,non-prefetchable)
  Memory Range: 0xc0000000-0xcfffffff (ro,non-prefetchable)
  Memory Range: 0xd0000000-0xd1ffffff (ro,non-prefetchable)
  I/O Ports: 0xe000-0xefff (rw)
  Memory Range: 0xfb000000-0xfb07ffff (ro,non-prefetchable,disabled)
  IRQ: 25 (no events)
  Module Alias: "pci:v000010DEd00001C30sv000010DEsd000011B3bc03sc00i00"
  Driver Info #0:
    Driver Status: nouveau is not active
    Driver Activation Cmd: "modprobe nouveau"
  Driver Info #1:
    Driver Status: nvidia_drm is active
    Driver Activation Cmd: "modprobe nvidia_drm"
  Driver Info #2:
    Driver Status: nvidia is active
    Driver Activation Cmd: "modprobe nvidia"
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #8 (PCI bridge)


35: PCI 400.0: 0300 VGA compatible controller (VGA)
  [Created at pci.378]
  Unique ID: YmUS.bu0+vrXepm5
  Parent ID: 3hqH.PlTr6vYXRV5
  SysFS ID: /devices/pci0000:00/0000:00:03.0/0000:04:00.0
  SysFS BusID: 0000:04:00.0
  Hardware Class: graphics card
  Model: "ATI Cedar [Radeon HD 5000/6000/7350/8350 Series]"
  Vendor: pci 0x1002 "ATI Technologies Inc"
  Device: pci 0x68f9 "Cedar [Radeon HD 5000/6000/7350/8350 Series]"
  SubVendor: pci 0x174b "PC Partner Limited / Sapphire Technology"
  SubDevice: pci 0xe316 
  Driver: "radeon"
  Driver Modules: "drm"
  Memory Range: 0xe0000000-0xefffffff (ro,non-prefetchable)
  Memory Range: 0xfb520000-0xfb53ffff (rw,non-prefetchable)
  I/O Ports: 0xd000-0xdfff (rw)
  Memory Range: 0xfb500000-0xfb51ffff (ro,non-prefetchable,disabled)
  IRQ: 53 (338 events)
  Module Alias: "pci:v00001002d000068F9sv0000174Bsd0000E316bc03sc00i00"
  Driver Info #0:
    Driver Status: radeon is active
    Driver Activation Cmd: "modprobe radeon"
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #11 (PCI bridge)


100: PCI a00.0: 0300 VGA compatible controller (VGA)
  [Created at pci.378]
  Unique ID: cuhJ.PqBrBsyVQIB
  Parent ID: x1VA.yZKistTcjo6
  SysFS ID: /devices/pci0000:00/0000:00:1c.6/0000:09:00.0/0000:0a:00.0
  SysFS BusID: 0000:0a:00.0
  Hardware Class: graphics card
  Device Name: "Aspeed Video AST2400"
  Model: "ASPEED AST1000/2000"
  Vendor: pci 0x1a03 "ASPEED Technology Inc."
  Device: pci 0x2000 "AST1000/2000"
  SubVendor: pci 0x15d9 "Super Micro Computer Inc"
  SubDevice: pci 0x0857 
  Revision: 0x30
  Driver: "ast"
  Driver Modules: "drm"
  Memory Range: 0xf8000000-0xf8ffffff (rw,non-prefetchable)
  Memory Range: 0xf9000000-0xf901ffff (rw,non-prefetchable)
  I/O Ports: 0xa000-0xafff (rw)
  IRQ: 18 (37 events)
  Module Alias: "pci:v00001A03d00002000sv000015D9sd00000857bc03sc00i00"
  Driver Info #0:
    XFree86 v4 Server Module: ast
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #40 (PCI bridge)


Primary display adapter: #100



lsmod:

mixertestimm:~ # lsmodModule                  Size  Used by
af_packet              45056  0 
iscsi_ibft             16384  0 
iscsi_boot_sysfs       20480  1 iscsi_ibft
msr                    16384  0 
raid0                  20480  1 
md_mod                155648  2 raid0
snd_hda_codec_hdmi     57344  1 
snd_hda_codec_realtek    94208  1 
snd_hda_codec_generic    81920  1 snd_hda_codec_realtek
snd_hda_intel          45056  0 
snd_hda_codec         147456  4 snd_hda_codec_realtek,snd_hda_codec_hdmi,snd_hda_codec_generic,snd_hda_intel
snd_hda_core           81920  5 snd_hda_codec_realtek,snd_hda_codec_hdmi,snd_hda_codec_generic,snd_hda_codec,snd_hda_intel
snd_hwdep              16384  1 snd_hda_codec
joydev                 20480  0 
intel_rapl             24576  0 
sb_edac                32768  0 
edac_core              65536  1 sb_edac
x86_pkg_temp_thermal    16384  0 
intel_powerclamp       16384  0 
coretemp               16384  0 
kvm_intel             184320  0 
kvm                   606208  1 kvm_intel
nvidia_drm             53248  0 
nvidia_modeset        843776  1 nvidia_drm
snd_pcm               135168  4 snd_hda_codec_hdmi,snd_hda_codec,snd_hda_intel,snd_hda_core
snd_timer              36864  1 snd_pcm
nvidia              13139968  46 nvidia_modeset
irqbypass              16384  1 kvm
crct10dif_pclmul       16384  0 
crc32_pclmul           16384  0 
crc32c_intel           24576  0 
ghash_clmulni_intel    16384  0 
igb                   217088  0 
ptp                    20480  1 igb
pps_core               20480  1 ptp
iTCO_wdt               16384  0 
iTCO_vendor_support    16384  1 iTCO_wdt
lpc_ich                24576  0 
dca                    16384  1 igb
ipmi_ssif              28672  0 
ipmi_devintf           20480  0 
drbg                   28672  1 
ansi_cprng             16384  0 
aesni_intel           167936  0 
aes_x86_64             20480  1 aesni_intel
lrw                    16384  1 aesni_intel
gf128mul               16384  1 lrw
glue_helper            16384  1 aesni_intel
ablk_helper            16384  1 aesni_intel
cryptd                 20480  3 ghash_clmulni_intel,aesni_intel,ablk_helper
snd                    90112  8 snd_hda_codec_realtek,snd_hwdep,snd_timer,snd_hda_codec_hdmi,snd_pcm,snd_hda_codec_generic,snd_hda_codec,snd_hda_intel
pcspkr                 16384  0 
i2c_i801               28672  0 
mfd_core               16384  1 lpc_ich
shpchp                 36864  0 
ipmi_si                61440  0 
ipmi_msghandler        53248  3 ipmi_ssif,ipmi_devintf,ipmi_si
soundcore              16384  1 snd
fjes                   32768  0 
processor              49152  0 
hid_generic            16384  0 
usbhid                 53248  0 
ext4                  655360  1 
crc16                  16384  1 ext4
jbd2                  118784  1 ext4
mbcache                16384  2 ext4
sr_mod                 24576  0 
cdrom                  61440  1 sr_mod
sd_mod                 57344  7 
amdkfd                143360  1 
amd_iommu_v2           20480  1 amdkfd
radeon               1601536  2 
ast                    61440  1 
i2c_algo_bit           16384  3 ast,igb,radeon
drm_kms_helper        155648  3 ast,radeon,nvidia_drm
syscopyarea            16384  1 drm_kms_helper
sysfillrect            16384  1 drm_kms_helper
sysimgblt              16384  1 drm_kms_helper
fb_sys_fops            16384  1 drm_kms_helper
ahci                   36864  4 
ttm                   106496  2 ast,radeon
libahci                36864  1 ahci
xhci_pci               16384  0 
ehci_pci               16384  0 
ehci_hcd               81920  1 ehci_pci
xhci_hcd              192512  1 xhci_pci
libata                274432  2 ahci,libahci
drm                   393216  8 ast,ttm,drm_kms_helper,radeon,nvidia_drm
usbcore               270336  5 ehci_hcd,ehci_pci,usbhid,xhci_hcd,xhci_pci
usb_common             16384  1 usbcore
wmi                    16384  0 
button                 16384  0 
sg                     40960  0 
dm_multipath           32768  0 
dm_mod                135168  53 dm_multipath
scsi_dh_rdac           20480  0 
scsi_dh_emc            16384  0 
scsi_dh_alua           20480  0 
scsi_mod              249856  8 sg,scsi_dh_alua,scsi_dh_rdac,dm_multipath,scsi_dh_emc,libata,sd_mod,sr_mod
autofs4                45056  2 

A snippet from /var/log/Xorg.0.log (and are the only EE’s/errors in this output)

    14.966] (II) LoadModule: "fglrx"    14.993] (WW) Warning, couldn't open module fglrx
    14.994] (II) UnloadModule: "fglrx"
    14.994] (II) Unloading fglrx
    14.994] (EE) Failed to load module "fglrx" (module does not exist, 0)
    14.994] (II) LoadModule: "ati"
    14.994] (II) Loading /usr/lib64/xorg/modules/drivers/ati_drv.so
    14.995] (II) Module ati: vendor="X.Org Foundation"
    14.995]    compiled for 1.18.3, module version = 7.9.0
    14.995]    Module class: X.Org Video Driver
    14.995]    ABI class: X.Org Video Driver, version 20.0
    14.995] (II) LoadModule: "nvidia"
    14.995] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so
    15.041] (II) Module nvidia: vendor="NVIDIA Corporation"
    15.041]    compiled for 4.0.2, module version = 1.0.0
    15.041]    Module class: X.Org Video Driver
    15.050] (II) LoadModule: "nouveau"
    15.051] (WW) Warning, couldn't open module nouveau
    15.051] (II) UnloadModule: "nouveau"
    15.051] (II) Unloading nouveau
    15.051] (EE) Failed to load module "nouveau" (module does not exist, 0)
    15.051] (II) LoadModule: "nv"
    15.051] (WW) Warning, couldn't open module nv
    15.051] (II) UnloadModule: "nv"
    15.051] (II) Unloading nv
    15.051] (EE) Failed to load module "nv" (module does not exist, 0)
    15.051] (II) LoadModule: "ast"
    15.051] (II) Loading /usr/lib64/xorg/modules/drivers/ast_drv.so

Any suggestions would be helpful.

Do you install kernel-devel?

https://en.opensuse.org/SDB:NVIDIA_the_hard_way

Yes, kernel-devel is installed. (I think its needed to compile the Nvidia drivers.)

Well, the nvidia kernel module seems to be loaded correctly.

Can you please provide the full /var/log/Xorg.0.log? (via susepaste.org or similar as it is likely too big)

Do you use some custom /etc/X11/xorg.conf or have modified some files in /etc/X11/xorg.conf.d/?
I find it strange somehow that it seems to try loading the AMD drivers (fglrx and radeon/ati) as well.

Also, is there any particular reason why are you trying to install the nvidia driver the “hard way”?
Maybe you would have better luck using the repo?
https://en.opensuse.org/SDB:NVIDIA_drivers

Thanks for the quick response!

I just tried that:

  1. Loaded the machine again with Leap 42.3 (from scratch).
  2. Manually did the “zypper rm drm-kmp-default” and then a reboot. KDE login and Plasma desktop still came up OK.
  3. Installed the nvidia-glG04-390.48-6.1.x86_64 from your repo.
  4. Once finished, rebooted.

The machine now boots to a full black screen with the small “white line” cursor and I get NO keyboard or mouse functionality. (Can’t Ctrl-Alt-F1 or F2 or anything) I, luckily, can still login via ssh and make changes. Did reboot again and pressed Esc during the green chevron with the three dots graphic. It switched to the systemd boot text, but even that finally froze with no keyboard function.

Full Xorg.0.log is here: SUSE Paste

I also have a copy of the same file just before the nvidia-glG4 package (and pre-requisites) are installed.

I have NOT altered any X11 config files. The above Xorg.0.log file was taken from a very fresh install. The only things I do is a install of Leap 42.3 with a superset of packages. I login, add the normal 42.3 OSS repository, the Nvidia repository (with yast2), install the nvidia-glG04 (zypper in nvidia-glG04) wait for it to finish and reboot.

One thing I did notice is just after zypper finishes the nvidia-gjG04 install, the Plasma desktop start to fail. I could not use the Reboot or Logoff buttons to logout of KDE. I get the “frowny face” error windows saying the application failed. I have to reboot with a “reboot” command from a root prompt.

According to the log, Xorg seems to be using a Radeon card and the radeon driver.
Having nvidia installed breaks Mesa though, and therefore also radeon’s OpenGL support.
This can also be seen in the log:

360.921] (II) Applying OutputClass “Radeon” to /dev/dri/card1
360.921] loading driver: radeon

361.113] (EE) Failed to initialize GLX extension (Compatible NVIDIA X driver not found)

Apparently you have two graphic chips/cards?

What’s the output of "/sbin/lspci -nnk | egrep -iA3 “VGA|3D”?

You probably have an AMD CPU with integrated graphics.
Make sure you connect your monitor to the nvidia card (and not the mainboard connector), maybe try to disable the integrated graphics in the BIOS/EFI settings if possible.

PS: if you only want to use the nvidia card for CUDA, not for the display, you can (have to) uninstall nvidia-glG04. This contains nvidia’s OpenGL/GLX support, and breaks/replaces Mesa, i.e. radeon’s (and open source drivers’) OpenGL support.
But that package is only needed for displaying 3D graphics/OpenGL.

Tumbleweed and Leap 15.0 use libglvnd to avoid this problem, by allowing different GLX implementations to be installed at the same time and choose the proper one on runtime, so you may consider upgrading as well.

Here is the output of the command:

mixertestimm:~ # /sbin/lspci -nnk | egrep -iA3 "VGA|3D"01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1c30] (rev a1)
        Subsystem: NVIDIA Corporation Device [10de:11b3]
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_drm, nvidia
--
04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cedar [Radeon HD 5000/6000/7350/8350 Series] [1002:68f9]
        Subsystem: PC Partner Limited / Sapphire Technology Device [174b:e316]
        Kernel driver in use: radeon
        Kernel modules: radeon
--
0a:00.0 VGA compatible controller [0300]: ASPEED Technology, Inc. ASPEED Graphics Family [1a03:2000] (rev 30)
        Subsystem: Super Micro Computer Inc Device [15d9:0857]
        Kernel driver in use: ast
        Kernel modules: ast



I have three video adapters in this machine:

  1. ASPEED Technologies (on-board)
  2. Radeon Cedar
  3. NVIDIA P2000

Would like to have the NVIDIA for GPU computations only and display KDE/Plasma on another display. Will pull the Radeon adapter and get back to you.

Ah. I already was wondering why it tries to load the ast driver as well… :wink:

Would like to have the NVIDIA for GPU computations only and display KDE/Plasma on another display.

As written meanwhile, this should be possible by uninstalling nvidia-glG04.
That’s exactly the reason why it is a separate package…

Will pull the Radeon adapter and get back to you.

You will have the same problem with the ASPEED card though.
Only the nvidia will “work” if you (fully) install the nvidia driver (i.e. including nvidia-glG04).

I suggested to disable radeon under the assumption that you actually want to use the nvidia card for the display.

[QUOTE=wolfi323;2864524]
As written meanwhile, this should be possible by uninstalling nvidia-glG04.
That’s exactly the reason why it is a separate package…[/QUOTE

We need the Nvidia drivers installed in order for the CUDA tools to work and to use this powerful GPU for high speed computation.

You will have the same problem with the ASPEED card though.
Only the nvidia will “work” if you (fully) install the nvidia driver (i.e. including nvidia-glG04).

Yup, tested it. After removing the Radeon adapter, we are in the same state (actually a little better) than with the Radeon adapter. I’m back to the “black screen with the working cursor arrow” and the keyboard and mouse are fully functional. (No sddm login screen appears.)

I have now moved this GPU to a “real” x16 PCI slot and [b]SET THE BIOS TO USE THE NVIDIA CARD as the display adapter!!

The on-board (ASPEED) VGA is now disabled once the OS starts and the KDE/Plasma login and Desktop works!!

Thanks for the help! Now I get to see if using the Nvidia adapter as the display adapter gets in the way of the computational processing.

But you shouldn’t need nvidia-glG04 for that.

Could you give me a quick synopsis of the nvidia-glG04 package? What is different in this package as compared to the “NVIDIA the Hard Way” (use the *.run executable form Nvidia and compile the drivers yourself) package.

As I wrote, it contains nvidia’s OpenGL libraries, that are needed to display hardware accelerated 3D graphics.

What is different in this package as compared to the “NVIDIA the Hard Way” (use the *.run executable form Nvidia and compile the drivers yourself) package.

There is no difference really.
Just that the .run installer installs everything unconditionally (and replaces the system’s libraries, from Mesa e.g.), whereas the rpm packages are split up so you can e.g. leave out the OpenGL part which breaks OpenGL support for other drivers.

Thanks Wolfi! That clarifies it. Thanks again for the help.

J.