Problems switching to NVidia Driver on Leap 15

I had quite a bit of trouble switching from Nouveau to the NVidia Driver on Leap 15. So I thought I’d pass on what I learned in case it might help someone else. …And also in hopes that the install process can be made a little smarter to avoid these problems.

WHAT I DID

Added the “nVidia Graphics Drivers” repro, selected the “nvidia-glG04” package (which cross-tagged all the associated NVidia driver packages), and installed them. Then I rebooted Leap 15.

THE PROBLEMS

  1. The boot-up console reverted to an 80x25 (I think) text mode for the console, even though previously it had a much higher text resolution (during installation I’d selected 2560x1440 for all the resolutions).
  2. (More importantly: ) After boot, you’re left with the whole screen flashing at ~2Hz, and X does not come up.

HOW TO MAKE IT STOP

Reboot into runlevel 3 (though you’re still left without a working X server). After doing so, removing /etc/X11/xorg.conf (if present) may get you a working X server, but not one using the NVidia driver.

SOLUTION FOR BOOT CONSOLE RESOLUTION

yast -> Bootloader -> Kernel Parameters -> Console resolution -> Change from “Autodetected by grub2” to a specific resolution (e.g. 2560x1440)

(My guess is that blacklisting nouveua somehow crippled the console’s default text mode detection, causing it to fall back to something any VGA would support.)

MORE INFO ON THE FLASHING SCREEN

What I think is going on is that X is failing to startup because it can’t load the NVidia kernel modules. Evidence for this (from /var/log/Xorg.0.log):


    (EE) NVIDIA: Failed to initialize the NVIDIA kernel module. Please see the
    (EE) NVIDIA:     system's kernel log for additional error messages and
    (EE) NVIDIA:     consult the NVIDIA README for details.
    (EE) No devices detected.
    (EE) 
    Fatal server error:
    (EE) no screens found(EE) 

Now for some strange reason, when the system fails to start X, it … tries to start X again!, repeatedly! Why it’s doing this, I have no idea.

I also discovered that the NVidia kernel modules were installed here:

/lib/modules/4.12.14-lp150.12.4-default

whereas the kernel installed with the Leap 15 DVD is 4.12.14-lp150.11-default, so its modules live here:

/lib/modules/4.12.14-lp150.11-default

This explains why when I ran “modprobe nvidia” (after installing the NVidia driver packages, and rebooting into runlevel 3), it said it couldn’t find an nvidia kernel module.

SOLUTION FOR THE FLASHING SCREEN

Once I suspected why the X server was failing to load the NVidia kernel modules, the solution was simple:

Yast -> Online Update

Let it install everything, make sure there’s no “/etc/X11/xorg.conf”, and then reboot.

No 2Hz flashing, X actually comes up, with the nvidia modules loaded in the kernel, and display performance is much improved.

WHY I DON’T RUN ONLINE UPDATE RIGHT AFTER DVD INSTALL

I’ve had problems in the past with packages (e.g. from packman) refusing to install if I let Online Update upgrade some of the packages to newer versions first. For that reason, I always install all of the packages I want installed first (especially 3rd party packages) before I let Online Update upgrade any packages to newer versions.

QUESTIONS

  1. Could the NVidia driver packages be made smarter so that they refuse to install unless the kernel version they’re expecting matches the kernel version they’re provided for?
  2. Why is a failed X startup repeated ad infinitum in Leap 15? It would be helpful if it quit trying after 1 failure.
  3. Why does “init 3” hang on Leap 15?

For all I can tell it looks like GDM bug or at least design problem. It does not treat Xserver startup failure specially, so from GDM point of view session started and finished normally and it simply starts another session again - that is after all exactly what GDM is for.

Why does “init 3” hang on Leap 15?

Are you sure it actually “hangs”? Did you try switching to tty (Ctrl-Alt-F1, Ctrl-Alt-F2 etc)?

OP printed out, ready for trying out then I get home tomorrow morning. Will give feedback on how it went :nerd:.

[quote=“arvidjaar,post:2,topic:132203”]

Are you sure it actually “hangs”? Did you try switching to tty (Ctrl-Alt-F1, Ctrl-Alt-F2 etc)?[/QUOTE]

Sorry, I should have qualified that. It hangs the TTY (text console). However, it does not hang the whole OS. I can still switch tty’s with Ctl-Alt-F# and interact with the other terminals.

It’s as if it never completes the transition from runlevel 5 -> 3.

You often have to exit via ctrl-c after an init level change

I don’t recall having to do that before (though I am upgrading from OpenSuSE 13.1).

But just to confirm, yes in Leap 15.0 sending the INT, TERM, and/or HUP signal to the “init 3” process does seem to terminate it and bring you back to the shell prompt.