KDE startup hangs after zypper dup

Hello, all.

openSUSE Tumbleweed was running fine until a recent zypper dup update (on Thursday 4th March), but now it gets past the disk decryption step and then the KDE startup hangs. A non-blinking cursor just sits at the top-left of the screen, and no further progress is ever made.

I can switch to tty1, login, and access the root directories and my encrypted home directory fine, so I don’t believe there’s anything wrong with the disk system or with Linux core. It’s just the KDE system which fails to make an appearance.

One of the last things seen in /var/log/boot.log is “Starting X Window system”. And in /var/log/Xorg.0.log I can’t spot anything which looks like an error, the last line being “Suspending AIGLX clients for VT switch”.

I have successfully run zypper dup from the command line, and other than having to change vendor from packman to openSUSE for several packages, nothing was unusual. But the update did not fix the problem.

I’ve also run the openSUSE live disk which I used to install three weeks ago, and ran the update and repair options, but that did not fix the problem.

Can someone give me a clue as to what to try next in order to diagnose and fix this sudden failure of KDE Plasma to load?
System:

  • Intel Core i5-4460
  • NVIDIA GTX 970 (using proprietary NVIDIA driver)
  • 16GiB RAM
  • Samsung 970 EVO Plus 1 TB PCIe NVMe M.2
  • Gigabyte Z97X Gaming 3 motherboard

(Note that I’ve posted this same question on the Unix & Linux channel on StackExchange, but so far it’s not had any responses.)

As additional information, I just ran zypper dup again, it fetched a new kernel and appeared to fetch new graphics drivers, but still the KDE startup hangs.

Any ideas what else I can try? Is there a way to completely reinstall openSUSE over the root partition in way which will not touch the home partition?

Yes just do not format home and be sure that partition is set to mount at /home. Not that assumes a seperate home partition the default these days is to have /home on root when you use BTRFS you have to override for a different setup

Luckily I found this page about system recovery before attempting to reinstall openSUSE. From the GRUB menu I chose to boot from a read-only snapshot, selected a snapshot far enough in the past that it was not affected by the problems caused by updates, booted into KDE to confirm that everything was running okay, and then executed “sudo snapper rollback” to put have this working snapshot replace the broken one. Then reboot into a working system. Unfortunately, when I ran “sudo zypper dup” again, it immediately broke KDE again on the next boot. So there’s something in that batch of updates which is corrupting the window or desktop system, at least for me.

Could it be something to do with nvidia?

To be sure, you get the login screen and when you ten login, the screen reverts to the non-blinking cursor on a black screen? The openSUSE lightbulp is never shown?

To clarify, this is what happens: 1. I enter the GRUB decryption prompt, and enter decryption key. 2. GRUB appears and I select openSUSE. 3. The disk decryption prompt appears, and I enter the decryption key. 4. The page blanks, and the (low-resolution) cursor appears at the top-left, but (instead of blinking a couple of times and then being replaced by the high-resolution screen) the cursor is frozen, and no further progress is ever made towards the window system. 5. After a couple of minutes I give up and hit Ctrl+Alt+F1 to drop into tty1 and login to the console, and then issue sudo reboot now to restart the machine. So I never make it as far as the KDE login screen, because the terminal (tty7) freezes before the high-resolution screen appears. Which might mean it’s an NVIDIA driver problem, but could also mean that any of a large number of other window system or KDE components are damaged. I’ve just noticed this page which suggests that Tumbleweed went through a massive overhaul. But the date on that news is a few days after my trouble started, so I’m not sure that the two are related. However, I can say that even though I can use sudo snapper rollback to put my system (binaries, configuration files, drivers) back to the way it was, and then I can get into KDE fine. But as soon as I run sudo zypper dup again, my system will hang before reaching the window system on the next reboot. No matter how much time (and new updates) I allow to pass, updating is still breaking openSUSE on my machine. So something in the new kernel, or new xorg or KDE packages is either getting corrupted, or is incompatible with my system. I can’t go without updates forever, so eventually I’m going to have to replace openSUSE Tumbleweed with something else. Which is frustrating, because it’s been fine for the last few years, up until now.

Hi
So your at snapshot < 20210305?


cat /etc/os-release | grep VERSION_ID

0305 had the second glibc update so should be a big one… Sure you not getting kicked out of the update part way through? I would suggest switching to the tty and run;


zypper -vvv dup

Does it now finish, also did you try booting by adding the grub boot option nomodeset if you think it’s the nvidia driver that’s the issue…

Pretty good you can switch back to an older snapshot and things work, but yes, quite frustrating.

If you can fall back to the console, you should be able to get the boot log (journallctl -b/dmesg) and that should give indications on what is the problem I think. Just save it on a USB stick or a partition that is not part of the BTRFS filesystem.

Yes, I’m stuck on a snapshot based on Tumbleweed from just before 5th March.

I ran zypper -vvv dup and it completed without showing any sign of compilation failure.

I booted using nomodeset and it simply took me to a command prompt (tty1) and showed no sign that it was trying to boot the window system. I don’t know what that tells us, but it’s no better than booting without nomodeset.

And journalctl -b shows only about a half-dozen lines, ending with a “startup finished” line. (I couldn’t get journalctl -b /dmesg to execute.)

Whatever is wrong with the upgrade process appears to be very stubborn. What is the safest way of installing Tumbleweed from scratch (so that my home partition is not put at risk of change/deletion)?

That should be either journalctl -b and/or dmesg | less. Other options would be journalctl -b | susepaste and/or dmesg | susepaste, so that we here can see. Don’t try to paste either journal or dmesg here directly. They are too voluminous. If susepaste is too ornery, try uploading to http://pastebin.com/ instead.

Is your /home part of the / BTRFS filesytem on one partition, or a separate partition? If separate, saving home is simple: when in partitioner, direct that filesystem to be mounted on /home and do not format it. You may need to select the expert partitioner to do that. Expert is the only partitioner mode I ever use. If /home is part of BTRFS, then it needs to be backed up before installing, then restored afterward.

Something else that might be worth a try: changing from sddm to lightdm or vice versa: update-alternatives --config default-displaymanager.

One more thing to try is to login as root and try startx. This could narrow the source of the problem to either X itself, or the login manager. If startx works, be sure not to access the internet with it, other than using YaST2 or zypper, and exit X ASAP.

I never worked out what was causing the upgrade obstacle, so I reinstalled openSUSE Tumbleweed on top of itself (deleting and replacing the Linux root partition and the EFI partition, but leaving the home partition and the Windows partitions). That deleted the Windows entry in GRUB, so it took another day of hunting around to find the magic recipe to fix the Windows BCD in the EFI partition, and then regenerate the GRUB configuration so that the “Windows Boot Loader” reappeared as an option. Now I just need to work out why installing the NVIDIA drivers causes various module load failures during xorg startup. Anyone else get the impression that humans have a need to create things more complicated than they can really handle?

Life can be simpler by not choosing to install non-FOSS software, aka NVidia’s proprietary drivers.

I have TW on 5 PCs using various NVidia GPUs, and have never installed a proprietary driver on any of them. I also have others using Intel and AMD GPUs. Other than slower on the older ones of any of the three brands, I don’t notice differences among any of them.

Yeah, my attempt to install NVIDIA caused a boot failure (this time Xorg.0.log showed clear errors relating to module loading, and I suspect this might be because the MOK util step never occurred) so I’m currently using the default (Mesa?) driver. The default driver seems absolutely fine until the fancy application selector or the logout overlay appears, and then the mouse cursor drags quite badly (I think because those overlays use fancy background blur).

Most often any issues with the nvidia driver and TW updates can be cleared up by a forced reinstall of anything installed from the nvidia repo. I just switch to a text console (ctrl-alt-F1), login as root, and either use curses Yast or zypper to reinstall the driver, for example:


**% zypper se --installed-only -r NVIDIA**
Loading repository data...
Reading installed packages...

S  | Name                      | Summary                                                               | Type
---+---------------------------+-----------------------------------------------------------------------+--------
i+ | nvidia-computeG05         | NVIDIA driver for computing with GPGPU                                | package
i+ | nvidia-gfxG05-kmp-default | NVIDIA graphics driver kernel module for GeForce 600 series and newer | package
i+ | nvidia-glG05              | NVIDIA OpenGL libraries for OpenGL acceleration                       | package
i+ | x11-video-nvidiaG05       | NVIDIA graphics driver for GeForce 600 series and newer               | package

**% zypper se --installed-only -r NVIDIA | awk '$1 == "i+" { print $3 }' | xargs zypper in --force**
Loading repository data...
Reading installed packages...
Forcing installation of 'x11-video-nvidiaG05-460.73.01-39.2.x86_64' from repository 'NVIDIA'.
Forcing installation of 'nvidia-computeG05-460.73.01-39.2.x86_64' from repository 'NVIDIA'.
Forcing installation of 'nvidia-gfxG05-kmp-default-460.73.01_k5.12.0_1-39.8.x86_64' from repository 'NVIDIA'.
Forcing installation of 'nvidia-glG05-460.73.01-39.2.x86_64' from repository 'NVIDIA'.
Resolving package dependencies...

The following 4 packages are going to be reinstalled:
  nvidia-computeG05 nvidia-gfxG05-kmp-default nvidia-glG05 x11-video-nvidiaG05

4 packages to reinstall.
Overall download size: 0 B. Already cached: 180.2 MiB. No additional space will be used or freed after the operation.
**Continue? [y/n/v/...? shows all options] (y): **y


The install has to recompile the nvidia kernel module and the initrd, so this may take a while. Sometimes you can get away with just a forced reinstall of the relevant kernel module: nvidia-gfxG05-kmp-default. Normally I think the nvidia kernel module is rebuilt by a post install script whenever a new kernel is installed, so the above should not be necessary, but I think sometimes a chicken-and-egg situation defeats the automation.

The only time a forced reinstall fails is when the kernel API’s have changed and nvidia have not yet caught up, in which case I just use a boot menu option for the previous kernel until nvidia catches up (TW supports multi-boot kernels) - if it’s going to take a while I’ll change the default boot kernel via yast.

You should never have to resort to a reinstall to sort out graphics issues. Personally I want the best and most versatile hardware and drivers and have stuck with nvidia. Support here is pretty good and I’ve found the nvidia devs and support people to be really helpful too.

# inxi -GISxy
System:
  Host: p5bse Kernel: 5.10.16-1-default x86_64 bits: 64 compiler: gcc
  v: 10.2.1 Desktop: KDE Plasma 5.21.4 Distro: openSUSE Tumbleweed 20210504
Graphics:
  Device-1: **NVIDIA** GF119 [NVS 310] vendor: Hewlett-Packard **driver: nouveau**
  v: kernel bus-ID: 01:00.0
  Display: x11 server: X.Org 1.20.11 **driver: loaded: modesetting**
  unloaded: fbdev,vesa resolution: 1: 2560x1440~60Hz 2: 2560x1080~60Hz
  OpenGL: renderer: llvmpipe (LLVM 12.0.0 128 bits) v: 4.5 **Mesa** 21.0.2
  direct render: Yes
Info:...Shell: Bash  v: 5.1.4 inxi: 3.3.04

Nouveau (kernel), Modesetting (X foundation) & Mesa (X high-level) are all in use here.