Zypper update freezes

A recent update appears to “freeze” my HP Intel 64-bit desktop, where by “freeze” I mean the system stops with ‘dracut’ attempting to ‘include kernel modules’. The cursor can be moved around the screen with the mouse, which I guess is a function of the video card, but clicking anywhere on the screen produces no response at all.

After waiting for at least 15 minutes with no further Konsole output, the first attempted update could be interrupted by CNTRL-C which produced a Konsole message reporting it was attempting (unsuccessfully) a graceful exit. Updates can only be ended by powering-off: holding down the start button on the desktop.

The system can then be made usable by rolling back to a previous snapper image, so unsuccessful attempts are not trivial. Zypper installation of two RPMs since then happily proceeded with no problem.

The admin logs available in YaST don’t go back far enough (two days) to reveal any diagnostic info.

I’d be grateful for any suggestions as I now have five updates queued. I could install them one at a time but I’m not sure that would be much help as it’s probably the first which brings things to a screeching halt.

Is it possible to do a dummy update using zypper which preserves the integrity of the current snapper image?

1 Like

Based on your description, it is the running of “dracut” to rebuild the “initrd” that is freezing your system.

If you try those updates, the updates will likely install correctly. And then “dracut” is run after they have all been installed.

If this were my system, I would make a copy of the working “initrd”. And then I would update. If it freezes, I would boot from rescue media to restore that copied “initrd” and then see if it boots.

2 Likes

Switch to multi-user, login as root, update and switch back to graphical:

systemctl isolate multi-user.target
zypper update
systemctl isolate graphical.target
1 Like

Thanks Neil & Karl for your responses, and apologies for the delay in responding…

I’m almost certain that diagnosis is correct, and sure enough, the first queued update when I wrote the base note was to GRUB2. However I now have quite a few more and I’d like to install them all except the one to GRUB2. This should (hopefully!) prove the point and it might include an update which fixes the original problem. I’ve had this happen before though the update normally just fails gracefully rather than freezing the system with no processor activity as far as I can tell (?.. maybe waiting for a lock?)

I can do so by simply un-checking the GRUB update in the ‘Software Updates’ list on the task-manager panel and clicking ‘Install Updates’ or something similar in YaST.

But does the task-manager install button call Zypper or YaST? I can’t see any way of excluding a single update (not a package) from a very long list using Zypper from the Konsole.

I thought I’d then run ‘zypper update GRUB2’ with the --dry-run option and have a look at its logfile in /var to see if that sheds any light on the original problem. I’m a little reluctant to tangle with # systemctl, especially with the ‘isolate’ command since it’s dangerous territory unless you’re an expert!

Any comments?

As far as I know, the update applet does equivalent of “zypper dup”. So unchecking the one update there should do what you wanted.

Running “zypper update --dry-run grub2” probably won’t tell you what you are looking for. In particular, I don’t think it tells you about the scripts that would run after the update.

The Software Updates applet is nice, if it works. If it causes trouble fix it. Switch from graphic to multiuser and run zypper update instead as already suggested above.

zypper al grub*

This prevents any package beginning with string “grub” from being added, replaced or removed without zypper presenting options to choose from about what to do because of the existence of the lock. Removing the lock to allow the transaction will be one of the options for selection. If you wish the option to proceed notwithstanding the lock, you may choose to have the lock removed. However, because of the wildcard in the lock, zypper will actually ignore the lock, not remove it. If you wish the remove selection to actually remove a lock, you must have a lock that includes no wildcard character. Thus:

zypper al grub2-x86_64-efi

will only apply to grub2-x86_64-efi, and nothing else.

zypper rl grub2-x86_64-efi

will also remove the lock.

Note: “rl” and “al” are shorthand aliases for “removelock” and “addlock”.

man zypper
1 Like

The level of support available to OpenSuSE users from absolute experts such as nrickert, karlmistelberger, and mrmazda is just amazing! This Forum is invaluable, and I hope it survives re-engineering of the Update process post-Leap15.5.

I investigated mrmazda’s suggestion regarding # zypper al with these results.

# zypper ll

# | Name             | Type    | Repository | Comment
--+------------------+---------+------------+--------------------
1 | grub2-x86_64-efi | package | (any)      | ADL: Freezes system

# zypper al --comment "ADL: Freezes system" grub2*
Specified lock has been successfully added.
# zypper ll

# | Name             | Type    | Repository | Comment
--+------------------+---------+------------+--------------------
1 | grub2*           | package | (any)      | ADL: Freezes system
2 | grub2-x86_64-efi | package | (any)      | ADL: Freezes system

# 

None of this seems to affect the task-manager updater (grub2-x86_64-efi etc. was still ticked there). Locks set in zypper are also non-volatile and propagated to YaST where the effect of globbed package names can be seen. All nice and consistent…

However a locked update set in the task-manager updater takes that entire update off the list without doing anything visible to YaST or zipper. And I think that’s what I really want to do since this problem may result from a contingent update to some package not caught by a zypper lock. Debugging the root cause will be easier sfter all the other updates are done (successfully :slight_smile: ), and they might fix it.

Of course there’s also a possibility the issue has nothing to do with the GRUB2 update…

Does all that sound plausible?

1 Like

I can’t answer because I have no knowledge of “update set in the task-manager updater” or “task-manager updater”. What I wrote applies to zypper, and possibly AFAIK, should apply to YaST2 software management. In openSUSE I rarely use anything but rpm or zypper for package management, with the balance by YaST, mostly in the system installation environment.

Your #2 lock is subsumed by your #1.

1 Like

Well it’s back to the drawing-board again. I unchecked the GRUB2 update in the list which appears when the “Software Updates” button on the task-manager panel is clicked and let the rest run. This time the system froze when vlc-vdpau was being installed (from Packman, I hope).

So I powered it down and rolled back the boot image to the pre-update version, which contained the description “zypp(packagekitd)” with “important=yes”.

I’ll post another note when I have more information. I might try updating one at a time to begin with using zypper interactively from the Konsole, maybe Firefox then GRUB2 just in case it’s a resource problem.

After updating single packages using zypper from Konsole, and sometimes entire updates using the Software Updates button on the task-manager panel, 21 of 22 updates installed with no problems.

The one which appears to be responsible is the “recommended” update to systemd: openSUSE-SLE-15.4-2023-4153(1) which fails consistently as described in my base note. The system becomes completely unresponsive with dracut attempting to ‘include kernel modules’.

The only way to recover control is by powering-down the system, rebooting from a snapper rollback, and creating a new system image which is known to be in a consistent state (e.g. all locks are properly initialised).

I’ll lock systemd updates to prevent accidents and do some further testing, and if & when I have any clues as to the root caue I’ll post that information here.

1 Like

No problem installing it here by using zypper instead of GUI tool:

# grep 2023-4153 /etc/zypp/repos.d/*.repo
# zypper se -s 2023-4153 | grep 2023-4153
i | openSUSE-SLE-15.5-2023-4153 | patch | 1       | noarch | UpdateSLE
# grep 2023-4153 /var/log/zypp/history
2023-10-24 09:16:55|patch  |openSUSE-SLE-15.5-2023-4153|1|noarch|UpdateSLE|moderate|recommended|needed|applied|
#
1 Like

After some reading on systemctl and the boot process, I reverted to a CLI login using the CNTRL-ALT-F1 method, logged in as root, and then ran:

# systemctl isolate multi-user-target
# zypper up systemd

However the result was the same:

dracut: *** Including module: kernel-modules ***

until I attempted to use CNTRL-C, which produced an unsuccessful “attempting to exit gracefully…” response. At that point I physically powered down the system using the power-off button and rolled back to an earlier image.

I notice it’s possible to use systemctl to freeze UNIT and warm UNIT and I guess that might be relevant if dracut uses it, but I have no idea how to make further progress at this point. Maybe upgrade to 15.5 and see if the problem is still there?

1 Like

Just an idea to try: Set the immutable flag on each of your initrds to prevent dracut from successfully regenerating them. It will read the lockdown and skip, reporting no permission, then quit.

Another: scan through /etc/dracut.conf.d/*.conf or /etc/dracut.conf to be sure there are no corrupted files. Have you ever modified any content of /etc/dracut.conf.d/ or /etc/dracut.conf?

1 Like

The only non-standard package I have installed is EpsonScan2 which does have four configuration files in /etc/sane.d so I could try removing the whole package.

Other than that, I haven’t modified any distribution configs in /etc/*.conf There’s no evidence of malware, and the files themselves can be read by clamAV.

There are 45 configuration files in /etc including dracut.conf and sysctl.conf, though none look to me like kernel configurations.

There are two in /etc/dracut.conf.d/ : 99-debug.conf and ostree.conf The debug configuration file is just a placeholder containing comments, but ostree.conf contains two lines other than comments:

add_dracutmodules+=" ostree systemd "
reproducible=yes

Is it possible the add_dracutmodules+ command is where the problem lies?

However this is getting beyond my O/S pay-grade, and I’m now finding that updates which depend on the systemd update are failing.

Is an upgrade to 15.5 likely to be the answer? Or would it be better to download 15.5 and install it from scratch but using the existing /user partition? The only issue there is that each user’s desktop environment apparently still has to be painfully re-configured from scratch.

1 Like
# lsinitrd /boot/initrd | grep systemd | wc -l
155

I’d say odds are high with that many lines with string systemd that systemd is not anything you need to configure for addition to initrds. I have no experience using ostree, or add_dracutmodules+=, so can’t even guess what the answer is to whether problem lies there.

The best way forward might be an offline upgrade. Individual user settings shouldn’t have any bearing on the success of package management or upgrading the OS. OTOH, I don’t use 1-click to install anything, thus avoiding the bad repo configs they too often cause. Thus I have a history of very good experiences with zypper/online upgrades with Leap. I take great care with repo management, and don’t upgrade with any user or other iffy repos enabled.

A better recommendation may follow after you show output from zypper lr -d.

Thanks for your help, mrmazda!

I’ve come pretty much to the same conclusion. Fortunately, I downloaded a DVD version of 15.5 and checked the SHA256 a month ago, before this issue arose. Experience and good software engineering practice suggests it’s best to establish the cause, not just find a workaround, but we all have finite lives and there comes a time…

The decision crystallised when I couldn’t find ImageMagick in the KDE application menu even though zypper assured me that package was installed and up-to-date.

The hardware is an HP “Elite” desktop with SSD memory, so I suppose it’s possible main memory has been corrupted for whatever reason. And on that point, this is a “regional” area with lots of very tall trees which seems subject to lightning strikes. Only this afternoon I noticed one around 30-40 metres away with the bark on most of its trunk charred, possibly having been struck and set on fire during a recent thunderstorm, but clearly not by a bushfire.

I’ll post the results of upgrading here.

The repository list is as follows:

# zypper lr -d
#  | Alias                       | Name                                                                                        | Enabled | GPG Check | Refresh | Priority | Type     | URI                                                                           | Service
---+-----------------------------+---------------------------------------------------------------------------------------------+---------+-----------+---------+----------+----------+-------------------------------------------------------------------------------+--------
 1 | Foreign-RPM-repo            | Foreign-RPM-repo                                                                            | Yes     | ( p) Yes  | Yes     |   99     | plaindir | dir:[this saves epsonScan2 RPMs]
 2 | openSUSE-Leap-15.4-1        | openSUSE-Leap-15.4-1                                                                        | Yes     | (r ) Yes  | No      |   99     | rpm-md   | cd:/?devices=/dev/disk/by-id/wwn-0x5001480000000000                           | 
 3 | packman-essentials          | packman-essentials                                                                          | Yes     | (r ) Yes  | Yes     |   90     | rpm-md   | https://ftp.gwdg.de/pub/linux/misc/packman/suse/openSUSE_Leap_15.4/Essentials | 
 4 | repo-backports-debug-update | Update repository with updates for openSUSE Leap debuginfo packages from openSUSE Backports | No      | ----      | ----    |   99     | NONE     | http://download.opensuse.org/update/leap/15.4/backports_debug/                | 
 5 | repo-backports-update       | Update repository of openSUSE Backports                                                     | Yes     | (r ) Yes  | Yes     |   99     | rpm-md   | http://download.opensuse.org/update/leap/15.4/backports/                      | 
 6 | repo-debug                  | Debug Repository                                                                            | No      | ----      | ----    |   99     | NONE     | http://download.opensuse.org/debug/distribution/leap/15.4/repo/oss/           | 
 7 | repo-debug-non-oss          | Debug Repository (Non-OSS)                                                                  | No      | ----      | ----    |   99     | NONE     | http://download.opensuse.org/debug/distribution/leap/15.4/repo/non-oss/       | 
 8 | repo-debug-update           | Update Repository (Debug)                                                                   | No      | ----      | ----    |   99     | NONE     | http://download.opensuse.org/debug/update/leap/15.4/oss/                      | 
 9 | repo-debug-update-non-oss   | Update Repository (Debug, Non-OSS)                                                          | No      | ----      | ----    |   99     | NONE     | http://download.opensuse.org/debug/update/leap/15.4/non-oss/                  | 
10 | repo-non-oss                | Non-OSS Repository                                                                          | Yes     | (r ) Yes  | Yes     |   99     | rpm-md   | http://download.opensuse.org/distribution/leap/15.4/repo/non-oss/             | 
11 | repo-oss                    | Main Repository                                                                             | Yes     | (r ) Yes  | Yes     |   99     | rpm-md   | http://download.opensuse.org/distribution/leap/15.4/repo/oss/                 | 
12 | repo-sle-debug-update       | Update repository with debuginfo for updates from SUSE Linux Enterprise 15                  | No      | ----      | ----    |   99     | NONE     | http://download.opensuse.org/debug/update/leap/15.4/sle/                      | 
13 | repo-sle-update             | Update repository with updates from SUSE Linux Enterprise 15                                | Yes     | (r ) Yes  | Yes     |   99     | rpm-md   | http://download.opensuse.org/update/leap/15.4/sle/                            | 
14 | repo-source                 | Source Repository                                                                           | No      | ----      | ----    |   99     | NONE     | http://download.opensuse.org/source/distribution/leap/15.4/repo/oss/          | 
15 | repo-update                 | Main Update Repository                                                                      | Yes     | (r ) Yes  | Yes     |   99     | rpm-md   | http://download.opensuse.org/update/leap/15.4/oss/                            | 
16 | repo-update-non-oss         | Update Repository (Non-Oss)                                                                 | Yes     | (r ) Yes  | Yes     |   99     | rpm-md   | http://download.opensuse.org/update/leap/15.4/non-oss/                        | 
#

Upgrading what? When? I still see only 15.4 in your repo list. I suggest to disable 1 & 2 even for normal usage, and enable only if and when needed for specific needed packages not found in standard online repos.

1 Like

I’m planning on “updating” 15.4 by installing 15.5 offline from the DVD, then replacing each user’s directory in /home with their current 15.4 version. This saves having to painfully re-configure each user’s desktop environment, package configurations, etc.

The installation process will be configured to reformat all partitions except /user without changing anything else, but otherwise run the installation as though it’s a new one.

P.S. When the system freezes during a systemd update I suspect it’s waiting forever to acquire a lock on the RPM package, whether it’s being run from the GUI or the CLI, but why this should make the whole system completely unresponsive I don’t know. However I guess it lends mild support to the possibility of hardware problems.

You may want to check the drive. Anything can happen: Weihnachtsbescherung | Karl Mistelberger.

Drive checks

  1. smartctl

erlangen:~ # smartctl -t long /dev/sdb
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.5.9-1-default] (SUSE RPM)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 12 minutes for test to complete.
Test will complete after Sun Nov 12 09:24:01 2023 CET
Use smartctl -X to abort test.
erlangen:~ # 
  1. dd
erlangen:~ # dd if=/dev/sdb of=/dev/null bs=4M status=progress
63984107520 bytes (64 GB, 60 GiB) copied, 417 s, 153 MB/s
15264+1 records in
15264+1 records out
64023257088 bytes (64 GB, 60 GiB) copied, 417.298 s, 153 MB/s
erlangen:~ # 

File system checks

Check allocated space

  1. fsck
erlangen:~ # btrfs check --force /dev/sdb
Opening filesystem to check...
WARNING: filesystem mounted, continuing because of --force
Checking filesystem on /dev/sdb
UUID: 78383e24-1ed7-45ad-9a6b-65b8b98b93c2
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups
found 35480035328 bytes used, no error found
total csum bytes: 33686488
total tree bytes: 985071616
total fs tree bytes: 904445952
total extent tree bytes: 42057728
btree space waste bytes: 142133594
file data blocks allocated: 41277497344
 referenced 41277435904
erlangen:~ # 
  1. block checksums
erlangen:~ # btrfs scrub start /dev/sdb
scrub started on /dev/sdb, fsid 78383e24-1ed7-45ad-9a6b-65b8b98b93c2 (pid=22465)
erlangen:~ # 
erlangen:~ # btrfs scrub status /dev/sdb
UUID:             78383e24-1ed7-45ad-9a6b-65b8b98b93c2
Scrub started:    Sun Nov 12 09:38:27 2023
Status:           finished
Duration:         0:05:53
Total to scrub:   33.96GiB
Rate:             98.51MiB/s
Error summary:    no errors found
erlangen:~ # 
  1. rewrite everything

Start balance

erlangen:~ # btrfs balance start --full-balance /Sandisk
erlangen:~ # 

Check status:

erlangen:~ # btrfs balance status /Sandisk
Balance on '/Sandisk' is running
32 out of about 36 chunks balanced (33 considered),  11% left
erlangen:~ # 

A drive which passes all of the above is a prerequisite for smooth upgrades and maintenance.