System down

Today, as I usually do, I logged into all the computers on my network, using ssh, each in their own instance of Konsole, to do updates.

bart@ASU-X99:~> ssh asu-aio 
Enter passphrase for key '/home/bart/.ssh/id_ed25519':  
Last login: Thu Nov 10 15:40:42 2022 from 10.118.10.5 
Have a lot of fun... 
bart@ASU-AIO:~> su - 
Password:  
**ASU-AIO:~ #** zypper up

As expected, all went well except for … :open_mouth: My Wife’s Computer!
During the update of her machine, I saw an error code of

installing grub2-2.06-150400.11.12.1.x86_64  message from syslogd@msi-6400 kernel 1630932.449751  c0] watchdog:BUG:softlockup - cpu2 stuck for 249s! kworker/ug:1:19589

I wrote this error down and typed it in here so it is not a screen copy. The cpu numbers kept repeating for cpu 0, 1 and 2 with the time incrementing each time.

As the cpu stuck messages would not stop, I CTL C the process and went to her machine and tried doinf a few things. It was clear that things were not right. So, I rebooted. Her machine came back up and seemed to be ok. So I repeated the zypper up process and it seemed to go without errors. I rebooted again. It never came up!

Now, when I boot the computer, I get a grub2 prompt and I don’t know how to proceed. I really, really don’t want to do a fresh install!

Bart

Booting from a Grub prompt is possible, but you have to know what to type, and when. For this purpose, your backup of /boot/grub2/grub.cfg can be used. Find in it the line containing “/etc/grub.d/10_linux”, and enter each line that follows the immediately following “menuentry”. You can skip the if & fi lines and the lines in between, and also skip the echo lines. Likely your backup doesn’t have the kernel version last installed, so instead type simply vmlinuz and initrd if those two strings have appended version strings. For the default kernel and initrd there exist symlinks that work just as well as the originals with all their dashes, dots and numbers. When typing the linu line, omit the resume= parameter, replacing it with noresume.

If you don’t succeed, try again with a “recovery” mode entry from the advanced options section of grub.cfg.

Yes but, with Leap 15.3 and Leap 15.3, the “simple” «zypper refresh» called by «zypper patch» and «zypper update» can, occasionally, cause problems.

  • The reason is, the meta-data churn in the repositories due to on-going activities dealing with the (almost completed) transition to the SLE and SLE Backports repositories.

The Zypp back-end for PackageKit seems to call “zypper refresh --force” which seems to be quite reliable.

  • Therefore, I usually call –
 # zypper refresh --force

before, applying any patches or updates …

How do I get to see what is in /boot/grub2/grub.cfg? When I start the computer, it goes immediately to a grub prompt. I can press tab and see a list of commands, but none are familiar to me.

:expressionless:

:open_mouth:

:frowning:

:cry:

Knowing that restoring a program most often doesn’t work, I never have backed up the root partition. Guess that should change!

I removed the drive from her machine, stuck it in a USB drive container and the /boot/grub2/grub.cfg file is essentially empty:

# DO NOT EDIT THIS FILE
#
# It is automatically generated by grub2-mkconfig using templates
# from /etc/grub.d and settings from /etc/default/grub
#

### BEGIN /etc/grub.d/00_header ###
### END /etc/grub.d/00_header ###

### BEGIN /etc/grub.d/00_tuned ###
set tuned_params=""
set tuned_initrd=""
### END /etc/grub.d/00_tuned ###

### BEGIN /etc/grub.d/10_linux ###
### END /etc/grub.d/10_linux ###

### BEGIN /etc/grub.d/20_linux_xen ###
### END /etc/grub.d/20_linux_xen ###

### BEGIN /etc/grub.d/20_memtest86+ ###
### END /etc/grub.d/20_memtest86+ ###

### BEGIN /etc/grub.d/30_os-prober ###
### END /etc/grub.d/30_os-prober ###

### BEGIN /etc/grub.d/30_uefi-firmware ###
### END /etc/grub.d/30_uefi-firmware ###

### BEGIN /etc/grub.d/40_custom ###
### END /etc/grub.d/40_custom ###

### BEGIN /etc/grub.d/41_custom ###
### END /etc/grub.d/41_custom ###

### BEGIN /etc/grub.d/80_suse_btrfs_snapshot ###
### END /etc/grub.d/80_suse_btrfs_snapshot ###

### BEGIN /etc/grub.d/90_persistent ###
### END /etc/grub.d/90_persistent ###

### BEGIN /etc/grub.d/95_textmode ###
### END /etc/grub.d/95_textmode ###


No wonder if won’t boot!

Is there a way to run what would normally be grub2-mkconfg from the grub prompt? It doesn’t seem to be in the list of available commands.

So, I guess it’s time for a new drive and a fresh install.

Then we can spend the next few weeks setting up all the links, installing all the packages that are not in the default installation and, and, and.

At least I do have backups of all the data.

Bart

It depends on the cause of the failure. If it’s an old drive and some sort of mechanical failure, then recovery is unlikely. If the drive is out of warranty, new drive with fresh install is probably the most sensible way forward.

So, I guess it’s time for a new drive and a fresh install.
If you are able to boot it using the installation or other removable media, mount the / filesystem to /mnt, and give us here output from:

cat /mnt/etc/fstab
blkid
fdisk -l, or
parted -l

we should be able to construct a list of commands to type at a grub prompt, or include in grub.cfg.

Depending on what went wrong to cause the problem, from rescue boot you may be able to chroot into the installed system, from which you could try text mode yast bootloader, or run grub2-mkconfig -o /boot/grub2/grub.cfg, to correct the booting problem, which could well accomplish little or less if the update process went haywire for reasons other than mechanical failure.

It’s not a mechanical failure as far as I can tell. As I said, I took the drive out of her machine and put it in a USB enclosure and can mount it and pole around. I posted the contents of /etc/boot/grub.cfg before. It has the skeleton of a file there, but was obviously not completed.

I’ll put the drive back in her system and try your suggestions. Thanks!

Bart

I put the drive back into my wife’s computer, booted it with an installation disk. From the grub menu, I selected more and then boot Linux

I was presented with several pop up boxes where I was asked which system I wanted and select a name, my wife’s copy of openSUSE came up! All seemed to be well. So, I opened an instance of Konsole and su - to root. (I can’t show any of the actual screen entries).
I ran grub2-mkconfig -o /boot/grub2/grub.cfg and got the prompt back with no errors.
I did cat /boot/grub2/grub.cfg and found that the file was seemingly untouched. It was basically a skeleton as I showed above.
I ran touch /boot/grub2/junk to be sure I could write to that directory. I could.
I ran cat > /boot/grub2/junk <enter> “This is some stuff” CTRL C
I ran cat /boot/grub2/junk and verified the text was there. It was. I removed the junk file.
Using Yast in the GUI, I reinstalled all of the grub files that were shown as installed.
I ran grub2-mkconfig -o /boot/grub2/grub.cfg again and got the prompt back with no errors.
I did cat /boot/grub2/grub.cfg again and found that the file was seemingly untouched. It was basically a skeleton as I showed above.
Using Yast in the GUI, I reinstalled all of the kernel files that were shown as installed. (quite a few!)
I ran grub2-mkconfig -o /boot/grub2/grub.cfg again and got the prompt back with no errors.
I did cat /boot/grub2/grub.cfg again and found that the file was seemingly untouched. It was basically a skeleton as I showed above.

When I do an update and the kernel is included, it seems the grub.cfg file is changed as the menu is changed. I thought by reinstalling the kernel, that process would be triggered. Either it was not, or wherever grub2-mkconfig get’s it’s information from is amongst the missing.

What now?

When I do a zypper up often dracut is called, especially when there’s a kernel update. Is there a way to call dracut directly?

Never mind! The man page doesn’t seem to say anything about the grub .cnf file.

Bart

What do:

  • cat /etc/default/grub
    ]sudo rpm -qa | grep grub]ls -Ggh /etc/grub.d/]cat /etc/fstab]lsblk -fshow?

dracut -f will rebuild the running kernel (IIRC).

I got these responses while using a live version of Leap 15.4
After doing these, I realized that the responses obtained from the drives of the live disk.
I tried to figure out how to get the console to look at the actual hard drive but couldn’t.
So the cat command is accurate because I opened Dolphin, made the root partition of the hard drive active, opened the file, copied the contents to kate, saved it to the desktop on the hard drive and then used drag to move it to a USB drive, brought it to my computer and pasted the results here. So, you won’t see a prompt.

cat /etc/default/grub

# If you change this file, run 'grub2-mkconfig -o /boot/grub2/grub.cfg' afterwards to update
# /boot/grub2/grub.cfg.

# Uncomment to set your own custom distributor. If you leave it unset or empty, the default
# policy is to determine the value from /etc/os-release
GRUB_DISTRIBUTOR=
GRUB_DEFAULT=saved
GRUB_HIDDEN_TIMEOUT=0
GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=8
GRUB_CMDLINE_LINUX_DEFAULT="resume=/dev/disk/by-label/Leap-Swap showopts"
GRUB_CMDLINE_LINUX=""

# Uncomment to automatically save last booted menu entry in GRUB2 environment

# variable `saved_entry'
# GRUB_SAVEDEFAULT="true"
#Uncomment to enable BadRAM filtering, modify to suit your needs

# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
# GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"
#Uncomment to disable graphical terminal (grub-pc only)

GRUB_TERMINAL="gfxterm"
# The resolution used on graphical terminal
#note that you can use only modes which your graphic card supports via VBE

# you can see them in real GRUB with the command `vbeinfo'
GRUB_GFXMODE="auto"
# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
# GRUB_DISABLE_LINUX_UUID=true
#Uncomment to disable generation of recovery mode menu entries

# GRUB_DISABLE_LINUX_RECOVERY="true"
#Uncomment to get a beep at grub start

# GRUB_INIT_TUNE="480 440 1"
GRUB_BACKGROUND=
GRUB_THEME=/boot/grub2/themes/openSUSE/theme.txt
SUSE_BTRFS_SNAPSHOT_BOOTING="true"
GRUB_USE_LINUXEFI="true"
GRUB_DISABLE_OS_PROBER="false"
GRUB_ENABLE_CRYPTODISK="n"
GRUB_CMDLINE_XEN_DEFAULT="vga=gfx-1024x768x16"

sudo rpm -qa | grep grub

MSI-6400:/ # rpm -qa | grep grub
grub2-systemd-sleep-plugin-2.06-150400.11.12.1.noarch
grub2-2.06-150400.11.12.1.x86_64
ruby2.5-rubygem-cfa_grub2-2.0.0-1.55.x86_64
grub2-branding-openSUSE-15.4.20220322-lp154.2.3.noarch
grub2-snapper-plugin-2.06-150400.11.12.1.noarch
grub2-i386-pc-2.06-150400.11.12.1.noarch
grub2-x86_64-efi-2.06-150400.11.12.1.noarch
MSI-6400:/ #


ls -Ggh /etc/grub.d/

MSI-6400:/ # ls -Ggh /etc/grub.d
total 100K
-rwxr-xr-x 1  11K Oct 26 02:12 00_header
-rwxr-xr-x 1  14K Oct 26 02:12 10_linux
-rwxr-xr-x 1  18K Oct 26 02:12 20_linux_xen
-rwxr-xr-x 1 1.9K Oct 26 02:12 20_memtest86+
-rwxr-xr-x 1  13K Oct 26 02:12 30_os-prober
-rwxr-xr-x 1 1.4K Oct 26 02:12 30_uefi-firmware
-rwxr-xr-x 1  700 May  8  2022 35_fwupd
-rwxr-xr-x 1  214 Oct 26 02:12 40_custom
-rwxr-xr-x 1  215 Oct 26 02:12 41_custom
-rwxr-xr-x 1  937 Oct 26 02:12 80_suse_btrfs_snapshot
-rwxr-xr-x 1 1.3K Oct 26 02:12 90_persistent
-rwxr-xr-x 1  270 Oct 26 02:12 95_textmode
-rw-r--r-- 1  483 Oct 26 02:12 README
MSI-6400:/ #


cat /etc/fstab

LABEL=Leap-Swap               swap                swap  defaults              0  0
LABEL=42-3-Root               /                   ext4  acl,user_xattr        1  1
LABEL=42-3-Home               /home               ext4  acl,user_xattr        1  2
UUID=7488-945C                /boot/efi           vfat  umask=0002,utf8=true  0  0
#
#
DEL-OSS:/home/common          /mnt/Server         nfs   noauto,nofail,user    0  0
#
#
SYN-1520:/volume1/Common      /mnt/NAS-Common     nfs   noauto,nofail,user    0  0
SYN-1520:/volume1/Vi          /mnt/NAS-Vi         nfs   noauto,nofail,user    0  0
#
# This entry is for mounting the backup folder on the NAS
SYN-1520:/volume1/Backup/Vinetta /mnt/NAS-Backup  nfs   noauto,nofail,user    0  0



lsblk -f

MSI-6400:/ # lsblk -f
NAME   FSTYPE   FSVER            LABEL                       UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
loop0  squashfs 4.0                                                                                     0   100% /run/overlay/squashfs_container
loop1  ext4     1.0                                          1a2e71c3-f447-4490-b632-d3d25e2b9785      1G    72% /run/overlay/rootfsbase
sda
├─sda1 vfat     FAT16                                        7488-945C
├─sda2 swap     1                Leap-Swap                   bb26c287-f536-4b83-8682-54bbf61a9ac1
├─sda3 ext4     1.0              42-3-Root                   30b70a76-da05-42b0-9c6e-2835e735235c   34.2G    28% /run/media/linux/42-3-Root
└─sda4 ext4     1.0              42-3-Home                   10f916bd-ddb5-4dd6-974b-a5449df741ed  163.3G     4% /run/media/linux/42-3-Home
sdb    iso9660  Joliet Extension openSUSE_Leap_15.4_KDE_Live 2022-11-09-08-19-39-00
├─sdb1 iso9660  Joliet Extension openSUSE_Leap_15.4_KDE_Live 2022-11-09-08-19-39-00                     0   100% /run/overlay/live
├─sdb2 vfat     FAT16            BOOT                        1310-C9F3
└─sdb3 ext4     1.0              cow                         84c1bc0a-ce5a-4269-a71d-f263acbe7118   26.6G     1% /run/overlay/overlayfs
sdc    exfat    1.0                                          F33E-E4FA                              57.3G     0% /run/media/linux/F33E-E4FA
sr0
MSI-6400:/ #

show?

dracut -f will rebuild the running kernel (IIRC).

Yup! I found that out just after you posted.

I checked the contents of /etc/default/grub, /etc/grub.d on two of my desktops and they are pretty much the same. The differences are obvious. So, I think I’m going to open /boot/grub2/grub.cfg and manually recreate it based on the two working copies. For most of the files in /etc/grub.d, it looks like I can just search for the “cat <<” line and copy everything up to the EOF.

I’ll be sure to let you know how it works out.

It’s 04:00 hours here and my windows keep closing on me, I’m calling it quits for tonight!

Bart

I WON! rotfl!

Here is what I think happened:
During the update, something went wrong and the process stopped in the middle. (That’s obvious!)
Right at the spot where the grub configuration files were being manipulated.
All of the files in /etc/grub.d/* files had been created as empty files and *.rpmnew files had been created. (Should have been the tip off)
There must be a process that merges the original file and the .rpmnew files that didn’t run.

What I did:
I booted a copy of the installation disk and seleced More->Boot Linux and selected the real copy on the hard disk.
As root, I ran an instance of kate then:
opened /etc/grub.d/00_header
opened /etc/grub2/00_header.rpmnew , Ctrl-A, CTRL-C
moved to /etc/grub2/00_header, CTRL-V, CTRL-S
closed both files.
Repeated the above process for all files that had a .rpmnew extension.
Ran grub2-mkconfig and could see that I was successful!
Ran grub2-mkconfig -o /boot/boot2/grub.cfg
Rebooted the computer and all is well!

I should have investigated this much sooner as none of my other computers had any .rpmnew files. Only the one that was problematic.

Anyway, I didn’t have to do a fresh install!

Bart

I’m usually happier to fix what broke than to toss it and install anew and/or spend money. It’s my shade of green! Congratulations!

Me too! Thanks a bunch for your help!

Bart