A slow descent to hell

ultome · November 19, 2023, 10:16pm

Hello,
My Tumbleweed was working perfectly and was my favorite distro, with KDE Plasma.
However, things got weird since I started to use a custom script to sustpend, poweroff and reboot my system. Why did I use a custom script? I just wanted to make sure I unmounted cleanly my rclone remote (a ProtonDrive), and stopped any background sync with it. In addition to that, I started using sudo systemctl at the end because (I don’t know why, maybe I did a bad configuration job) ProtonVPN required sudo authentication to be turned off… I know it’s a very bad idea now, but at the time it seemed just like a temporary fix or shortcut.

So then suspending started to disfunction: my PC locked and froze instead of suspending.

Then one day, I had to reboot in front of a black screen telling me that the locking function was broken (or something like this, I wasn’t smart enough to take a screenshot…).

And finally… I can’t boot anything from Tumbleweed’s Grub anymore. Kernel Panics everywhere!

So now I’m stuck with my backup (gaming oriented) Fedora dual-boot, which is very painful (Gnome seems like Windows XP when you come from KDE Plasma…).

I have no idea how to debug this. I can successfully chroot my Tumbleweed from Fedora, at least we have that…

All and any help would be appreciated! Thanks.

mrmazda · November 19, 2023, 11:22pm

Give us a bit of information to work with. From chrooting:

efibootmgr
zypper lr -dE
sudo inxi -CSMnaz --vs
lsblk -f

ultome · November 20, 2023, 12:17am

> efibootmgr
EFI variables are not supported on this system.

> zypper lr -dE: openSUSE Paste

> inxi -CSMnaz --vs
inxi 3.3.27-00 (2023-05-07)

(this command never stops? I got nothing out of waiting more time…)

>  lsblk -f
NAME        FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sda                                                
└─sda1                                93,9G    58% /
zram0                                              [SWAP]
nvme0n1                                            
├─nvme0n1p1                           49,5M    48% /boot/efi
├─nvme0n1p2                                        
├─nvme0n1p3                                        [SWAP]
└─nvme0n1p4

Tomorrow my computer will probably in repair (the fan’s going crazy, as if I needed any more problems), so please forgive me if I don’t answer at once to your next command. I’m not abandoning this thread, my PC will just temporarily be somewhere I can’t use it.

Thanks for the fast answer, I hope this will help.

Btw, I’m quite surprised of the output of efibootmgr, it’s the first time I see that on my PC, both hard disks have been relying on EFI so far, so I really dont understand why we get that…

Ah and just so you have the full picture, here is lsblk -f run from Fedora:

> sudo lsblk -f
NAME        FSTYPE FSVER LABEL       UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda                                                                                      
└─sda1      btrfs        Tumbleweed  37b1ace3-0889-4b40-9b0a-a2fe5da56d0a                
zram0                                                                                    [SWAP]
nvme0n1                                                                                  
├─nvme0n1p1 vfat   FAT32 ESP         BAFC-2878                              49,5M    48% /boot/efi
├─nvme0n1p2 ext4   1.0   Fedora_boot e130ac5a-55fb-4620-a63b-41184c9c680f  545,3M    37% /boot
├─nvme0n1p3 swap   1     Linux_swap  81a52966-11cf-4204-8e0b-4568796b2e5b                [SWAP]
└─nvme0n1p4 btrfs        Fedora      8a9a8f64-1ce9-45d2-b9dc-ba4cd756134e   92,6G    58% /home
                                                                                         /

I hope this will help.

nrickert · November 20, 2023, 12:23am

You could try that command from your Fedora system (outside the “chroot”).

ultome · November 20, 2023, 12:24am

From Fedora:

> efibootmgr
BootCurrent: 0001
Timeout: 0 seconds
BootOrder: 0001,0002,2001,2002,2003
Boot0000* opensuse-secureboot   HD(1,GPT,5dff6717-47a9-4681-a12d-2ce66cc67b4b,0x800,0x32000)/File(\EFI\opensuse\shim.efi)
Boot0001* Fedora        HD(1,GPT,5dff6717-47a9-4681-a12d-2ce66cc67b4b,0x800,0x32000)/File(\EFI\fedora\shim.efi) File(.䍒)
Boot0002* openSUSE      HD(1,GPT,5dff6717-47a9-4681-a12d-2ce66cc67b4b,0x800,0x32000)/File(\EFI\opensuse\grubx64.efi)RC
Boot0003* openSUSE      HD(1,GPT,5dff6717-47a9-4681-a12d-2ce66cc67b4b,0x800,0x32000)/File(\EFI\opensuse\grubx64.efi)RC
Boot2001* EFI USB Device        RC
Boot2002* EFI DVD/CDROM RC
Boot2003* EFI Network   RC

nrickert · November 20, 2023, 12:33am

The reason that you did not get that output from your “chroot”, is that it requires access to “/sys/firmware/efi/efivars”.

ultome · November 20, 2023, 12:35am

I can access the directory but ls shows nothing inside

nrickert · November 20, 2023, 3:21am

It’s a mount point, and using mount with --bind doesn’t handle anything below mount points.

karlmistelberger · November 20, 2023, 6:08am

For your information: Grub – EFI – Btrfs | Karl Mistelberger

Ah and just so you have the full picture, here is lsblk -f run from Fedora: …

Keep It Super Simple (KISS):

erlangen:~ # fdisk -l /dev/nvme1n1
Disk /dev/nvme1n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: Samsung SSD 970 EVO Plus 2TB            
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: F5B232D0-7A67-461D-8E7D-B86A5B4C6C10

Device           Start        End    Sectors  Size Type
/dev/nvme1n1p1    2048    1050623    1048576  512M EFI System
/dev/nvme1n1p2 1050624 3907028991 3905978368  1.8T Linux filesystem
erlangen:~ #

Infamous host erlangen works fine with a single btrfs partition occupying the whole SSD. Get a simple setup working. You may add complexity at any time.

hui · November 20, 2023, 6:51am

Did you actually read the provided output from the TO? Seems not…or else you would have seen that his sda1 is one btrfs partitition only for Tumbleweed.

Permanently promoting the ridiculous host setups maybe makes blind for other sane ways and possibilities……

rafaellinuxuser · November 20, 2023, 10:55am

One question, from GRUB it didn’t let you access the snapshots either? Generally, when there are problems due to system changes, this is the solution.

ultome · November 20, 2023, 10:58am

Indeed that’s one of the first things I tried… I didn’t try them all, but none of those I tried worked.

rafaellinuxuser · November 20, 2023, 11:51am

When you said “not worked” you mean “Kernel panic” always? Even with the oldest snapshot?

ultome · November 20, 2023, 10:26pm

Simple question but complicated answer… I get to see logs of the ongoing boot process only under certain conditions, that unfortunately I haven’t perfectly identified so far… Most of the time I select a boot option, then the screen turns black and that’s it. At times I don’t even know if my computer has shut down or not, that’s to say how non-verbose and non-responsive it gets…

I’ll try to see what I can get from the last snapshot when I get my computer back.

To all friendly souls that want to help me on this topic, my computer is in repair for a few days (fan problem), so I won’t be able to investigate for some time.

But please don’t un-follow this topic, I’ll revive it ASAP!

karlmistelberger · November 21, 2023, 4:46am

rafaellinuxuser · November 21, 2023, 8:32am

From the “Kernel panic” message, from the beginning I have been inclined to think that the reason for the problem is a hardware failure, most probably temperature, which in a laptop can affect everything, such as memory, which will fail randomly or continuously depending on the constancy of the problem that causes the temperature of all components to rise. Therefore, it is possible that when you get your computer back, you will no longer have the problem.

There are ISOs with test benches that will run stress algorithms to find out if something is wrong with the equipment under certain circumstances.

Windows, in fact, is more sensitive to problems with RAM modules. I know this because I had a dual-boot computer that Windows was unable to use without blue screens, but Linux did. A stress test showed that a RAM module failed.

ultome · November 21, 2023, 11:18am

It would be absolutely amazing if back from repair my openSUSE problem was fixed along with the noisy fan problem! I’ll keep you updated. It is indeed true that both problems occured and worsened simultaneously…

karlmistelberger · November 22, 2023, 5:17am

A Windows notebook would crash whenever accessing a folder. Cleaned the fan, applied new thermal paste, replaced the HDD by a SSD and installed Xubuntu. The user is happy since six years now. She maintains the website of the local sports club.

myswtest · November 22, 2023, 3:26pm

Had a similar event happen not long ago.
I booted into the dedicated BIOS and ran the System Diagnostics selection (Advanced, detailed) - no issues showed up. Oddly enough, after shutting it off for 30 minutes, then booting back, all seems to be fine. Computers can be unpredictable, like people