Btrfs filesystem on primary machine completely destroyed itself, secondary in the process of dying 😭

pavinjoseph · March 3, 2024, 5:30pm

Hello everyone,

I find myself in the unenviable position of having to re-setup both my primary and secondary machines.
Currently typing this from the secondary machine as the primary’s btrfs filesystem is completely dead and unable to mount/boot root FS.
btrfs check --repair from a Gnome resuce ISO failed to repair it.

Out of curiosity, I checked my secondary machine from a rescue ISO and it also showed a few errors, though not even 1% of the primary machine.

At the time of the incident on the primary, I was in the process of typing in a reply on this very forum.
Chrome wouldn’t respond, it used to happen sometimes for a few seconds but nothing like this.
I switched to Thunderbird and it too hung, this was a first.
Just had enough time to switch to the terminal and check what was going on, CPU0 was at 100% utilization and systemd-journald was the culprit, it was waiting for disk I/O mostly, so I reckon there were many errors being printed to the journal.
Sadly I wasn’t able to a get a hold of what the errors were as soon after the system crashed.
No kernel panic, I was able to switch to a console TTY and do magic sysrq “REISUB” but it took a few attempts to get the system to power down.
Normally in my past experience REISUB was supposed to reboot immediately, but in this case it took several REISUBs and the system just shut down.

Of course the next time I tried to bootup, grub showed an error that it couldn’t access its own themes and of course no kernel and no bootup was possible as it couldn’t mount the root FS.

Currently I’m in the process of restoring the primary from my backups.

These were my pet machines, so I’m quite sad that this happened even after diligently checking the journal logs for any priority 3 errors and my nightly systemd service logs for transactional dup.
None of them showed any issues, all the disks were unmounted correctly on shutdown/reboot. There were no kernel panics or other issues that could’ve lead to this.

My current theories are:

I ran a btrfs online defrag and balance yesterday to compress existing chunks and then reclaim the space lost due to allocated but partially filled extents.
Problem with theory is that it did not affect the btrfs FS on a SATA SSD in the primary machine, only the NVMe SSD’s root fs was impacted.
There are some issues with the btrfs version in kernels 6.7.x.
The SATA SSD’s working btrfs FS with no errors was created when I was using LMDE (Debian), perhaps this was what saved it.
There are some hardware errors that I don’t know about.
My memory is fine though, memtest and vulkan memtest show no issues.
Smart shows no errors on the NVMe drives themselves, but it doesn’t support performing any self tests.
I was told that Zen2 architecture AMD APU’s have been known to have weird memory issues, with some systems even refusing to POST when running memory at 3200 MHz.
I can confirm this as when I added a 16G module to a free slot in the laptop, I got kernel panics on running graphics intensive load.
Adding an 8G module caused no instabilities. I believe this is because these AMD APU’s onboard graphics reserves/uses a portion of the normal RAM for its VRAM.

Here’s my system details on the secondary machine:

System:
  Kernel: 6.7.4-1-default arch: x86_64 bits: 64 compiler: gcc v: 13.2.1 clocksource: hpet
  Console: pty pts/1 DM: GDM v: 45.0.1 Distro: openSUSE Tumbleweed-Slowroll 20240213
Machine:
  Type: Laptop System: LENOVO product: 82KD v: Lenovo V15 G2 ALC Ua serial: <filter> Chassis:
    type: 10 v: Lenovo V15 G2 ALC Ua serial: <filter>
  Mobo: LENOVO model: LNVNB161216 v: SDK0T76486WIN serial: <filter>
    part-nu: LENOVO_MT_82KD_BU_idea_FM_V15 G2 ALC Ua uuid: dd35a9a6-b90d-4f65-8723-88a4c29167c5
    UEFI: LENOVO v: GLCN46WW date: 03/23/2022
Battery:
  ID-1: BAT0 charge: 38.4 Wh (100.0%) condition: 38.4/38.0 Wh (101.0%) power: 0.1 W volts: 8.7
    min: 7.7 model: BYD L20B2PF0 type: Li-poly serial: <filter> status: full cycles: 60
Memory:
  System RAM: total: 16 GiB available: 13.11 GiB used: 7.03 GiB (53.6%)
  Array-1: capacity: 16 GiB slots: 2 modules: 2 EC: None max-module-size: 8 GiB note: est.
  Device-1: Channel-A DIMM 0 type: DDR4 detail: synchronous unbuffered (unregistered)
    size: 8 GiB speed: 2133 MT/s volts: 1.2 width (bits): data: 64 total: 64 manufacturer: Hynix
    part-no: HMA41GS6AFR8N-TF serial: <filter>
  Device-2: Channel-B DIMM 0 type: DDR4 detail: synchronous unbuffered (unregistered)
    size: 8 GiB speed: spec: 3200 MT/s actual: 2133 MT/s volts: 1.2 width (bits): data: 64 total: 64
    manufacturer: Samsung part-no: M471A1G44AB0-CWE serial: N/A
CPU:
  Info: quad core model: AMD Ryzen 3 5300U with Radeon Graphics bits: 64 type: MT MCP smt: enabled
    arch: Zen 2 rev: 1 cache: L1: 256 KiB L2: 2 MiB L3: 4 MiB
  Speed (MHz): avg: 776 high: 1436 min/max: 400/3900 volts: 1.2 V ext-clock: 100 MHz cores:
    1: 1391 2: 400 3: 400 4: 400 5: 400 6: 1386 7: 1436 8: 400 bogomips: 41528
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: AMD Lucienne vendor: Lenovo driver: amdgpu v: kernel arch: GCN-5 pcie: speed: 8 GT/s
    lanes: 16 ports: active: HDMI-A-1 off: eDP-1 empty: none bus-ID: 04:00.0 chip-ID: 1002:164c
    class-ID: 0300 temp: 37.0 C
  Device-2: Syntek Integrated Camera driver: uvcvideo type: USB rev: 2.0 speed: 480 Mb/s
    lanes: 1 bus-ID: 1-3:3 chip-ID: 174f:2459 class-ID: fe01 serial: <filter>
  Display: server: X.org v: 1.21.1.11 with: Xwayland v: 23.2.4 compositor: gnome-shell driver:
    X: loaded: modesetting unloaded: fbdev,vesa dri: radeonsi gpu: amdgpu tty: 174x43
  Monitor-1: HDMI-A-1 model: Samsung U32R59x serial: <filter> res: 3840x2160 dpi: 140
    size: 697x392mm (27.44x15.43") diag: 800mm (31.5") modes: max: 3840x2160 min: 720x400
  Monitor-2: eDP-1 model: ChiMei InnoLux 0x15f5 res: 1920x1080 dpi: 142
    size: 344x193mm (13.54x7.6") diag: 394mm (15.5") modes: max: 1920x1080 min: 640x480
  API: OpenGL Message: GL data unavailable in console for root.
  API: EGL Message: EGL data unavailable in console, eglinfo missing.
Audio:
  Device-1: AMD Renoir Radeon High Definition Audio vendor: Lenovo driver: snd_hda_intel v: kernel
    pcie: speed: 8 GT/s lanes: 16 bus-ID: 04:00.1 chip-ID: 1002:1637 class-ID: 0403
  Device-2: AMD ACP/ACP3X/ACP6x Audio Coprocessor vendor: Lenovo driver: N/A pcie: speed: 8 GT/s
    lanes: 16 bus-ID: 04:00.5 chip-ID: 1022:15e2 class-ID: 0480
  Device-3: AMD Family 17h/19h HD Audio vendor: Lenovo driver: snd_hda_intel v: kernel pcie:
    speed: 8 GT/s lanes: 16 bus-ID: 04:00.6 chip-ID: 1022:15e3 class-ID: 0403
  Device-4: Thesycon System & Consulting GmbH DX3 Pro+ driver: snd-usb-audio type: USB rev: 2.0
    speed: 480 Mb/s lanes: 1 bus-ID: 1-2.1.1:6 chip-ID: 152a:8750 class-ID: fe01
  API: ALSA v: k6.7.4-1-default status: kernel-api with: aoss type: oss-emulator
  Server-1: PipeWire v: 1.0.3 status: n/a (root, process) with: 1: pipewire-pulse status: active
    2: wireplumber status: active 3: pipewire-alsa type: plugin 4: pw-jack type: plugin
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: Lenovo driver: r8169
    v: kernel pcie: speed: 2.5 GT/s lanes: 1 port: 3000 bus-ID: 01:00.0 chip-ID: 10ec:8168
    class-ID: 0200
  IF: enp1s0 state: down mac: <filter>
  Device-2: Realtek RTL8822CE 802.11ac PCIe Wireless Network Adapter vendor: Lenovo
    driver: rtw_8822ce v: N/A pcie: speed: 2.5 GT/s lanes: 1 port: 2000 bus-ID: 02:00.0
    chip-ID: 10ec:c822 class-ID: 0280
  IF: wlp2s0 state: up mac: <filter>
  IF-ID-1: wg-suse-laptop state: unknown speed: N/A duplex: N/A mac: N/A
Bluetooth:
  Device-1: Realtek Bluetooth Radio driver: btusb v: 0.8 type: USB rev: 1.0 speed: 12 Mb/s
    lanes: 1 bus-ID: 3-4:2 chip-ID: 0bda:c123 class-ID: e001 serial: <filter>
  Report: btmgmt ID: hci0 rfk-id: 3 state: down bt-service: enabled,running rfk-block:
    hardware: no software: yes address: <filter> bt-v: 5.1 lmp-v: 10
Drives:
  Local Storage: total: 6.6 TiB used: 2.75 TiB (41.6%)
  ID-1: /dev/nvme0n1 vendor: Samsung model: MZALQ256HBJD-00BL2 size: 238.47 GiB speed: 31.6 Gb/s
    lanes: 4 tech: SSD serial: <filter> fw-rev: 5L2QFXM7 temp: 36.9 C scheme: GPT
  ID-2: /dev/sda vendor: Seagate model: BUP RD size: 3.64 TiB type: USB rev: 3.0 spd: 5 Gb/s
    lanes: 1 tech: N/A serial: N/A fw-rev: 0304 scheme: GPT
  ID-3: /dev/sdb vendor: HGST (Hitachi) model: HTS721010A9E630 size: 931.51 GiB type: USB
    rev: 3.0 spd: 5 Gb/s lanes: 1 tech: HDD rpm: 7200 serial: <filter> fw-rev: 0209 scheme: GPT
  ID-4: /dev/sdc vendor: Western Digital model: WD20NMVW-11EDZS2 size: 1.82 TiB type: USB
    rev: 3.0 spd: 5 Gb/s lanes: 1 tech: HDD rpm: 5200 serial: <filter> fw-rev: 1012 scheme: MBR
Partition:
  ID-1: / size: 219.97 GiB used: 19.4 GiB (8.8%) fs: btrfs dev: /dev/dm-1 mapped: suse-system
  ID-2: /boot/efi size: 511 MiB used: 4.7 MiB (0.9%) fs: vfat dev: /dev/nvme0n1p1
  ID-3: /home size: 219.97 GiB used: 19.4 GiB (8.8%) fs: btrfs dev: /dev/dm-1
    mapped: suse-system
  ID-4: /opt size: 219.97 GiB used: 19.4 GiB (8.8%) fs: btrfs dev: /dev/dm-1 mapped: suse-system
  ID-5: /var size: 219.97 GiB used: 19.4 GiB (8.8%) fs: btrfs dev: /dev/dm-1 mapped: suse-system
Swap:
  ID-1: swap-1 type: partition size: 2 GiB used: 0 KiB (0.0%) priority: -2 dev: /dev/dm-2
    mapped: suse-swap
Sensors:
  System Temperatures: cpu: 41.5 C mobo: N/A gpu: amdgpu temp: 37.0 C
  Fan Speeds (rpm): N/A
Info:
  Processes: 436 Power: uptime: 1h 3m states: freeze,mem,disk suspend: deep wakeups: 0
    hibernate: platform Init: systemd v: 254 default: graphical
  Packages: pm: flatpak pkgs: 8 Compilers: gcc: 13.2.1 Shell: Sudo (sudo) v: 1.9.15p5
    default: Bash v: 5.2.26 running-in: pty pts/1 inxi: 3.3.33

Tagging @karlmistelberger as he has quite the experience with btrfs issues on infamous host erlangen.
Of course I want all your inputs on determing the root cause, which is proving to be elusive.

malcolmlewis · March 3, 2024, 5:43pm

@pavinjoseph very likely using transactional-update on a non standard setup… or perhaps a new issue with Samsung devices?

pavinjoseph · March 3, 2024, 6:05pm

Unlikely as it’s not a complicated program. All it does is create a new snapshot, chroot into it and perform whatever it needs to do. If the action fails the snap is discarded, otherwise it is set as the new default subvolume and you boot into it. This is something the average person with basic knowledge of btrfs, snapper, and zypper/chroot can roll on their own though TU implements a library tukit and does things in a very generic way so as to increase interoperability with other snapshot and FS backends in the future.
But I’m not ruling out anything at this point.

or perhaps a new issue with Samsung devices?

Could be, could be… I do have other NVMe brands at my disposal to test if it comes to that.

Thanks, if something else pops into your mind I’m all ears

karlmistelberger · March 3, 2024, 8:01pm

That’s a sound approach. To my experience the issues are caused by the hardware. All of the machines I installed go with KISS (Keep It Super Simple), i.e a single partition with nothing else:

erlangen:~ # fdisk -l /dev/nvme1n1
Disk /dev/nvme1n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: Samsung SSD 970 EVO Plus 2TB            
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: F5B232D0-7A67-461D-8E7D-B86A5B4C6C10

Device           Start        End    Sectors  Size Type
/dev/nvme1n1p1    2048    1050623    1048576  512M EFI System
/dev/nvme1n1p2 1050624 3907028991 3905978368  1.8T Linux filesystem
erlangen:~ #

No swap:

erlangen:~ # free -h
               total        used        free      shared  buff/cache   available
Mem:            30Gi        12Gi        11Gi       444Mi       7.9Gi        18Gi
Swap:             0B          0B          0B
erlangen:~ #

I am trying hard to break the btrfs (46 unsafe shutdowns). The 970 EVO Plus 2TB SSDs show perfect atomic behaviour.

erlangen:~ # smartctl -A /dev/nvme1n1p2
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.7.6-1-default] (SUSE RPM)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF SMART DATA SECTION ===
SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        37 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    141,844,872 [72.6 TB]
Data Units Written:                 67,629,949 [34.6 TB]
Host Read Commands:                 776,490,229
Host Write Commands:                1,202,968,514
Controller Busy Time:               2,391
Power Cycles:                       2,014
Power On Hours:                     1,238
Unsafe Shutdowns:                   46
Media and Data Integrity Errors:    0
Error Information Log Entries:      3,163
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               37 Celsius
Temperature Sensor 2:               35 Celsius

erlangen:~ #

A single error occurred since 2021-11-24:

erlangen:~ # journalctl -q -g btrfs --priority err 
Feb 28 07:51:16 erlangen kernel: BTRFS error (device nvme0n1p2): couldn't find block (412207087616) (level 1) in tree (2405) with key (650 96 3847)
erlangen:~ #

pavinjoseph · March 3, 2024, 8:34pm

Thanks Karl

I am back on my primary machine, restoration went fast and smooth as the SATA SSD had local backups. Would’ve been much slower restoring from offsite backups.

I will continue to run btrfs check every few days from a rescue ISO and report back.

Now onto fixing the secondary machine, it has only offsite backups

Pavin.

pavinjoseph · March 3, 2024, 8:44pm

I did manage to land one picture of the console while the machine was hung in REISUB, never to boot again with that FS.

pavinjoseph · March 4, 2024, 1:32am

TU failed gracefully due to the recently discussed virtiofsd bug.

Mar 04 03:09:29 suse-pc transactional-update[15355]: ( 26/154) Installing: sane-backends-1.2.1-3.1.x86_64 [....done]
Mar 04 03:09:30 suse-pc transactional-update[15355]: ( 27/154) Installing: libxdp1-1.4.2-2.1.x86_64 [...done]
Mar 04 03:09:30 suse-pc transactional-update[15355]: ( 28/154) Installing: libz1-1.3.1-1.1.x86_64 [...done]
Mar 04 03:09:30 suse-pc transactional-update[15355]: ( 29/154) Installing: mailx-12.5-38.1.x86_64 [..
Mar 04 03:09:30 suse-pc transactional-update[15355]: Calling pre installation script
Mar 04 03:09:30 suse-pc transactional-update[15355]: done]
Mar 04 03:09:30 suse-pc transactional-update[15355]: ( 30/154) Installing: man-pages-6.05.01-3.1.noarch [....done]
Mar 04 03:09:30 suse-pc transactional-update[15355]: ( 31/154) Installing: prctl-1.6-5.1.x86_64 [..done]
Mar 04 03:09:30 suse-pc transactional-update[15355]: ( 32/154) Installing: rtkit-0.13-6.1.x86_64 [..
Mar 04 03:09:30 suse-pc transactional-update[15355]: /usr/bin/systemd-sysusers --replace=/usr/lib/sysusers.d/rtkit.conf -
Mar 04 03:09:30 suse-pc transactional-update[15355]: done]
Mar 04 03:09:31 suse-pc transactional-update[15355]: ( 33/154) Installing: ruby3.3-rubygem-gem2rpm-0.10.1-3.1.x86_64 [..done]
Mar 04 03:09:31 suse-pc transactional-update[15355]: ( 34/154) Installing: unzip-doc-6.00-42.1.noarch [..done]
Mar 04 03:09:31 suse-pc transactional-update[15355]: ( 35/154) Installing: vim-data-common-9.1.0111-8.1.noarch [..done]
Mar 04 03:09:31 suse-pc transactional-update[15355]: ( 36/154) Installing: virtiofsd-1.10.1-2.1.x86_64 [..
Mar 04 03:09:31 suse-pc transactional-update[15355]: error: unpacking of archive failed on file /usr/libexec/virtiofsd;65e4ee13: cpio: File from package already exists as a >
Mar 04 03:09:31 suse-pc transactional-update[15355]: error: virtiofsd-1.10.1-2.1.x86_64: install failed
Mar 04 03:09:31 suse-pc transactional-update[15355]: error: virtiofsd-1.10.1-1.1.x86_64: erase skipped
Mar 04 03:09:31 suse-pc transactional-update[15355]: error]
Mar 04 03:09:31 suse-pc transactional-update[15355]: Installation of virtiofsd-1.10.1-2.1.x86_64 failed:
Mar 04 03:09:31 suse-pc transactional-update[15355]: Error: Subprocess failed. Error: RPM failed: Command exited with status 1.
Mar 04 03:09:31 suse-pc transactional-update[15355]: Abort, retry, ignore? [a/r/i] (a): a
Mar 04 03:09:31 suse-pc transactional-update[15355]: Warning: %posttrans and %transfiletrigger scripts are not executed when aborting!
Mar 04 03:09:31 suse-pc transactional-update[15355]: Problem occurred during or after installation or removal of packages:
Mar 04 03:09:31 suse-pc transactional-update[15355]: Installation has been aborted as directed.
Mar 04 03:09:31 suse-pc transactional-update[15355]: Please see the above error message for a hint.
Mar 04 03:09:31 suse-pc transactional-update[15355]: 2024-03-04 03:09:31 Application returned with exit status 8.
Mar 04 03:09:31 suse-pc transactional-update[15118]: ERROR: zypper --no-cd dup on /.snapshots/6/snapshot failed with exit code 8!
Mar 04 03:09:31 suse-pc transactional-update[15118]: Use '--interactive' for manual problem resolution.
Mar 04 03:09:31 suse-pc transactional-update[15116]: Warning: The following files were changed in the snapshot, but are shadowed by
Mar 04 03:09:31 suse-pc transactional-update[15116]: other mounts and will not be visible to the system:
Mar 04 03:09:31 suse-pc transactional-update[15116]: /.snapshots/6/snapshot/var/lib/sgml/CATALOG.iso_ent
Mar 04 03:09:31 suse-pc transactional-update[15118]: Removing snapshot #6...
Mar 04 03:09:31 suse-pc transactional-update[15551]: 2024-03-04 03:09:31 tukit 4.5.0 started
Mar 04 03:09:31 suse-pc transactional-update[15551]: 2024-03-04 03:09:31 Options: --discard abort 6
Mar 04 03:09:31 suse-pc transactional-update[15551]: 2024-03-04 03:09:31 Discarding snapshot 6.
Mar 04 03:09:32 suse-pc transactional-update[15551]: 2024-03-04 03:09:32 Transaction completed.
Mar 04 03:09:32 suse-pc transactional-update[15116]: transactional-update finished
Mar 04 03:09:32 suse-pc systemd[1]: bkupd.service: Main process exited, code=exited, status=1/FAILURE
Mar 04 03:09:32 suse-pc systemd[1]: bkupd.service: Failed with result 'exit-code'.
Mar 04 03:09:32 suse-pc systemd[1]: Failed to start Backup and transactional update/reboot.
Mar 04 03:09:32 suse-pc systemd[1]: bkupd.service: Consumed 6min 21.729s CPU time.

If it does turn out to be the reason for my troubles, I’m rolling my own. This is an invaluable feature to have

pavinjoseph · March 4, 2024, 1:48am

TU succeeded after renaming existing virtiofsd dir.

Mar 04 07:11:09 suse-pc transactional-update[24520]: 2024-03-04 07:11:09 Application returned with exit status 0.
Mar 04 07:11:09 suse-pc transactional-update[24520]: 2024-03-04 07:11:09 Transaction completed.
Mar 04 07:11:09 suse-pc transactional-update[24180]: Trying to rebuild kdump initrd
Mar 04 07:11:10 suse-pc transactional-update[20371]: 2024-03-04 07:11:10 tukit 4.5.0 started
Mar 04 07:11:10 suse-pc transactional-update[20371]: 2024-03-04 07:11:10 Options: --discard call 11 /sbin/mkdumprd
Mar 04 07:11:10 suse-pc transactional-update[20371]: 2024-03-04 07:11:10 Executing `/sbin/mkdumprd`:
Mar 04 07:11:10 suse-pc transactional-update[20371]: /var/lib/kdump not writable, not regenerating initrd.
Mar 04 07:11:10 suse-pc transactional-update[20371]: 2024-03-04 07:11:10 Application returned with exit status 0.
Mar 04 07:11:10 suse-pc transactional-update[20371]: 2024-03-04 07:11:10 Transaction completed.
Mar 04 07:11:10 suse-pc transactional-update[20382]: 2024-03-04 07:11:10 tukit 4.5.0 started
Mar 04 07:11:10 suse-pc transactional-update[20382]: 2024-03-04 07:11:10 Options: --discard close 11
Mar 04 07:11:10 suse-pc transactional-update[20382]: 2024-03-04 07:11:10 New default snapshot is #11 (/.snapshots/11/snapshot).
Mar 04 07:11:10 suse-pc transactional-update[20382]: 2024-03-04 07:11:10 Transaction completed.
Mar 04 07:11:10 suse-pc transactional-update[24180]: Warning: The following files were changed in the snapshot, but are shadowed by
Mar 04 07:11:10 suse-pc transactional-update[24180]: other mounts and will not be visible to the system:
Mar 04 07:11:10 suse-pc transactional-update[24180]: /.snapshots/11/snapshot/var/lib/sgml/CATALOG.iso_ent
Mar 04 07:11:10 suse-pc transactional-update[24180]: /.snapshots/11/snapshot/var/lib/unbound/root.key
Mar 04 07:11:10 suse-pc transactional-update[24180]: New default snapshot is #11 (/.snapshots/11/snapshot).
Mar 04 07:11:10 suse-pc transactional-update[24180]: Using default snapshot 11 to replace running system...
Mar 04 07:11:10 suse-pc transactional-update[24180]: Applying /usr...
Mar 04 07:11:10 suse-pc transactional-update[24180]: Applying /etc...
Mar 04 07:11:10 suse-pc transactional-update[24180]: Applying /boot...
Mar 04 07:11:10 suse-pc transactional-update[24180]: Executing systemctl daemon-reexec...
Mar 04 07:11:11 suse-pc transactional-update[24180]: Executing create_dirs_from_rpmdb...
Mar 04 07:11:12 suse-pc transactional-update[24180]: Executing systemd-tmpfiles --create...
Mar 04 07:11:12 suse-pc transactional-update[24180]: => Applied default snapshot as new base for running system!
Mar 04 07:11:12 suse-pc transactional-update[24180]:    Running processes will not be restarted automatically.
Mar 04 07:11:12 suse-pc transactional-update[24180]: transactional-update finished
Mar 04 07:11:13 suse-pc systemd[1]: bkupd.service: Deactivated successfully.
Mar 04 07:11:13 suse-pc systemd[1]: Finished Backup and transactional update/reboot.
Mar 04 07:11:13 suse-pc systemd[1]: bkupd.service: Consumed 6min 49.294s CPU time.

pavinjoseph · March 4, 2024, 2:25am

Found errors

Errors relating to quota/qgroups found when running btrfs check from rescue ISO after creating new snapshots and changing default subvolume by TU.

Fixed by disabling quota for the btrfs filesystem: sudo btrfs quota disable /

karlmistelberger · March 4, 2024, 11:45am

Fixing the error manually and restarting bkupd.service is preferred usage.
Quotas slow down applications using btrfs. E.g. snapper list is annoyingly slow with quotas enabled and very fast when disabled.

Any reason for using bkupd.service instead of transactional-update.service? The latter works well with the timer enabled on host 6700k. On infamous host erlangen its timer is disabled. I use the following program starter instead:

karl@erlangen:~/Schreibtisch> cat Dist-Upgrade.desktop 
[Desktop Entry]
Comment[en_GB]=
Comment=
Comment[de_DE]=
Exec=/usr/bin/systemctl start transactional-update.service
GenericName[en_GB]=
GenericName=
GenericName[de_DE]=
Icon=system-upgrade
MimeType=
Name[en_GB]=Dist-Upgrade
Name=Dist-Upgrade
Name[de_DE]=Dist-Upgrade
Path=
StartupNotify=true
Terminal=false
TerminalOptions=
Type=Application
X-DBUS-ServiceName=
X-DBUS-StartupType=
X-KDE-SubstituteUID=false
X-KDE-Username=
karl@erlangen:~/Schreibtisch>

User karl is granted permission to upgrade erlangen:

erlangen:~ # cat /etc/polkit-1/rules.d/00-dup.rules 
// Allow karl to  manage transactional-update.service; 
// fall back to implicit authorization otherwise. 
polkit.addRule(function(action, subject) { 
    if (action.id == "org.freedesktop.systemd1.manage-units" && 
        action.lookup("unit") == "transactional-update.service" && 
        subject.user == "karl") { 
        return polkit.Result.YES; 
    } 
});
erlangen:~ #

pavinjoseph · March 4, 2024, 12:47pm

Ah okay, good to have confirmation

Woah, you’re right. I had gotten used to running sudo snapper list --disable-used-space. Snapper doesn’t show the space used by each snapshot now with quotas disabled with or without the --disable-used-space option.

Yes, there is a bug that causes TU to not perform kexec reboot into a new kernel. Plus I’ve got a different TU flow now that I apply the updates live and only reboot if necessary.

# /etc/systemd/system/bkupd.service
[Unit]
Description=Backup and transactional update/reboot

[Service]
Type=oneshot
ExecStartPre=/usr/local/bin/check-device
ExecStartPre=/usr/bin/borgmatic --verbosity 1 --syslog-verbosity 1
ExecStartPre=/usr/bin/snapper -c root create -c number -d 'RO snap before dup'
ExecStartPre=-/usr/bin/zypperoni ref
ExecStartPre=-/usr/bin/zypperoni --no-confirm dup-download
ExecStart=/usr/sbin/transactional-update --drop-if-no-change apply cleanup dup
ExecStartPost=/usr/bin/systemctl restart kdump.service
ExecStartPost=/usr/bin/systemctl restart kexec-load.service
#ExecStartPost=/usr/bin/systemctl reboot

As you’re not applying the updates live, it’s required to enable the service create-dirs-from-rpmdb.service.

From TU manpage:

Due to the volatile nature of /var the directory will not be mounted into the new snapshot during the transactional-update run, as this would break atomicity: The currently running system depends on the old state of the data (imagine a database migration was triggered by a package). Any modifications to /var therefore have to be in the new system, i.e. modifying the contents of /var as part of the packaging scripts is not allowed.

The only exception to this rule are directories: Those will be recreated during the first boot into the updated system by the create-dirs-from-rpmdb.service helper service. For all other cases please use one of the options described in Packaging for transactional-updates and Migration / Upgrade in the Packaging guidelines. If a package is breaking this rule by installing files into a directory which is not part of the root file system, then a warning message indicating the affected file is printed at the end of the transactional-update run.

pavinjoseph · March 4, 2024, 6:07pm

Secondary fixed

Secondary machine fixed without re-creating btrfs filesystem by disabling quota and running btrfs check from rescue ISO

More and more it looks like the root cause is/was faulty btrfs quota implementation. Lots of errors with quota enabled, slow filesystem with many snapshots, perhaps the defrag/balance drove it over the edge.

pavinjoseph · March 5, 2024, 6:15am

Found culprit

transactional-update (TU) is absolved:

Mar 05 03:08:38 suse-pc transactional-update[18274]: 2024-03-05 03:08:38 Transaction completed.
Mar 05 03:08:38 suse-pc transactional-update[3491]: New default snapshot is #45 (/.snapshots/45/snapshot).
Mar 05 03:08:38 suse-pc transactional-update[3491]: Using default snapshot 45 to replace running system...
Mar 05 03:08:38 suse-pc transactional-update[3491]: Applying /usr...
Mar 05 03:08:38 suse-pc transactional-update[3491]: Applying /etc...
Mar 05 03:08:38 suse-pc transactional-update[3491]: Applying /boot...
Mar 05 03:08:38 suse-pc transactional-update[3491]: Executing systemctl daemon-reexec...
Mar 05 03:08:40 suse-pc transactional-update[3491]: Executing create_dirs_from_rpmdb...
Mar 05 03:08:40 suse-pc transactional-update[3491]: Executing systemd-tmpfiles --create...
Mar 05 03:08:40 suse-pc transactional-update[3491]: => Applied default snapshot as new base for running system!
Mar 05 03:08:40 suse-pc transactional-update[3491]:    Running processes will not be restarted automatically.
Mar 05 03:08:40 suse-pc transactional-update[3491]: transactional-update finished
Mar 05 03:08:41 suse-pc systemd[1]: bkupd.service: Deactivated successfully.
Mar 05 03:08:41 suse-pc systemd[1]: Finished Backup and transactional update/reboot.
Mar 05 03:08:41 suse-pc systemd[1]: bkupd.service: Consumed 5min 52.323s CPU time.

TU ran dup and applied the updates live, ran btrfs check from rescue ISO and no errors!

Root cause has been isolated to faulty btrfs quota/qgroup

Btrfs filesystem in SATA SSD that didn’t fail had quota/qgroup disabled by default when created (by me in previous distro LMDE). It housed all the snapshot backups from the main drive and did not fail or show any errors.

Btrfs filesystem in NVMe drive (on both primary and secondary device) were setup by OpenSuse/Yast installer, so it might have enabled quota as snapper would want that to keep the snapshot space usage in check.

karlmistelberger · March 5, 2024, 7:02am

transactional-update.service creates a new subvolume and deletes it without committing. This causes a btrfs check of the mounted filesystem to report numerous errors. Booting into a rescue system and performing a check of the unmounted file system isn’t needed on infamous host erlangen.

While typing this comment I switched to Emergency Mode and ran the btrfs check with zero messages. Switched back to graphical mode, resumed typing without data loss and started btrfs-scrub.service:

erlangen:~ # btrfs scrub status /
UUID:             0e58bbe5-eff7-4884-bb5d-a0aac3d8a344
Scrub started:    Tue Mar  5 07:44:48 2024
Status:           finished
Duration:         0:05:11
Total to scrub:   832.44GiB
Rate:             2.68GiB/s
Error summary:    no errors found
erlangen:~ #

pavinjoseph · March 5, 2024, 7:16am

I’m appropriating that sticker for future use

As the command it uses is /usr/sbin/transactional-update cleanup dup reboot this could happen when there was an issue running dup or when there is nothing to do I suppose.

I wonder why deleting a subvolume without committing would cause scrub to report errors. What a mess btrfs is still…

karlmistelberger · March 5, 2024, 7:42am

You need to know what you are doing. This challenges inexperienced users, which is bad.

However defaults used by Tumbleweed are robust:

erlangen:~ # btrfs filesystem usage -T /
Overall:
    Device size:                   1.82TiB
    Device allocated:            840.12GiB
    Device unallocated:         1022.39GiB
    Device missing:                  0.00B
    Device slack:                    0.00B
    Used:                        832.50GiB
    Free (estimated):              1.00TiB      (min: 516.63GiB)
    Free (statfs, df):             1.00TiB
    Data ratio:                       1.00
    Metadata ratio:                   2.00
    Global reserve:              512.00MiB      (used: 0.00B)
    Multiple profiles:                  no

                  Data      Metadata System                             
Id Path           single    DUP      DUP       Unallocated Total   Slack
-- -------------- --------- -------- --------- ----------- ------- -----
 1 /dev/nvme1n1p2 800.00GiB 40.00GiB 128.00MiB  1022.39GiB 1.82TiB     -
-- -------------- --------- -------- --------- ----------- ------- -----
   Total          800.00GiB 20.00GiB  64.00MiB  1022.39GiB 1.82TiB 0.00B
   Used           794.57GiB 18.96GiB 144.00KiB                          
erlangen:~ #

Users switching to btrfs may want to carefully read Hardware considerations — BTRFS documentation

Full balancing as well as adding and removing devices may stress your system. If that fails refrain from using btrfs.

pavinjoseph · March 5, 2024, 8:34am

In the context of metadata DUP I agree, but IIRC that’s the default chosen by btrfs itself when creating fs on a single device.

OTOH, the default option of enabling quota/qgroups chosen by OpenSuse installer when choosing to enable snapshots would’ve caused irrecoverable data loss and a lot of grief had I not had backups. I have now opened a bugzilla report so at least the relevant people are aware of this issue:
https://bugzilla.opensuse.org/show_bug.cgi?id=1220912

karlmistelberger · March 5, 2024, 11:44am

rpm indeed owns files located under /var. Enabled the service:

erlangen:~ # systemctl list-unit-files create-dirs-from-rpmdb.service 
UNIT FILE                      STATE   PRESET  
create-dirs-from-rpmdb.service enabled disabled

1 unit files listed.
erlangen:~ #

pavinjoseph · March 8, 2024, 6:36am

Karl, you might want to update TU timer to run at a different time, not at 0000.
Btrfs scrub/balance/trim/defrag runs on the first day of every month at 0000-0100. Give another hour for it to finish doing its thing. That means you should start TU update/reboot only past 0200.
I run TU every day at 0300.

karlmistelberger · March 8, 2024, 7:43am

Host 6700k uses the following drop-in:

6700k:~ # systemctl cat transactional-update.service 
# /usr/lib/systemd/system/transactional-update.service

...

# /etc/systemd/system/transactional-update.service.d/override.conf
[Unit]
After=btrbk.service btrfs-balance.service btrfs-scrub.service

[Service]
ExecStartPre=/usr/bin/nm-online
6700k:~ #

Upon return from sleep transactional-update waits until the services completed their task:

 6700k:~ # journalctl -b -1 -u btrbk.service -u transactional-update.service --identifier systemd
Mar 08 06:43:24 6700k systemd[1]: Starting btrbk backup...
Mar 08 06:43:51 6700k systemd[1]: btrbk.service: Deactivated successfully.
Mar 08 06:43:51 6700k systemd[1]: Finished btrbk backup.
Mar 08 06:43:51 6700k systemd[1]: btrbk.service: Consumed 6.513s CPU time.
Mar 08 06:43:51 6700k systemd[1]: Starting Update the system...
Mar 08 06:45:18 6700k systemd[1]: transactional-update.service: Deactivated successfully.
Mar 08 06:45:18 6700k systemd[1]: Stopped Update the system.
Mar 08 06:45:18 6700k systemd[1]: transactional-update.service: Consumed 53.568s CPU time.
6700k:~ #

It works for Type=oneshot. Still need to check this for Type=simple:

6700k:~ # systemctl cat btrfs-balance.service 
# /usr/lib/systemd/system/btrfs-balance.service
[Unit]
Description=Balance block groups on a btrfs filesystem
Documentation=man:btrfs-balance
After=fstrim.service btrfs-trim.service btrfs-scrub.service

[Service]
Type=simple
ExecStart=/usr/share/btrfsmaintenance/btrfs-balance.sh
IOSchedulingClass=idle
CPUSchedulingPolicy=idle
6700k:~ #