Infamous Host erlangen

Infamous host erlangen does away with virtually all annoyances in computing the author encountered since 1978.

The system relies on rolling hardware as well as rolling software. Current retail components supported by Tumbleweed out of the box are built into an ATX case.

erlangen:~ # inxi -zFmy132
System:    Kernel: 6.4.3-1-default arch: x86_64 bits: 64 Console: pty pts/1 Distro: openSUSE Tumbleweed 20230719
Machine:   Type: Desktop Mobo: ASRock model: B450 Pro4 R2.0 serial: <filter> UEFI: American Megatrends v: P5.30 date: 03/01/2022
Memory:    System RAM: available: 30.7 GiB used: 11.21 GiB (36.5%)
           Array-1: capacity: 128 GiB slots: 4 EC: None
           Device-1: DIMM 0 type: no module installed
           Device-2: DIMM 1 type: DDR4 size: 16 GiB speed: 2133 MT/s
           Device-3: DIMM 0 type: no module installed
           Device-4: DIMM 1 type: DDR4 size: 16 GiB speed: 2133 MT/s
CPU:       Info: 8-core model: AMD Ryzen 7 5700G with Radeon Graphics bits: 64 type: MT MCP cache: L2: 4 MiB
           Speed (MHz): avg: 1700 min/max: 1400/4672 cores: 1: 1400 2: 1400 3: 1400 4: 1400 5: 1400 6: 1400 7: 1400 8: 1400
             9: 1400 10: 1400 11: 1400 12: 3800 13: 1400 14: 3800 15: 1400 16: 1400
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Mobile Series] driver: amdgpu v: kernel
           Display: server: X.Org v: 21.1.8 with: Xwayland v: 23.1.2 driver: X: loaded: modesetting unloaded: fbdev,vesa
             dri: radeonsi gpu: amdgpu resolution: 3840x2160~60Hz
           API: OpenGL v: 4.6 Mesa 23.1.3 renderer: AMD Radeon Graphics (renoir LLVM 16.0.6 DRM 3.52 6.4.3-1-default)
Audio:     Device-1: Advanced Micro Devices [AMD/ATI] Renoir Radeon High Definition Audio driver: snd_hda_intel
           Device-2: Advanced Micro Devices [AMD] Family 17h/19h HD Audio driver: snd_hda_intel
           API: ALSA v: k6.4.3-1-default status: kernel-api
Network:   Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver: r8169
           IF: enp8s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Drives:    Local Storage: total: 12.74 TiB used: 4.53 TiB (35.6%)
           ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO Plus 2TB size: 1.82 TiB
           ID-2: /dev/nvme1n1 vendor: Samsung model: SSD 970 EVO Plus 2TB size: 1.82 TiB
           ID-3: /dev/sda vendor: Crucial model: CT2000BX500SSD1 size: 1.82 TiB
           ID-4: /dev/sdb vendor: Seagate model: ST8000VN004-2M2101 size: 7.28 TiB
Partition: ID-1: / size: 1.82 TiB used: 542.73 GiB (29.1%) fs: btrfs dev: /dev/nvme1n1p2
           ID-2: /boot/efi size: 511 MiB used: 640 KiB (0.1%) fs: vfat dev: /dev/nvme1n1p1
           ID-3: /home size: 1.82 TiB used: 542.73 GiB (29.1%) fs: btrfs dev: /dev/nvme1n1p2
           ID-4: /opt size: 1.82 TiB used: 542.73 GiB (29.1%) fs: btrfs dev: /dev/nvme1n1p2
           ID-5: /var size: 1.82 TiB used: 542.73 GiB (29.1%) fs: btrfs dev: /dev/nvme1n1p2
Swap:      Alert: No swap data was found.
Sensors:   System Temperatures: cpu: 31.0 C mobo: 31.0 C gpu: amdgpu temp: 34.0 C
           Fan Speeds (RPM): fan-1: 0 fan-2: 718 fan-3: 0 fan-4: 0 fan-5: 0
Info:      Processes: 495 Uptime: 2h 26m Shell: Bash inxi: 3.3.27
erlangen:~ # 

A daily zypper dist-upgrade warrants early detection of issues related to maintenance and their resolution.

Wear and tear of desktop class 4TB HDD model WD40EZRX-22SPEB0

Vendors issue very optimistic figures on MTBF of their drives. Actual values are much lower. From avherald.com:

Yes, I always have a good number of HDDs on stock - they usually fail after 20,000-50,000 hours (2.2 years to 5.7 years) although the manufacturer claims a MTBF of 1.6 million hours (182 years).

The above matches well other experience: SMART/HDD/WDC/README.md at master · linuxhw/SMART · GitHub

A thorough test of the drive was squeezing the existing 4TB ext4 partition, adding a btrfs partition and rsyncing the two. Then the ext4 on partition sdb1 was deleted and the unused space added to btrfs on sdb2.

erlangen:~ # btrfs filesystem usage -T /media/61fc4107-d7da-4c0b-a1f4-d92aa6fc1d26/
Overall:
    Device size:                   3.64TiB
    Device allocated:              1.17TiB
    Device unallocated:            2.47TiB
    Device missing:                  0.00B
    Device slack:                    0.00B
    Used:                          1.17TiB
    Free (estimated):              2.47TiB      (min: 1.24TiB)
    Free (statfs, df):             2.47TiB
    Data ratio:                       1.00
    Metadata ratio:                   2.00
    Global reserve:              512.00MiB      (used: 0.00B)
    Multiple profiles:                  no

             Data    Metadata System                             
Id Path      single  DUP      DUP       Unallocated Total   Slack
-- --------- ------- -------- --------- ----------- ------- -----
 1 /dev/sdb2 1.16TiB  6.00GiB  16.00MiB   762.10GiB 1.91TiB     -
 2 /dev/sdb1 6.00GiB        -         -     1.72TiB 1.73TiB     -
-- --------- ------- -------- --------- ----------- ------- -----
   Total     1.16TiB  3.00GiB   8.00MiB     2.47TiB 3.64TiB 0.00B
   Used      1.16TiB  2.50GiB 176.00KiB                          
erlangen:~ # 

Issuing btrfs device remove /dev/sdb2 resulted in a fatal failure. Moving blocks using this command is a great sanity check of any drive.

Metadata are perfectly consistent:

erlangen:~ # btrfs check --force /dev/sdb2
Opening filesystem to check...
WARNING: filesystem mounted, continuing because of --force
Checking filesystem on /dev/sdb2
UUID: 61fc4107-d7da-4c0b-a1f4-d92aa6fc1d26
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 1278897782784 bytes used, no error found
total csum bytes: 1246301616
total tree bytes: 2684928000
total fs tree bytes: 1093910528
total extent tree bytes: 148422656
btree space waste bytes: 423718490
file data blocks allocated: 1276212854784
 referenced 1276212854784
erlangen:~ # 

However data are rotten as exposed by the numerous errors in the journal.

erlangen:~ # smartctl -A /dev/sdb
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.4.4-1-default] (SUSE RPM)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       3191
  3 Spin_Up_Time            0x0027   253   175   021    Pre-fail  Always       -       2291
  4 Start_Stop_Count        0x0032   091   091   000    Old_age   Always       -       9028
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   080   080   000    Old_age   Always       -       14636
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   097   097   000    Old_age   Always       -       3493
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       130
193 Load_Cycle_Count        0x0032   198   198   000    Old_age   Always       -       8928
194 Temperature_Celsius     0x0022   114   107   000    Old_age   Always       -       38
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       3
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       108

erlangen:~ # 

Backup of infamous host erlangen now relies on SSDs and NAS grade HDDs:

  1. Samsung model: SSD 970 EVO Plus 2TB , size: 1.82 TiB
  2. Seagate model: ST8000VN004-2M2101 size: 7.28 TiB

Cargo Cult System Administration

My decades of experience in companies supports this assessment. Professional administrators were an estimated 80% of their time busy fixing problems that they had brought on themselves with their sloppiness.

I always try “Science Based System Administration”. System administration is not science, nor is it witchcraft. But applying scientific methods gives its efficiency and robustness an amazing boost which makes infamous host erlangen really great.

this-is-only-for-build-envs: Much Ado About Nothing

Observed some noise recently: Search results for 'this-is-only-for-build-envs order:latest' - openSUSE Forums

It’s yet another example of annoyances caused by Cargo Cult administration.

Actually this is a non-issue on infamous host erlangen which readily upgraded automatically using the same procedure as every day.