Leap Micro root filesystem - availability choiches

I’ve been testing out leap micro 5.2, 5.3 (and 5.4 alpha) as a base OS for container and VM workloads. It is working really well.
I am willing to commit to roll out in production machines (3 independent machines), however there is one unsolved system desing issue for me at least:

  • I get that the aim of leap micor is a disposable, container base / VM host runtime OS, therefore the default install is into one single disk partition, without any redundancy.
  • In the last 14-15 years I’ve run with great satisfaction SuSE/OpenSuse as a server OS with ext3/4 on top of (lvm and) mdraid1 on remote servers: when one of the hdds inevitably failed, the system continued to operate without interruption, even if rebooted - until I replaced the failed hdd.
  • Is it this possible with leap micro?
    • btrfs root is mandatory/necessary for snapsnots and rollback => if I add a second device to btrfs during install and convert to raid1 profile everything is working as expected during normal operation. However in case of a simulated hdd failure (remove the 1st or the 2nd device…) the system can no longer boot. I’ve looked in SUSE, OpenSUSE documentation and I have found nothing how to handle this case. I’ve tried various combination of rootfsflags=degraded, rc.break=pre-mount etc: I could not start the system (waiting on missing drives to appear…)
      Sure, booting the machine off a USB stick, mounting -o degraded seems to work, until discovering that @/etc is a read only snapshot… so no dice of simple recovery.

What is the suggested use of this distribution in this case:

  • root btrfs on top of md-raid does not seems that optimal, but maybe this is the only way? (I know that I’ll loose btrfs based self healing…)
  • root btrfs on top of lvm raid1: seems very not recommended, but at lest it is very flexible…

So my question is: I want some level of better availability for my servers with leap micro, can it be done?
If so what is the recommended way of achieving it in a single node system?

This is well known issue on systemd based systems. The “integration” with systemd is less than primitive - systemd waits for btrfs filesystem to become available and udev does not mark btrfs filesystem as available until all physical devices are discovered. Up to know every discussion resulted in finger pointing between btrfs and systemd and nobody came up with alternative implementation.

1 Like

Thank you for this insight: i did not manage to uncover this information.
Then it is not possible to use btrfs based raid1 for root.
I’ll give mdraid1 under btrfs root a testing for availability reasons.