[SOLVED] openSUSE Tumbleweed boot failure after zypper dup: NVMe RAID 0 timeout

Disclaimer: Hi everyone, I solved this problem with the help of AI, and this is a recap. I know AI posts are not always appreciated, but I have personally reviewed this text in a human way, hoping to help anyone who finds themselves in difficulty with RAID, SELinux, and boot timeouts.

I hope this helps anyone who, like me, has a RAID setup and has had the same problem.


The problem

After the latest system update (zypper dup), my openSUSE Tumbleweed workstation began failing to boot, consistently dropping into emergency mode.

System Setup:

  • CPU: AMD Ryzen 9 3900X.
  • OS: openSUSE Tumbleweed (Kernel 6.18.7-1-default).
  • Storage: Root partition on LVM, and the /home partition on a Software RAID 0 array composed of two NVMe drives.

Symptoms:

  1. Boot timeout: the process would hang for 90 seconds, eventually timing out while waiting for the UUID of the /home partition (46818d92...).
  2. Explicit RAID disable: The logs showed a specific dracut error: rd.md=0: removing MD RAID activation. This indicated the initramfs was being instructed to ignore RAID devices.
  3. SELinux denials: several avc: denied { getattr } errors appeared for mdadm, preventing the utility from correctly identifying or assembling the array during early boot.
  4. NVMe name swapping: NVMe identifiers (nvme0, nvme1, nvme2) were swapping roles between reboots, making device-path based mounting unreliable.

The solution

The fix involved ensuring persistent identifiers, forcing the RAID assembly in the initramfs stage, and resolving SELinux permission blocks.

1. Persistent identifiers and fstab

The first step was to sanitize /etc/fstab.

I replaced all device-mapper paths (like /dev/mapper/system-root) with UUIDs to ensure the kernel could always find the correct partitions regardless of NVMe enumeration order.

We also optimized the mount order (boot, system, local backup disks, swap and network drives) and options to streamline the process and speed up the initial boot sequence.

2. GRUB configuration

I updated /etc/default/grub to use UUIDs for all mount points and added kernel flags to override the RAID block.

Updated GRUB_CMDLINE_LINUX_DEFAULT:

  • root=UUID=aea38da1...: Replaced mapper paths with the unique ID.
  • rd.md=1: Explicitly enabled RAID to override the rd.md=0 default found in the logs.
  • rd.auto=1: Forced auto-detection of storage layers.
  • rd.md.uuid=b0c111d4...: Provided the specific UUID of the RAID array to the kernel during pre-boot.

3. Dracut (Initramfs) module fix

I created a configuration file to ensure the RAID drivers are included and the mdadm.conf is respected.

File: /etc/dracut.conf.d/10-raid-fix.conf

add_dracutmodules+=" dm mdraid "
mdadmconf="yes"
hostonly="yes"

After saving, I regenerated the boot images for all kernels:

sudo dracut -f --regenerate-all

4. SELinux relabeling

Since SELinux was actively blocking mdadm processes from accessing system modules, I triggered a full filesystem relabel: sudo touch /.autorelabel

Note: The subsequent boot took about 10-15 minutes to finish scanning and re-labeling all files (especially on the large RAID array), but this resolved the permission issues permanently.

Outcome

The system now boots correctly and quickly (very quickly compared to my previous setup!). The RAID 0 array (md127) is automatically assembled and active upon reaching the desktop.

Hope this helps anyone dealing with complex storage setups and boot failures on Tumbleweed!

2 Likes

Cheers mate. I’ve had the same problem since the last update (which is why I’ve reverted to the previous snapshot). I’ll test it tomorrow. Do you know if there’s already a bug report for this?

hey @Thor_Thenhammer I didn’t think it was a Tumbleweed bug, but rather my fault for misconfiguring RAID. If you need more details, feel free to reach me out.

1 Like

I have this same problem, but the issue seems to lie in the change of mdadm package from 4.4+31.g541b40d3-1.1.x86_64 to 4.5+39.g1aa6e5de-1.1

I can update everything except mdadm, and it’s all fine. As soon as I update mdadm, my RAID1 device no longer starts at boot

2 Likes

This morning I made another dup, with latest updates and everything was smooth as usual.

In any case, this is the first incident after approximately three years of using Tumbleweed every day :laughing:

Rocking solid!

1 Like

Tried your solution yesterday evening but didn’t work for me.

Absolutely! This is also my first issue since a very long time.

Yes. Locking mdadm to its current state also fixes it for me.

The question is if the changes are intentional and we need to reconfigure the RAIDS from our site, or is this a bug. Can’t find a report on bugzilla, yet.

This will be nice to know :slight_smile:

To my knowledge and feeling what I did should avoid issues like this.
But again…I’ve been helped with some AI stuff, there is a potential risk. In case you need a step-by-step guide with all the steps I can send it 1-1.

This morning I also updated my laptop — 1.2gb updates — but is not using raid. mdadm was in the packages. Updated with any issues.

1 Like

Some remarks: A mapper device like /dev/mapper/system-root should never be a problem since it’s an LVM LV name that is explicitly assigned to be unique. However, something like /dev/mapper/md127 might be a lot less unique, so avoiding this can really help.

For RAIDs etc. you have little other choice than using a UUID. But for plain partitions, I very much prefer to assign a volume label. Not only is that much better to read, it’s also a lot less problematic if you need to move your partitions to a new disk or to a new PC.

[sh @ morgul] ~ 1 % /bin/lsblk -o NAME,SIZE,FSTYPE,LABEL,MOUNTPOINTS
NAME          SIZE FSTYPE      LABEL    MOUNTPOINTS
sda           1.8T                      
├─sda1          1T ext4        work     /work
└─sda2        839G crypto_LUKS          
nvme0n1     931.5G                      
├─nvme0n1p1     1G vfat        efi-boot /boot/efi
├─nvme0n1p2     2G swap        swap     [SWAP]
├─nvme0n1p3   100G ext4        root-01  /
├─nvme0n1p4   100G ext4        root-02  
└─nvme0n1p5 728.5G ext4        ssd-work /ssd-work

[sh @ morgul] ~ 2 % cat /etc/fstab
#
# /dev/nvme0n1
#

LABEL=swap      swap             swap  defaults                  0  0
LABEL=root-01   /                ext4  defaults                  0  1
LABEL=root-02   /alternate-root  ext4  ro,noauto,data=ordered    0  2
LABEL=ssd-work  /ssd-work        ext4  data=ordered              0  2
LABEL=efi-boot  /boot/efi        vfat  utf8,dmask=0077           0  2


#
# /dev/sda
#

LABEL=work      /work            ext4  data=ordered              0  2
LABEL=crypto    /crypto          ext4  user,noauto,data=ordered  0  2

[sh @ morgul] ~ 3 % ls -l /dev/disk/by-label
total 0
lrwxrwxrwx 1 root root 15 Feb  4 09:43 efi-boot -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Feb  4 09:43 root-01 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root 15 Feb  4 09:43 root-02 -> ../../nvme0n1p4
lrwxrwxrwx 1 root root 15 Feb  4 09:43 ssd-work -> ../../nvme0n1p5
lrwxrwxrwx 1 root root 15 Feb  4 09:43 swap -> ../../nvme0n1p2
lrwxrwxrwx 1 root root 10 Feb  4 09:43 work -> ../../sda1

[sh @ morgul] ~ 4 % ls -l /dev/disk/by-uuid
total 0
lrwxrwxrwx 1 root root 10 Feb  4 09:43 1ca85363-7e65-4f21-b710-db41f907a67a -> ../../sda2
lrwxrwxrwx 1 root root 15 Feb  4 09:43 2d30496b-beb0-406c-9965-7f1aa70b74f0 -> ../../nvme0n1p4
lrwxrwxrwx 1 root root 10 Feb  4 09:43 47885202-37cf-4c6a-a30a-8a27cad98c9e -> ../../sda1
lrwxrwxrwx 1 root root 15 Feb  4 09:43 9316fd5f-ea0e-409a-93eb-2483ec97509e -> ../../nvme0n1p2
lrwxrwxrwx 1 root root 15 Feb  4 09:43 9CA2-7D93 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Feb  4 09:43 a1c83c7c-9e9d-437e-b0a0-35d97c47dcbc -> ../../nvme0n1p3
lrwxrwxrwx 1 root root 15 Feb  4 09:43 bf6592c2-0f36-4cc0-b60c-b8ba1ea72669 -> ../../nvme0n1p5

I hate dealing with UUIDs. They are not made for human consumption.

[sh @ morgul] ~ 5 % sudo blkid
/dev/nvme0n1p5: LABEL="ssd-work" UUID="bf6592c2-0f36-4cc0-b60c-b8ba1ea72669" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="5902418b-e309-4357-b4f6-113ac721f2d7"
/dev/nvme0n1p3: LABEL="root-01" UUID="a1c83c7c-9e9d-437e-b0a0-35d97c47dcbc" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="5f02aad8-29bb-4944-8afd-3c567ce44917"
/dev/nvme0n1p1: LABEL_FATBOOT="efi-boot" LABEL="efi-boot" UUID="9CA2-7D93" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="f3471d8b-5fff-4d56-ab83-a1e5cc1e23da"
/dev/nvme0n1p4: LABEL="root-02" UUID="2d30496b-beb0-406c-9965-7f1aa70b74f0" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="2cb27c37-0b79-4f46-a244-bb113034beec"
/dev/nvme0n1p2: LABEL="swap" UUID="9316fd5f-ea0e-409a-93eb-2483ec97509e" TYPE="swap" PARTUUID="caebe5a5-a477-46ff-94cf-bfb8e3238869"
/dev/sda2: UUID="1ca85363-7e65-4f21-b710-db41f907a67a" TYPE="crypto_LUKS" PARTUUID="9d6b1fd2-25d9-4f84-b8c5-544409a2be78"
/dev/sda1: LABEL="work" UUID="47885202-37cf-4c6a-a30a-8a27cad98c9e" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="884952d1-bbc4-4608-9f24-c41dfbc79e41"

Yikes - what a mess.

1 Like

That would be great. Maybe I missed a step. I am eager to try it again.

Do you still have all the logs available while fixing it? Could you maybe open a bug report on bugzilla so some of the maintainers/developers can confirm that this is a bug? I still don’t think that’s the intention. I think most people here have set up their RAID using Yast so that there aren’t any obscure configurations.

Hey @Thor_Thenhammer I’m sharing here the detailed processes with my files.
I’ll be happyt, If there is anything it can help.
I have the logs and if you point me out where I can open the issue, I’ll do that.

I’ve dupped this morning again and no issues, but no mdadm package in the list.


:open_file_folder: PART 1: The Recovery Guide

1. Disclaimer

This documentation serves as a technical record of the steps taken to resolve a persistent boot failure involving NVMe RAID assembly and SELinux permission denials on an openSUSE Tumbleweed workstation (AMD Ryzen 9 3900X).

2. Problem Description

Following a system update (zypper dup), the workstation failed to boot, consistently dropping into Emergency Mode after a 90-second timeout.

System Architecture:

  • Root: LVM on NVMe.
  • Home: Software RAID 0 (MDADM) on 2x NVMe drives.
  • Issue: The system could not mount /home because the RAID array (md127) was not assembling automatically.

Diagnostic Findings (from logs):

  1. NVMe Swapping: The kernel reordered NVMe identifiers (nvme0 vs nvme1) between reboots, breaking static device paths in /etc/fstab.
  2. RAID Disabled: The boot logs showed rd.md=0: removing MD RAID activation, indicating the initramfs was explicitly instructed to ignore RAID.
  3. SELinux Blocks: Logs showed avc: denied { getattr } for mdadm, preventing the RAID tool from accessing kernel modules during boot.

3. The Resolution Process

Step 1: Sanitize /etc/fstab (Persistence)

We replaced unstable device paths (e.g., /dev/mapper/...) with persistent UUIDs to ensure the kernel locates partitions regardless of drive enumeration order.

Step 2: Force RAID in Dracut (Initramfs)

We created a configuration file to force the inclusion of RAID drivers in the boot image and ensure mdadm.conf is read.

  • Action: Created /etc/dracut.conf.d/10-raid-fix.conf.
  • Command: sudo dracut -f --regenerate-all

Step 3: GRUB Kernel Parameters (Override)

We modified the bootloader configuration to override the rd.md=0 block and explicitly point the kernel to the RAID UUID.

  • Action: Edited GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub.
  • Key Flags Added: rd.md=1 (Force RAID on), rd.auto=1, rd.md.uuid=<UUID>.’
  • Command: sudo grub2-mkconfig -o /boot/grub2/grub.cfg

Step 4: SELinux Relabeling (Security)

To fix the permission denials blocking mdadm, we triggered a full filesystem relabel.

  • Command: sudo touch /.autorelabel then reboot.
  • Result: The subsequent boot took ~10 minutes to scan all files, but resolved the permission errors.

4. Final Outcome

  • Status: The system boots successfully.
  • RAID: md127 is active and assembled automatically ([UU] state).
  • Performance: Boot time is significantly faster due to optimized mount order and UUID usage.
  • Space: Recovered ~13GB of space by cleaning old Btrfs snapshots (snapper delete).

:open_file_folder: PART 2: Configuration Files Reference

Here are the final, working configuration files used in the fix.

File 1: /etc/fstab (Optimized)

Note: Using UUIDs prevents issues when NVMe drives change names (nvme0 ↔ nvme1).

âžś  ~ cat /etc/fstab                       
# /dev/system/root (Btrfs System Volumes)  
UUID=aea38da1-e532-409b-92dc-db393b93b625  /                       btrfs  defaults                          0  0  
UUID=aea38da1-e532-409b-92dc-db393b93b625  /.snapshots             btrfs  subvol=/@/.snapshots              0  0  
UUID=aea38da1-e532-409b-92dc-db393b93b625  /boot/grub2/i386-pc     btrfs  subvol=/@/boot/grub2/i386-pc      0  0  
UUID=aea38da1-e532-409b-92dc-db393b93b625  /boot/grub2/x86_64-efi  btrfs  subvol=/@/boot/grub2/x86_64-efi   0  0  
UUID=aea38da1-e532-409b-92dc-db393b93b625  /opt                    btrfs  subvol=/@/opt                     0  0  
UUID=aea38da1-e532-409b-92dc-db393b93b625  /root                   btrfs  subvol=/@/root                    0  0  
UUID=aea38da1-e532-409b-92dc-db393b93b625  /srv                    btrfs  subvol=/@/srv                     0  0  
UUID=aea38da1-e532-409b-92dc-db393b93b625  /usr/local              btrfs  subvol=/@/usr/local               0  0  
UUID=aea38da1-e532-409b-92dc-db393b93b625  /var                    btrfs  subvol=/@/var                     0  0  
  
# Boot EFI (Physical Partition)  
UUID=0BA3-7AED                             /boot/efi               vfat   utf8                              0  2  
  
# User Data (RAID Array)  
UUID=46818d92-5933-4ccd-b18f-f7a0d7475e38  /home                   xfs    defaults                          0  0  
  
# Local backup partitions  
UUID=454e7577-4429-4741-b7da-c3e937a93092  /volumes/backup         xfs    defaults                      0  0  
UUID=4438d25a-98ab-44e1-8476-7048c533cf5a  /volumes/restic         xfs    defaults                      0  0  
  
# Swap (LVM Volume)  
UUID=373db110-6767-467c-9721-31c56af8da92  swap                    swap   defaults                          0  0

File 2: /etc/default/grub (Bootloader)

Note: The rd.md=1 flag is critical to override the system default that was disabling RAID.

GRUB_DISTRIBUTOR=
GRUB_DEFAULT=saved
GRUB_HIDDEN_TIMEOUT=0
GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=8
GRUB_CMDLINE_LINUX_DEFAULT=" \
  root=UUID=aea38da1-e532-409b-92dc-db393b93b625 \
  resume=UUID=373db110-6767-467c-9721-31c56af8da92 \
  rd.md=1 \
  rd.auto=1 \
  rd.md.uuid=b0c111d4:7d369110:3fc042ea:7ea0f2fa \
  splash=silent \
  quiet \
  security=selinux \
  selinux=1 \
  amd_iommu=on \
  iommu=pt \
  mitigations=auto"
GRUB_CMDLINE_LINUX=""
GRUB_TERMINAL="gfxterm"
GRUB_GFXMODE="1920x1200"
GRUB_THEME=/boot/grub2/themes/openSUSE/theme.txt
SUSE_BTRFS_SNAPSHOT_BOOTING="true"
GRUB_USE_LINUXEFI="true"
GRUB_DISABLE_OS_PROBER="false"

File 3: /etc/dracut.conf.d/10-raid-fix.conf (Initramfs)

Note: This ensures the RAID kernel modules are physically present in the boot image.

add_dracutmodules+=" dm mdraid "
mdadmconf="yes"
hostonly="yes"
1 Like

I agree on all points.
I was sceptical at first too, and I procrastinated on updating.

But the more I interacted with the AI, the more I became convinced that it could be the solution. In the end, it didn’t cost me much to experiment, in the sense that I still had snapper as a lifesaver.

In theory, disk configuration is something you do once and that’s it, so having UUIDs in fstab — however not human at all — ultimately does the job.

If you look at the post below, I commented and sorted my new fstab, according to boot logic. This is another suggestion from the AI, and it also improved boot times.

For example, I didn’t know that it would be better to put /swap at the end.

Before, it was in the middle of the others.
I am attaching my previous fstab.bak here below, that was generated when the system was installed:

âžś  ~ cat /etc/fstab.bak 
/dev/system/root                           /                       btrfs  defaults                      0  0
/dev/system/root                           /var                    btrfs  subvol=/@/var                 0  0
/dev/system/root                           /usr/local              btrfs  subvol=/@/usr/local           0  0
/dev/system/root                           /root                   btrfs  subvol=/@/root                0  0
/dev/system/root                           /opt                    btrfs  subvol=/@/opt                 0  0
/dev/system/root                           /boot/grub2/x86_64-efi  btrfs  subvol=/@/boot/grub2/x86_64-efi  0  0
/dev/system/root                           /boot/grub2/i386-pc     btrfs  subvol=/@/boot/grub2/i386-pc  0  0
UUID=0BA3-7AED                             /boot/efi               vfat   utf8                          0  2
/dev/system/swap0                          swap                    swap   defaults                      0  0
/dev/system/root                           /.snapshots             btrfs  subvol=/@/.snapshots          0  0
UUID=46818d92-5933-4ccd-b18f-f7a0d7475e38  /home                   xfs    defaults                      0  0
LABEL=restic                               /volumes/restic         xfs    defaults                      0  0
LABEL=backup                               /volumes/backup         xfs    defaults                      0  0
/dev/system/root                           /srv                    btrfs  subvol=/@/srv                 0  0

# other network drives
192.168.20.2:/volume1/backup   etc etc
etc
etc

Awesome! Thanks. I will try it again this evening.

Also awesome :). You can create a bug report here: https://bugzilla.suse.com/enter_bug.cgi?product=openSUSE%20Tumbleweed

TL;DR

It’s difficult to say what happened in retrospect. Do you still have snapshots of your system before and after the issue occured? It might be interesting to look for dracut errors from the update transaction that created the broken initrd, and to examine the initrd itself.

Because that’s most probably what caused your issue, for some reason the mdraid module and/or its parameters (rd.md.uuid) weren’t set correctly in the initrd. The file that matters (inside the initrd image) is /etc/cmdline.d/90mdraid.conf, see below.

Long story

This is not necessary. /dev/mapper references in fstab are fine as long as the map name (e.g. system-root) is unique. Don’t confuse this with devnode names like /dev/sda or /dev/dm-1, which should indeed be avoided.

None of this should be necessary. rd.auto=1 would be sufficient, but even theat is not required. If the mdraid module is included in the initrd, dracut should have generated a file /etc/cmdline.d/90mdraid.conf in your initrd which would contain something like this:

 rd.md.uuid=0fedcba9:12345678:87654321:9abcdef0 

Run lsinitrd /boot/initrd etc/cmdline.d/90mdraid.conf to verify that.

There are 2 possible cases where this wouldn’t happen:

  • if you disable hostonly mode in dracut
  • if the initrd is built from an environment where MD is not present in the block device stack (e.g. from a rescue system on top of a RAID1 where only one disk was present and MD was not loaded).

This isn’t necessary, either. hostonly=yes is the default on openSUSE. dm and mdraid should be autodetected by dracut when you build the RAM disk, except if, as noted above, you build the initrd in a non-standard environment. dracut in hostonly mode assumes that it needs to set up the same block device stack that’s present while the initrd is build. Thus by default it will include dm and mdraid if and only if they’re present in the stack that’s below your root device.

Do you have any concrete evidence that his was acutally happening, other than that your AI guessed so? Have you checked your audit logs, for example?

Hi! Thanks a lot for taking the time to read through and provide such a detailed and technical explanation. I really appreciate it!

You are absolutely right: in a perfect scenario, openSUSE and dracut should handle all of this automatically. To answer your points and provide the data you asked for…

The initrd and 90mdraid.conf, I ran the command you suggested on my current, fully working system:

sudo lsinitrd /boot/initrd etc/cmdline.d/90mdraid.conf

but it returns no output and the file does not exist inside my initrd.

My guess is that that dracut is completely failing to auto-detect the MD RAID array during the initramfs generation. Because this auto-generated file is missing, the initrd defaults to rd.md=0, completely disabling RAID activation.

This is exactly why hardcoding rd.md=1 and rd.md.uuid=... directly into GRUB_CMDLINE_LINUX_DEFAULT was strictly necessary to get the machine booting again.
Without those GRUB parameters, the kernel remained blind to the array.

Regarding /dev/mapper vs UUIDs, it’s a fair point as I per my previous commend but I thought that switching to UUIDs was just to be 100% bulletproof during the troubleshooting phase, eliminating any possible variable related to the nvme0/nvme1 swapping, even if LVM technically handles the mapper names fine.

The SELinux relabeling wasn’t just an AI guess as I actually found concrete evidence in the journalctl logs during the failing boots. SELinux was actively throwing AVC denials, preventing mdadm scripts from accessing kmod to load the necessary drivers during early boot.

Here are the exact lines from my logs before the relabel:

Feb 03 10:43:41 workstation kernel: audit: type=1400 audit(1770111821.897:4): avc:  denied  { getattr } for  pid=1223 comm="sh" path="/usr/bin/kmod" dev="dm-0" ino=1739916 scontext=system_u:system_r:mdadm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:kmod_exec_t:s0 tclass=file permissive=0
Feb 03 10:43:41 workstation kernel: audit: type=1400 audit(1770111821.906:6): avc:  denied  { getattr } for  pid=1231 comm="sh" path="/usr/bin/kmod" dev="dm-0" ino=1739916 scontext=system_u:system_r:mdadm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:kmod_exec_t:s0 tclass=file permissive=0

Triggering /.autorelabel fixed these denials, which finally allowed the forced RAID assembly to succeed.

Thank you for clarifying the proper openSUSE default behavior. It seems my specific NVMe + LVM + RAID 0 setup causes the auto-detection to trip up, making the manual overrides mandatory.

Indeed, if etc/cmdline.d/90mdraid.conf wasn’t created by dracut while the initrd was built, your RAID stack had not been detected. This was probably caused by the SELinux denials, then. Did the denials occur while you were building the initrd, or during boot? DId you rebuild the initrd after relabeling?

I’m not an SELInux expert but if relabelling fixes the issue, AFAIU it means that the file system labels had previously been broken. The question is, then, how that came to pass. I’ll leave this to the experts in bug 1257793.

I registered here just to report same issues with RAID.

When making a fresh installation of Tumbleweed, installer properly detects old mdadm RAID1 at /dev/md0 and then I set mount point to it. After installation, boot fails and enters emergency console, while journalctl displays errors “UUID truncated”.

This NEVER happened with January install ISOs, but no matter which February install ISO I use, booting with mdadm RAID always fails because of “UUID truncated” errors.

Thanks for the follow-up.
Yes, I am actually the one who opened Bug 1257793. Your logic makes perfect sense and helps connect the dots on why the breakage happened in the first place.

To answer your questions about the timeline and how the fix was applied.

When did the denials occur?
I explicitly saw the SELinux avc: denied messages in the boot logs (journalctl -b from the emergency shell) during the failing boot attempts. However, based on your logic, it is highly probable that SELinux also silently blocked Dracut from probing the RAID array during the actual zypper dup update transaction. That perfectly explains why Dracut failed to generate the 90mdraid.conf file when building the initrd.

Did I rebuild the initrd after relabeling?
Actually, no! The sequence I used for the fix was:

  1. Hardcoded rd.md=1 and rd.md.uuid=... into /etc/default/grub.
  2. Ran dracut -f --regenerate-all (while the SELinux labels were presumably still broken).
  3. Ran touch /.autorelabel.
  4. Rebooted.

I think it is working because I forced the parameters directly into the GRUB kernel line, it didn’t matter that Dracut’s auto-detection was broken by the SELinux bug. The GRUB parameters forced the initrd to look for the RAID anyway. Then, during the reboot process, the .autorelabel fixed the broken filesystem contexts. Once the relabeling was done, the mdadm processes in early boot finally had the correct permissions to execute and assemble the array based on my forced GRUB parameters.

Thanks again for the great technical insights, it really helps to understand the root mechanics behind the bug I reported.

Hi and welcome! With the help of AI, I put together the following steps that fixed the RAID boot issues on my machine.

Disclaimer: This worked perfectly for my setup, but I don’t know if it will fix your specific error. If you try it, please leave some feedback here so we can understand if it solves the problem for fresh installs.

Here are the steps to follow:

1. Boot from emergency shell
When the boot fails, type this to assemble the RAID manually and reach the desktop:

mdadm --assemble --scan
exit

2. Modify GRUB
Once logged in, open /etc/default/grub and add these flags to the GRUB_CMDLINE_LINUX_DEFAULT line to force RAID activation:
rd.md=1 rd.auto=1

3. Apply changes and fix SELinux
Open a terminal and run these commands to update the bootloader, rebuild the initramfs, and fix any potential SELinux permission blocks:

sudo grub2-mkconfig -o /boot/grub2/grub.cfg
sudo dracut -f --regenerate-all
sudo touch /.autorelabel

4. Reboot
Restart your system.

Important: The first boot took about 10 minutes in my case, because SELinux has to relabel the filesystem. Do not turn off the system and just let it finish.*

If this works, please consider adding your experience to Bug 1257793 on the openSUSE Bugzilla, I think it will definitely help the developers.

1 Like