Booting Problems From A Degraded v12.3 RAID 1 Partition

  • While performing some pre-production testing of a openSUSE 13.2 install I discovered a problem when booting from a degraded mdadm raid-1 partition. If the 2nd disk, /dev/sdb containing raid partition /dev/sdb2, is physically inoperative or disconnected, the system will not boot and I end up in the Dracut Emergency Shell. At that time, mdadm shows this partition, /dev/md0, to be operational albeit degraded, but nothing that would prevent booting.

  • Thinking that it still had something to do with the degraded array, I configured it to a single disk raid1 where /dev/md0 only contained /dev/sda2 (after failing out and removing /dev/sdb2 I did: mdadm /dev/md0 --grow --raid-device=1 --force). When the secondary disk was removed, it would not boot and exhibited the same behavior as above. If the second disk was once again physically connected it would boot without incident.

  • One of my early thoughts was that perhaps it had something to do with the boot loader. 13.2 uses grub2 by default and doesn’t offer grub (I guess they’re calling it grub legacy now) as an install option. I’m more familiar with the configuration of the original so I installed it without incident and, after removing /dev/sdb, it exhibited the same behavior as above.

  • At the emergency shell provided, it gives the option to view the boot log. Some interesting highlights:
    o kernal: md0: is active with 1 out of 2 mirrors.
    o kernal: md0: detected capacity change from 0 to 91267923968
    o kernal: md0: unknown partition table
    o systemd[1]: Found device /dev/md0
    o dracut-initque[284]: Warning: Could not boot
    o dracut-initque[284]: Warning: /dev/disk/by_UUID/ea3 … does not exist

  • Interestingly, the /dev/disk/by_UUID listed above equates to /dev/sdb1 when /dev/sdb is attached. In this case it isn’t. I thought that perhaps the resume= entry in the grub menu was the cause – but no I had removed it.

  • This has never been a problem in previous versions. Any Ideas?

Here is my disk configuration:

fdisk -l
Disk /dev/sda: 232.9 GiB, 250059350016 bytes, 488397168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x000459a6

Device Boot Start End Sectors Size Id Type
/dev/sda1 2048 8390655 8388608 4G 82 Linux swap / Solaris
/dev/sda2 * 8390656 186648575 178257920 85G fd Linux raid autodetect

Disk /dev/md0: 85 GiB, 91267923968 bytes, 178257664 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/sdb: 93.2 GiB, 100030242816 bytes, 195371568 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x528b2130

Device Boot Start End Sectors Size Id Type
/dev/sdb1 2048 8390655 8388608 4G 82 Linux swap / Solaris
/dev/sdb2 * 8390656 186648575 178257920 85G fd Linux raid autodetect

mdadm --detail /dev/md0
/dev/md0:
Version : 1.0
Creation Time : Fri May 15 11:32:28 2015
Raid Level : raid1
Array Size : 89128832 (85.00 GiB 91.27 GB)
Used Dev Size : 89128832 (85.00 GiB 91.27 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Intent Bitmap : Internal

 Update Time : Wed May 27 19:39:44 2015
       State : clean

Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

        Name : any:0
        UUID : f191ca0c:b31d6d89:41232679:5e77bec6
      Events : 2465

 Number   Major   Minor   RaidDevice State
    0       8        2        0      active sync   /dev/sda2
    2       8       18        1      active sync   /dev/sdb2

cat /etc/fstab
/dev/sda1 swap swap defaults 0 0
/dev/md0 / ext3 acl,user_xattr 1 1

Did you recreate initrd? Otherwise it still contains old configuration that refers to two-disk setup.

Yes the problem is known and related to timeouts used by udev/dracut when waiting for device to appear. Try updating 13.2 to current patches; make sure to refresh initrd after that. If it still does not work, open bug report. I am actually pretty much confident there is bug report about this problem but do not have reference handy, try searching for it on bugzilla.

Thanks for your reply.

I’ve not recreated initrd – i’ll try it – but I really shouldn’t need to to get it to boot. This is a configuration that I’ve used since version 10.3 or shortly thereafter without incident. Using mirrored drives is a great and easy way to make backups and to clone a system without having to jump through hoops.

I submitted a bug report on this.