Hi folks.
I’m a bit of a newbie to this, but have been researching this issue for some time and have not yet found a definitive solution to my issue.
This issue would appear to be a Grub2 issue from the rather extensive research I’ve done…but I’m happy to be corrected!
So, I’ve set up a shiny new OpenSuse Linux Leap 15 machine to replace an old OpenSuse 11.4 machine.
I’ve installed fresh, and using all new hardware.
Setup is a couple of 1TB drives that I’ve software mirrored in Linux (always been happy with this, as have had failed drives on my old Opensuse 11.x machine and always recovered without issues).
However, where the old machine boots with a degraded RAID array, the new OpenSuse Leap 15 machine will not.
So, here’s the story so-far;
Boot is from BIOS boot partition.
Grub2 is installed onto the boot partitions of both sda and sdb, and is picked up from either drive OK.
When both drives are connected, the RAID Array is fine, and everything boots fine.
But I want to “simulate” a failed array and be sure the machine boots (wouldn’t anyone…?). So I shut-down, pull a drive SATA connector off a drive and try booting.
At this point, no matter which drive I disconnect, Grub2 boots to the Grub Rescue prompt.
So, I know Grub2 is picking up the MBR as I expect.
No-matter which drive is disconnected, Grub2 errors on “Cannot find MUUID xxxxxxxxxxxxxxxxxxxx”
I can “ls” the drives and everything is visible (including my ‘mdx’ volumes), but even issuing commands in the Grub2 rescue mode, I can’t get the machine to start.
Oh, as soon as I reconnect the second drive, it all boots again as if nothing is wrong.
Reading around all of this, it would seem that I can’t get this machine to boot a degraded array.
There was a posting about a dracut bug, but I tried a workaround (that others had used), to no avail.
That workaround was to add /etc/dracut.conf.d/mdadm.conf (which had not existed), and to that mdadm.conf file add: install_optional_items+=" /usr/lib/systemd/system/mdadm-last-resort@.service /usr/lib/systemd/system/mdadm-last-resort@.timer "
I then rebuilt initrd (all ok), and then ensured the new initd was picked-up.
Still no dice. Still machine boots with both drives connected and doesn’t with one removed.
With the degraded array, Grub2 never gets as far as the ramdisk bit of the boot cycle anyway.
So, after an awful lot of poking around, it seems that I need some form of arguments or options in Grub2 to force it to use a degraded array.
But I can’t find anything that clearly states what this might look like in-terms of commands or options issues to Grub2.
Somehow I need to force Grub2 to use the md0 device regardless of a degraded state. That seems to be the key.
I know others are having these issues, so I thought it worth putting a comment here.
Very grateful for any help from the esteemed forum.