opensuse 12.2 has problems with software RAID

Hi all,

I am finding that openSuse 12.2 has some problems with software RAID 1, particularly with 0.9 metadata.

I install a large number of systems as part of my job, so I have a network boot image based(tarballs) system for installing a wide variety of distros. I’ve been doing this for many years as it’s much faster than installing from DVD every time, and is easier to backup customizations.

My normal procedure to install a distro onto software raid is to first partition the drives with one of our default scripts, then create software raid devices from them, using the default 0.9 metadata. Then my scripts create filesystems on the partitions, mount them, and untar a backup image onto them. Then the appropriate magic is done to make it bootable (update mdadm.conf, remove old udev entries, run mkinitrd, install grub [accounting for grub2]). I ended up with the system booting into grub 2 rescue mode, claiming that it could not load normal.mod.

I then tried installing from the DVD to the already existing 0.9 arrays. The install seemed to work, but on reboot it goes right back to grub 2’s rescue mode.

I ended up zeroing the drives completely and doing the install with software raid from the DVD only, using the gui for creating the partitions, etc. Doing that, the system is working - it’s able to boot properly now using software RAID1. I’ve noticed that this installation is using 1.0 metadata/superblocks for the RAID devices.

As I need this to be repeatable for multiple systems, I then went back to my network boot and deleted the arrays and recreated them fresh, using 1.0 superblocks like the DVD install used. When the install finishes and I try to run mkinitrd(from chroot, including /proc and /sys), I get this output:

netuno:/ # mkinitrd -v

Kernel image: /boot/vmlinuz-3.4.6-2.10-desktop
Initrd image: /boot/initrd-3.4.6-2.10-desktop
KMS drivers: nouveau
Root device: /dev/md5 (mounted on / as ext4)
/usr device: /dev/md6 (mounted on /usr as ext4)
mdadm: cannot open /dev/md/netuno:6: No such file or directory
[BLOCK] /dev/sda -> ahci
[BLOCK] /dev/sdb -> ahci
Device md!netuno:6 not found in sysfs
There was an error generating the initrd (1)

I didn’t have any problems like this with SUSE 12.1, 11.X, 10.X, or 9.X. What has changed to cause this? And how can I fix it?


Could you show “ls -l /dev/md” at this point?

I have figured out the problem I had getting it to work with 1.0 metadata. When mkinitrd is first run, it generates a file /run/mdadm/map. If the /etc/mdadm.conf file is not correct at that point, the map is generated incorrectly. If you correct the mdadm.conf and then run mkinitrd again, the incorrect /run/mdadm/map is still used. To fix that problem, you can just delete the map file and then rerun mkinitrd again. It will recreate it with the correct info from mdadm.conf.

I still have problems with 0.9 metadata though. It seems that grub2 is not able to read 0.9 metadata arrays correctly. If I use the same procedure that works with a 1.0 array on a 0.9 array, I get this message at the end of the mkinitrd about grub2-install failing:

Path `/boot/grub2’ is not readable by GRUB on boot. Installation is impossible.

I temporarily modified grub2-install to print commands (added set -x to top) and I found that “grub2-probe -t fs /boot/grub2” is the command that fails. Running that directly gives this error message:
grub2-probe: error: disk `mduuid/25011c39c382e8021462098afed7fa8’ not found.

The uuid listed there matches (without the colons) the UUID of my boot filesystem.

Why can’t grub read 0.9 arrays?