installation/boot problem -- Grub Error 21

I have a Grub Error 21 problem that I haven’t been able to solve despite searching the Internet for solutions.

I have a 1U server box with 4 SATA disks. The last time I installed SUSE 10.0, I had a /boot{,2,3,4} partitions on each disk, and then a RAID 5 reiserfs / partition, swap partitions on each disk, and a RAID 5 reiserfs /home partition. This system booted OK.

I recently installed OpenSUSE 11.1 on another 4-SATA 1U server. I used a RAID 1 (mirroring) ext3 /boot partition over all 4 disks (/dev/md0), a RAID 5 xfs / partition (/dev/md1), and a RAID 5 xfs /home partition (/dev/md2). This worked fine.

When I decided to upgrade the old system to 11.1 I attempted to replicate the second system’s partitioning and RAID structure. Everything seemed fine (including the kexec of the installed system). However when it came time to reboot the system, I got Grub Error 21. There seems to be something about the RAID that confusing GRUB on the first system even though it works fine on the second system.

I guessed that the RAID1 boot partition was the problem, and reinstalled OpenSUSE 11.1 onto an ext3 /boot of /dev/sda (I used a RAID 1 of the other three disks as /boot2). I still got Grub Error 21. Now I am at the point where I need to ask for suggestions, as experimenting with install/boot is rather time consuming. Can someone suggest what might be going wrong. When I run the “Rescue” system from the DVD, everything looks fine (e.g. I can mount /dev/md{0,1,2} and look around).

Here is one hint. System 1 (the one that won’t boot) is a Supermicro 6014H-T. It has one 2-port SATA controller in the Intel chipset on the mobo, and another mobo 4-port SATA controller. During the SUSE 10 days, the 4-port SATA controller was not recognized, and I was forced to add a 2-port SATA controller card to the system to access all four disks (2 from the Intel chipset, 2 from the add-in card). In OpenSUSE 11.1 all three SATA controllers are recognized. I don’t know the mapping of /dev/sd{a,b,c,d} to the controllers. It is possible that it changed between SUSE 10.0 and OpenSUSE 11.1, and that is why GRUB is not working.

Please suggest some things I can do to debug this situation.