I have two SATA disks, and I installed SuSE 11 with root partition on raid1.
The setup has three raid 1 partitions /dev/md0-3, with /boot mounted on md0, / on md1 and my /data on md2. Each mdX partition has ReiserFS.
I have installed SuSE 9.x, 10.x on at least 15 production machines, so I installed SuSE 11 here as I usually do, and it went fine. I could reboot fine, I configured grub to fall back on the second disk, so I tested booting with one disk unplugged, second disk unplugged, everything worked. Then I rebooted the machine a couple of times with both disks, everything worked.
After 3 days I rebooted the machine again, grub loaded the kernel, but the kernel could not assemble md1 - there was a message like md1 is unclean, starting background reconstruction, and then something about a bad bitmap file in md1 with error: -5. Then it gave me the sh# prompt, and /proc/mdstat showed no active raid obviously.
I booted with a CD in the rescue mode, and looked at /proc/mdstat:
md0 and md2 were assembled and running on both devices, but md1 had no active devices. I mounted md0, md2, checked the files, everything was intact. Then I tried to assemble md1 with mdadm, but it would either give me “device busy” when I tried mdadm -a /dev/md1 /dev/sdX or “I/O error” when I tried creating /etc/mdadm.conf , listing partitions in there, dumping mdadm --examine --scan >> /etc/mdadm.conf into it, and then doing mdadm --assemble --scan /dev/md1.
So then I reformatted md1 partitions, reinstalled SuSE on md1, did the same testing, reboots, everything went fine, but in 3 days I got the exact same problem.
This time, I recreated md1 in rescue mode, without formatting the partitions (mdadm -C /dev/md1 --level=raid1 --raid-devices=2 /dev/sda1 /dev/sdb1 ), and it rebooted fine after that with all data on md1 intact.
I checked for hardware errors in dmesg, smartctl -a /dev/sdX, IPMI logs, there was nothing bad that I could see.
Do you guys have any advice on this? At this point, I am suspecting 2.6.25 kernel, and I am about to revert to SuSE 10.3