Failed to load raid module while booting

Oh great OpenSuSe Gurus, I give thee Kudo’s and other (inexpensive) offerings…:wink:

My Brother 'N Laws server failed to reboot. :frowning:
The boot message mentioned something about not being able to load raid modules, and then the system hangs waiting for /dev/md0.

failed kernel update?

I am not able to boot to /dev/md0 (raid 1 root partition), and so I’m not able to repair the boot process…

Hmmm, I’m sort stuck…

Here some of the pertinent information:

fdisk -lu:


Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x0001ac3b

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *       16065      208844       96390   83  Linux
/dev/sda2          208845     4401809     2096482+  82  Linux swap / Solaris
/dev/sda3         4401810   214114319   104856255   fd  Linux raid autodetect
/dev/sda4       214114320  1953520064   869702872+  fd  Linux raid autodetect

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x000c5c8f

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1           16065      208844       96390   83  Linux
/dev/sdb2          208845     4401809     2096482+  82  Linux swap / Solaris
/dev/sdb3         4401810   214114319   104856255   fd  Linux raid autodetect
/dev/sdb4       214114320  1953520064   869702872+  fd  Linux raid autodetect

Disk /dev/md0: 107.3 GB, 107372728320 bytes
2 heads, 4 sectors/track, 26214045 cylinders, total 209712360 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x00000000


Disk /dev/md1: 890.5 GB, 890575601664 bytes
2 heads, 4 sectors/track, 217425684 cylinders, total 1739405472 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdc: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders, total 156301488 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x2e900b9b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1              63   156296384    78148161    c  W95 FAT32 (LBA)

/etc/grub.conf (might have been manually trashed by BroNLaw)


setup --stage2=/boot/grub/stage2 (hd0,0) (hd0,0)
quit

/boot/grub/device.map


(hd1)	/dev/disk/by-id/ata-ST31000333AS_5TE064RP
(fd0)	/dev/fd0
(hd0)	/dev/disk/by-id/ata-ST31000333AS_9TE04E1N

/boot/grub/menu.lst


# Modified by YaST2. Last modification on Tue May 19 13:12:01 CDT 2009
default 0
timeout 8
##YaST - generic_mbr
gfxmenu (hd0,0)/message
##YaST - activate

###Don't change this comment - YaST2 identifier: Original name: linux###
title openSUSE 11.1 - 2.6.27.21-0.1
    root (hd0,0)
    kernel /vmlinuz-2.6.27.21-0.1-default root=/dev/md0 splash=silent showopts vga=0x31a
    initrd /initrd-2.6.27.21-0.1-default

###Don't change this comment - YaST2 identifier: Original name: failsafe###
title Failsafe -- openSUSE 11.1 - 2.6.27.21-0.1
    root (hd0,0)
    kernel /vmlinuz-2.6.27.21-0.1-default root=/dev/md0 showopts ide=nodma apm=off noresume edd=off powersaved=off nohz=off highres=off processor.max_cstate=1 x11failsafe vga=0x31a
    initrd /initrd-2.6.27.21-0.1-default

###Don't change this comment - YaST2 identifier: Original name: floppy###
title Floppy
    rootnoverify (fd0)
    chainloader +1

and while the problem is about Raid Disks…

/etc/mdadm.conf


DEVICE partitions
ARRAY /dev/md0 level=raid1 UUID=b034117b:02949d0c:1f08f71d:0261cec4
ARRAY /dev/md1 level=raid1 UUID=eb4fbffe:3eb38985:583fea3c:7347cc94

I’ve reached the end of my quaint Linux knowledge, so
any help / tips on how to proceed would be appreciated.

In the meantime I’ll be on my knees praying to the OpenSUSE gods for forgiveness and redemption…

Jerry

P.S. I confess I started WinXP last week :’(

I’ll throw a wild guess to the table and say that the initrd for reason or other does not contain the necessary raid1 module - running repair via the DVD should be sufficient to fix the issue.

If repair for reason or another does not fix the problem, it gets a little but not necessarily a lot trickier - it would require booting the system up from the DVD, installing the original kernel and rebuilding the initrd so it would contain the raid1 module.

What modules are included in the initrd are located in /etc/sysconfig/kernel file under the INITRD_MODULES="" parameter, this should contain - amongst other things, raid1.

My Prayers are answered…
Thx for your time and efforts…

Eeeeck, The repair fails, been there; done that…

The /etc/sysconfig/kernel file does contain the module…
tips on how to go about would be helpfull:

  • Install the original kernel
  • rebuild initrd

Jerry continues his mantra:
OOOMMMM, ooommm, oooouuummmm :slight_smile:

You should be able to boot the system from the DVD as it would load up the necessarily modules - from the same location where you ran the repair you should be able to choose advanced (or expert?) options and use “Boot to existing system.”

From there you can then mount the DVD and install the original kernel with zypper (or YAST or rpm from the command line).

However as you can imagine, I cannot guarantee that this will fix the issue at hand - so tread carefully.

Oh High Priestess, While I burn incense at your alter, please pardon my impertinence, but…

The DVD won’t boot the installed system, as I cannot tell it to boot from raid. :frowning:

It does load the raid drivers, which we have been using to mount the raid disks.

The only option I can think of from here is to re-install all of SUSE :X

Any Ideas before I spend the next 2 days doing a remote reinstall and re-configure ?

Jerry

No need for such desperation. If you can load the raid drivers and mount the disks you are in good shape. Try remaking the initrd using the mkinitrd command after that. The man page claims that it should autodetect all needed features, but if it fails to autodetect the software RAID, add “-f md” next time. See its man page.

Thx, will start with googling and woman-pages (politically correct?) for mkinitrd, and try to work it out from there…

Thx again for yous all support and time…

Jerry

The man in manual comes from the Latin for hand, so there is no political correctness issue in continuing to call them man pages.

Yes, Yes, Yes!!!
mkinitrd did it…

Thx a Million, to all…

Jerry

*mumbles something about man-pages and equality’ :wink: