11.3 - problems booting from RAID 1

Hi there,

Just installed 11.3 on a RAID - unfortunately, I get Error 22: No such partition. Grub appears to be on both drives. Followed the same procedure as I did for 11.2 and that worked flawlessly. Is this a known issue or have I made a silly mistake somewhere?

Any advice gratefully received.

Best Wishes
Ziggx

What is it with RAID? A lot of user are with raid config…

Sorry, Hi. :slight_smile:

I say that, because raid is not easy to tackle in oS.

Hello,

I suspect you are seeing this bug:

https://bugzilla.novell.com/show_bug.cgi?id=619566, which I would consider a show-stopper given the number of folks using dmraid these days, knowingly or not. Fake-raid is standard on almost all mobos these days … and folks use it.

As a workaround, assuming of course that you are using dmraid, use mdraid instead. In other words, turn OFF the raid in the bios and configure your install with linux software raid.

FYI, in the little benchmarking I have done, linux software raid (mdraid) is much faster.

Best of luck!

Hi there,

A bit more detail:
I’m using an MDRAID not a fake RAID.
I configure the RAID in the installer as MD0 ( / ), MD1 ( /opt ), MD2 ( /home) and MD3 ( swap ) - all are RAID 1. I left the grub install as default. Running fdisk shows both disks of MD0 are bootable and grub shows they have both have stage 1 installed.
This is the error I get on trying to boot:

Booting ‘Desktop – openSUSE 11.3 - 2.6.34-12’

Kernel (hd0,5)/boot/vmlinuz-2.6.34-12-desktop root=/dev/disk/by-id/md-uuid-8badba2:f8edcf33:550ede86 resume=/dev/disk/by-id/md-uuid-5d09bda9:25c8c4e5:05fadba3:44438bf2 splash=silent quiet showopts vga=0x31a

Error 22: No such partition.

It looks like its not recognising my RAID - any pointers would be great.
Best Wishes
Ziggx

Just looked at Bug 619566 - jeez this is a show stopper. How can they release 11.3 in this state? Is there a workaround? Could I reinstall 11.2 and do zypper dup to get 11.3.

Ziggx

DaaX wrote:

>
> What is it with RAID? A lot of user are with raid config…
>

A lot of harddisks break right on schedule when their three warranty has
expired.
A lot of users don’t bother running smartd and regular selftests.

> Sorry, Hi. :slight_smile:
>
> I say that, because raid is not easy to tackle in oS.

Sure it is. Please stop spreading FUD.


Per Jessen, Zürich (21.6°C)
http://en.opensuse.org/User:Pjessen

ziggx wrote:

> Hi there,
>
> A bit more detail:
> I’m using an MDRAID not a fake RAID.
> I configure the RAID in the installer as MD0 ( / ), MD1 ( /opt ), MD2
> ( /home) and MD3 ( swap ) - all are RAID 1. I left the grub install as
> default. Running fdisk shows both disks of MD0 are bootable and grub
> shows they have both have stage 1 installed.
> This is the error I get on trying to boot:
>
> Booting ‘Desktop – openSUSE 11.3 - 2.6.34-12’
>
> Kernel (hd0,5)/boot/vmlinuz-2.6.34-12-desktop
> root=/dev/disk/by-id/md-uuid-8badba2:f8edcf33:550ede86
> resume=/dev/disk/by-id/md-uuid-5d09bda9:25c8c4e5:05fadba3:44438bf2
> splash=silent quiet showopts vga=0x31a
>
> Error 22: No such partition.

This looks like grub isn’t finding /dev/disk/by-id/md-uuid-<etc> - I
don’t speak grub, but I don’t know how it would get access
to /dev/disk/by-id so early in the process. I would open a bugreport.


Per Jessen, Zürich (21.5°C)
http://en.opensuse.org/User:Pjessen

Per,

The problem is that the grub menu has the wrong boot disk id (hd0,5). This doesn’t exist and it should be (hd0,0). I need to find a way to rewrite the line - I just need to figure out how to access the /boot directory on the raid…

Z

ziggx wrote:

>
> Per,
>
> The problem is that the grub menu has the wrong boot disk id (hd0,5).
> This doesn’t exist and it should be (hd0,0). I need to find a way to
> rewrite the line - I just need to figure out how to access the /boot
> directory on the raid…

Ok, how about if you boot a rescue system from CD and then mount /boot?


Per Jessen, Zürich (23.1°C)
http://en.opensuse.org/User:Pjessen

Ooooof!

Good info ziggx …

I’ll just try to throw out a few pointers …

  1. The boot partition is under /boot … it may be accesses through a normal shell (as root user of coarse). The primary grub configuration stuff is in /boot/grub/menu.lst. Disk boot order is in /boot/menu/device.map. If you change the device.map manually, you must run “mkinitrd”.

  2. When you configure grub through the graphical interface, make sure the “install to mbr” option is enabled. This can be done manually also, but the graphical tool is easier to use.

  3. It is possible that the kernel drivers needed for raid did not get built into the initrd. I had this problem with SuSE 11.2/raid10. My fix was to boot the install system, manually mount the broken installation, add “raid10” to /etc/sysconfig/kernel, and rebuild the initrd with the mkinitrd command. – I don’t think this is the problem in this case … just putting it out in case I am wrong …

For those asking about raid, it stands for Redundant Array of Inexpensive Disks. The short and over-simplified version, each write to disk is written to more than one disk auto-magically. Wikipedia has a nice article.

Good luck …

Hello again,

I just noticed the line “root=/dev/disk/by-id/md-uuid-8badba2:f8edcf33:550ede86”.

That looks a little strange to me, as the raid array is usually accessed by device mapper, /dev/md0, /dev/md1, …

And the fun continues …

Okay,

I’ve reinstalled - this time I have a /boot directory (RAID 1). on startup, it drops me into the grub prompt. From there, I can boot by using this set of commands:

  1. root (hd0,0)
  2. kernel /boot/grub/vmlinuz
  3. initrd /boot/initrd
  4. boot

Here’s the menu.lst contents (only essential bits)
timeout 8
default 0

That looks okay… But when I examine the /etc/sysconfig/bootloader, I find this line which, I think, is causing the grief:

DEFAULT_APPEND=" resume=/dev/disk/by-id/md-uuid-ed9928fc:5df194bc:7b0ddf72:bc7723a6 splash=silent quiet showops"

When I examine the RAID the uuid don’t match. Could this be where it’s going wrong?
Anyone any idea what this SHOULD read?

TIA
Ziggx

That would be the UUID of your swap device. If you can’t determine the UUID, replace the right-hand-side of resume= with the /dev/md device which represents your swap.

Aah… yeah man, you are right - it is the uuid for swap. Any ideas where else I might look for the problem?

Ziggx

Aah… yeah man, you are right - it is the uuid for swap. Any ideas where else I might look for the problem?

Ziggx

Hello again,

I am really curious as to your exact hardware config. Oddball disk, controller, bios settings ??? It is VERY unusual to have this much difficulty with a SuSE install.

That said, I would make certain that fake-raid is disabled in the bios and confirm that raid1 is built into your initrd (even though I can’t imagine that it is not, nor am I certain it really matters). Lastly, try running “mkinitrd -A” once you are in the running system. From the man page, “Create a so called ‘monster initrd’ which includes all available features and modules”.

Be well!

There’s nothing oddball about the system - vanilla AMD Quad, ASUS mb, etc - it worked perfectly with 11.2 in exactly the same setup booting off a RAID1 partition. There is something very broken about 11.3 and installing onto a RAID1. There are a number of open bugs in Bugzilla on 11.3 & RAID. If I was using a fake raid I could understand the problem.
I think there is something wrong with perl-Bootloader but I don’t have the expertise or knowledge of perl to figure it out. I’m seriously thinking of going back to 11.2 until this gets sorted. Meanwhile, I will check out your suggestions.

TIA
Ziggx

ziggx wrote:

>
> There’s nothing oddball about the system - vanilla AMD Quad, ASUS mb,
> etc - it worked perfectly with 11.2 in exactly the same setup booting
> off a RAID1 partition. There is something very broken about 11.3 and
> installing onto a RAID1. There are a number of open bugs in Bugzilla
> on 11.3 & RAID.

Hmm, I’m almost tempted to try it out myself now - I think I’ve got a
test system with a couple of drives.


Per Jessen, Zürich (29.8°C)
http://en.opensuse.org/User:Pjessen

Ran into the same problem, and I doubt it is actually a RAID problem.

As pointed out in #8, the error message refers to the (hd0,5) stuff. On my system the installation process used (hd0,1) where it should have used (hd0,4), probably because I have no primary partitions and 4 logical partitions in an extended partition on both disks, and the RAID array with the kernel resides in the first logical partitions; i.e. I have sda[15-8] and sdb[15-8], /boot with the kernel, initrd etc. is on md0 which resides in sd[ab]5. Note that that is the second existing partition device explaining the choice (hd0,1) rather than (hd0,4) which refers to sda5.

GRUB offers to edit the “configuration lines” used to boot. So I changed the kernel and initrd lines to refer to (hd0,4) and the freshly installed 11.3 system started (and the installation and configuration tasks after the first boot continued). Then I started YaST and went to the boot loader configuration. First time it crashed after the changes; starting it again the (hd0,4) stuff was gone and replaced by /dev/md0. I probably missed a reference to (hd0,1)
/messages as GRUB now complains about that, but it starts 11.3 correctly from the text mode menu.

Unfortunately, I currently end up in a restricted shell because of a fsck failure – apparently a by-id link for md2 comes up too late (because of the device’s size?). Will investigate that on Monday…

Might want to check /etc/fstab also,if you have not.