Failing over to a second hard drive

Hi,

I recently noticed that one of my test servers (running openSuSE 11.1) was showing SMART hard drive pre-failure errors. This server had a spare disk that was identical to the original so I did a:

dd if=/dev/sda of=/dev/sdb

Then powered down the machine, removed the old drive, and moved the second drive to the first SATA channel. It appears however that this isn’t enough to get it going.

Trying manual resume from /dev/disk/by-id/ata-Maxtor_6Y080M0_Y20V5C8C-part1
Resume device /dev/disk/by-id/ata-Maxtor_6Y080M0_Y20V5C8C-part1 not found (ignoring)
Waiting for device /dev/disk/by-id/ata-Maxtor_6Y080M0_Y20V5C8C-part2 to appear:..................Could not find /dev/disk/by-id/ata-Maxtor_6Y080M0_Y20V5C8C-part2.
Want me to fall back to /dev/disk/by-id/ata-Maxtor_6Y080M0_Y20V5C8C-part2? (Y/n)

At this point it drops into what I think is single user mode and I’m not sure where to go from here. I’m sure I can add the old drive back and get it booted if necessary but I’d like to know what I’m missing so if something like this were to happen and I didn’t have the original drive I could restore the system.

Thoughts, criticisms, insults all welcome.

The problem is that the entries in /etc/fstab are using the device ID, which has changed since you changed the drive. A simple fix would be to edit the fstab file so that all lines that read

 /dev/disk/by-id/ata-Maxtor_6Y080M0_Y20V5C8C-part1 

to

 /dev/sda1 

and so on for the other partitions (note that it tells you the partition number at the end of the ata-Maxtor…).

If you can’t get to the fstab via the single user mode it boots in, you may have to use a live-cd to get there.

Reply back if you run into problems or are unsure how to proceed.

Thanks for the reply!

I booted with a gparted disk, mounted the drive, and edited the /etc/fstab file. However the problem persists with the same error message.

Could you post the output of

fdisk -l

from gparted and also the contents of /etc/fstab? Maybe we can see something with that info.

Result of fdisk -l (sdb is a USB pen drive I used to copy off the contents)

Disk /dev/sda: 81.9 GB, 81964302336 bytes
255 heads, 63 sectors/track, 9964 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0000ccad

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1         327     2626596   82  Linux swap / Solaris
/dev/sda2   *         328        9964    77409202+  83  Linux

Disk /dev/sdb: 2063 MB, 2063597056 bytes
255 heads, 63 sectors/track, 250 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x91f72d24

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1         251     2015200    6  FAT16
Partition 1 has different physical/logical endings:
     phys=(249, 254, 63) logical=(250, 225, 38)

Current contents of /etc/fstab:

/dev/sda2	 /                    ext3       acl,user_xattr        1 1
/dev/sda1	 swap                 swap       defaults              0 0
proc                 /proc                proc       defaults              0 0
sysfs                /sys                 sysfs      noauto                0 0
debugfs              /sys/kernel/debug    debugfs    noauto                0 0
usbfs                /proc/bus/usb        usbfs      noauto                0 0
devpts               /dev/pts             devpts     mode=0620,gid=5       0 0
/dev/fd0             /media/floppy        auto       noauto,user,sync      0 0

Previous contents of /etc/fstab:

/dev/disk/by-id/ata-Maxtor_6Y080M0_Y20V5C8C-part2 /                    ext3       acl,user_xattr        1 1
/dev/disk/by-id/ata-Maxtor_6Y080M0_Y20V5C8C-part1 swap                 swap       defaults              0 0
proc                 /proc                proc       defaults              0 0
sysfs                /sys                 sysfs      noauto                0 0
debugfs              /sys/kernel/debug    debugfs    noauto                0 0
usbfs                /proc/bus/usb        usbfs      noauto                0 0
devpts               /dev/pts             devpts     mode=0620,gid=5       0 0
/dev/fd0             /media/floppy        auto       noauto,user,sync      0 0

I don’t see anything wrong with your fdisk and fstab output, but maybe someone else would notice something.

So, it still gives the same error of unable to find the disk when you try to boot with the changed fstab?

Yes, it still looks for the by-id Maxtor string. I don’t think it’s getting far enough to where it reads the /etc/fstab file since changing it had no effect. I’m not sure where in the early boot process it reads this info from.

Hi,

you also have to adapt the boot menu. Edit /boot/grub/menu.lst and replace your old drive ID with a proper device or device ID.

Hope this helps

I might have some insight into this. Since this was happening so early in the boot cycle I took a guess and put the old drive back in and pulled up the boot loader editor in Yast. Under ‘Optional Kernel Command Line Parameters’ I found this:

resume=/dev/disk/by-id/ata-Maxtor_6Y080M0_Y20V5C8C-part1

… which my guess is the cause of the error. I’m not sure what to replace it with, however :wink:

Out of curiosity I blanked this resume line out of the kernel parameters. Sadly it doesn’t seem to have stopped the problem, however :’( Same error.

Have you done what Monex suggested? I forgot about the entries in /boot/grub/menu.lst. You need to edit that file in the same way you edited the /etc/fstab file. You need to change the root entries in there from the device id’s to /dev/sda2.

Let us know if you have done that and the result.

Sorry for the delay. I did what Monex had posted. Below is a picture of the same error.

http://i29.tinypic.com/s3gbv9.jpg

Actually, never mind. I’m going to do a complete reinstall of SuSE 11.1 on this system to the good drive. Thanks anyway :wink: