Results 1 to 2 of 2

Thread: RAID5 crash, how to recover?

  1. #1

    Default RAID5 crash, how to recover?

    Hello!

    I have a huge problem with my file server (OpenSuse 11.3 - 64bit, kernel-2.6.34.7-0.7-default). I've just installed an Intel SASUC8I card, connected 3 of the 7 Samsung 2TB drives I have to it and after about one hour, it dropped 2 of the disks.

    I've managed to trace the problem to the card BIOS, which I've replaced with the non-raid edition, so it should now work fine with the kernel raid now.

    The problem is that I can't find a way to "un-fail" these 2 disks. I'm more than positive, that these drives are just fine, only the controller was misbehaving. The dropout also couldn't have created any data inconsistency either, since the 2 drives dropped out virtually at the same time and there was no writing being done at the time.

    I've tried add/re-add, I get either mdadm: cannot get array info for /dev/md0 or mdadm: add new device failed for /dev/sdi1 as 7: Invalid argument (depending on the raid being run or being stopped, in either case, mdstat reports it to be inactive)

    For a normal or forced assemble, I get mdadm: /dev/md0 assembled from 5 drives and 1 spare - not enough to start the array.

    I've been googleing like crazy, also trying to get info from mdadm's help and man, but nothing seems to deal with such a freak accident.

    An other interesting thing is, that if I reboot the system, mdstat shows md0 as inactive, but lists all the devices with no flags. It's only after a run command, that it changes to the 5 remaining devices, all with (S) flags.

    Alternatively: does anyone know where device failure info is stored? If I could in some way remove this information from the system (even by reinstalling the OS), I should be able to reassemble the array... Or is it stored in the member drive super-blocks?

    About 80% of this array's data is backed up, so if all else fails, I can restore most of its content, but I'd much prefer to reassemble this one as a whole, since there was absolutely no chance of data corruption.

    Any and all help is much appreciated! Tank you!

  2. #2

    Default Re: RAID5 crash, how to recover?

    Got it all back after working with it continuously for like 10-12 hours

    This was the right command (the disk order is a bit messed up, but it does matter!):

    mdadm --create /dev/md0 --assume-clean --level=5 --metadata=1.0 --chunk=1024 --parity=left-asymmetric --raid-devices=7 /dev/sdh1 /dev/sd[cdefi]1 /dev/sdg1

    I've created the original raid5 with the yast installer, so it was a bit of a pain, to track down all these non-default params...

    The biggest help was this page: https://raid.wiki.kernel.org/index.p...ice_failure.29, maybe I should have started here in the first place

    If someone is interested in the subject, I think I can be of further assistance. I've got into this real deep, this is the relevant essence of my research.

    You can get into quite a pickle with software raid, but I think it's still more recoverable than hw raid

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •