Results 1 to 2 of 2

Thread: Unable To Delete Sw Raid Volume Using Mdadm

  1. #1
    arnoschaefer Guest

    Default

    Hi,

    I have just spent hours on Google trying to find a solution to this problem, but nothing helped.

    I am using OpenSuSE 10.2. I set the system up with two disks in a software raid, with three partitions each: one swap (2GB), / (20GB) and user data (the rest).

    /root>cat /proc/mdstat
    Personalities : [raid1] [raid0] [raid6] [raid5] [raid4] [linear]
    md2 : active raid1 sda3[0] sdb3[1]
    367631360 blocks [2/2] [UU]

    md1 : active raid1 sda2[0] sdb2[1]
    20972736 blocks [2/2] [UU]

    md0 : active raid1 sda1[0] sdb1[1]
    2104384 blocks [2/2] [UU]


    Later, I decided for performance reasons to remove the md0 swap partition and convert the two partitions to regular swap space. However, I can not delete the partition, I always get a "device or resource busy" error.

    What I did first was to fail and remove the sdb1 partition from the md0 device. This worked fine:

    /root>mdadm /dev/md0 -f /dev/sdb1 -r /dev/sdb1
    mdadm: set /dev/sdb1 faulty in /dev/md0
    (... other success message).


    Then I turned off the swap on /dev/md0 using "swapoff /dev/md0".

    This is where the trouble started. Removing the second device from the RAID did not work:

    /root>mdadm /dev/md0 -f /dev/sda1 -r /dev/sda1
    mdadm: set /dev/sda1 faulty in /dev/md0
    mdadm: hot remove failed for /dev/sda1: Device or resource busy

    Neither did stopping the RAID:

    /root>mdadm -S /dev/md0
    mdadm: fail to stop array /dev/md0: Device or resource busy

    Rebooting did not change anything. Even after a reboot, the device is still busy.

    After trying everything I could think of to get rid of the RAID, I tried it by force and used fdisk to delete the /dev/sda1 partition from the sda disk. This is where it got really weird: after the reboot, the /dev/sdb1 partition (the one I had previously failed and removed from the RAID) was back in the RAID:

    md0 : active(auto-read-only) raid1 sdb1[1]
    2104384 blocks [2/1] [_U]

    Of course, the device was still busy and could not be removed.

    I just deleted /dev/sdb1 using fdisk. Guess what: now the system does not boot anymore:

    md: md0 stopped.
    mdadm: no devices found for /dev/md0
    Invoking userspace resume from /dev/md0
    Invoking in-kernel resume from /dev/md0
    Attempting manual resume
    I/O error reading swsusp image.

    This is insane. Is there any way to delete a RAID short of reinstalling the whole system from scratch? Who could help with this problem? Is there a mailing list for mdadm or something like that?

    BTW what the heck is swsusp? Does this mean that a reboot in OpenSuSE 10.2 is no longer a reboot, but a suspend to disk? If it is, then it is no wonder that rebooting does not fix any problems. How do I get around this?

    Regards,

    Arno

  2. #2
    arnoschaefer Guest

    Default

    Well, what do you know: it appears the swsusp was exactly the problem. Apparently, OpenSuSE comes with some sort of hibernate active by default. So it loads the suspended image from the swap device during boot, so the swap device is always busy.

    By adding the "noresume" line to the kernel command line, this is switched off, and the RAID can be safely deleted.

    This must be one of the most braindead configurations in a Linux distro I have ever come across.

    Regards,

    Arno

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

Search Engine Friendly URLs by vBSEO 3.5.2