12.3 with raid10

Hello all. Fairly new to linux but not computers and networks. I used the Intel SATA Raid controller on a server to create a raid10 array using 4x 2TB Western Digital RE4 hard disks. I then installed open SUSE 12.3 on it. All is great other than the raid rebuilds after every reboot. For some reason it does not mark the array as clean before it shuts down. i have read that having the root file system on the array will cause this as it unmounts the file system before it can be marked. “They” say you can modify a script to prevent this but have yet to see what script or the mods for it to prevent this. I would greatly appreciate any help on this. I have a production database on this server and wish to not rebuild it if i don’t have to.

mdadm --detail /dev/md126

/dev/md126:
Container : /dev/md/imsm0, member 0
Raid Level : raid10
Array Size : 3711641600 (3539.70 GiB 3800.72 GB)
Used Dev Size : 1855820928 (1769.85 GiB 1900.36 GB)
Raid Devices : 4
Total Devices : 4

      State : active

Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0

     Layout : near=2
 Chunk Size : 64K


       UUID : dd2ab43b:37dd6ee2:8d78e0a9:ba8c4eec
Number   Major   Minor   RaidDevice State
   3       8        0        0      active sync   /dev/sda
   2       8       16        1      active sync   /dev/sdb
   1       8       32        2      active sync   /dev/sdc
   0       8       48        3      active sync   /dev/sdd

cat /proc/mdstat

Personalities : [raid10] [raid0] [raid1] [raid6] [raid5] [raid4]
md126 : active raid10 sda[3] sdb[2] sdc[1] sdd[0]
3711641600 blocks super external:/md127/0 64K chunks 2 near-copies [4/4] [UUUU]

md127 : inactive sdc3 sda2 sdd1 sdb0
12612 blocks super external:imsm

unused devices: <none>

Thanks!

Did you install all available updates? There was update for 12.3 mdadm; from changelog it sounds like it may fix it.

And a remark on technique of the forums here.

As a new member (welcome!) you will not be aware of the CODE tags. The CODE tags are created by clicking on the # button in the tool bar of the post editor. You are kindly requested to copy/paste your computer texts (when applicable, prompt, command, output and next prompt) direct from the terminal emulator in between CODE tags. That will make them clearly stand out against the “story telling”, keep layout as it was on the terminal and it has more advantages in readability.

Thank you for your cooperation.

I applied all updates after the install via yast and zypper. my mdadm version is:


mdadm --version
mdadm - v3.2.6 - 25th October 2012

my md version is:

 
md --version
mkdir (GNU coreutils) 8.17
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Are these up to date?

Thanks!

I just checked another server that i use and has 4x 250GB SSD drives in raid10, also using on board intel SATA controller, and it also is rebuilding after reboots. This server has the same versions of OS and mdadm on it. Have i built the raid incorrectly? I built them in the controll bios then did the install of the OS. If I recall the OS install noticed the setup in the controller and asked if I was ok with this. Not sure of the exact message. It was a while ago. I said yes and all was well, until I noticed the resycnh after the reboots. thanks!

Made progress. After running all updates one of the 12.3 boxes, my mdadm version is 3.2.6. I built a test box last night and bought 4x 500GB WD SATA3 drives for it. I went through the same procedure of building raid10 array from intel on board controller on my MSI G series mother board. Installed 12.3 and updated completely. Still marking dirty on reboot or shutdown. I then laid waste and did the entire procedure over with the opensuse 13.1. This does not mark array dirty on reboot or shutdown. version of mdadm is 3.3 on the opensuse 13.1 box.

How can i update my mdadm 3.2.6 on the 12.3 box to the version 3.3 that is on my 13.1 box? Please help :stuck_out_tongue: Thanks!

As I mentioned in the other thread you posted to it sounds like a bug so you need to report to Bugzilla to have it fixed. 13.3 is still supported so it may generate a solution.

Alternatively you can compile the newer code from source,. Of course there is no way to know if something else will break from this operation. Then again it could be a kernel problem.

In you position I’d do what had to be done to move to 13.1 and then move to the Evergreen repos to have the advantage of long support life.

Spent many hours on this and have not gotten 12.3 on RAID10 to not re synch on reboot. i would love to upgrade to 13.x but the software, vicidial, I need to run on the box is not supported on that version, from a distribution stand point. If I new what I was doing I guess I could spend the time in trying to make that work, but I think that’s more than I can chew.

I tried a neat little tool call raider even. I did the install on single disk. used raider to convert to raid10. spit some errors out in the very end. everything worked but the last drive was removed and i could not add it. It said it was raid10 with 3 of 4 drives (UUU_) and did not go into re synch when I rebooted. I just could no get the drive added. I think there is a fix for my problem as this raider did not cause a re synch on reboots. I have read in many places about a script that marks the array as clean before it unmounts it. I just for the life of me am unable to find this script. I would pay for that script BTW

Done, Bugzilla report that is. Thanks!

I have 12.3 on raid10, with no resynch happening on reboots:) Problem is I had to disable my intel on board raid controller to get it this way and just assemble the array during the install.

I still need to test what happens on drive failures, but so far so good. I am not sure about the boot partitions. I guess i will find out though.

If everything goes well on the testing I will then have to consider the rebuild of the servers as they are using the raid controller. I would just live with it but the asterisk system on these requires reboots. i have a cron job kicking this off every weekend. Is this to much stress on the disks? 4x Samsung 250GB SSD drives in one server and 4x Western Digital RE4 2TB drives in the other. Thoughts please. Thanks!

ok. losts of testing and os installs. Turns out there is no problem with 12.3 if you apply mdadm-3.3-126.1.x86_64.rpm. RAID10 build with RST bios, or just using AHCI and assembling the array yourself, they both work fine. my problem was with the distribution DVD i was using. Apparently they changed the way the system is shutting down and not marking the partition clean before the umount. Anyone with knowledge on how to make this happen, marking the partition clean before umount? do i need to add something to an rc script? Any help would be greatly appreciated. Thanks!

I do not understand all the details, but to me it seams that you have a clear case of it functioning as it should in 12.3, but failing at 13.1

Then you should raise a bug report IMHO. You can do that at http://en.opensuse.org/Submitting_Bug_Reports (same username and password as here on the forums).

You could then post the bugnumber here, so that others can add information to it or vote for it.

13.1 has always worked for me. it was 12.3 i had the issue with. turns out it is 12.3 from a vendor distribution. they must have done something in it to break it. When I download 12.3 from opensuse site and apply 3.3-126.1.x86_64.rpm it works great. have tested both 13.1 and 12.3 with RAID10 arrays built on RST bios (and telling mdadm to manage this during the install) as well as AHCI (building my own array through install partition-er).

What I need know is how to get the volume marked clean on a reboot or shutdown. I am reading that


mdadm --wait-clean --scan

is key. I have to get this to run just before it umounts, but after it marks it as read only i think, the root partition. Any ideas? Thanks! I could really use some help. I have arrays rebuilding every night now. The asterisk app that runs on these servers has pretty bad memory leaks I am told and starts flaking out if you dont reboot every night.

I found something i would like to try but not sure how to get this to run. I have never done a script. any help?


# check out if a software raid is active
if test -e /proc/mdstat -a -x /sbin/mdadm ; then
    while read line ; do
   case "$line" in
   md*:*active*) mddev=--scan; break ;;
   esac
    done < /proc/mdstat
    unset line
    if test -n "$mddev" -a -e /etc/mdadm.conf ; then
   mddev=""
   while read type dev rest; do
       case "$dev" in
       /dev/md*) mddev="${mddev:+$mddev }$dev" ;;
       esac
   done < /etc/mdadm.conf
   unset type dev rest
    fi
fi

# kill splash animation
test "$SPLASH" = yes && /sbin/splash -q

echo "Sending all processes the TERM signal..."
killall5 -15
echo -e "$rc_done_up"

# wait between last SIGTERM and the next SIGKILL
rc_wait /sbin/blogd /sbin/splash

echo "Sending all processes the KILL signal..."
killall5 -9
echo -e "$rc_done_up"

if test -n "$REDIRECT" && /sbin/checkproc /sbin/blogd ; then
    # redirect our famous last messages to default console
    exec 0<> $REDIRECT 1>&0 2>&0
fi

# on umsdos fs this would lead to an error message, so direct errors to
# /dev/null
mount -no remount,ro / 2> /dev/null
sync

# wait for md arrays to become clean
if test -x /sbin/mdadm; then
    /sbin/mdadm --wait-clean --scan
fi
# stop any inactive software raid
if test -n "$mddev" ; then
    /sbin/mdadm --quiet --stop $mddev
    # redirect shell errors to /dev/null
    exec 3>&2 2>/dev/null
    # cause the md arrays to be marked clean immediately
    for proc in /proc/[0-9]* ; do
   test ! -e $proc/exe || continue
   read -t 1 tag name rest < $proc/status || continue
   case "$name" in
   md*_raid*) killproc -n -SIGKILL "$name" ;;
   esac
    done
    unset tag name rest
    # get shell errors back
    exec 2>&3-
fi 

You need to do it after root is marked read-only.

In any case, that’s what openSUSE 12.3 with mdadm patch does by default and it did it in previous versions (before systemd) as well. If your modified openSUSE does not - who knows what else is changed there?

agreed! I have no idea what they changed. I would like to know where these changes can be reversed and what is the proper shutdown method? Any example files or code would be greatly appreciated. I am unable to figure out how systemd works and how to manage it. any good links for this? thanks arvidjaar

Systemd executes programs in /usr/lib/systemd/system-shutdown very late (after all processes were killed and filesystems were unmounted or at least remounted read-only), immediately before performing shutdown/reboot. That is where mdadm hooks:

bor@opensuse:~> cat /usr/lib/systemd/system-shutdown/mdadm.shutdown 
#!/bin/sh
# We need to ensure all md array with external metadata
# (e.g. IMSM) are clean before completing the shutdown.
/sbin/mdadm --wait-clean --scan
bor@opensuse:~> 

For it to work mdmon must be exempt from “killall”. Please see documentation on systemd site how it is done.

I have the mdadm.shutdown file. i have spent the last 6 hours trying to find out the way to stop mdmon from getting killed. i found and article that mentions a file

/run/sendsigs.omit.d

but i could not find that on any of my servers. I have read that mdmon needs to run outside of user space or it will cause this problem but nobody saying how to get this to run under system, or stopping it from getting killed. Can you point me to an article on this? I would really appreciate it. thanks!

Assuming you are using systemd at all, mdmon should already implement it (you cannot do it “from outside” anyway):
http://www.freedesktop.org/wiki/Software/systemd/RootStorageDaemons/

It appears systemd is running


opensuse12:~ # ps -ef | grep systemd
root       351     1  0 23:05 ?        00:00:00 /usr/lib/systemd/systemd-journald
root       370     1  0 23:05 ?        00:00:00 /usr/lib/systemd/systemd-udevd
root       583     1  0 23:05 ?        00:00:00 /usr/lib/systemd/systemd-logind
message+   602     1  0 23:05 ?        00:00:00 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root      2641  2504  0 23:49 pts/1    00:00:00 grep --color=auto systemd