mdstat: recovery =161.6% (greater 100%) ????

I was just wondering how the mdadm raid drive recovery could exceed 100%… Have a look, this is my actual mdstat:

cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4] [raid0] [raid1]
md2 : active raid5 sdd3[4] sdc3[3] sdb3[1]
1204119552 blocks super 1.0 level 5, 128k chunk, algorithm 0 [3/2] [_UU]
=================================>] recovery =167.2% (503571456/301029888) finish=5950965.4min speed=27806K/sec
bitmap: 27/288 pages [108KB], 1024KB chunk

md0 : active raid5 sdd1[4] sdc1[3] sdb1[1]
4192768 blocks super 1.0 level 5, 128k chunk, algorithm 0 [3/3] [UUU]
bitmap: 0/8 pages [0KB], 128KB chunk

md1 : active raid5 sdd2[4] sdc2[3] sdb2[1]
41929472 blocks super 1.0 level 5, 128k chunk, algorithm 0 [3/3] [UUU]
bitmap: 41/160 pages [164KB], 64KB chunk

unused devices: <none>

Background:
I have three mdadm raid 5 disks running based on three 640 GB disks (WD6400AAKS). After one of them showed unreadable/uncorrectable sectors, I was going to replace that disk with a new one.
This is what I did:

  • physically added the new disk to the last port of my 4 Port (Promise) SATA controller (which became sdd, the pre-existing disk were: sda, sdb, sdc, with sda being the defective one)

  • partitioned the new disk appropriately (by using one of the good disks as a template, here sdc)

    sfdisk -d /dev/sdc | sfdisk /dev/sdd

  • marked all sda partitions with my three md drives as faulty:

    mdadm /dev/md0 -f /dev/sda1

    mdadm /dev/md1 -f /dev/sda2

    mdadm /dev/md2 -f /dev/sda3

  • added the new sdd partitions to those arrays individually:

    mdadm /dev/md0 -a /dev/sdd1

    … and when the rebuild was done, I removed the partition of the broken disk (sda):

    mdadm /dev/md0 -r /dev/sda1

BUT: With the last - and by far biggest array, the rebuild is not going to end in the next 48 years - or so… see mdstat entry “finish=5950965.4min”

Any idea what I did wrong, or what to do to fix this?

Update:
In the meantime my md2 raid array has recovered completely!
All in all the “rebuild” took like three hours plus another three hours for the then “active, degraded” stated array to complete the recovery process.

The mdstat now looks just fine:


# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1]
md2 : active raid5 sda3[4] sdc3[3] sdb3[1]
      1204119552 blocks super 1.0 level 5, 128k chunk, algorithm 0 [3/3] [UUU]
      bitmap: 2/288 pages [8KB], 1024KB chunk

md0 : active (auto-read-only) raid5 sda1[4] sdc1[3] sdb1[1]
      4192768 blocks super 1.0 level 5, 128k chunk, algorithm 0 [3/3] [UUU]
      bitmap: 0/8 pages [0KB], 128KB chunk

md1 : active raid5 sda2[4] sdc2[3] sdb2[1]
      41929472 blocks super 1.0 level 5, 128k chunk, algorithm 0 [3/3] [UUU]
      bitmap: 2/160 pages [8KB], 64KB chunk

unused devices: <none>

I am still wondering why mdadm was giving me a rebuild percentage above 100% and why it expected such an absurdly high “finish” duration…

However, it finally worked out fine and I am really very happy with my Linux/GNU software raid solution!