Need help creating RAID5 software array

I got a bunch of hard disks of similar type and same size:

# hdparm -i /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf | grep Serial
Model=WDC WD20EZRX-00DC0B0, FwRev=80.00A80, SerialNo=WD-WCC300465210
Model=WDC WD20EADS-00W4B0, FwRev=01.00A01, SerialNo=WD-WCAVY4624866
Model=WDC WD20EADS-00W4B0, FwRev=01.00A01, SerialNo=WD-WCAVY4624512
Model=WDC WD20EADS-00R6B0, FwRev=01.00A01, SerialNo=WD-WCAVY1854288
Model=WDC WD20EADS-00R6B0, FwRev=01.00A01, SerialNo=WD-WCAVY1811196

These have been prepared like this:

# parted /dev/sdb mklabel gpt
# parted /dev/sdc mklabel gpt
# parted /dev/sdd mklabel gpt
# parted /dev/sde mklabel gpt
# parted /dev/sdf mklabel gpt

# parted -a optimal -- /dev/sdb mkpart primary 2048s 100%
# parted -a optimal -- /dev/sdc mkpart primary 2048s 100%
# parted -a optimal -- /dev/sdd mkpart primary 2048s 100%
# parted -a optimal -- /dev/sde mkpart primary 2048s 100%
# parted -a optimal -- /dev/sdf mkpart primary 2048s 100%

# parted /dev/sdb set 1 raid on
# parted /dev/sdc set 1 raid on
# parted /dev/sdd set 1 raid on
# parted /dev/sde set 1 raid on
# parted /dev/sdf set 1 raid on

Now, when trying to create the array, I get an error message:

# mdadm --create /dev/md0 --level=5 --raid-devices=5 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1
mdadm: You haven't given enough devices (real or missing) to create this array

Frankly, there are five devices given which should be sufficient.

When giving the command like this, I get different output:

# mdadm --create /dev/md0  --level=5 --raid-devices=5 /dev/sd[b-f]1
mdadm: partition table exists on /dev/sdb1
mdadm: partition table exists on /dev/sdb1 but will be lost or
       meaningless after creating array
mdadm: partition table exists on /dev/sde1
mdadm: partition table exists on /dev/sde1 but will be lost or
       meaningless after creating array
mdadm: partition table exists on /dev/sdf1
mdadm: partition table exists on /dev/sdf1 but will be lost or
       meaningless after creating array
Continue creating array?

I don’t get it, to be honest. All five devices have been prepared the very same way but only for three of them I get these warnings–what about /dev/sdc1 and /dev/sdd1?

What am I doing wrong?

Example output of

fdisk -l /dev/sdb1

would be interesting. As well as

fdisk -l /dev/sdb

Actually last night I decided to just ignore the warnings and entered `y´ to continue because I was curious what would happen.

 # mdadm --create /dev/md0  --level=5 --raid-devices=5 /dev/sd[b-f]1
...
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

It seemed to be working on creating the array:

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdf1[5] sde1[3] sdd1[2] sdc1[1] sdb1[0]
      7813525504 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/4] [UUUU_]
      [===>.................]  recovery = 16.5% (322591160/1953381376) finish=302.1min speed=89936K/sec
      bitmap: 0/15 pages [0KB], 65536KB chunk

unused devices: <none>

Again, there’s some unexpected and somewhat strange output–why does it say ‘sdf1[5]’ and not ‘sdf1[4]’? That doesn’t make sense, right?

Anyway, I decided to let it continue over night. This morning, the array building process finished:

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdf1[5] sde1[3] sdd1[2] sdc1[1] sdb1[0]
      7813525504 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
      bitmap: 0/15 pages [0KB], 65536KB chunk

unused devices: <none>

Finally, I could format the array:

# mkfs.btrfs -L wdarray /dev/md0
btrfs-progs v5.14
See http://btrfs.wiki.kernel.org for more information.

Label:              wdarray
UUID:               73f46396-5fb3-41cb-b644-5153ef0310bc
Node size:          16384
Sector size:        4096
Filesystem size:    7.28TiB
Block group profiles:
  Data:             single            8.00MiB
  Metadata:         DUP               1.00GiB
  System:           DUP               8.00MiB
SSD detected:       no
Zoned device:       no
Incompat features:  extref, skinny-metadata
Runtime features:
Checksum:           crc32c
Number of devices:  1
Devices:
   ID        SIZE  PATH
    1     7.28TiB  /dev/md0

fdisk -l /dev/sdn’ output:

# fdisk -l /dev/sdb
Disk /dev/sdb: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: WDC WD20EZRX-00D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: DD199060-8650-4076-B9D0-4CA99F14CF04

Device     Start        End    Sectors  Size Type
/dev/sdb1   2048 3907028991 3907026944  1.8T Linux RAID

# fdisk -l /dev/sdc
Disk /dev/sdc: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: WDC WD20EADS-00W
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 0ED830E6-1527-47B5-AAAA-801062F013E8

Device     Start        End    Sectors  Size Type
/dev/sdc1   2048 3907028991 3907026944  1.8T Linux RAID

# fdisk -l /dev/sdd
Disk /dev/sdd: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: WDC WD20EADS-00W
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 7A8D6A4A-4D63-4881-96E9-C35295FD51A2

Device     Start        End    Sectors  Size Type
/dev/sdd1   2048 3907028991 3907026944  1.8T Linux RAID

 # fdisk -l /dev/sde
Disk /dev/sde: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: WDC WD20EADS-00R
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 72AE6882-8DFD-46CE-BB76-2A6DE8D98B28

Device     Start        End    Sectors  Size Type
/dev/sde1   2048 3907028991 3907026944  1.8T Linux RAID

# fdisk -l /dev/sdf
Disk /dev/sdf: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: WDC WD20EADS-00R
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: F1630B47-0280-4C04-B5A3-62004AC7F69C

Device     Start        End    Sectors  Size Type
/dev/sdf1   2048 3907028991 3907026944  1.8T Linux RAID

So far, so good. However, the ‘fdisk -l /dev/sdn1’ output now seems somewhat strange to me, not what I would have expected:


 # fdisk -l /dev/sdb1
Disk /dev/sdb1: 1.82 TiB, 2000397795328 bytes, 3907026944 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x69205244

Device      Boot      Start        End    Sectors   Size Id Type
/dev/sdb1p1       218129509 1920119918 1701990410 811.6G 72 unknown
/dev/sdb1p2       729050177 1273024900  543974724 259.4G 74 unknown
/dev/sdb1p3       168653938  168653938          0     0B 65 Novell Netware 386
/dev/sdb1p4      2692939776 2692991410      51635  25.2M  0 Empty

Partition 1 does not start on physical sector boundary.
Partition 2 does not start on physical sector boundary.
Partition 3 does not start on physical sector boundary.
Partition table entries are not in disk order.


# fdisk -l /dev/sdc1
Disk /dev/sdc1: 1.82 TiB, 2000397795328 bytes, 3907026944 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

# fdisk -l /dev/sdd1
Disk /dev/sdd1: 1.82 TiB, 2000397795328 bytes, 3907026944 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

# fdisk -l /dev/sde1
Disk /dev/sde1: 1.82 TiB, 2000397795328 bytes, 3907026944 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x4e0a0d00

Device      Boot      Start        End    Sectors   Size Id Type
/dev/sde1p1      1920221984 3736432267 1816210284   866G 74 unknown
/dev/sde1p2      1936028192 3889681299 1953653108 931.6G 6c unknown
/dev/sde1p3               0          0          0     0B  0 Empty
/dev/sde1p4        27459978   27460418        441 220.5K  0 Empty

Partition table entries are not in disk order.

 # fdisk -l /dev/sdf1
Disk /dev/sdf1: 1.82 TiB, 2000397795328 bytes, 3907026944 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x4e0a0d00

Device      Boot      Start        End    Sectors   Size Id Type
/dev/sdf1p1      1920221984 3736432267 1816210284   866G 74 unknown
/dev/sdf1p2      1936028192 3889681299 1953653108 931.6G 6c unknown
/dev/sdf1p3               0          0          0     0B  0 Empty
/dev/sdf1p4        27459978   27460418        441 220.5K  0 Empty

Partition table entries are not in disk order.

This is confusing, right?

The newer WD20EZRX has a different physical sector size (4096 bytes) than the older WD20EADS (512 bytes)–could that be causing the issue?

This matches the other information and to this extent it pretty much makes sense.

So, one disk is missing and mdadm performs RAID recovery. The number after component name is internal number (not the position in RAID array) which gets assigned sequentially. It looks like mdadm decided to rebuild RAID5 onto the “new” replacement disk. I cannot say whether it always works this way, or something happened when creating this array.

So, mdadm was right and there is a partition table on /dev/sdb1. Where it came from, I have no idea, you should know better how these disks were used in the past.

Briefly looking at examples on 'net, it appears to be the standard behavior.

And which one would this be? Why?
All five disk are clearly detected and can be fdisk’d, b through f.

All disks have been prepared like this on a Windows box:

C:\> diskpart
list disk
select disk 2
clean
convert gpt

So none of them should have had any partition beforehand.

I am not the mdadm developer. You can study mdadm sources to find out what it does exactly. My educated guess is - it needs to initialize RAID5 checksums, and it is reusing the same engine that is used for recovery after disk failure. Makes sense. You can try --assume-clean whether it behaves differently, but in your particular case it will result in corrupted data (because disks are not clean).

It just removes the partition table; it does not touch any data inside partitions.

Which should be sufficient.

Also, this is not a recovery, this is a new array built upon used disks.

I started again from scratch. This time using Linux to delete the partition tables. I also rebooted after each step to avoid issues because of lingering /etc/fstab updates. Eventually I get the very same non-logical disk layout as in the first run. :expressionless:

Step 1: Delete partitions

    # sfdisk --delete /dev/sdb
	
	The partition table has been altered.
	Calling ioctl() to re-read partition table.
	Re-reading the partition table failed.: Device or resource busy
	The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or partx(8).
	Syncing disks.
	# sfdisk --delete /dev/sdc
	
	The partition table has been altered.
	Calling ioctl() to re-read partition table.
	Re-reading the partition table failed.: Device or resource busy
	The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or partx(8).
	Syncing disks.
	# sfdisk --delete /dev/sdd
	
	The partition table has been altered.
	Calling ioctl() to re-read partition table.
	Re-reading the partition table failed.: Device or resource busy
	The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or partx(8).
	Syncing disks.
	# sfdisk --delete /dev/sde
	
	The partition table has been altered.
	Calling ioctl() to re-read partition table.
	Re-reading the partition table failed.: Device or resource busy
	The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or partx(8).
	Syncing disks.
	# sfdisk --delete /dev/sdf
	
	The partition table has been altered.
	Calling ioctl() to re-read partition table.
	Re-reading the partition table failed.: Device or resource busy
	The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or partx(8).
	Syncing disks.
	
    # reboot

Step 2: Set partition labels

$ sudo su
# parted /dev/sdb mklabel gpt
Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? y
Information: You may need to update /etc/fstab.

# parted /dev/sdc mklabel gpt
Warning: The existing disk label on /dev/sdc will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? y
Information: You may need to update /etc/fstab.

# parted /dev/sdd mklabel gpt
Warning: The existing disk label on /dev/sdd will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? y
Information: You may need to update /etc/fstab.

# parted /dev/sde mklabel gpt
Warning: The existing disk label on /dev/sde will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? y
Information: You may need to update /etc/fstab.

# parted /dev/sdf mklabel gpt
Warning: The existing disk label on /dev/sdf will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? y
Information: You may need to update /etc/fstab.

# reboot

Step 3: Create aligned partitions

$ sudo su
# parted -a optimal -- /dev/sdb mkpart primary 2048s 100%
Information: You may need to update /etc/fstab.

# parted -a optimal -- /dev/sdc mkpart primary 2048s 100%
Information: You may need to update /etc/fstab.

# parted -a optimal -- /dev/sdd mkpart primary 2048s 100%
Information: You may need to update /etc/fstab.

# parted -a optimal -- /dev/sde mkpart primary 2048s 100%
Information: You may need to update /etc/fstab.

# parted -a optimal -- /dev/sdf mkpart primary 2048s 100%
Information: You may need to update /etc/fstab.

# reboot

Step 4: Prepare partitions for RAID array creation

$ sudo su
# parted /dev/sdb set 1 raid on
Information: You may need to update /etc/fstab.

# parted /dev/sdc set 1 raid on
Information: You may need to update /etc/fstab.

# parted /dev/sdd set 1 raid on
Information: You may need to update /etc/fstab.

# parted /dev/sde set 1 raid on
Information: You may need to update /etc/fstab.

# parted /dev/sdf set 1 raid on
Information: You may need to update /etc/fstab.

# reboot

Step 5: Create RAID array FAILS

$ sudo mdadm --create /dev/md0  --level=5 --raid-devices=5 /dev/sd[b-f]1
mdadm: cannot open /dev/sdb1: Device or resource busy

??? This is after reboot!

Even worse:

$ sudo fdisk -l /dev/sdb1
Disk /dev/sdb1: 1.82 TiB, 2000397795328 bytes, 3907026944 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x69205244

Device      Boot      Start        End    Sectors   Size Id Type
/dev/sdb1p1       218129509 1920119918 1701990410 811.6G 72 unknown
/dev/sdb1p2       729050177 1273024900  543974724 259.4G 74 unknown
/dev/sdb1p3       168653938  168653938          0     0B 65 Novell Netware 386
/dev/sdb1p4      2692939776 2692991410      51635  25.2M  0 Empty

Partition 1 does not start on physical sector boundary.
Partition 2 does not start on physical sector boundary.
Partition 3 does not start on physical sector boundary.
Partition table entries are not in disk order.

It looks like all the previous commands have not really been written to the partition table.

Why is the device busy directly after reboot?
This is the first command I enter right after login:

$ sudo wipefs -a /dev/sdb
wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy

This is to provide an update on the open issue. After reading the mdadm (8) man page, I came to the conclusion that mdadm trying to be clever could be the issue:

-f, --force

Insist that mdadm accept the geometry and layout specified without question. Normally mdadm will not allow creation of an array with only one device, and will try to create a RAID5 array with one missing drive (as this makes the initial resync work faster). With --force, mdadm will not try to be so clever.

So, I tried again, this time just adding the --force option to the mdadm command:

  1. Stop the faulty array: # mdadm --stop /dev/md0
  2. Clean up: # wipefs -a /dev/sdb (repeat for other four devices)
  3. Create partitions (as above)
  4. Reset array definition:
    # echo 'DEVICE /dev/hd*[0-9] /dev/sd*[0-9]' > /etc/mdadm.conf
    # mdadm --detail --scan >> /etc/mdadm.conf
  5. Create array, this time with --force option
# mdadm --create /dev/md0  --level=5 --raid-devices=5 /dev/sd[b-f]1 --force
mdadm: /dev/sdc1 appears to be part of a raid array:
       level=raid5 devices=5 ctime=Wed Oct 11 21:46:58 2023
mdadm: /dev/sdd1 appears to be part of a raid array:
       level=raid5 devices=5 ctime=Wed Oct 11 21:46:58 2023
mdadm: /dev/sde1 appears to be part of a raid array:
       level=raid5 devices=5 ctime=Wed Oct 11 21:46:58 2023
mdadm: partition table exists on /dev/sde1 but will be lost or
       meaningless after creating array
mdadm: /dev/sdf1 appears to be part of a raid array:
       level=raid5 devices=5 ctime=Wed Oct 11 21:46:58 2023
mdadm: partition table exists on /dev/sdf1 but will be lost or
       meaningless after creating array
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

This time it is not recovering, it is creating the array from scratch:


# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdf1[4] sde1[3] sdd1[2] sdc1[1] sdb1[0]
      7813525504 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
      [>....................]  resync =  0.0% (1375612/1953381376) finish=809.6min speed=40183K/sec
      bitmap: 15/15 pages [60KB], 65536KB chunk

unused devices: <none>

Looks good so far.