Hello,
About a year ago, I created a Btrfs-based RAID1 using the YaST Partitioner. I also enabled the disk encryption option during the process.
Now, all of a sudden, the RAID array cannot be opened with LUKS anymore.
Why is this happening all of a sudden?
The error message for the RAID /dev/md/NAS reads:
LUKS keyslot 4 is invalid.
Device /dev/md/NAS is not a valid LUKS device.
It’s also worth noting that keyslot 4 wasn’t even used. Only keyslots 1 and 2 were set…
The output from mdadm --detail
and mdadm --examine
shows the following:
nasserver:/home/admin # mdadm --detail /dev/md/NAS
/dev/md/NAS:
Version : 1.0
Creation Time : Mon May 22 06:00:38 2023
Raid Level : raid1
Array Size : 15625879360 (14.55 TiB 16.00 TB)
Used Dev Size : 15625879360 (14.55 TiB 16.00 TB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Wed Jul 31 01:48:24 2024
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Consistency Policy : bitmap
Name : any:NAS
UUID : bbf107f6:69f4d874:565b9db2:bd1a37b7
Events : 604873
Number Major Minor RaidDevice State
0 8 32 0 active sync /dev/sdc
1 8 0 1 active sync /dev/sda
2 8 16 2 active sync /dev/sdb
nasserver:/home/admin # mdadm --misc --examine --verbose /dev/sd[abc]
/dev/sda:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x1
Array UUID : bbf107f6:69f4d874:565b9db2:bd1a37b7
Name : any:NAS
Creation Time : Mon May 22 06:00:38 2023
Raid Level : raid1
Raid Devices : 3
Avail Dev Size : 31251759016 sectors (14.55 TiB 16.00 TB)
Array Size : 15625879360 KiB (14.55 TiB 16.00 TB)
Used Dev Size : 31251758720 sectors (14.55 TiB 16.00 TB)
Super Offset : 31251759088 sectors
Unused Space : before=0 sectors, after=296 sectors
State : clean
Device UUID : e65eef70:fe0e7627:2a07bd6a:2ae8cd7c
Internal Bitmap : -72 sectors from superblock
Update Time : Wed Jul 31 01:48:24 2024
Bad Block Log : 512 entries available at offset -8 sectors
Checksum : d8035600 - correct
Events : 604873
Device Role : Active device 1
Array State : AAA (‘A’ == active, ‘.’ == missing, ‘R’ == replacing)
/dev/sdb:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x1
Array UUID : bbf107f6:69f4d874:565b9db2:bd1a37b7
Name : any:NAS
Creation Time : Mon May 22 06:00:38 2023
Raid Level : raid1
Raid Devices : 3
Avail Dev Size : 35156656040 sectors (16.37 TiB 18.00 TB)
Array Size : 15625879360 KiB (14.55 TiB 16.00 TB)
Used Dev Size : 31251758720 sectors (14.55 TiB 16.00 TB)
Super Offset : 35156656112 sectors
Unused Space : before=0 sectors, after=3904897320 sectors
State : clean
Device UUID : 4eefa158:7e983fa6:9b63c49b:636d4179
Internal Bitmap : -72 sectors from superblock
Update Time : Wed Jul 31 01:48:24 2024
Bad Block Log : 512 entries available at offset -8 sectors
Checksum : 3d7a5196 - correct
Events : 604873
Device Role : Active device 2
Array State : AAA (‘A’ == active, ‘.’ == missing, ‘R’ == replacing)
/dev/sdc:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x1
Array UUID : bbf107f6:69f4d874:565b9db2:bd1a37b7
Name : any:NAS
Creation Time : Mon May 22 06:00:38 2023
Raid Level : raid1
Raid Devices : 3
Avail Dev Size : 35156656040 sectors (16.37 TiB 18.00 TB)
Array Size : 15625879360 KiB (14.55 TiB 16.00 TB)
Used Dev Size : 31251758720 sectors (14.55 TiB 16.00 TB)
Super Offset : 35156656112 sectors
Unused Space : before=0 sectors, after=3904897320 sectors
State : clean
Device UUID : f187ec83:6ce4ae41:e5e52705:0f0fc0da
Internal Bitmap : -72 sectors from superblock
Update Time : Wed Jul 31 01:48:24 2024
Bad Block Log : 512 entries available at offset -8 sectors
Checksum : cf165a1a - correct
Events : 604873
Device Role : Active device 0
Array State : AAA (‘A’ == active, ‘.’ == missing, ‘R’ == replacing)
According to mdadm
, all checksums are correct. A SMART test with the manufacturer’s tools also confirmed that all hard drives are error-free. Therefore, I don’t believe it’s a hardware issue.
I rolled back the system with Snappy to an earlier state where the RAID was still functioning without any issues. However, this didn’t make any difference. So, I don’t believe the problem is caused by faulty installed software.
I was able to determine that there is a valid LUKS header at the beginning of the drive with device number 2 (Device Role: Active device 2, currently /dev/sdb
). This header is recognized when the drive is accessed directly without mdadm
.
Therefore, I created a new mdadm
RAID array using mdadm --create
. Now, the drive with the LUKS header is in the first position. The path was changed from /dev/sdX
to /dev/mapper/sdX
, as I am now working with overlays of sdX
from this point on:
mdadm --create /dev/md/NAS --assume-clean --level=1 --metadata=1.0 --raid-devices=3 /dev/mapper/sdb /dev/mapper/sdc /dev/mapper/sda
mdadm: /dev/mapper/sdb appears to be part of a raid array:
level=raid1 devices=3 ctime=Mon May 22 06:00:38 2023
mdadm: /dev/mapper/sdc appears to be part of a raid array:
level=raid1 devices=3 ctime=Mon May 22 06:00:38 2023
mdadm: /dev/mapper/sda appears to be part of a raid array:
level=raid1 devices=3 ctime=Mon May 22 06:00:38 2023
mdadm: largest drive (/dev/mapper/sdb) exceeds size (15625879360K) by more than 1%
Continue creating array? y
When creating the array, the error message “largest drive exceeds size by more than 1%” is displayed. However, since no actual reformatting is being done, this shouldn’t be a problem. Or am I mistaken?
Now, the RAID array can be mounted again using LUKS.
A test with btrfs check shows that the filesystem is corrupt and not all checksums are found. However, the data itself is still accessible.
btrfs check --check-data-csum /dev/mapper/data
Opening filesystem to check…
Checking filesystem on /dev/mapper/data
UUID: b7398a14-99e6-4245-b152-85d4a22e7155
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space cache
[4/7] checking fs roots
root 259 inode 17447 errors 1000, some csum missing
root 7968 inode 17447 errors 1000, some csum missing
root 8540 inode 17447 errors 1000, some csum missing
root 10472 inode 17447 errors 1000, some csum missing
root 11035 inode 17447 errors 1000, some csum missing
root 12429 inode 17447 errors 1000, some csum missing
root 13718 inode 17447 errors 1000, some csum missing
root 14494 inode 17447 errors 1000, some csum missing
root 15513 inode 17447 errors 1000, some csum missing
root 17341 inode 17447 errors 1000, some csum missing
root 19777 inode 17447 errors 1000, some csum missing
root 19897 inode 17447 errors 1000, some csum missing
root 20018 inode 17447 errors 1000, some csum missing
root 20139 inode 17447 errors 1000, some csum missing
root 20260 inode 17447 errors 1000, some csum missing
root 20380 inode 17447 errors 1000, some csum missing
root 20501 inode 17447 errors 1000, some csum missing
root 20621 inode 17447 errors 1000, some csum missing
root 20743 inode 17447 errors 1000, some csum missing
root 20864 inode 17447 errors 1000, some csum missing
root 20869 inode 17447 errors 1000, some csum missing
root 20874 inode 17447 errors 1000, some csum missing
root 20879 inode 17447 errors 1000, some csum missing
root 20884 inode 17447 errors 1000, some csum missing
root 20889 inode 17447 errors 1000, some csum missing
root 20894 inode 17447 errors 1000, some csum missing
root 20899 inode 17447 errors 1000, some csum missing
root 20904 inode 17447 errors 1000, some csum missing
root 20909 inode 17447 errors 1000, some csum missing
root 20914 inode 17447 errors 1000, some csum missing
root 20919 inode 17447 errors 1000, some csum missing
root 20924 inode 17447 errors 1000, some csum missing
root 20929 inode 17447 errors 1000, some csum missing
root 20934 inode 17447 errors 1000, some csum missing
root 20939 inode 17447 errors 1000, some csum missing
root 20944 inode 17447 errors 1000, some csum missing
root 20949 inode 17447 errors 1000, some csum missing
root 20954 inode 17447 errors 1000, some csum missing
root 20959 inode 17447 errors 1000, some csum missing
root 20964 inode 17447 errors 1000, some csum missing
root 20969 inode 17447 errors 1000, some csum missing
ERROR: errors found in fs roots
found 4455135825920 bytes used, error(s) found
total csum bytes: 4337800852
total tree bytes: 12935102464
total fs tree bytes: 4029333504
total extent tree bytes: 3621240832
btree space waste bytes: 2327492901
file data blocks allocated: 444362249580544
referenced 10789058260992
What does “errors 1000” mean? Is that the number of errors?
And were all Btrfs blocks that have checksums free of errors?
Now I have the following questions:
- Should I continue with the current approach and let Btrfs fix the filesystem errors using
btrfs fix
? If not, what would be a better approach? - I don’t quite understand why
mdadm
is being used at the block device level. Btrfs should be managing the RAID, shouldn’t it? Or is another RAID system being used after the LUKS encryption, this time managed by Btrfs? Shouldn’t Btrfs be aware of the physical block devices under the LUKS encryption layer? - What happens if the drive containing the LUKS header fails? In a RAID1 setup, a drive failure should be possible without data loss. Or is the LUKS header also stored on another drive? I’ve already made a backup of the LUKS header, but unfortunately, I have no idea how to restore it if that drive fails.
Thank you in advance for answering my questions.
My System:
NAME=“openSUSE Leap”
VERSION=“15.6” ID=“opensuse-leap”
ID_LIKE=“suse opensuse”
VERSION_ID=“15.6”
PRETTY_NAME=“openSUSE Leap 15.6”
ANSI_COLOR=“0;32”
CPE_NAME=“cpe:/o:opensuse:leap:15.6”