Tumbleweed: RAID1 not mounting

Hello everyone!

I have been using a RAID1 for more than 2 years without any kind of problem in openSuSE. Some year ago, I moved to Tumbleweed, and inherited the RAID in its initial configuration without any glitches. Once I got one power connector in one of the disks accidentally removed and after solving the problem everything worked perfectly. I could define myself as a proud and satisfied md user.

This week, after one Tumbleweed update (openSUSE-release-20170626-1.2.x86_64) the array did not mount anymore. In fact it had disappeared from YasT. Using YasT I redefined it (without formatting partitions!) and manually made:

mdadm --assemble /dev/md0 

what again worked well. However, anytime I try to mount it, I get the same error:

# mount /dev/md0 /mnt/INOUT
mount: wrong fs type, bad option, bad superblock on /dev/md0,
missing codepage or helper program, or other error

In some cases useful info is found in syslog - try
       dmesg | tail or so.

These are the results of some diagnosing commands:

# cat mdadm.conf
DEVICE containers partitions
ARRAY /dev/md0 UUID=5f8c1470:011b45c5:a2b2f6f9:d50cf7b9

# mdadm --detail /dev/md0
/dev/md0:
           Version : 1.0
     Creation Time : Mon Jun 26 23:50:15 2017
        Raid Level : raid1
        Array Size : 1953513280 (1863.02 GiB 2000.40 GB)
     Used Dev Size : 1953513280 (1863.02 GiB 2000.40 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Tue Jun 27 07:28:22 2017
             State : clean 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : unknown

              Name : any:0
              UUID : 5f8c1470:011b45c5:a2b2f6f9:d50cf7b9
            Events : 4131

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1


# mdadm -E /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x1
     Array UUID : 5f8c1470:011b45c5:a2b2f6f9:d50cf7b9
           Name : any:0
  Creation Time : Mon Jun 26 23:50:15 2017
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3907026912 (1863.02 GiB 2000.40 GB)
     Array Size : 1953513280 (1863.02 GiB 2000.40 GB)
  Used Dev Size : 3907026560 (1863.02 GiB 2000.40 GB)
   Super Offset : 3907026928 sectors
   Unused Space : before=0 sectors, after=352 sectors
          State : clean
    Device UUID : f369afb2:dae809b7:8b3d710d:9b3cc377

Internal Bitmap : -16 sectors from superblock
    Update Time : Tue Jun 27 07:28:22 2017
  Bad Block Log : 512 entries available at offset -8 sectors
       Checksum : 28e8ee90 - correct
         Events : 4131


   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)

# mdadm -E /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x1
     Array UUID : 5f8c1470:011b45c5:a2b2f6f9:d50cf7b9
           Name : any:0
  Creation Time : Mon Jun 26 23:50:15 2017
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3907026912 (1863.02 GiB 2000.40 GB)
     Array Size : 1953513280 (1863.02 GiB 2000.40 GB)
  Used Dev Size : 3907026560 (1863.02 GiB 2000.40 GB)
   Super Offset : 3907026928 sectors
   Unused Space : before=0 sectors, after=352 sectors
          State : clean
    Device UUID : baf97132:a11d8af0:4257a807:97048efb

Internal Bitmap : -16 sectors from superblock
    Update Time : Tue Jun 27 07:28:22 2017
  Bad Block Log : 512 entries available at offset -8 sectors
       Checksum : 602d94d2 - correct
         Events : 4131


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)


# cat /proc/mdstat 
Personalities : [raid1] 
md0 : active raid1 sdb1[0] sdc1[1]
      1953513280 blocks super 1.0 [2/2] [UU]
      bitmap: 0/15 pages [0KB], 65536KB chunk

unused devices: <none>

Unfortunately I cannot remember how did I initially formatted the partitions, but I am completely sure it cannot be othe than btrfs or ext4.

And I am quite sure (but not fully) there is no LVM on top of of the RAID. (What to do for testing it?)

I like to (preferibly) mount the RAID again or alternatively, mount as a loop device one of the partitions for extracting the data before reformatting. I’ve tried this but it seems I need a parameter out from any of the ‘mdadm --E /dev/sd**1’ (DataOffset) which is not shown in my version or status.

Any help will be very appreciated.

Pablo G**

Can’t say I’d know how to troubleshoot, but in general software RAID arrays have been known to go bad.

I think you are already on the right path to considering breaking the array,
Can your system boot with only one of the members?
If you can do that, you can then perhaps proceed to re-build your array.

If you want to mount on a loop device, if you need some refresher or a jump I wrote the following after rejecting numerous obsolete articles and finding that the MAN pages might not be to the point and clear as they should be

https://en.opensuse.org/User:Tsu2/loop_devices

Myself, I prefer to spend the money on hardware RAID, even cheap cards… One to use and one if/when the one being used might go bad.

TSU

Thanks Tsu or your answer and info!

Meanwhile, I am still stuck on the same place. As I get an answer to my questions, your loopback info will become very interesting.

Regards,
Pablo G

It does not mean you did not destroy filesystem information on this array.

Unfortunately I cannot remember how did I initially formatted the partitions

And how do you expect us to know it if autodetection of filesystem type by mount fails? At this point you can only try data rescue tools. I have never used them so hopefully somebody else can chime in here.

IMO data rescue should be last resort.

The whole point of a RAID1 is to mirror data on two disks, and as I described boot to only one disk if data is corrupted on the other so IMO that should be the first try. Of course if data was corrupted before writing to disk, then data would be corrupted on both disks… RAID1 only protects against certain types of failures.

TSU

This part of your answer is important. Do you mean reattaching the RAID can destroy the data?

In fact I was asking for help on a) mounting or b) dd’ing.

Thanks for your help.

You already tried to mount and this failed. mount without explicit type tries to auto-detect it. If it failed, some other tools that do more exhaustive detection are needed to try to guess it. If you at least knew what filesystem type it was, you could try some mount options for it.

or b) dd’ing.

Not sure I understand this one.

Thanks again for your answer!

When you say ‘reboot’… do you mean rebooting the PC?. If so, why?

What I already did (no success) is:

# cat /proc/mdstat 
Personalities : [raid1] 
md0 : active raid1 sdb1[0] sdc1[1]
      1953513280 blocks super 1.0 [2/2] [UU]
      bitmap: 0/15 pages [0KB], 65536KB chunk

unused devices: <none>


# mdadm /dev/md0 --manage --fail  /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md0


# mdadm /dev/md0 --manage --remove /dev/sdb1
mdadm: hot removed /dev/sdb1 from /dev/md0

# cat /proc/mdstat 
Personalities : [raid1] 
md0 : active raid1 sdc1[1]
      1953513280 blocks super 1.0 [2/1] [_U]
      bitmap: 0/15 pages [0KB], 65536KB chunk

unused devices: <none>

# mount /dev/md0 /mnt/
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

# mount -t ext4 /dev/md0 /mnt/
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

# mount -t btrfs /dev/md0 /mnt
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

# mdadm -a /dev/md0  /dev/sdb1
mdadm: re-added /dev/sdb1
# mdadm /dev/md0 --manage --fail /dev/sdc1 --remove /dev/sdc1
mdadm: set /dev/sdc1 faulty in /dev/md0
mdadm: hot removed /dev/sdc1 from /dev/md0

# cat /proc/mdstat  
Personalities : [raid1] 
md0 : active raid1 sdb1[0]
      1953513280 blocks super 1.0 [2/1] [U_]
      bitmap: 0/15 pages [0KB], 65536KB chunk

unused devices: <none>

# mount -t ext4 /dev/md0 /mnt
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

# mount -t btrfs /dev/md0 /mnt
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.


I’ve run a mirrored raid for some time as well. One problem - 42.2 install didn’t mount it to /home as I told it too.

It sounds to me like the system doesn’t know that the raid array is there but mdadm does and it’s perfectly ok. What I would do is add a mount for it to fstab, reboot and then see what came up in the kernel boot log.

I did wonder if btrfs might cause problems. Seems not from a quick web search that indicates that either btrfs’s own raid or md raid can be used but the page I found isn’t too clear on the subject. It was bench marking the 2 arrangements.

John

About dd’ing:

I have read somewhere that there is a possibility to do something like dd if=/dev/md0 … for extracting the data from the disk, but it seems that for that action you need some parameter (DataOffset) that my
mdadm --detail /dev/md0

doesn’t provide.

If I add an entry to fstab, system boot is interrupted. My plan was checking “by hand” and only then automate the bootstrap.

Definitely, it was btrfs what I had on top of the mdadm RAID1.

Thanks!

/dev/md0 is the whole array. You do not need any offset to copy data from it. Nor is it clear how is it going to help you. If you cannot access filesystem on this array now, why you think that copying it somewhere will suddenly make it available?

In this case finally post output of dmesg as error message tells you.

Sorry, I wanted to say “dd if=/dev/sdb1…”

In this case finally post output of dmesg as error message tells you.

Unfortunately nothing happens in dmesg when I try to

mount -t btrfs /dev/md0 /mnt

You have array with metadata version 1.0. In this case data starts at the beginning of device, no offset is needed. Size of data is slightly less than size of underlying device and is shown in output you provided.

Unfortunately nothing happens in dmesg when I try to

mount -t btrfs /dev/md0 /mnt

Then it is not detected as btrfs by kernel. There is slight chance that you had metadata 1.1 or 1.2 before and so actual data starts at offset. Please provide the first 128K from underlying device (you can do something like “hexdump -C -n 131072 /dev/sdX1” and upload result to http://susepaste.org).

This is the unpromising result I get:

hexdump -C -n 131072 /dev/sdb1
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |…|
*
00020000

I did not upload it elsewhere as it is quite self-explanatory. Good reason for not detecting anything. I suppose there is nothing to do here.

Thanks again for your help!

Hi
I just got the same issue with Tumbleweed. I just installed tumbleweed with adding online repositories before install (means it get all the package from internet).
I have a raid 5 mounted as home so my system is on a separated disk. I installed everything on a new HDD .
System is booting but my Raid is not detected.
So
mdadm -A scan … it is found. Great
There was no mdadm.conf in /etc I created it (found some instruction online). I have still my old installation so I can see that the mdadm.conf is more or less the same than the one on my opensuse 12.3.
I can see the Raid in the partitioner and I mount it in /data. Great
All my folder are there so Great Again

The I rebooted… still not there and the boot Stop in the middle in a kind of recovery mode because the /dev/md0 cannot be found.

I still can assemble manualy using the device descritption in the mdadm.conf : mdadm -A /dev/md0
My array is conscidered as clean and have a persistent superblock.(mdadm -D /dev/md0)

So I am also stuck. I looked on a lot of mdadm forum but my config seems to be right as I can assemble it manualy and mount it . So my guess is that there is an issue with Grub2 or the Kernel autodetect.

I don’t wana try what you did with yast and recreating an array even without formating.

I will try to install the leap version and see if it works better.

Do not assume auto-assembled array will always be called /dev/md0. Try mounting by UUID in /etc/fstab if it makes any difference.

Thanks fo anwering my post…

When I did the mount with the partitionner it generated the mount line in the fstab using the UUID. Also the UUID is present in the mdadm.conf file. I guess mdadm generates the md0 and UUID in the /dev directory. None of them worked There is no md entry in /dev in whatever format…:frowning:

To conclude.
I have installed leap 4.3 and the raid was recognized directly. Just had to mount it in the partitioner.

So Tumbleweed is really unstable. I had KDE graphic issue, raid issue… Maybe due to the lack of testing. I will add a bug

I will stay with leap. It’s more stable…

THans for your help.