root RAID device missing from /dev after booting

cilantro05 · August 9, 2011, 8:55am

I’m trying to convert a running SuSE 11.3 machine to boot from RAID1.
Seems to work – I copied / and /boot to (degraded) md arrays, and have successfully booted from the new disk. But the md device for / on the running system does not exist in /dev! Among other things, this has prevented me from adding the original root partition to the array.

Perhaps devtmpfs was mounted on /dev after the md device was put in /dev, thus overmounting the existing md device?? I can not otherwise imagine how my system can claim to have /dev/md8 mounted as /, when /dev/md8 itself does not (seem to) exist. But I don’t know whether I need a new initrd, or should tweak devtmpfs and/or udev, or perhaps something else altogether.

Details:

I’ve been running SuSE 11.3 on a 64-bit machine with 1 disk.
I created /boot and / partitions on disk2 with:
mdadm -Cv /dev/md3 -e0.9 -l1 -n2 missing /dev/sdb3
mdadm -Cv /dev/md8 -e0.9 -l1 -n2 missing /dev/sdb6
(the numbering is for historical reasons), then added the configuration details to /etc/mdadm.conf using mdadm -Ds. Formatted as ext3, then copied /boot and / into /dev/md3 and /dev/md8, respectively. Set up GRUB and /etc/fstab on disk2, then booted successfully into the new system. Completed the /boot array (/dev/md3) with:
mdadm /dev/md3 -a /dev/sda3
but couldn’t run the similar command for /dev/md8 because it doesn’t exist! That’s right, df shows:
/dev/md8 766736 345272 382516 48% /
/dev/md3 256586 59520 183818 25% /boot
but “ls /dev/md8” returns “no such file or directory”, and /proc/mdstat contains no information about this array.

I’ve tried this both with the stock initrd (2.6.34-12-desktop), and with one I made using mkinitrd after adding “raid1” to INITRD_MODULES in /etc/sysconfig/kernel; no difference. I have not tried making my own initramfs. But I have trouble believing that this is where the problem lies, since the RAID array is assembled and mounted correctly – otherwise I couldn’t boot. Rather, I suspect this is a problem with devtmpfs. Or perhaps the init script, or even something silly like not using /dev/md0…

Thanks in advance for any suggestions.

hcvv · August 9, 2011, 12:12pm

Welcome to our forums new poster.

It is very good you try to show your case by copy/pasting the relevant computer commands and their output here. But please do so between CODE tags (Posting in Code Tags - A Guide ) to make those readable and cleary different from the rest of your story.

E.g.

henk@boven:~> ls /dev/md8
ls: kan geen toegang krijgen tot /dev/md8: Bestand of map bestaat niet
henk@boven:~>

or, when you are afraid we cannot understand your local language

henk@boven:~> LANG=C ls /dev/md8
ls: cannot access /dev/md8: No such file or directory
henk@boven:~>

is much better then what you do.

cilantro05 · August 9, 2011, 6:07pm

Ouch; sorry. Code used to create arrays was:


	mdadm -Cv /dev/md3 -e0.9 -l1 -n2 missing /dev/sdb3
	mdadm -Cv /dev/md8 -e0.9 -l1 -n2 missing /dev/sdb6

After reboot, df shows:


	/dev/md8 766736 345272 382516 48% /
	/dev/md3 256586 59520 183818 25% /boot

and completing the /boot array with


	mdadm /dev/md3 -a /dev/sda3

works fine, but the similar command for / fails, since /dev/md8 does not appear to exist, and in particular /proc/mdstat contains no information about it.

Further details in original post; sorry again for the newbie error.

Cilantro

hcvv · August 9, 2011, 6:33pm

I won’t ask you to post again all you did now twice above, but when I do a df it looks like

henk@boven:~> df
Bestandssysteem     1K-blokken  Gebruikt Beschikbr Geb% Aangekoppeld op
rootfs                20641788   4539116  15054032  24% /
devtmpfs                497896       352    497544   1% /dev
tmpfs                   503484      1464    502020   1% /dev/shm
/dev/sda2             20641788   4539116  15054032  24% /
/dev/sda3             95880040  50338824  44469708  54% /home
/dev/sda5             20641788   5706324  13886824  30% /mnt/oldsys
/dev/sda6            103210940  50503240  47464892  52% /mnt/oldsys/home
henk@boven:~>

You see the difference? All lay-out is preserved and we can now see the columns (complete with their headings which are missing in your llisting).

I am also afraid that the “code used to create arays was:” is from your memory and not copy/pasted from life.

I note that until now no MD experts joined us, thus I have some dumb remarks (after glancing through man mdadm), did you use the -Q, -D, *-E *options to gather information about how mdadm thinks the situation is?

cilantro05 · August 9, 2011, 7:06pm

I am also afraid that the “code used to create arays was:” is from your memory and not copy/pasted from life.

Not true. This is from my cut-and-paste log file kept (on another machine) while I worked (and later used to repeat the process).

I note that until now no MD experts joined us, thus I have some dumb remarks (after glancing through man mdadm), did you use the -Q, -D, *-E *options to gather information about how mdadm thinks the situation is?

Yes. /dev/md8 does not exist (after reboot), so no mdadm command yields any information about it. Prior to reboot, that is, after initial assembly, the -D option yields


ARRAY /dev/md8 metadata=0.90 UUID=315bd190:eee58a54:2424eef9:7d0c6ee

which I inserted into /etc/mdadm.conf. The result of -E on the underlying partition is below:


%  mdadm -E /dev/sdb6
/dev/sdb6:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 315bd190:eee58a54:2424eef9:7d0c6eeb (local to host oction)
  Creation Time : Mon Aug  8 16:21:18 2011
     Raid Level : raid1
  Used Dev Size : 779008 (760.88 MiB 797.70 MB)
     Array Size : 779008 (760.88 MiB 797.70 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 8

Update Time : Mon Aug  8 16:32:09 2011
          State : clean
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 72ace2f - correct
         Events : 34


      Number   Major   Minor   RaidDevice State
this     1       8       22        1      active sync   /dev/sdb6

   0     0       0        0        0      removed
   1     1       8       22        1      active sync   /dev/sdb6

Yes, I’d like to hear from some MD experts, but I still suspect that the problem is with devtmpfs, since the array is being assembled correctly.

Thanks,
Cilantro

hcvv · August 9, 2011, 7:28pm

Another wild jump of me. Just to have more info available for the gurus that have to come (after we found the correct invocation );

Can you give us the output of

fdisk -l

It will show us if the correct partition type is given to those partitions (must be* fd Linux raid auto* IIRC).

cilantro05 · August 9, 2011, 7:41pm


#  fdisk -l /dev/sdb

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1        1912    15358108+  83  Linux
/dev/sdb2            1913       34112   258646500   83  Linux
/dev/sdb3   *       34113       34145      265072+  fd  Linux raid autodetect
/dev/sdb4           34146       60801   214114320    5  Extended
/dev/sdb5           34146       34784     5132736   83  Linux
/dev/sdb6           34785       34881      779121   fd  Linux raid autodetect
/dev/sdb7           34882       36156    10241406   83  Linux
/dev/sdb8           36157       36795     5132736   83  Linux
/dev/sdb9           36796       36892      779121   83  Linux
/dev/sdb10          36893       37531     5132736   fd  Linux raid autodetect
/dev/sdb11          37532       38806    10241406   83  Linux
/dev/sdb12          38807       59756   168280843+  fd  Linux raid autodetect
/dev/sdb13          59757       60801     8393931   82  Linux swap / Solaris

/dev/sdb10 and /dev/sdb12 are part of RAID10 arrays, which are working properly. /dev/sdb2 is used as temporary workspace, and the remaining partitions are currently unformatted and unused, except for swap.

(Does partition type really matter any more? I thought initrd used mdadm these days. No matter; I set the partition types to fd anyway.)

Cilantro

hcvv · August 9, 2011, 7:58pm

We had a lot of discussions about which programs/tools look for those types and draw conclusions from it. In any case, like you did, the general opinion was to set it correct, were it only for your own documentation.

Strange that only three of the four are detected.

cilantro05 · August 9, 2011, 8:10pm

No, all four are surely being detected – else I couldn’t boot. What I imagine must be happening is that initrd correctly detects /dev/md8 and mounts it as /, *then *devtmpfs starts up (a second time?) and creates (recreates?) /dev, somehow overmounting the /dev on which mda8 lives. I don’t see any other way for /dev/md8 to be mounted on /, without /dev/md8 itself being accessible.

Of course, that doesn’t mean something else entirely isn’t happening; I truly don’t understand devtmpfs. But root is clearly on the successfully assembled and mounted array, even though I can’t access the device after booting.

Cilantro

PS: (note added) /proc is presumably also being (re)mounted after booting, since /proc/mdstat also contains no information about the raid device which contains root.

cilantro05 · August 10, 2011, 12:08am

Hmm. I tried making /dev/md8 by hand:

mknod /dev/md8 b 9 8

Then I tried running

mkdam -A /dev/md8

which not only failed silently, as before, but which removed /dev/md8 in the process. This suggests that mdadm will, under certain circumstances, remove the block file. So perhaps I’m having an mdadm problem after all, and mdadm is for some reason removing /dev/md8. Perhaps when trying to reassemble an already-running array? Still not sure how this can happen while the array is running, but perhaps it’s not a devtmpfs overmount problem after all.

Any MD experts out there? Perhaps I need to run some other init script There’s not much documentation for doing this by hand; I’m not the first person to not immediately realize that boot.md had to be enabled…

Thanks,
Cilantro

hcvv · August 10, 2011, 9:46am

You forget there is the udev deamon running in the background. Alll in* /dev* is nowadays generated dynamic. Maybe an mkdir will succeed in the first moment, but I do not know what happens when udevd notices this. It is a bit more complicated that it looks I am afraid.

djh-novell · August 10, 2011, 12:42pm

cilantro05 wrote:
> /dev/sdb3 * 34113 34145 265072+ fd Linux raid autodetect
>
> (Does partition type really matter any more? I thought initrd used
> mdadm these days. No matter; I set the partition types to fd anyway.)

I use md RAID but I’ve stayed out of this thread because I’m way too
cowardly to try running / or /boot on a RAID. However I can comment on
this question to point out that

<https://raid.wiki.kernel.org/index.php/RAID_superblock_formats>

says: “Current Linux kernels (as of 2.6.28) can only autodetect (based
on partition type being set to FD) arrays with superblock version 0.90.”

ISTR you are [sensibly IMHO] using 1.0 superblocks.

I don’t know what the implications are.

cilantro05 · August 10, 2011, 6:50pm

You are correct, of course, but I thought it was worth a try. I also tried moving /dev out of the way entirely (with “mount --move /dev /mnt/tmp”) to see whether there was anything mounted underneath; no there is not.

I think that’s probably the best advice yet. If I can’t get this sorted out quickly, I’ll go back to RAID10 on my data partitions, and leave / and /boot alone.

Pretty sure I’m using 0.9 superblocks for / and /boot, but not sure it matters anymore since I believe initrd uses mdadm rather than autodetect these days.

I am still mystified as to how I can have / mounted on /dev/md8 without /dev/md8 existing… I will try booting into single-user mode and/or playing with the boot scripts in /etc/init.d, but if none of that works I’m ready to abandon the attempt.

Thanks to you both for your comments.
Cilantro

cilantro05 · August 11, 2011, 2:08am

SUCCESS!!

Turns out it was an initrd problem after all. I was on the right track trying to rebuild initrd with RAID modules, but mdadm was also missing. I’m sure it’s possible to fix this manually, perhaps with mkinitramfs, but there’s an easier way: Just use the -f option to mkinitrd.

mkinitrd -f md

This appears to both insert the appropriate modules and a copy of mdadm.

In the interest of full disclosure, I should add that I ran this command after doing a chroot into the new filesystem, while still booting from the old one. I don’t believe this is necessary, but didn’t check. (In the other direction, there’s some chance that chroot alone would be sufficient, without using -f – mkinitrd is pretty smart.) I was loosely following the guide at
http://wiki.archlinux.org/index.php/Convert_a_single_drive_system_to_RAID,which was a big help. However, I was unable to set up GRUB correctly on disk2 (which would be copied to disk1 when the RAID array was assembled); rebooting failed after assembling /boot. After a brief moment of panic, I booted into another OS (I used a SuSE LiveDVD, but that doesn’t matter) and reran the grub setup command on disk1; this worked fine.

The only downside is that I now have all levels of raid as modules in initrd, and also automatically loaded into the kernel. I’m sure this can be fine tuned, but I may not bother. (I also believe that raid1 alone isn’t sufficient; there’s also a dm-mod module that is probably needed.)

I wish that the need for “mkinitrd -f md” were better documented; don’t know whether this is SuSE specific. Yes, it’s in the mkinitrd man page, but not, so far as I can tell, in any RAID HOWTOs. The other step missing from most HOWTOs is the need to run

chkconfig boot.md on

which however I had figured out earlier. Without this, mdadm does not run automatically when booting, so (non-boot) arrays need to be assembled manually.

So what was going on? I do not believe there was any overmounting at all. Rather, /dev/sdb6 was being mounted as / – and somehow later getting renamed to /dev/md8 even though that array was most likely never successfullly assembled. Further backtracking reveals that my menu.lst had “root=/dev/sdb6” in it, rather than “root=/dev/md8”…

In any case, I am now booting from RAID1 partitions. Not sure it was worth the trouble, but glad it works.

Thanks to all for their comments.
Cilantro