Kernel update caused boot issues. Need some help.

Boot Issues

I took a kernel update and now have this problem.
I was at 2.6.27.7-9 now at 2.6.27.19-3.2

Checking file systems…
fsck 1.41.1 (01-Sep-2008)
fsck.ext3: Invalid argument while trying to topen /dev/md0
/dev/md0:
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>

fsck failed for at least one filesystem (not /).
Please repair manually and reboot.
The root file system is is already mounted read-write.

Attention: Only CONTROL-D will reboot the system in this
maintanance mode. shutdown or reboot will not work.
Give root password for login:

cat /proc/mdstat
Personalities : [raid1] [raid0] [raid6] [raid5] [raid4] [linear]
md127 : active (auto-read-only) raid1 sda1[0] sdb1[1]
506032 blocks super 1.0 [2/2] [UU]
bitmap: 0/8 pages [0KB], 32KB chunk

md1 : active raid1 sda2[0] sdb2[1]
976245816 blocks super 1.0 [2/2] [UU]
bitmap: 0/466 pages [0KB], 1024KB chunk

unused devices: <none>

mdadm -S /dev/md127
md: md127 stopped.
md: unbind<sda1>
md: export_rdev(sda1)
md: unbind<sdb1>
md: export_rdev(sdb1)
mdadm: stopped /dev/md127

mdadm --assemble --force --verbose /dev/md0 /dev/sda1 /dev/sdb1

mdadm: looking for devices for /dev/md0
md: md0 stopped.
mdadm: /dev/sda1 is identified as a member of /dev/md/0, slot 0.
mdadm: /dev/sdb1 is identified as a member of /dev/md/0, slot 1.
md: bind<sdb1>
mdadm: added /dev/sdb1 to /dev/md/0 as 1
md: bind<sda1>
mdadm: added /dev/sda1 to /dev/md/0 as 0
raid1: raid set md0 active with 2 out of 2 mirrors
md0: bitmap initialized from disk: read 1/1 pages, set 0 bits
created bitmap (8 pages) for device md0
mdadm: /dev/md/0 has been started with 2 drives.

cat /proc/mdstat
Personalities : [raid1] [raid0] [raid6] [raid5] [raid4] [linear]
md0 : active (auto-read-only) raid1 sda1[0] sdb1[1]
506032 blocks super 1.0 [2/2] [UU]
bitmap: 0/8 pages [0KB], 32KB chunk

md1 : active raid1 sda2[0] sdb2[1]
976245816 blocks super 1.0 [2/2] [UU]
bitmap: 0/466 pages [0KB], 1024KB chunk

unused devices: <none>

e2fsck -b 8193 /dev/md0

Free blocks count wrong for group #0 (7661, counted=1514).
Fix<y>?y
There are several of these.

Boot: ***** FILE SYSTEM WAS MODIFIED *****
Boot: 72/126976 files (16.7% non-contiguous), 85475/506032 blocks

(repair filesystem) #

cat/proc/mdstat
Personalities : [raid1] [raid0] [raid6] [raid5] [raid4] [linear]
md0 : active raid1 sda1[0] sdb1[1]
506032 blocks super 1.0 [2/2] [UU]
bitmap: 0/8 pages [0KB], 32KB chunk

md1 : active raid1 sda2[0] sdb2[1]
976245816 blocks super 1.0 [2/2] [UU]
bitmap: 0/466 pages [0KB], 1024KB chunk

unused devices: <none>

control d

When I reboot the same thing happens all over again.

If I issue an init 5 the system boots all the way up and I able to run.

Any ideas.

Thanks

You may need to fix /etc/mdadm.conf since md0 was not correctly assembled at boot. Also make sure that INITRD_MODULES in /etc/sysconfig/kernel contains all the modules needed to support RAID.

Current files

(none):/etc # more mdadm.conf
DEVICE partitions
ARRAY /dev/md0 level=raid1 UUID=70399c23:e69071b3:37c490a5:5a2b749b
ARRAY /dev/md1 level=raid1 UUID=7771a78a:55eb8f6c:42571ebc:40fd1b56

(none):/etc/sysconfig # more kernel

Path: System/Kernel

Description:

Type: string

Command: /sbin/mkinitrd

This variable contains the list of modules to be added to the initial

ramdisk by calling the script “mkinitrd”

(like drivers for scsi-controllers, for lvm or reiserfs)

INITRD_MODULES=“sata_nv sata_via pata_amd pata_jmicron ahci pata_amd processor thermal ata_generic amd74xx ide_pci_g
eneric fan jbd ext3 raid1 dm_mod edd”

Type: string

Command: /sbin/mkinitrd

This variable contains the list of modules to be added to the initial

ramdisk that is created for unprivilegd Xen domains (domU); you may need

drivers for virtual block and network devices in addition to filesystem

and device-mapper modules.

DOMU_INITRD_MODULES=“xennet xenblk”

Type: string

ServiceRestart: boot.loadmodules

This variable contains the list of modules to be loaded

once the main filesystem is active

You will find a few default modules for hardware which

can not be detected automatically.

MODULES_LOADED_ON_BOOT=""

Type: string

Default: “”

The file name of a binary ACPI Differentiated System Description Table

(DSDT). This table is appended to the initial ram disk (initrd) that

the mkinitrd script creates. If the kernel finds that its initrd

contains a DSDT, this table replaces the DSDT of the bios. If the file

specified in ACPI_DSDT is not found or ACPI_DSDT is empty/not specified,

no DSDT will be appended to the initrd.

Example path /etc/acpi/DSDT.aml

You can also override Secondary System Description Tables (SSDTs).

Add DSDT and SSDT files separated by spaces, e.g. “DSDT.aml SSDT1.aml”

The files must be named DSDT.aml and/or SSDT[1-9]*.aml.

For compatiblity reasons, if only one file is added it is assumed it is

the DSDT and will be used as such, in future the above naming scheme

will be enforce.

Be aware that overriding these tables can harm your system.

Only do this if you know what you are doing and file a bug on

bugzilla.kernel.org so that the root cause of the issue will get fixed.

ACPI_DSDT=""

Type: string(yes)

Default: “”

Skip doing a minimal preparation of the /usr/src/linux source tree so

that most header files can be directly included. If set, /usr/src/linux

will not be touched.

SKIP_RUNNING_KERNEL=""

Still having a problem.
(none):/etc # mdadm -E /dev/sda1
/dev/sda1:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x1
Array UUID : 70399c23:e69071b3:37c490a5:5a2b749b
Name : linux:0
Creation Time : Fri Jan 9 22:53:21 2009
Raid Level : raid1
Raid Devices : 2

Avail Dev Size : 1012064 (494.25 MiB 518.18 MB)
Array Size : 1012064 (494.25 MiB 518.18 MB)
Super Offset : 1012072 sectors
State : clean
Device UUID : ea90d068:b28a9fd2:5bf965a1:d534103d

Internal Bitmap : 2 sectors from superblock
Update Time : Sat Mar 7 23:30:59 2009
Checksum : e40db29f - correct
Events : 164

Array Slot : 0 (0, 1)

Array State : Uu

This looks good to me.

Try recreating the initrd with mkinitrd. Also make sure that the symlink in /boot/grub points to the right one.

Thanks for helping me with this issue.

I tried the mkinitrd.

I still get the same issue.
I’ve since then reformated boot, root, swap and loaded the stock OS 11.1. System boots fine with boot in a raid1 configuration. I then just updated the kernel to 2.6.27.19-3.2 After rebooting the boot log shows md0 failing. Do you know if the new kernal supports boot in a raid1 config? or it seems that mdadmd is not starting up first so md0 fails.

I’m typing this from an x86_64 computer updated to 2.6.27.19-3.2 with / ext3 RAID1 and /home XFS RAID1. I have also run some Factory kernels in between with no issues.

I’m experiencing the same problem atm… currently thinking about how to downgrade the kernel again… or reinstall the old system without messing up the system…

after all: it’s a raid system intended for BACKUP… I don’t know if I can trust this machine anymore running that unstable kernel/hardware combination :frowning:
(had some other problems with the machine not shutting down (can’t stop processes), kacpid taking up to 100%, smbd running at 100%, …)

I must admit that my next server won’t be an openSUSE one… and I’m feeling really sorry about that fact because I really like yast.

Please tell me you are not considering RAID a backup solution.

So, I have the same problem.
Fresh openSuSE 11.1 installation.
md0 /boot
md1 LVM ( / (xfs) and swap included and some free space for snapshots)

All works fine.

zypper update
reboot

bam
“Can’t find kernel… etc.”

Now md0 isn’t longer known. I have to change md0 to md127 in /etc/fstab and the machine comes up.

boot.md will fail but no problems and can’t see why. /var/log/boot.msg has no detail cainformations for me.
After I changed md0 to md127 in /etc/mdadm.conf and reboot boot.md still fail.

I think it was a bug incoming with the kernel / module update. Any ideas about to fix it?
Any ideas why boot.md fails?

# mdadm --examine --scan --config=partitions
ARRAY /dev/md/0 level=raid1 metadata=1.0 num-devices=2 UUID=fca0b060:859af8ad:ccbf558a:e252f681 name=linux:0
ARRAY /dev/md/1 level=raid1 metadata=1.0 num-devices=2 UUID=983a0d01:71ff9aad:ed5d7372:97d0590b name=linux:1
# cat /proc/mdstat 
Personalities : [raid1] [raid0] [raid6] [raid5] [raid4] [linear] 
md127 : active raid1 sda1[0] sdb1[1]
      128508 blocks super 1.0 [2/2] [UU]
      bitmap: 0/8 pages [0KB], 8KB chunk

md1 : active raid1 sda2[0] sdb2[1]
      312431984 blocks super 1.0 [2/2] [UU]
      bitmap: 2/298 pages [8KB], 512KB chunk

unused devices: <none>
# cat /etc/fstab 
/dev/vg01/swaplv     swap                 swap       defaults              0 0
/dev/vg01/rootlv     /                    xfs        defaults              1 1
/dev/md127           /boot                ext3       acl,user_xattr        1 2
proc                 /proc                proc       defaults              0 0
sysfs                /sys                 sysfs      noauto                0 0
debugfs              /sys/kernel/debug    debugfs    noauto                0 0
usbfs                /proc/bus/usb        usbfs      noauto                0 0
devpts               /dev/pts             devpts     mode=0620,gid=5       0 0

I had the same thing happen last week and came to the same solution of putting md127 in fstab. The update changes the device number for some reason.

I was using software RAID-1 and a new install of openSUSE 11.1