Hi everyone,
I’m running openSUSE 11.4 on several machines and whenever there is a kernel update, there is a slight chance that grub refuses to boot the kernel after the update is done. It happened again today.
Here’s what I do/observe:
- “zypper update”, which (besides other stuff) listed kernel-default. I confirm.
- After this is done, I do “shutdown -r now” to reboot.
- Grub shows its menu (text mode only, I don’t have the graphical bootsplash installed). After 3 sec or so timeout, it tries to boot as always, but immediately says “Error 16: Inconsistent filesystem structure”.
Unfortunatelly, this problem occurs only randomly. I’m running 2 64bit machines and 4 32bit machines. Up to now it happened only on 32bit machines and at every kernel-update, it occurs on 1 or 2 of my machines.
To fix this, I have to boot from external media and recreate the broken boot-partition (i.e. copy files to some temporary place and do a “mkfs.ext2 /dev/sda1”, then copy back on the new filesystem). Then I mount everything, chroot in my to-be-fixed system and run “grub-install”. After that I can boot from harddisk again.
Some more information:
- The filesystem which is inconsistent according to grub is ok according to fsck.ext2. I checked this before rewriting the filesystem with mkfs.ext2
- The system contains only one harddisk with 2 partitions: sda1 is /boot, ext2, 80MiB; sda2 is a luks-encrypted lvm container. partition table is gpt, grub is installed in the mbr (I think).
- The way I installed these systems I a bit unorthodox: I basically set up the partitioning scheme by hand and mounted everything, then did a “zypper -R /mnt/root install …”. While this might contribute to the triggering of this behavior, it still believe there is an underlying bug that is causing this mess. I have no other explanation for this to happen only randomly, but not always.
Does anyone have an idea where I could look for the reason of the strange behavior? I searched this forum and google, and many people are seeing this error message from grub, but most threads do (a) only discuss how to get the system back to boot normal or (b) have this problem while running the grub shell from within the working system. I couldn’t find a discussion thread that aimed at my problem (or I wasn’t able to recognize such a thread).
In the last months I always just “fixed” this as fast as possible to move on. Today I made a copy of /dev/sda1 (“cat /dev/sda1 > file”) right after booting my rescue system. If anyone has time and interest in fixing the underlying bug, I’m happy to share this file, logfiles and other data, I just don’t know where to look for hints.
Yarny