12.2 unbootable after kernel update

After installing the newest security update to the kernel this morning my 12.2 machine has become unbootable.

It was a (then) new single-boot 12.2 installation using grub2 (no remains of earlier grub1 or multiboot stuff should be on the disk). The kernel has been updated successfully before.
(Xen hypervisor is installed but hasn’t been really used, the machine boots to desktop kernel)

The error message is:

Grub loading…
Welcome to Grub!

error: unknown filesystem
Entering rescue mode

The disk has 3 partitions. sda1 is FAT (not used by grub or Linux in any way, just reserved for BIOS update), sda2 is ext2 (/boot) and sda3 is a LUKS crypted LVM PV containing / and /home (doesn’t matter yet, because first grub needs to understand sda2 before we get there)

Both sda2 and sda3 are intact, I can mount (respectively decrypt, activate lvs and mount) them when booting from a liveCD. fsck has no complaints.

My MBR is intact (md5sum matches with a backup). I understand the MBR contains a generic bootloader that jumps to the first partition with the boot flag set. That is sda2 indeed. I guess this step works OK, otherwise I would not see the grub rescue shell

Grub thinks the prefix is (hd0,msdos2)/grub2, which again looks correct to me. But “ls (hd0,msdos2)” says unknown filesystem, as long as grub cannot read this ext2 (/boot) we’re not getting anywhere. The ext2.mod file is there.

While booted from LiveCD I reinstalled grub2 (using yast2 bootloader) to /dev/sda2. (after taking a backup of the whole partition) Made absolutely no difference, it still complains about unknown filesystem.

Could you boot live CD and get output of https://raw.github.com/arvidjaar/bootinfoscript/master/bootinfoscript. Post it on SUSE Paste.

Happened to me today too.

The screen stuck before the login screen should appear.

Thanks for your quick reply, that looks like a very useful tool. Unfortunately login.attachmate.com did not want to co-operate a couple of hours ago, so I have not succeed to post this reply earlier.

Here it finally is: SUSE Paste

The information looks very reasonable to me, that is like I thought the system is configured. I could not spot any problem (I’m not a grub2 guru).

Note: This is the status after re-installation of grub2 to /dev/sda2. I would expect that before re-installation some files would be located in different sectors. But as the error message was completely unaffected the problem lays probably somewhere else. If you think it’s essential to understand the issue I can restore /dev/sda2 from backup to the state before the re-installation, when grub2 complained about the “unknown filesystem” for the first time.

Same here (oSUSE 12.2 KDE x86_64 UEFI Install, default partitioning from oSUSE DVD) after a kernel security patch! :PSecurity patch that makes oSUSE unbootable, lol!
By the way, I found a workaround just to have system starting.
When on GRUB menu, press e on the default menu entry and then remove -version from both vmlinuz and initrd lines, then boot with F10.
This is default

...]
echo    'Loading Linux 3.4.28-2.20-desktop ...'
linux    /boot/vmlinuz-3.4.28-2.20-desktop root=UUID=blah-blah-blah-blah ...]
echo    'Loading initial ramdisk ...'
initrd    /boot/initrd-3.4.28-2.20-desktop

You’ll need to modify it this way

...]
echo    'Loading Linux 3.4.28-2.20-desktop ...'
linux    /boot/vmlinuz root=UUID=blah-blah-blah-blah ...]
echo    'Loading initial ramdisk ...'
initrd    /boot/initrd

It detects your filesystem as minix for whatever reason:

insmod minix_be

Edit grub.cfg and replace all “minix_be” with ext2.

EDIT. OK, but it may not even be able to fetch ext2 at this point … now I have valid reason to add listing of grub2 modules in core.img. Hmm … if it will not work, try manually reinstalling GRUB2 (from proper chroot) adding ext2 explicitly:

grub2-install --modules=ext2 --force /dev/sda1

Then open bug report on http://bugzilla.novell.com and let us know number.

/dev/sda2 of course. Sorry, should not have replied early in the morning :slight_smile:

Same issue here… Yesterday at home, after updating kernel (through yast of course), didn’t reboot.

This morning, at work, I did the same update in my openSUSE (in a virtual machine) and restarted sucessfully. Only I had to reinstall the guest additions but I think that is the normal behavior.

Adding more "me too"s is not going to fix it. Open bug report, attach /boot/grub2/i386-pc/core.img (or /boot/efi/EFI/opensuse/grubx64.efi if you have EFI) and result of bootinfoscript before you tried any repair action. Post bug number here.

Care to explain where to open bug report? Mine is a little different with OP. I can start till the animation finishes, it’s just the login screen doesn’t appear. I have a default background image stuck there forever, and the only button works there is ctrl+alt+del. Where do I look at?

I can ctrl+alt+f1 normally before it gets stuck though.

Short story: Machine boots, thanks a lot!!!

Long story: See below.

Longer story: I believe to know what went wrong, but I don’t know how this has been prevented before and how it should be prevented in future.

No worries I would have guessed that anyway. Do what I mean, not what I say :wink:

Exactly, that’s the problem.

That statement (with the correct partition number of course) did not work:

 # grub2-install --modules=ext2 --force /dev/sda2
/usr/sbin/grub2-bios-setup: warning: File system `ext2' doesn't support embedding.
/usr/sbin/grub2-bios-setup: error: embedding is not possible, but this is required for cross-disk install.

Even with force it doesn’t want to write its files into LiveCD’s filesystem. (I think it should not have caused any trouble
because I had the correct files in sda2’s filesystem from my previous reinstallation)

Anyway, let’s do it right. (I always prefer to run without force first in order to understand what nasty things I force it into)

 # mount /dev/sda2 /mnt
linux:/home/linux # grub2-install --modules=ext2 --boot-directory=/mnt /dev/sda2
/usr/sbin/grub2-bios-setup: warning: File system `ext2' doesn't support embedding.
/usr/sbin/grub2-bios-setup: warning: Embedding is not possible.  GRUB can only be installed in this setup by using blocklists.  However, blocklists are UNRELIABLE and their use is discouraged..
/usr/sbin/grub2-bios-setup: error: will not proceed with blocklists.

OK, dangerous things, we need to force it:

 # grub2-install --modules=ext2 --boot-directory=/mnt/ --force /dev/sda2
/usr/sbin/grub2-bios-setup: warning: File system `ext2' doesn't support embedding.
/usr/sbin/grub2-bios-setup: warning: Embedding is not possible.  GRUB can only be installed in this setup by using blocklists.  However, blocklists are UNRELIABLE and their use is discouraged..
Installation finished. No error reported.

After this the machine boots just fine.

This was not required to get the machine boot. Haven’t had time to look into this yet. Could wrong info there cause trouble some day later???

So if I understand the warnings (which need to be overwritten with force) correctly the problem is that grub hides its essential parts, which are run before it can read the filesystem inside the free blocks
of the filesystem. If somebody modifies the filesystem contents, it could happen that grub breaks (if the filesystem uses the wrong “free” blocks). This seemed to have happend when the kernel was updated.

However, how has this worked before? This OpenSuse 12.2 is not very old and the kernel has not been updated very often. So it could have been luck. But I have an Ubuntu Lucid 10.04 LTS system, which is more
than 2 years old with exactly the same setup. And Ubuntu updates the kernel a.) more frequently than OpenSUSE and b.) does not remove old kernels if there was a kernel-internal ABI break. In effect the boot
filesystem there gets written to much more and it has been full several times (until I manually removed old kernels). So how can they manage to never have hidden grub parts overwritten? Do they re-install
grub each time after they update the kernel? That should work.

Maybe even OpenSUSE normally re-installs grub2 (in general the bootloader, because I think grub2 is not the only supported one) after each update of the /boot filesystem? But this very kernel.rpm had a problem in its %post section and did not do it??? Well I know, it’s open source I should read the code and not speculate wildly. But now I have spent so many hours on this issue that I need to do some real work first.

Yes, bugzilla, I hope I can still do it today,

Then please start new thread.

Did not I say “proper chroot”, which implies that /boot is mounted where it belongs? As you said “do what I mean” :slight_smile:

Anyway glad it worked. Now I would be really interested to understand how it could possibly happen. Could you for a start show output of

bor@opensuse:~> sudo /usr/sbin/grub2-probe -t fs -d /dev/sda1
ext2

I just performed latest updates including kernel and did not have any issue. So I think your problem most likely is some latent issue that became obvious now, because bootloader had to be updated and system rebooted.

Under the shower it appeared to me that I must have written quite some nonsense there. I mixed up free space in the filesystem and the gap after the MBR. Of course the information can be “hidden” inside files. Even if grub accesses them via a blocklist before it is able to read the filesystem, the space is perfectly reserved also from the view of the filesystem. Problems would only occur if you copy around files and delete the originals resulting in a changed block address. I don’t think a kernel update would do that.

So I don’t understand how the kernel update managed to break the grub installation. (Of course it could have been something else than the kernel update, but somehow the kernel update still remains the number 1 suspect)

Well, when you point me at it I see that you said it. But I didn’t read it :(, the code box somehow draw all my attention to it. And the one or two times in my life I was forced to reinstall grub before this time I managed with the --boot-directory option. Is there anything fundamentally better/safer when using chroot? At least the typing effort seems to be bigger to mount /proc, /dev, /sys (not sure which of them are really required)

That I would like to understand, too. I guess you mean /dev/sda2 (my /boot partition) again. Yes, I can run the command when I am close to the machine again. Right now it is in the office (powered-off) and I am at home.

Besides that it’s too late for today I’m not sure what I should report to bugzilla. It’s not (yet) clear what causes the problem and it’s not clear under which circumstances it might be reproducible. At least one of the “me toos” had a broken grub2, but he could get around it by modifying grub.cfg. So his problem was not that grub could not longer read the /boot fs. In the meantime those who need to work-around should also find our discussion here.

It says ext2 of my /dev/sda2, no surprise there. If you are interested what it says to the backup image of the partition when it was in the “grub unknown filesystem” state I could try also that one. Will grub2-probe be happy if the block device is a lvm2 logical volume on a crypted physical volume? It would be both quicker and safer to restore my backup there. Or will it be so smart to tell me that grub will not work under those circumstances?

Sure.

Will grub2-probe be happy if the block device is a lvm2 logical volume on a crypted physical volume?

I expect it to work for probing filesystem type.

Yes, grub2-probe did work without complaints. It recognized the restored /boot filesystem containing the broken, probably “Minix-only” grub as ext2. (restoration to a logical volume as mentioned earlier)

Anything else that could still be investigated?

My grub.cfg still contains the insmod minix_be statements. I’m not sure what their relevance really is, because according to my understanding grub.cfg is read using the filesystem support. So they are too late to course any harm directly (or thinking the positive way correct insmod ext2 statements would be unreachable in order to rescue anything in case ext2 filesystem support is missing in the beginning). However, probably the wrong statements in the file are a symptom that something went wrong in some phase.

How large is the image? Could you compress it and make available somewhere?