EFI-Boot: problem after crash

On my Ryzen I’ve 3 Suse Versions - no one is stable:

  1. 42.2 (standard kernel, somewhat less instable than all other versions) on NVME disk (first prio)
  2. 42.2 with kernel 4.10.9 (the actual “stable” kernel) on a 4TB disk - it’s nearly not bootable.
  3. Tumbleweed on an USB3-disk-port (kernel 4.10.9, deactivated per power switch today).

At morning I started (1) as usual and later set it asleep (not Standby - not working -, in “Ruhezustand”).

At weak up boot crashed.
After that only (2) was to be found in the boot options - and not in the boot menu (at <F2>).
Some tries were necessary - after the second try all read only rollup versions were vanished.
The boot menu didn’t even show up any option to boot from NVME (the NVME disk was visible, but not in the boot sequence).

The boot parameters were defect, the saved parameters in NVRAM gone. After some hour I got a start of (2), all data at first glance o.k., but Yast was hanging - no possibility to update the boot menu.
Only the very last option - switch <F12> at boot time to select a boot version - was working, and yast was able to regeneate a boot menu.
(There’s no hint for <F12> at all in the so called “User’s Manual” of the board, only a 1/10 second flash light at boot time!)
As a surprise it was possible to boot the NVME disk using <F12>, and then a setup of the boot menu using Yast.

Now I wanted to look at the efi parameters in a root shell, but it didn’t work as expected. Why?

# efibootmgr
Fatal: Couldn't open either sysfs or procfs directories for accessing EFI variables.
Try 'modprobe efivars' as root.
# modprobe efivars
# efibootmgr -v
Fatal: Couldn't open either sysfs or procfs directories for accessing EFI variables.
Try 'modprobe efivars' as root.

But at next boot after I was able to set all EFI parameters starting from zero and set up the NVRAM disk for boot.

My BIOS info:

# dmidecode -t0
# dmidecode 3.0
Getting SMBIOS data from sysfs.
SMBIOS 3.0 present.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: American Megatrends Inc.
        Version: F2
        Release Date: 02/20/2017
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 16384 kB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                BIOS ROM is socketed
                EDD is supported                                                                                                                                                                       
                5.25"/1.2 MB floppy services are supported (int 13h)                                                                                                                                   
                3.5"/720 kB floppy services are supported (int 13h)                                                                                                                                    
                3.5"/2.88 MB floppy services are supported (int 13h)                                                                                                                                   
                Print screen service is supported (int 5h)                                                                                                                                             
                Serial services are supported (int 14h)                                                                                                                                                
                Printer services are supported (int 17h)                                                                                                                                               
                ACPI is supported                                                                                                                                                                      
                USB legacy is supported                                                                                                                                                                
                BIOS boot specification is supported                                                                                                                                                   
                Targeted content distribution is supported                                                                                                                                             
                UEFI is supported                                                                                                                                                                      
        BIOS Revision: 5.12

Remaining questions: is it possible to give an explanation, what happened?

EFI/BIOS seems to be much more complex than the former BIOS and much more instable and error prone.
Therefore it seems to be necessary to study the mechanism - is there a good source?

It’s a bit hard to know what happened.

If I’m reading your report correctly, you hibernated, and then there were problems in waking up.

Personally, I don’t hibernate. I did try it once, to test. And that worked. But people run into problems depending on their hardware. I think it can be a problem with nvidia graphics, for example.

At one time, I had opensuse on a USB drive. And if I booted the computer, and used the menu on the hard drive, the system on the USB drive would not work. I had to hit F12 and select the USB drive for booting from the BIOS. Apparently, the BIOS did not fully initialize the USB ports otherwise. However, after booting to the hard drive, I could the do a restart and boot the USB from the grub menu because the USB ports were already initialized by then. My impression is that Microsoft pressured manufacturers to do what they could to speed up boot, and not fully initializing the USB ports was one option that they could take.

You had an issue with “efivars”. This typically means that you booted your computer with legacy booting rather than EFI booting, so that the EFI system was not properly initialized. Perhaps you installed one or more of your systems for legacy booting rather than UEFI booting.

I’m pretty sure that you can find UEFI specifications on the Internet. But they are complex and hard to read. There are also some simpler guides to use UEFI with linux. Maybe search for “rodsbooks efi”.

It was hibernating. I’ve nearly never got problems with hibernating, but often with standby.

That’s a very perfect hit, see http://www.rodsbooks.com/gb-hybrid-efi/ - *it’s a Gigabyte board.
And it seems, Rod is exactly describing the same situation.
*At the other side - it would be not enough to study the UEFI definition.

Rod Smith is writing about “hair pulling begins” with Gigabyte’s Hybrid EFI, specially when there’s no windows system (I’ve none).
He’s writing “The board has a hard time remembering the boot options set via the Linux efibootmgr utility” …

Despite I’ve never opted explicitely for legacy booting, that seems to be true.
Perhaps for one system I’ve missed Rod’s hint: “Set the firmware’s “EFI CD/DVD Boot Option” to “EFI,” if necessary, to force an EFI boot of your installer”.

For newbees in using EFi like me it seems to be of great value to study Rod’s explanations.

It’s so. My 42.2 on a 4TB is having a /boot/efi, my 42.2 on a NVME is having none, onlly an empty /boot/grub2/x86_64-efi
. I’ve used the identical USB stick. Don’t know, if thats because of an installation default or because I’ve changed a boot parameter.
Until I’ve corrected that, the situation described above may repeat … when unexpected, that’s a bad trap.
And the BIOS setup description in the “User’s Manual” for my GA-AB350-Gaming is just a crazy joke … but such ‘manuals’ are just standard nowadays.

I can’t tell what went wrong.

However – check whether the 42.2-4TB has an entry in its boot menu, to boot the 42.2-NVME. If it does, then boot that way. You’ll be using EFI, even though it was installed for legacy booting.

And if you get that far, then setup a “fstab” entry to mount your EFI partition as “/boot/efi”. And then go into Yast bootloader, and switch from using “grub2” to using “grub2-efi”.

On second thought, you might do better to just boot the NVME system with the boot menu from the 4TB system. Otherwise you will run into issues with two different systems fighting over which controls the boot menu.

I’m planning to setup new starting with kernel 4.11.
Before following your hint, I tried kernel 4.11 RC8-1, had a good feeling beside some kernel panic at boot time (no log existing), and updated to 4.11 RC8-2.
But booting 4.11 RC8-2 was failing completely - kernel panic.

But in the 4.11 RC8-2 boot tries, there’s a perhaps valuable hint (I’ve one sharp camera picture):

EFI_MEMMAP is not enabled.
Failed to lookup EFI memory descriptor for 0x0…d01a901… Invalid PCI ROM headersignature: expecting 0xaa55, got 0xffff
BUG: unable to handle kernel Null pointer dereference at 0…0
IF: gpiochip_get_data+0x5/0x20

input: PC Speaker as /devices…
…kernel OOPS

… kernel panic

My guess: in the original 42.2 boot parameters my (Ryzen +) AMD polaris grafic card is set to a very restricted use of HDMI/Display Port (VGA instead).

I had to rollback to kernel 4.4 …

Is it necessary to copy the boot options manually for every change?

If those are kernel options on the boot line, then you might need to copy them for each install. The installer usually puts those into “/etc/default/grub” so that they are there for future boots.

At first thanks for all your valuable hints. With them I’ll try to setup Leap with kernel 4.11 as soon as possible.

I’m thinking, my boot problems will be strictly reproducible:
Let’s take:

  • a dino linuxer with equipment from 1996 to 2005, who had always installed different linuxes parallel
    (ok, that gave also strange problems up to warranty changes of board and disk - both without hardware errors)
  • whose contact to EFI was to had heard about it’s existence
  • equipment too new for having full linux service
    (Ryzen, AMD 460 grafic card, DP screen with audio boxes)
  • a GA-AB350-Gaming - he’ll think, it’s ‘hybrid BIOS’ is equivalent to EFI,
    with no information beside it’s User’s Manual
  • a LEAP 42.2 installation medium

and let him seek for a stable system, or at least a system, which is working as ‘good’, ‘fast’ and ‘errorfree’ as his very defect old equipment.
(my very first installation, a pure default installation of 42.2 onto a 4TB disk of a Windows-less PC, using dealer’s parameter set - he checked the PC with it - was messaging a defect USB port, using a 1024x768 screen, and freezing down randomly).

Before I buyed, I had tried to get as much as possible information using the web - without own experience that’s nearly without value.
Modern technology like EFI is looking like being planned as a dungeon game.

I tried to follow the hints gotten here, but:

  • Because lots of “unexpected IRQ trap at vector 07” errors (and messages!) it wasn’t possible to get Yast2 to make changes at the running 42.2 (EFI-boot, kernel 4.10.9).
    (in single user mode the permanent “unexpected IRQ trap at vector 07” error messages are covering the left part of yast2).
    What’s causing these messages? Is it possible to diminish them (even to suppress them may help)?
  • It was not possible to attach a USB3 disk for saving disk contents. It looked like a defect disk.
  • Logout and ordinary shutdown weren’t possible (no reaction)
  • At last I tried to go back to kernel 4.4x was ending in an endless loop after killing kernel 4.10.9 and reinstalling kernel 4.4x.
    After that it’s of no value to try any rollback version (no kernel …).
    A backward update using a 42.2 install DVD would have been possible - I made that once before, and it worked - but I had only an USB3 with 42.2-image.
    Why ever - it wasn’t accepted by EFI or BIOS, showing instead a 30 partitions (really existing: 1)

It was necessary to install 42.2 with kernel 4.11.0-1.g1b516a5-default completely new onto my NVME disk.
Up to now it’s the best running Linux on my PC - still a problem is too new hardware: Ryzen, GA-board, Polaris(DP with Audio), but looking to be usable for the next time.
The kernel is still very imperfect, running much better than any kernel before, but a cold boot may need an hour and a series of kernel panics,
but beside the “unexpected IRQ trap at vector 07” errors the system up to now was never freezing down (after loading up my old home directory, it wasn’t possible to log out or shut down the system. I was waiting for more than a hour, tried it repeatedly).

While doing that I’ve seen what I made wrong before.
That I made errors is no surprise.
The ‘newest’ of my PC’s before was from 2005, and was trained as a sysadmin only up to 1992.
It’s not totally impossible, but needing a lot of luck to set up all parameters for booting and system in a manner to get the expected effect.
One of the oldest principle of Unix is to work silently - and indeed also to fail silently.
Because of this silency for a non expert it is hard to see if the result is what he expected.

Looking back, there is a big lack of info for setting up EFI/BIOS and 42.2 users for people like me.
There are a lot of parameters to set - and the maximal description for them is a list of possible values.
There’s no hint about the meaning of these parameters and no hint of the consequences in setting a concrete value.
The “User Manual” of the board is giving near to no info about EFI/BIOS, there are lots of very unspecific and/or vague docs in the web -
without having an expert looking over the shoulder and giving some explanation it’s very hard and unsatisfying.

I had used OpenSUSE before as long as it was possible with my old equipment (after that I was using lubuntu and bodhi).
Formerly OpenSUSE was famous for being the most easy way to setup a linux system.
There were some big pitfalls, but they are gone, but the situation nowadays is much more complex, and there are new pitfalls.
E.g. on my NVME disk without a warning I’ve got a system looking like an EFI system but with old style partitions (I didn’t know, that this can’t work. Repeating the action with boot device ‘EFI DVD’ instead a USB stick a clear warning was visible).
OpenSUSE is acting in a much more professional way as in former times, but I’m hoping, OpenSUSE is trying to be also a distro for ‘normal’ linux users, and not only professionals.

But then the installation should give definitely more infos understandable for casual users like me.
While installing the installation program was the only reachable source of information for me.
Therefore for any question of the installation procedure it should give any information needed by a layman to make the decision, and it should ask, what the user is planning (e.g. EFI or classical BIOS boot), it should give advice, what the parameter setting is doing, and it should give a comment on the result (e.g. ‘system ready for EFI boot’, ‘to setup for EFI boot the following procedure is necessary’, ‘setup for classical BIOS boot. Because you have also an EFI boot system, depending on your board you may get problems’. But for one of my 42.2 install tries a correct result message would have been “Whatever you’ll do - perhaps it will be impossible to boot this system before correcting for BIOS boot”. To setup this last system I have done: 1) installed a system using a 42.2 USB stick, old style partitioning 2) updated this system using the same 42.2, but from “EFI DVD”).

In this forum are a lot of experts giving very precise and helpful commands - often in a very short manner - so sometimes I’m missing hints for growing a broader understanding of the situation (to learn top down together with bottom up is sometimes more efficient).

Setting up EFI boot is no problem just boot the installer in EFI mode and most things are set… If any question then the existing EFI boot partition (if one) should be set to mount at /boot/efi. grub2-efi should be the boot code. Note if multi-booting then all OS must use the same method ie EFI or MBR

You may be confused on how BTRFS file system works. It has many sub volumes which show up in mount so don’t panic or use the EXT4 file system which is more traditional.

openSUSE keeps at least 2 kernels so you can select the previous if the newer does not work using advanced option at boot Also if using BTRFS and snapper you can roll back to previous states.

Newer hardware can always be a problem so Tumbelweed may be better for you at least until 42.3 or what ever the number it comes out. (some talk about 15.1 to match the SLES versions)

BTW you can get live version from GeckoLinux to test without dong install

https://geckolinux.github.io/

That’s so - but that’s just the problem. Even for a guy like me having heard only the word “EFI” before, to install the 42.2 on a 4TB disk was a straightforward exercise (I was mainly using the distributor’s parameter set - there were too much new crazy names). But I wanted a system on the NVME disk. Beside the boot order I used the identical(!) “BIOS/EFI hybrid board” parameter set, but I got a more or less BIOS boot - and because lack of experience I didn’t see that. At first both systems were running, but soon it was a nightmare. Together with the obvious problems of too new hardware - the 4.4x-kernel proved to be nearly unusable - that was a very irritating exercise.

The installer should really warn, when there’s a try to use different incompatible methods for a multiboot system.

There was indeed some confusion.
But BTRFS has been proven as best choice (but may have set some additional problem - sometimes it was impossible to log out or shut down).

I had to use BTRFS rollbacks excessively.

I’ve also tried Tumbleweed, but the problems were practically the same, depending mainly on different kernels.

Up to now It wasn’t possible to boot a test system without changing boot parameters - setting another system for next boot wasn’t working because of cold boot problems (kernel panic).

i havnt read full thread, but check in bios how your hard drive shows up, some nvme drives are listed as “intel raid” (or something), which causes severe problems.

Thanks - nice to see there’s a problem I don’t have :wink:

Well you can always force a MBR boot. Bot if multi-booting ALL OS must boot the same why or they won’t see each other.

Then again a little Googling will find lots of info such as here from official openSUSE

https://en.opensuse.org/openSUSE:UEFI

Probably more then you really need to know but it covers most of the basics.

Ignorance is bliss but it is also expensive :open_mouth:

For my Ryzen, 4.11.0-2.g39615a9-default under 42.2, EFI boot, is the first kernel not panicking on my PC at a - still slow - cold boot (beside 4.4x, which wasn’t really usable) :).
To setup boot for both systems (the second is also 42.2 with the same kernel …) was taking nearly 40 additional minutes. It was the first time, that this process was coming to an end under yast instead of an update procedure using the DVD image.
Temporarily connecting to the other system is working, but a link to the other system set in fstab may still force kernel panic during a cold boot.
At least up to kernel 4.11.0-1 a fresh boot was sometimes necessary to make it possible to use yast2 or connect to an USB disk.
I’m now tending to wait for an update to the successor of 42.2 (and tumbleweed for the second system).

As this is a fact and I’m too nosy, I yesterday installed the newest Tumbleweed onto my nvme disk (EFI boot).
First boot (with install DVD loaded): 1 kernel panic, then booting and running succesfully for the rest of the day.
Tried a cold boot today morning. After 3 times kernel panic I changed to my second system (42.2, same kernel linux-ep76 4.11.0-2.g39615a9-default, 4TB disk).
It booted like before - slowly, but without major problems (only whenever I took a short look at boot state using <esc>, there are some new “unexpected IRQ trap at vector 07” messages).