Grub2 weirdness

Today, I brought my system down to change a plug on the onboard sound card. When I tried to bring it up, I got the “No operating system found” message. I’m single booting openSUSE 12.3 with the UEFI FAT32 partition and the root partition on a SSD. The tmp and home partitions are on a 1T mechanical drive. This system had not been down for quite some time, so there were several YOU updates done, and I did update evolution from build.opensuse.org/GNOME.

The machine was OK, I could boot a KDE Live CD copy, but the SSD drive was not showing up as bootable in the BIOS screen. First move, update the BIOS on my system board. Didn’t help. I was concerned that the SSD drive was going bad. After starting with the Live CD, I poked around for a while and could read and write to the FAT partition of the SSD. I started the machine with a copy of Super Grub2, it found the grubx64.efi file on the SSD and the computer came up like it should. Proved the SSD was still working. I ran grub2-mkconfig without success. I’ll spare you the long list of stuff I tried but found that, in order to accept a device as bootable, there had to be a FAT partition, with a folder named EFI, containing a folder containing a valid .efi file.

The default file structure on the UEFI FAT partition is /EFI/opensuse/grubx64.efi. With that structure, my computer will not boot. I had to change the name of both the opensuse folder and the grubx64.efi file. If I change the folder named opensuse to anything else, the machine will start to come up, but will not finish. When I change the name of grubx64.efi to anything else, the machine will boot properly as long as the opensuse folder has been renamed.

Obviously, the problem is not in openSUSE, but in grub2. On 16 August, there was an update to grub. Grub2, grub2-efi, grub2-x86_64.efi and so on were contained in the YOU update I did. I find it hard to believe though, that I’m the only one who had a problem. Although my system is operating now, I’m concerned that, in the future, something will require an update to something in grub that will break it again.

I’m hoping that if someone does have a problem, this message will help. But, I’m hoping even more that someone will say “Oh, you didn’t need to do all that, dummy, here’s how to fix it!”

Bart

Check you plugs. any time you open the box there is a chance of producing a lose connection if you bump the cables

There were several posts about boot problem recently. Unfortunately, each case appears to be different and nobody provided enough information to even try to guess what happened.

In your case it looks like known firmware bug but again - you do not tell what PC you are using, what manufacturer etc.

Please show output of “efibootmgr -v” to look at your firmware boot list.

On Wed 11 Sep 2013 03:16:02 AM CDT, arvidjaar wrote:

montana_suse_user;2583907 Wrote:
> I find it hard to believe though, that I’m the only one who had a
> problem.

There were several posts about boot problem recently. Unfortunately,
each case appears to be different and nobody provided enough information
to even try to guess what happened.

In your case it looks like known firmware bug but again - you do not
tell what PC you are using, what manufacturer etc.

Please show output of “efibootmgr -v” to look at your firmware boot
list.

Hi
I have that on my HP Probook 4430s, not finding an operating system
(I’m guessing it’s looking for the Windows Boot Manager entry) it’s just
a funky UEFI implementation and I also think some grub2-efi changes with
the efi variables.

I can overcome this by setting the nextboot option with efibootmgr (I
actually created a systemd service)

For example in my case;


efibootmgr

BootCurrent: 0000
Timeout: 0 seconds
BootOrder: 0000,0001
Boot0000* Notebook Upgrade Bay
Boot0001* Notebook Hard Drive
->> Boot0002* opensuse
Boot0003* Notebook Ethernet

efibootmgr -n 0002

BootNext: 0002
BootCurrent: 0000
Timeout: 0 seconds
BootOrder: 0000,0001
Boot0000* Notebook Upgrade Bay
Boot0001* Notebook Hard Drive
Boot0002* opensuse
Boot0003* Notebook Ethernet


Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
SLED 11 SP3 (x86_64) GNOME 2.28.0 Kernel 3.0.82-0.7-default
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below… Thanks!

For the record, The system board is an ASUS Sabertooth 990FX/GEN3 R2.0, using an AMI BIOS. I updated to the latest BIOS. It has 32 G of ram. I’m using an AMD 8 core processor. All drives are SATA. SDA is a Kingston SV200S3 120 G SSD with a 133 meg FAT partition for the EFI boot system, a 50 G partition for / and a 50 G partition for semi static data that is mounted in my home directory. SDB is a 1 T Seagate that has 32 G for swap, 32 G for /tmp and 850 G for /home. SDC is 1T mounted in my home directory. SDD is a 500 G drive from my old Windows XP machine. It is mounted in /home and has some data left.

bart@Asus-990FX:~> su -
Password:
Asus-990FX:~ # efibootmgr -v
BootCurrent: 0004
Timeout: 0 seconds
BootOrder: 0004,0002,0001
Boot0001* CD/DVD Drive  BIOS(3,0,00)AMGOAMNO........o.P.L.D.S. .D.V.D.-.R.W. .D.H.1.6.A.C.S.H....................A...........................>..Gd-.;.A..MQ..L.0.S.6.A.6.8.4.9.V.Z.6.J.D.Y.0.0.0.R.U.2......AMBOAMNO........o.A.T.A.P.I. . . .i.H.E.S.2.1.2. . . .3....................A...........................>..Gd-.;.A..MQ..L.5.2.2.1.1.0. .1.1.2.2.6.9.0.0.5.3.0.5.5......AMBO
Boot0002* Hard Drive    BIOS(2,0,00)AMGOAMNO........o.K.I.N.G.S.T.O.N. .S.V.2.0.0.S.3.1.2.8.G....................A...........................>..Gd-.;.A..MQ..L.2.4.A.7.0.3.W.7.G.K.K.S. . . . . . . . ......AMBOAMNO........o.S.T.3.1.0.0.0.5.2.8.A.S....................A...........................>..Gd-.;.A..MQ..L. . . . . . . . . . . . .V.9.6.P.E.P.L.5......AMBOAMNO........o.S.T.3.1.0.0.0.5.2.8.A.S....................A...........................>..Gd-.;.A..MQ..L. . . . . . . . . . . . .V.9.6.P.R.Q.N.2......AMBOAMNO........o.M.a.x.t.o.r. .6.H.5.0.0.F.0....................A...........................>..Gd-.;.A..MQ..L.8.H.Z.0.L.4.H.J. . . . . . . . . . . . ......AMBO
Boot0004* UEFI OS       HD(1,1f60800,42800,526eaceb-1396-499a-9187-ef1dfcb8e5bf)File(\EFI\BOOT\BOOTX64.EFI)
Asus-990FX:~ #

Thing is, it’s been working just fine for many months now. I guess there could be an incompatibility between my system and grub2. But, I can’t, for the life of me, figure out why simply changing the names of the directory and the efi file would fix it. I created and threw away half a dozen cheap CDs trying different things, formatted and monkeyed with a USB drive till I though I’d wear it out. Listened to my wife when I used her real Windows machine to try some things. Never occurred to me (for WAY too long) to simply change the names of the files on the boot drive.

If I go in right now, and change the name of the directory on the boot partition to opensuse, as it was, my machine won’t boot. If I boot Super Grub2, have it load the efi file and change the name of the directory back to BOOT or almost anything other than opensuse, it will boot just fine. The same occurs with the name of the EFI file. If I change it to grub_x64.efi, it won’t boot. If I change it to anything other than the original name, it’ll boot just fine.
I’m tempted to say that someone hard coded some file names into grub, but that wouldn’t happen. I mean… Would it? :slight_smile:

I do have to say that Super Grub2 downloaded and burned on a disk was a tremendous help!

The upside of all this is, I actually learned something!

Bart

Which directory? You see, most people omit details they assume are obvious making it impossible to actually understand what happened …

to opensuse, as it was, my machine won’t boot.

Your firmware does not have boot entry for openSUSE. So of course it will ignore it. Now why and when exactly it happened is impossible to answer. grub-install will try to delete and re-create EFI boot entry (may be it should not delete it if it exists and points to correct file…); so if something prevented new entry from being added, it could result in your problem.

It is known that EFI storage space may fragment and defragmentation is not always efficient (or sometimes is done on reboot only). That could be one possibility.

I’m tempted to say that someone hard coded some file names into grub

Your system behaves as it should given your boot entries.

Could you try power-cycle your system, boot, run grub2-install and check whether you have any additional boot entry in efibotmgr output?

Which directory was I talking about? I was of course referring to the FAT partition of the boot drive that has the /EFI directory. The one that, of necessity, as I understand it, must contain another directory which contains the .efi file which contains the code to actually start the boot manager. That is, obviously, the directory being discussed.

Your firmware does not have boot entry for openSUSE. So of course it will ignore it.

I’d be willing to bet that my firmware does not include an entry such as labamba, but when I rename the directory that was previously named opensuse, the one created by the opensuse installation, to labamba, the boot process works. That is, as long as the .efi file has also been renamed.

Now why and when exactly it happened is impossible to answer. grub-install will try to delete and re-create EFI boot entry (may be it should not delete it if it exists and points to correct file…); so if something prevented new entry from being added, it could result in your problem.

I did not run grub-install. Didn’t see a need to. It was installed when I installed openSUSE 12.3. it was running correctly. I simply tried to re-boot my machine after shutting down to make a wiring change to the sound card. Had absolutely nothing to do with and software at all.

It is known that EFI storage space may fragment and defragmentation is not always efficient (or sometimes is done on reboot only). That could be one possibility.

I had not touched the FAT file system on that drive since installation. Of course, I don’t know what goes on during the YOU updates. I guess I could copy the .efi file to a USB drive, format the FAT partition, re-create the directories and copy the .efi file there. That would eliminate any fragmentation problems. Then again, the problem is not that it doesn’t work, the problem is it doesn’t work with pre-existing file names!

I guess I need to find the time to rename the .efi file and the directory where it resides to a lot of different names and see if I can find a pattern. It is such a PITA to reboot a hundred times!

Your system behaves as it should given your boot entries.

Could you try power-cycle your system, boot, run grub2-install and check whether you have any additional boot entry in efibotmgr output?

Yes. Sure! Good idea! I’ll do that and reply with my findings.

Bart

It is invoked automatically during grub2 RPM update, and there were updates.

I had not touched the FAT file system

I did not mean EFI Ssstem Partition, but NVRAM where EFI variables and settings are kept (including list of boot entries).

I took the time tonight, to try various things with this system. I think I’m getting a handle on this.

Here is how I view the boot system to work. Please correct me where I’m not on track.

EFI stands for Extensible Firmware Interface. The word Firmware being the main item right now, means it’s non volatile but writeable.

Grub2-install has the ability to write to this NVRAM. And does, creating a list of bootable devices, locations of, and names of startup files on these bootable devices.

At boot time, the computer looks for this list of devices in the NVRAM and, if the first device is available, reads the information at the location stored in the NVRAM variable.

That information, for my use, is grub2. Grub2 loads, offers me several choices and then calls the linux kernel. Grub2 then terminates and drops out of RAM.

If I have this right, it’s not so different from the old PC BIOS system except that it gives a large leap in flexibility in defining boot devices and locations of boot files.

It seems to me, that I could rename the directories and files in the EFI partition, to whatever I wanted, and manually create boot entries, using efibootmgr, using the madeup names and the system would boot properly. Although I could imagine all sorts of problems with updates and the like from doing something like this.

I took your suggestion and ran grub2-install


bart@Asus-990FX:~> su -
Password:
Asus-990FX:~ # grub2-install
BootCurrent: 0004
Timeout: 0 seconds
BootOrder: 0000,0004,0002,0001
Boot0001* CD/DVD Drive
Boot0002* Hard Drive
Boot0004* UEFI OS
Boot0000* opensuse
Installation finished. No error reported.
Asus-990FX:~ #

At this point, I looked inside the EFI drive and was surprised to see a /EFI/opensuse/grubx64.efi structure. I deleted the /BOOT directory and rebooted.
The system boots correctly with the structure and names that it refused just before.

I don’t quite understand the Boot0004 entry but, I think it is because there was that BOOT/bootx64.efi partition when I ran grub2-install. I’m going to try running it again and see if it goes away now that the BOOT directory is gone.

My only question now is, how and when did my NVRAM get modified. It obviously did, as the directory names and file names it wanted were not what was on the drive, proven by the fact it would boot when changed. Guess I’ll probably never know the answer, but least now I know how to fix a problem if, or when it ever occurs again.

arvidjaar, it was your references to this NVRAM, the suggestion to run grub2-install and reference to efibootmgr that finally turned on that little light in my head. Thanks, thanks a bunch.

Bart

You could try to dig in /var/log/YaST2. File perl-BL-standalone-log should contain output of update-bootloader which is used by openSUSE. May be you can find some hint there. You should also check whether correct bootloader type is set - check /etc/sysconfig/bootloader for LOADER_TYPE, should be grub2-efi.

Otherwise your description is 100% correct, I’m impressed.

efibootmgr -v
BootCurrent: 000F
Timeout: 2 seconds
BootOrder: 000E,0011,000F,0004,0005,0006,0007,0008,0009,000A
Boot0000  Setup
Boot0001  Boot Menu
Boot0002  Diagnostic Splash
Boot0003  Acer D2D:
Boot0004* HDD 0: WDC WD5000BPVT-22HXZT3                         ACPI(a0341d0,0)PCI(1f,2)03120a00000000000000..bYVD.A...O.*..
Boot0005* HDD 1:        030a2500d23878bc820f604d8316c068ee79d25b91af625956449f41a7b91f4f892ab0f601
Boot0006* ATAPI CD/DVD: MATSHITADVD-RAM UJ8C2                           ACPI(a0341d0,0)PCI(1f,2)03120a00040000000000......!N.:^G.V.T
Boot0007* USB FDD:      030a2400d23878bc820f604d8316c068ee79d25b6ff015a28830b543a8b8641009461e49
Boot0008* Network Boot: 030a2400d23878bc820f604d8316c068ee79d25b78a84aaf2b2afc4ea79cf5cc8f3d3803
Boot0009* USB HDD:      030a2400d23878bc820f604d8316c068ee79d25b33e821aaaf33bc4789bd419f88c50803
Boot000A* USB CD/DVD:   030a2400d23878bc820f604d8316c068ee79d25b86701296aa5a7848b66cd49dd3ba6a55
Boot000B  UsbBackDoor
Boot000C* Internal Shell:
Boot000D  MEBx Hot Key
Boot000E* Fedora        ACPI(a0341d0,0)PCI(1f,2)03120a00000000000000HD(2,c8800,96000,56738615-741a-4897-abe4-f9d4f5581153)File(\EFI\fedora\shim.efi)..
Boot000F* opensuse      HD(2,c8800,96000,56738615-741a-4897-abe4-f9d4f5581153)File(\EFI\opensuse\grubx64.efi)
Boot0011* Windows Boot Manager  HD(2,c8800,96000,56738615-741a-4897-abe4-f9d4f5581153)File(\EFI\Microsoft\Boot\bootmgfw.efi)WINDOWS.........x...B.C.D.O.B.J.E.C.T.=.{.9.d.e.a.8.6.2.c.-.5.c.d.d.-.4.e.7.0.-.a.c.c.1.-.f.3.2.b.3.4.4.d.4.7.9.5.}...e................

I had to reinstall the openSUSE grub, because every time I selected the “opensuse” entry, the UEFI bootmanager would select the Fedora grub, instead (the same of the Fedora entry, hence pointing to the Fedora kernel).
I was able to manually boot openSUSE within the Grub shell (via the set root=, linuxefi end initrdefi commands).
Once in openSUSE, I run the command:

zypper install --force grub2 grub2-efi grub2-x86_64-efi

Before I had tried to delete and recreate the boot entry via the efibootmgr command with no avail.
Probably a defective UEFI firmware or, for some reasons, the grubx64.efi file in the opensuse folder was deleted or overwritten by the Fedora one (?) :open_mouth: