Grub disordering in Leap 15.6 keeps recurring?

Folks:

Had a long thread about “log in slowness with TW” going, and that seems to have been possibly “solved” by changing the display manager to sddm_qt6 . . . . However, in trying to confirm that all is well with TW log in, I just cold booted to grub menu and selecting “tumbleweed” system, it booted to Manjaro, which is in another drive. I restarted and this time selected, “Tumbleweed, advanced options” . . . and then scrolled down to the fifth line item, all named “tumbleweed” and that successfully booted to TW, and log in was back to normal.

Leap 15.6 is now the “os-prober” system and over the last few days, grub has “disordered” itself a number of times. This morning, for possibly the third time, I ran the “grub2-mkconfig” command, and that found all of the OSs (7 bare metal installs) in their proper sdx locations and I could boot into and restart into a number of them, just fine.

I shut the machine down, and as mentioned above, on cold boot, selecting “TW” in sda7 went to Manjaro in sdb7 . . . . So, grub was “operational” before shutting down and back to not so tidy on cold boot, nothing having been done to grub while the machine was shut down.

Grub was previously installed on TW and ran fine up until a few days back, when, finally in frustration with the slowness of log in, I installed grub and os-prober on Leap, and subsequently OSs were lost or re-ordered . . . willy-nilly.

I wonder if your problem could be related to this thread:

https://forums.opensuse.org/t/problem-with-disks-order-after-snapshot-20230921/169324

and this post:

https://forums.opensuse.org/t/problem-with-disks-order-after-snapshot-20230921/169324/5

1 Like

Possibly . . . have to check it later, but this problem showed up after 9/21 . . . ??? Thanks for the links . . . .

The latest update . . . seems like on each fresh boot the grub order of systems continues to change . . . in line item order AND in sdx location.

Today, the names of the system associated with the sdx location were changed, but grub would boot the listed sdx that was showing . . . rather than as before booting to ER mode if the name and/or sdx location weren’t correct, so like today it showed “Lubuntu Mantic in sdb7” . . . but Manjaro is located in sdb7 and Manjaro is what booted up . . . .

But, for several weeks I have not been able to boot a Gecko rolling install, by hook or by crook . . . boots to ER . . . .

I tried this suggestion from Michael H on the Factory list-serve. In my case it did not prevail. Grub sees the OSs in “sdb” as being in “sdc” and vice versa.

I don't work with booting very often, so to confirm the procedure, it 
should be something like (as root):

% echo 'softdep usb_storage pre: ahci' >/etc/modprobe.d/10-ahci-scsi.conf
  % dracut -f --regenerate-all

I first just ran dracut -f and the boot order still varied.  But after adding
the --regenerate-all both a hot and cold boot reverted to the old ordering.
Others reading this might also want to note that mkinitrd is no longer
available, so do use dracut.

The sdX order is determined by the BIOS this is why you need to use some form of labeling to fix the order you want the OS to see the partitions/drives. You can not depend on the BIOS to always set the right order for you

Especially on a Mac. :wink:

OK, thanks for the reply on it. I have seen other posts talking about “labeling” . . . but I don’t know what that is . . . or how to do it. Grub was running fine up until a week or so back . . . seems like a perfect storm of problems have hit my openSUSE installs . . . .

@mrmazda:

The scourge of the Mac . . . in spite of the historical issues where other systems were moving UUIDs around . . . all was well until this recent deluge of conundrums.

UUID or label can be used to name the partitions UUID is randomly created at the partition creation Labels you meed to add but be sure there are no duplicate labels on the machine. The BIOS sets the order on the order of discovery. This can change if the hardware changes or other reasons

my /etc/fstab shows UUIDs . . . where am I supposed to look to set this data to the way that grub will recognize the system that is there and selected to boot.

Seems like right now grub is re-ordering on each reboot . . . .

UUID labels only change in creation of a partition the sdX order is all do to the BIOS and the order the drives are found. Don’t confuse them.

ON the other hand 15.6 is still alpha.

You can see the sdX assignments to UUID Labels in /dev/disk/disk/by_uuid and looking at properties. All references should be by uuid no sdX stuff

1 Like

OK, thanks for the hint. Yes Leap is “alpha”. … but the problem(s) started in TW, so I thought I could switch channels and get around the TW problems . . . .

So Far, no success . . . .

In this respect, Leap was ahead of Tumbleweed, it has been using modules for loading sd_mod for some time. That’s what SUSE Enterprise does, and Leap is based on that. Tumbleweed was compiling sd_mod into the kernel for increased boot speed, but with everyone using faster hardware, that’s no longer necessary, I think a recent bug prompted it changing to modules as well. So moving from that recent TW to Leap, just meant no change in what happens.

In the bug report, Martin Wilck explained why the following might be a better work around:

   echo sd_mod > /etc/modules-load.d/sd_mod.conf
   dracut -f --regenerate-all

This forces the early initialisation of the sd_mod, where as what I was previously doing just effected the loading of the ahci drivers. I think sd_mod is more general. So that might be worth a try.

3 Likes

You made a straight forward and robust suggestion!

However in the long term users are better off following the suggestion of Martin Wilk:

Anyway, it has been mentioned multiple times in this discussion, but once more:

** don’t rely on /dev/sd* device names **

This has been unreliable for years, and you have simply been lucky if you haven’t seen issues so far. Closing as WONTFIX.

1 Like

I object. UUID and LABEL belong to file systems, not to partitions.

ID and PATH can belong to partitions (but also to other containers).

PARTLABEL and PARTUUID belong to partitions, but exist only in GPT.

And here you were also incorrect:

It is on creation of a file system.
And you can also change the UUID on purpose on an existing file system.

2 Likes

I was already using labels. A few non fstab usages slipped between the cracks, fixed now.

The quoted work-around is just a nice to have - so the output of df, lsblk, and the rest aren’t too disorienting.

I don’t think I’ve been lucky. The decisions behind how the kernel used to be built removed any element of luck (at least for my simple desktop hardware and quite a few others too, it seems).

From a usability aspect, I do think it might be a bit unwise to move the kernel build from a relatively stable assignment of /dev/sd* to a totally unpredictable one. It’s going to break a lot of old setups, and old advice and examples. Then again, Sweden switched overnight from driving on one side of the road to the other. I guess we’ll survive this change too :slight_smile:

1 Like

Labels are FS associated but UUID are only associated with partitions they are created when you create a partition.

Again the sdX naming is strictly a BIOS job not a kernel job and depends on the order the BIOS finds the drives.

Nope. But saying yes - no - yes - no. Is senseless. Thus I will stop with this.

1 Like

Well, I you may try yourself.

Use some mass-storage device with unpartitioned space. Use e.g. fdisk to ceate a partition. Then check if there is a /dev/disk/by-uuid symlink for that partition.

And to top that off, use e.g. mke2fs to create a ext2 (or ext3 or ext4) file ssytem in that partition and check again for the UUID.

1 Like