OS 13.2: Hibernate (aka suspend-to-disk) doesn't work when booting with "Advanced options"

When booting OS 13.2 with any of the “Advanced options”, that is with anything but the latest kernel installed (3.17.6 on this box), and trying to suspend to disk, my laptop ends up with an incorrect grubenv like the following:


# GRUB Environment Block
next_entry=Advanced options for openSUSE>openSUSE, with Linux 3.16.7-7-desktop (recovery mode)
##############################################################################################

Not surprisingly, the next boot is in “recovery mode”, the “next_entry” in grubenv is not reset, so every boot after that is the same, without Grub menu (understandably) until grubenv is manually cleared.

When originally booting with the first “default” “openSUSE” menu option (that is, with the latest kernel installed), everything works as expected.
So that is not related to HW, swap or the usual troubles with hibernation.
My guess is that a script or something is wrong counting the grub configuration lines, landing one line after the correct one when writing grubenv on suspend and therefore picking the “(recovery mode)” line that causes the following harm.

Please, systemd experts, point me to the culprit script or whatever, or even to an existing bugreport if there is one (a quick search didn’t find this specific trouble…).
To be clear, I need help to submit a to-the-point bug report if this behaviour is not already known.
I seldom use suspend-to-disk, and discovered this just by chance while testing the recently updated kernel, so I don’t really need to fix my system.

Do you have pm-utils installed?
Since the latest systemd update they are used again for hibernating/suspend just like in 13.1.
The responsible script would then be /usr/lib/pm-utils/sleep.d/99Zgrub.

If pm-utils is not installed, the work with setting/resetting the next boot entry is done by /usr/bin/systemd-sleep-grub.

But the problem might also lie in grub2-once which is called to actually set the boot entry.

I’m not aware of a bug report about this…

A small followup:
I tried it here and can reproduce the issue, regardless whether pm-utils is installed or not (not surprising in the end as the systemd script is based on the pm-utils script, and the part to find the currently booted entry in particular is exactly the same).

The scripts seem to count wrong in some way, here’s the debug output (from the pm-utils script, systemd-sleep-grub says basically the same though):

INFO: running prepare-grub
  Skipping grub entry #3, because it has the noresume option
  Skipping grub entry #5, because it has the noresume option
  Skipping grub entry #6, because it has no root= option
  running kernel is grub menu entry 4 (vmlinuz-3.16.6-2-desktop)
  preparing boot-loader: selecting entry 4, kernel /boot/3.16.6-2-desktop
  grub-once:   saving original /boot/grub2/grubenv
  running '/usr/sbin/grub2-once 4'

while those are the menu entries:

# grub2-once --list
     0 openSUSE
     1 Advanced options for openSUSE>openSUSE, with Linux 3.16.7-7-desktop
     2 Advanced options for openSUSE>openSUSE, with Linux 3.16.7-7-desktop (recovery mode)
     3 Advanced options for openSUSE>openSUSE, with Linux 3.16.6-2-desktop
     4 Advanced options for openSUSE>openSUSE, with Linux 3.16.6-2-desktop (recovery mode)
     5 openSUSE Memtest
     6 Microsoft Windows XP Professional (on /dev/sda1)

(I booted #3 here)

Of course #3 and #5 do not have the noresume option, but #2 and #4 have.
So there seems to be something wrong with parsing the grub.cfg file.

I investigated a bit more, and the issue seems to be caused by the “boot snapshot” feature. If I disable this the correct entry gets set.
And indeed, if “SUSE_BTRFS_SNAPSHOT_BOOTING” is true (in /etc/default/grub) the following is created in the menu header:

if  -n "$extra_cmdline" ]; then
  submenu "Bootable snapshot #$snapshot_num" {
    menuentry "If OK, run 'snapper rollback $snapshot_num' and reboot." { true; }

The hibernate scripts look for the string "menuentry " to identify menu entries, so it definitely gets confused by this I’d say.

If you are running the default kernel, this works fine, as entry #0 and entry #1 are actually the same (#1 is the default kernel entry inside the “Advanced Options” submenu).

Thanks, good description of what I’m witnessing here too.
I have pm-utils installed but, as you confirm, this doesn’t matter apparently.

I investigated a bit more, and the issue seems to be caused by the “boot snapshot” feature. If I disable this the correct entry gets set.
And indeed, if “SUSE_BTRFS_SNAPSHOT_BOOTING” is true (in /etc/default/grub) the following is created in the menu header:

if  -n "$extra_cmdline" ]; then
  submenu "Bootable snapshot #$snapshot_num" {
    menuentry "If OK, run 'snapper rollback $snapshot_num' and reboot." { true; }

The hibernate scripts look for the string "menuentry " to identify menu entries, so it definitely gets confused by this I’d say.

If you are running the default kernel, this works fine, as entry #0 and entry #1 are actually the same (#1 is the default kernel entry inside the “Advanced Options” submenu).

Well, you have caught the culprit, apparently. Let’s run a couple more tests tomorrow morning…
If you decide to file a bug report yourself, let me know: I’ll join for testing.

It should not use numbers at all. It should store either menuentry id (preferably) or menu entry name. They are not affected by miscounting.

I found a fix so trivial that I blush >:)writing about it (and thanks Wolfi323 for pointing out the offending script).
I just added a single quote in line 39 in /usr/bin/systemd-sleep-grub, from the original:


        case $LINE in
        menuentry\ *)
        let J++

to the fixed:


        case $LINE in
        menuentry\ \'*)
        let J++

That way the script skips the offending extra line


menuentry "If OK, run 'snapper rollback $snapshot_num' and reboot."

and the real boot line count is correct, making everybody happy.

To be clear, I uninstalled pm-utils; otherwise it seems that /usr/lib/pm-utils/sleep.d/99Zgrub should be amended as well.

The solution hinted at by arvidjaar seems more solid though…

Nice one! :wink:
My first try at a “quick fix” would have been this: (line #103 in 99Zgrub, or #37 in systemd-sleep-grub)

        while read LINE; do
            case $LINE in
            menuentry\ *snapper\ rollback*)
                # "filter out" this entry, it's not real...
                ;;
            menuentry\ *)
                let J++
                ;;

Works fine here as well.

The solution hinted at by arvidjaar seems more solid though…

Yes.
But probably also more complicated to implement.

Anyway, I’ll file a bug report later today and let the maintainers decide how to tackle this…

I reported it now:
https://bugzilla.opensuse.org/show_bug.cgi?id=911243