NFS shares unmounted after resume from suspend on Leap 15.4b

Hi all.

On Leap 15.4 beta my NFS shares on my synology nas are unmounted after I resume from suspend. This does not happen on 15.3.
I configured my shares in fstab as follows:

192.168.1.16:/volumeUSB1/usbshare /media/share/Synology/usbshare1 nfsdefaults,nofail 0 0

Is this a bug or has it something to do with newer versions of software on 15.4 and these settings?

Are you using wicked or NetworkManager?

I configured my shares in fstab as follows:

192.168.1.16:/volumeUSB1/usbshare /media/share/Synology/usbshare1 nfsdefaults,nofail 0 0

This line cannot work, looks like space is missing somwehere.

I’m using networkmanager. And something went wrong with copying and pasting from fstab. This is how it is:

192.168.1.16:/volumeUSB1/usbshare /media/share/Synology/usbshare1 nfs defaults,nofail 0 0

NetworkManager comes with dispatcher script to unmount NFS filesystems in /etc/fstab when interface goes down and mount them when interface comes up. This is /etc/NetworkManager/dispatcher.d/nfs. As a quick test you could try moving this script out of its directory. Does it change anything?

It has mounted the shares after resuming from suspend. So that ‘seems’ to be a solution.
But it is strange that this script prevents the very thing it is written for.

I have compared the nfs file between the one from a fresh 15.4 install and the one after a upgrade from 15.3, and they are identical.
So now I wonder is this script needed at all? Or is it broken and needs to be fixed. Because this has worked for me, maybe not for others?

No. It did not unmount them before suspend, they remained mounted.

Or is it broken and needs to be fixed.

Something else may have changed.

Put script back, reboot, capture output of “cat /proc/self/mountinfo” suspend, resume and provide full output of “journalctl -b” and “cat /proc/self/mountinfo” again (upload to https://susepaste.org).

Ok here are the files:

Mountinfo after reboot: SUSE Paste
Journalctl -b after reboot: SUSE Paste

Mountinfo after resume: SUSE Paste
Journalctl -b after resume: SUSE Paste

This looks like inherent race condition.

NetworkManger dispatcher up/down scripts are run asynchronously. So what apparently happens, scripts started on interface down continues to run and unmounts filesystems after resume. Here is trivial scrip that just adds some delay:

May 23 18:24:15 bor-Latitude-E5450 nm-dispatcher[210825]: req:2 'down' [wlan0], "/etc/NetworkManager/dispatcher.d/30-test": run script
...
May 23 18:24:25 bor-Latitude-E5450 kernel: ACPI: Waking up from system sleep state S3
...
May 23 18:24:25 bor-Latitude-E5450 nm-dispatcher[210825]: req:2 'down' [wlan0], "/etc/NetworkManager/dispatcher.d/30-test": complete

You can debug this by adding file /etc/NetworkManager/conf.d/50-debug.conf with content

root@bor-Latitude-E5450:~# cat /etc/NetworkManager/conf.d/50-debug.conf
[logging]
domains=DISPATCH:TRACE
root@bor-Latitude-E5450:~# 

and restart NetworkManager.

If I am right, you could try to add nfs script (just link to the existing one) to the directory /etc/NetworkManager/dispatcher.d/pre-down.d. According to documentation, scripts in this directory are executed synchronously, so NetworkManager should wait until filesystems are unmounted before suspend.

On the second thought, this will run script twice - as pre-down and as down which is probably not what we want. Try moving script in pre-down.d and linking additionally into pre-up.d. See “man NetworkManager” for details. It also possible to leave script in place, but then script needs editing, remove “down” from case condition.

I don’t think I understand what you want me to do.
From what I think I understand:

  • I should make a directory called ‘predown.d’ in /etc/NetworkManager/dispatcher.d/

  • Then make a script in predown.d with this content:

May 23 18:24:15 bor-Latitude-E5450 nm-dispatcher[210825]: req:2 'down' [wlan0], "/etc/NetworkManager/dispatcher.d/30-test": run script
...
May 23 18:24:25 bor-Latitude-E5450 kernel: ACPI: Waking up from system sleep state S3
... May 23 18:24:25 bor-Latitude-E5450 nm-dispatcher[210825]: req:2 'down' [wlan0], "/etc/NetworkManager/dispatcher.d/30-test": complete
  • ​Then make a link to the original nfs file in predown.d
  • Then restart NetworkManager

Is that correct?
Also the nfs file stays in /etc/NetworkManager/dispatcher.d?
And what do the times/dates (May 23 18:24:15) mean in the script?
Is that script ment to suspend and resume at a specific?

No. Create directories /etc/NetworkManager/dispatcher.d/pre-up.d and /etc/NetworkManager/dispatcher.d/pre-down.d. Move nfs script in one of them; link into another.

mv /etc/NetworkManager/dispatcher.d/nfs /etc/NetworkManager/dispatcher.d/pre-down.d
ln -s ../pre-down.d/nfs /etc/NetworkManager/dispatcher.d/pre-up.d

Also the nfs file stays in /etc/NetworkManager/dispatcher.d?

No.

It worked!
Now is this ‘THE’ solution? Or should the nfs script be fixed and this bug(?) be reported?

Anyway, I want to thank you for your time and effort.
Besides this, I had zero problems upgrading from 15.3 to 15.4.
15.4 is rock solid allready! :slight_smile:

For all I can tell - yes. You should open bug report on https://bugzilla.opensuse.org (same user account as here). You should also collect logs with trace level using the original nfs script location (which demnstrates the problem) as I told you and attach to this bug report.

I have no idea what ‘collecting logs with trace level means’.
You wrote before that I could debug this by adding the file /etc/NetworkManager/conf.d/50-debug.conf with content

root@bor-Latitude-E5450:~# cat /etc/NetworkManager/conf.d/50-debug.conf
[logging]
domains=DISPATCH:TRACE
root@bor-Latitude-E5450:~#

Now after I done that, I should run “cat /proc/self/mountinfo” and “journalctl -b” again after a reboot and after resume from suspend and add the outcomes from those commands to the bug report?
Would that be enough?

Should be. But you could first provide them here to check whether they contain necessary information to demonstrate the root cause of this bug.

Well, I found out that the ‘solution’ sometimes works. I will try to get the logs when it it doesn’t work again. But at the moment I’m running a little short on time. I can still continue my work after I do “mount -a”.