Problems with unmounting /home

raveland · April 1, 2015, 10:02pm

Hello,

I’ve got 3 machines with 13.2. Probably after recent updates two of these three are always failing to unmount /home partition during shutdown/reboot so on the next boot system starts in emergency mode. I haven’t restarted the third machine but the chances are high it could be the same.

After some investigations I’ve found that shutdown process which is managed by systemd is not killing ssh sessions which is making /home dir busy. So umount is unable to unmount properly.

Basically, “shutdown -r +1 && exit” kinda works, but I don’t think it’s normal.
Previously it was fine doing shutdown with active ssh sessions.

What is broken?
Where can I look further?

Thanks

gogalthorp · April 1, 2015, 10:14pm

Ok shutting down implies all things are unmounted. Mounts do not survive power shutdowns. So it is something else going on if a partition does not mount on start-up. If a partition in fstab fails to mount at boot then the boot fails completly. A corrupted file system may cause this but then the boot fails. So how do you know that home is not mounted??

raveland · April 1, 2015, 10:36pm

Hello,

Debug info from journald:

Apr 01 16:27:26 mongo umount[2134]: umount: /home: target is busy
Apr 01 16:27:26 mongo umount[2134]: (In some cases useful info about processes that
Apr 01 16:27:26 mongo umount[2134]: use the device is found by lsof(8) or fuser(1).)
Apr 01 16:27:26 mongo systemd[1]: Failed unmounting /home.

And just before shutdown:

mongo:/var/log/journal # lsof /home
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
bash    4350 john cwd    DIR  254,4     4096 60424193 /home/john

raveland · April 1, 2015, 11:42pm

Another machine produces kind of different output:

Apr 02 00:32:34 dev3 SuSEfirewall2[16731]: Not unloading firewall rules at system shutdown
Apr 02 00:32:34 dev3 postfix/postfix-script[16739]: stopping the Postfix mail system
Apr 02 00:32:34 dev3 postfix/master[1385]: terminating on signal 15
Apr 02 00:32:34 dev3 wickedd-dhcp4[746]: eth0: Request to release DHCPv4 lease with UUID cafe1b55-42d3-0300-ed02-000005000000
Apr 02 00:32:34 dev3 wickedd-dhcp6[747]: eth0: no lease set
Apr 02 00:32:35 dev3 sh[16702]: /bin/sh: /bin/killall: No such file or directory
Apr 02 00:32:35 dev3 wicked[16747]: eth0            device-ready
Apr 02 00:32:39 dev3 haveged[478]: haveged: Stopping due to signal 15
Apr 02 00:32:39 dev3 haveged[478]: haveged starting up
Apr 02 00:32:39 dev3 umount[16919]: umount: /var/log: target is busy
Apr 02 00:32:39 dev3 umount[16919]: (In some cases useful info about processes that
Apr 02 00:32:39 dev3 dracut-initramfs-restore[16900]: gzip: stdin: not in gzip format
Apr 02 00:32:39 dev3 dracut-initramfs-restore[16900]: cpio: premature end of archive
Apr 02 00:32:39 dev3 umount[16919]: use the device is found by lsof(8) or fuser(1).)
Apr 02 00:32:39 dev3 kernel: watchdog watchdog0: watchdog did not stop!
Apr 02 00:32:40 dev3 systemd-journal[450]: Journal stopped

gogalthorp · April 2, 2015, 3:48am

even if the partition is not unmounted it should not cause an emergency mode at startup. If so then a power failure would cause that also and it does not at least for most. So I’d say it maybe a configuration problem since it appears to happen on at least 2 machines. Other are not reporting this so we need to know how the machine is setup.

start with repositories

zypper lr -d

then

cat /etc/fstab

You running and services not out of the box??

Are these servers or desktops or what?

file systems???