Very slow reboot or shutdown

purevw · July 23, 2021, 8:07pm

I had a failure of ending Session 2 of my user and a 1:29 wait. The resulting shutdown.txt file is 8400 lines of text.

Using the journalctl command I get:


**RYZEN9:~ #** journalctl -b -1 -u *.target
-- Logs begin at Sat 2021-07-17 14:12:25 CDT, end at Fri 2021-07-23 13:02:42 CDT. --
-- No entries --

which makes no sense since the journal file is 56MB in size. This is the complete path the system made for the journal after I created the directory that was specified:

/var/log/journal/3186ca150be445b6a880d1695718cde1/

arvidjaar · July 24, 2021, 8:57am

Show full output of “journalctl -b -1”, not arbitrary filtered content.

tannington · July 24, 2021, 10:27am

And… did you look at the file?

The test one I created on this system (ref post #19) was 7909 lines.

Open the file in a text editor and go to the last line, although the file is large it’s quite easy to quickly scroll back watching the times at the beginning of each line to find the point at which the shutdown pauses. Remember, you’re looking for a 90 second gap, so it should be fairly obvious, even when scrolling quickly.

What was Session 2 of your user doing? Perhaps you should have terminated that first before closing the system down?

tannington · July 24, 2021, 10:43am

Additionally, you may find this thread informative:

https://unix.stackexchange.com/questions/273876/a-stop-job-is-running-for-session-c2-of-user

purevw · July 24, 2021, 4:16pm

All files within the journal directory are in machine language and not viewable in a text editor.

I never open a second session under my user. It is something that the system is doing on it’s own. I thought that it had something to so with using “su”, but I did not use “su” last night and still got the Session 2 shutdown delay.

The shutdown.txt file content has no time stamps. I’m currently looking at the journalctl content for clues.

purevw · July 24, 2021, 4:51pm

Using journalctl -b -1 I got a very long output. From what I see, “kdesud” is the culprit in this case:


Jul 24 00:42:54 RYZEN9 systemd-logind[1431]: Got message type=signal sender=:1.0 destination=n/a path=/org/freedeskto>
Jul 24 00:43:48 RYZEN9 systemd[1]: systemd-timesyncd.service: Got notification message from PID 1325 (WATCHDOG=1)
Jul 24 00:43:48 RYZEN9 systemd[1]: systemd-journald.service: Got notification message from PID 911 (WATCHDOG=1)
Jul 24 00:44:23 RYZEN9 systemd-logind[1431]: Got message type=signal sender=:1.0 destination=n/a path=/org/freedeskto>
Jul 24 00:44:23 RYZEN9 systemd-logind[1431]: Got message type=signal sender=:1.0 destination=n/a path=/org/freedeskto>
Jul 24 00:44:23 RYZEN9 systemd-logind[1431]: Sent message type=method_call sender=n/a destination=org.freedesktop.sys>
Jul 24 00:44:23 RYZEN9 systemd[3308]: Received SIGTERM from PID 1 (systemd).
Jul 24 00:44:23 RYZEN9 systemd[3308]: Activating special unit exit.target
Jul 24 00:44:23 RYZEN9 systemd[1]: session-2.scope: Stopping timed out. Killing.
Jul 24 00:44:23 RYZEN9 systemd[1]: session-2.scope: Killing process 3838 (kdesud) with signal SIGKILL.
Jul 24 00:44:23 RYZEN9 systemd[1]: session-2.scope: Failed with result 'timeout'.
Jul 24 00:44:23 RYZEN9 systemd[1]: session-2.scope changed stop-sigterm -> failed
Jul 24 00:44:23 RYZEN9 systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1>
Jul 24 00:44:23 RYZEN9 systemd[1]: session-2.scope: Failed to destroy cgroup /user.slice/user-1000.slice/session-2.sc>
Jul 24 00:44:23 RYZEN9 systemd[1]: session-2.scope: Job 3946 session-2.scope/stop finished, result=done
Jul 24 00:44:23 RYZEN9 systemd[1]: Stopped Session 2 of user ronnie.

I did not use a “su” terminal last night that I remember, but I always use Yast Online Update immediately after every boot.

This may only address half of the problem. If I get a shutdown delay for my user (not Session 2), it is usually after I have been accessing my larger hard drives, even if I shut down 30 minutes after gkrellm shows that any file moves have completed. My primary concern with that is that twice now I have lost the mappings to a large drive upon shutdown and have to manually re-create the Luks mapping after the next boot. My primary concern is possible data loss if the drive is being forcibly shut down. The drives in question are several Terabytes in size.
The next time I get a shutdown error that is not Session 2, I will go back and search for the cause.

tannington · July 24, 2021, 6:17pm

I was referring to “shutdown-log.txt”, not the journal.

I never open a second session under my user. It is something that the system is doing on it’s own. I thought that it had something to so with using “su”, but I did not use “su” last night and still got the Session 2 shutdown delay.

OK - Then the shutdown-log could contain the answers…

The shutdown.txt file content has no time stamps. I’m currently looking at the journalctl content for clues.

Really ?? - Then something is very amiss in it’s creation. Again from the test one I created on this system (ref post #19):

paul@Orion-15:~$ cd /
paul@Orion-15:/$ ls
bin   dev  home  lib64       mnt  proc  run   shutdown-log.txt  sys  usr
boot  etc  lib   lost+found  opt  root  sbin  srv               tmp  var
paul@Orion-15:/$ tail -n 20 shutdown-log.txt
  201.639555] systemd-shutdown[1]: Detaching loop devices.
  201.639572] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
  201.639599] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
  201.639629] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
  201.639683] systemd-shutdown[1]: All loop devices detached.
  201.639686] systemd-shutdown[1]: Stopping MD devices.
  201.639692] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
  201.639705] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
  201.639727] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
  201.639766] systemd-shutdown[1]: All MD devices stopped.
  201.639770] systemd-shutdown[1]: Detaching DM devices.
  201.639776] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
  201.639788] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
  201.639808] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
  201.639846] systemd-shutdown[1]: All DM devices detached.
  201.639851] systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached.
  201.640131] systemd-shutdown[1]: Successfully forked off '(sd-executor)' as PID 1593.
  201.662719] [1593]: Successfully forked off '(direxec)' as PID 1594.
  201.663171] [1593]: Successfully forked off '(direxec)' as PID 1595.
  201.853498] EXT4-fs (sdc1): re-mounted. Opts: acl,user_xattr. Quota mode: none.
paul@Orion-15:/$

purevw · July 25, 2021, 5:12am

tannington:

I was referring to “shutdown-log.txt”, not the journal.

OK - Then the shutdown-log could contain the answers…

Really ?? - Then something is very amiss in it’s creation. Again from the test one I created on this system (ref post #19):

paul@Orion-15:~$ cd /
paul@Orion-15:/$ ls
bin   dev  home  lib64       mnt  proc  run   shutdown-log.txt  sys  usr
boot  etc  lib   lost+found  opt  root  sbin  srv               tmp  var
paul@Orion-15:/$ tail -n 20 shutdown-log.txt
  201.639555] systemd-shutdown[1]: Detaching loop devices.
  201.639572] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
  201.639599] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
  201.639629] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
  201.639683] systemd-shutdown[1]: All loop devices detached.
  201.639686] systemd-shutdown[1]: Stopping MD devices.
  201.639692] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
  201.639705] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
  201.639727] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
  201.639766] systemd-shutdown[1]: All MD devices stopped.
  201.639770] systemd-shutdown[1]: Detaching DM devices.
  201.639776] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
  201.639788] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
  201.639808] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
  201.639846] systemd-shutdown[1]: All DM devices detached.
  201.639851] systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached.
  201.640131] systemd-shutdown[1]: Successfully forked off '(sd-executor)' as PID 1593.
  201.662719] [1593]: Successfully forked off '(direxec)' as PID 1594.
  201.663171] [1593]: Successfully forked off '(direxec)' as PID 1595.
  201.853498] EXT4-fs (sdc1): re-mounted. Opts: acl,user_xattr. Quota mode: none.
paul@Orion-15:/$

I guess it was my mistake. I do not see time stamps. I would call those index numbers and have no idea how to interpret “minutes” and “seconds”. I do note a difference between your output and mine. Mine was never able to finalize the final DM device and apparently forced it.


 3238.254689] systemd-shutdown[1]: All filesystems unmounted.
 3238.254691] systemd-shutdown[1]: Deactivating swaps.
 3238.254715] systemd-shutdown[1]: All swaps deactivated.
 3238.254717] systemd-shutdown[1]: Detaching loop devices.
 3238.254726] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
 3238.254742] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
 3238.254766] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
 3238.254816] systemd-shutdown[1]: All loop devices detached.
 3238.254817] systemd-shutdown[1]: Detaching DM devices.
 3238.254820] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
 3238.254825] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
 3238.254837] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
 3238.254987] systemd-shutdown[1]: Not all DM devices detached, 1 left.
 3238.255104] systemd-shutdown[1]: Detaching DM devices.
 3238.255107] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
 3238.255113] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
 3238.255126] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
 3238.255249] systemd-shutdown[1]: Not all DM devices detached, 1 left.
 3238.255251] systemd-shutdown[1]: Cannot finalize remaining DM devices, continuing.
 3238.255395] systemd-shutdown[1]: Successfully forked off '(sd-executor)' as PID 11207.
 3238.256992] [11207]: Successfully forked off '(direxec)' as PID 11208.
 3238.257147] [11207]: Successfully forked off '(direxec)' as PID 11209.
 3238.257296] [11207]: Successfully forked off '(direxec)' as PID 11210.
 3238.263967] EXT4-fs (dm-0): re-mounted. Opts: (null)

I would note that my / is encrypted, which may be the DM that couldn’t be finalized because it was being written to.

arvidjaar · July 25, 2021, 9:38am

Default dmesg timestamp is in seconds since boot (this does not include time when system was suspended).

tannington · July 26, 2021, 6:35pm

OK. Time stamp or index number, you need to look further back in the shutdown-log.txt to find the point at which the 90 sec timeout begins, look at what is happening in the log for several lines immediately prior to that for any possible clues.

purevw · August 1, 2021, 5:21pm

The only pre-timeout failure is the package “pidgin” and it is always the same:


 3145.197645] systemd[1]: Received SIGCHLD from PID 3606 (pidgin).
 3145.197658] systemd[1]: Child 3606 (pidgin) died (code=exited, status=1/FAILURE)
 3145.197688] systemd[1]: session-2.scope: Failed to read oom_kill field of memory.events cgroup attribute: No such file or directory
 3145.197690] systemd[1]: session-2.scope: Child 3606 belongs to session-2.scope.

The code shows that the process died, so I am not sure of it could be causing the timeout. I also do not know why pidgin is under Session-2, unless pidgin is spawning Session-2 if I open an active chat window.
I will verify the theory shortly. I do not run pidgin as SU, but I do have it set to autostart with KDE upon login.

I have saved 3 of the shutdown.txt files and they all show the same at the end of the time-out. It is basically the same information as the journalctl command gave me:

From failure 1:

 3235.494506] systemd[1]: session-2.scope: Stopping timed out. Killing.
 3235.494779] systemd[1]: session-2.scope: Killing process 3937 (kdesud) with signal SIGKILL.
 3235.494831] systemd[1]: session-2.scope: Failed with result 'timeout'.
 3235.494838] systemd[1]: session-2.scope changed stop-sigterm -> failed


From failure 2:


 1871.515893] systemd[1]: session-2.scope: Stopping timed out. Killing.
 1871.516037] systemd[1]: session-2.scope: Killing process 3833 (kdesud) with signal SIGKILL.
 1871.516094] systemd[1]: session-2.scope: Failed with result 'timeout'.
 1871.516103] systemd[1]: session-2.scope changed stop-sigterm -> failed


From failure 3:


 6451.547019] systemd[1]: systemd-timesyncd.service: Got notification message from PID 1364 (WATCHDOG=1)
 6472.546754] systemd[1]: session-2.scope: Stopping timed out. Killing.
 6472.546901] systemd[1]: session-2.scope: Killing process 2894 (kdesud) with signal SIGKILL.
 6472.546961] systemd[1]: session-2.scope: Failed with result 'timeout'.
 6472.546969] systemd[1]: session-2.scope changed stop-sigterm -> failed

It seems that kdesud may be a problem. My question is, what is causing Session-2 of my user to start in the first place? On the third failure, the only package I ran as root was Yast2.Online Update. It was opened by the system icon and not by a terminal. That is the only time I used the admin password during that session.

karlmistelberger · August 1, 2021, 5:46pm

Ronnie Bailey:

The only pre-timeout failure is the package “pidgin” and it is always the same:
 3145.197645] systemd[1]: Received SIGCHLD from PID 3606 (pidgin).
 3145.197658] systemd[1]: Child 3606 (pidgin) died (code=exited, status=1/FAILURE)
 3145.197688] systemd[1]: session-2.scope: Failed to read oom_kill field of memory.events cgroup attribute: No such file or directory
 3145.197690] systemd[1]: session-2.scope: Child 3606 belongs to session-2.scope.
The code shows that the process died, so I am not sure of it could be causing the timeout. I also do not know why pidgin is under Session-2, unless pidgin is spawning Session-2 if I open an active chat window.
I will verify the theory shortly. I do not run pidgin as SU, but I do have it set to autostart with KDE upon login.

I have saved 3 of the shutdown.txt files and they all show the same at the end of the time-out. It is basically the same information as the journalctl command gave me:
From failure 1:

 3235.494506] systemd[1]: session-2.scope: Stopping timed out. Killing.
 3235.494779] systemd[1]: session-2.scope: Killing process 3937 (kdesud) with signal SIGKILL.
 3235.494831] systemd[1]: session-2.scope: Failed with result 'timeout'.
 3235.494838] systemd[1]: session-2.scope changed stop-sigterm -> failed


From failure 2:


 1871.515893] systemd[1]: session-2.scope: Stopping timed out. Killing.
 1871.516037] systemd[1]: session-2.scope: Killing process 3833 (kdesud) with signal SIGKILL.
 1871.516094] systemd[1]: session-2.scope: Failed with result 'timeout'.
 1871.516103] systemd[1]: session-2.scope changed stop-sigterm -> failed


From failure 3:


 6451.547019] systemd[1]: systemd-timesyncd.service: Got notification message from PID 1364 (WATCHDOG=1)
 6472.546754] systemd[1]: session-2.scope: Stopping timed out. Killing.
 6472.546901] systemd[1]: session-2.scope: Killing process 2894 (kdesud) with signal SIGKILL.
 6472.546961] systemd[1]: session-2.scope: Failed with result 'timeout'.
 6472.546969] systemd[1]: session-2.scope changed stop-sigterm -> failed 
It seems that kdesud may be a problem. My question is, what is causing Session-2 of my user to start in the first place? On the third failure, the only package I ran as root was Yast2.Online Update. It was opened by the system icon and not by a terminal. That is the only time I used the admin password during that session.

On my machine there is a unit session-2.scope too:

**erlangen:~ #** systemctl list-units --type scope  
  UNIT            LOAD   ACTIVE SUB     DESCRIPTION               
  init.scope      loaded active running System and Service Manager 
  session-2.scope loaded active running Session 2 of user karl 

LOAD   = Reflects whether the unit definition was properly loaded. 
ACTIVE = The high-level unit activation state, i.e. generalization of SUB. 
SUB    = The low-level unit activation state, values depend on unit type. 
**2 loaded units listed.** Pass --all to see loaded but inactive units, too. 
To show all installed unit files use 'systemctl list-unit-files'. 
**erlangen:~ #**

In the first place it’s me who is creating this unit by logging in.

purevw · August 2, 2021, 2:09am

karlmistelberger:

On my machine there is a unit session-2.scope too:

**erlangen:~ #** systemctl list-units --type scope  
  UNIT            LOAD   ACTIVE SUB     DESCRIPTION               
  init.scope      loaded active running System and Service Manager 
  session-2.scope loaded active running Session 2 of user karl 

LOAD   = Reflects whether the unit definition was properly loaded. 
ACTIVE = The high-level unit activation state, i.e. generalization of SUB. 
SUB    = The low-level unit activation state, values depend on unit type. 
**2 loaded units listed.** Pass --all to see loaded but inactive units, too. 
To show all installed unit files use 'systemctl list-unit-files'. 
**erlangen:~ #**

In the first place it’s me who is creating this unit by logging in.

I was not aware. Mine shows the same immediately upon login. I ran the systemctl command as a regular user, but the output was identical.
The question remains. What is causing the 90 second pause? “pidgin” is reported to die at the beginning of the pause. “kdesud” is killed immediately at the end of the pause / timeout.
I can confirm that in my last session, I did not open a chat window with pidgin and although pidgin is always running in the background, shutdown was normal. I am chatting via pidgin now. Once done, I will close the chat window, and then exit pidgin before shutdown. Perhaps shutdown will be normal if pidgin is no longer running in the background during shutdown. Possibly there is a bug in pidgin, but I would not know what to report, other than the slow shutdown.
I am installing psi+ and will try it as an alternative to pidgin. I will also stop pidgin from starting at login.

karlmistelberger · August 2, 2021, 7:17am

Ronnie Bailey:

I was not aware. Mine shows the same immediately upon login. I ran the systemctl command as a regular user, but the output was identical.
The question remains. What is causing the 90 second pause? “pidgin” is reported to die at the beginning of the pause. “kdesud” is killed immediately at the end of the pause / timeout.
I can confirm that in my last session, I did not open a chat window with pidgin and although pidgin is always running in the background, shutdown was normal. I am chatting via pidgin now. Once done, I will close the chat window, and then exit pidgin before shutdown. Perhaps shutdown will be normal if pidgin is no longer running in the background during shutdown. Possibly there is a bug in pidgin, but I would not know what to report, other than the slow shutdown.
I am installing psi+ and will try it as an alternative to pidgin. I will also stop pidgin from starting at login.

Prerequisite for debugging is a persistent journal as discussed earlier. Preferably run the commands as user root:

Startup of user session in previous boot (-b -1):

**i3-4130:~ #** journalctl -b -1 -o short-monotonic -u user@1000.service -g Start 
-- Journal begins at Thu 2021-07-29 10:51:52 CEST, ends at Mon 2021-08-02 07:06:10 CEST. -- 
    4.985405] i3-4130 systemd[1]: Starting User Manager for UID 1000... 
    5.101950] i3-4130 systemd[1023]: Started Timer for profile-sync-daemon - 1Hour. 
    5.102292] i3-4130 systemd[1023]: Starting D-Bus User Message Bus Socket. 
    5.103687] i3-4130 systemd[1]: Started User Manager for UID 1000. 
    5.105348] i3-4130 systemd[1023]: Starting Profile-sync-daemon... 
    5.140661] i3-4130 systemd[1023]: Started D-Bus User Message Bus. 
    5.199344] i3-4130 systemd[1023]: Starting Timed resync... 
    6.233307] i3-4130 systemd[1023]: Starting KDE Global Shortcuts Server... 
    6.289597] i3-4130 systemd[1023]: Started KDE Global Shortcuts Server. 
    6.289682] i3-4130 systemd[1023]: Startup finished in 1.294s. 
    6.411322] i3-4130 systemd[1023]: Starting User preferences database... 
    6.419531] i3-4130 systemd[1023]: Started User preferences database. 
    6.618531] i3-4130 systemd[1023]: Starting Sound Service... 
    6.626818] i3-4130 systemd[1023]: Starting KActivityManager Activity manager Service... 
    6.685827] i3-4130 systemd[1023]: Started KActivityManager Activity manager Service. 
    7.747875] i3-4130 systemd[1023]: Started Sound Service. 
   13.185342] i3-4130 systemd[1023]: Started Konsole - Terminal. 
   32.082275] i3-4130 systemd[1023]: Started Firefox - Web Browser. 
   32.377618] i3-4130 systemd[1023]: Starting Virtual filesystem service... 
   32.392857] i3-4130 systemd[1023]: Started Virtual filesystem service. 
   32.951102] i3-4130 systemd[1023]: Starting Accessibility services bus... 
   32.959462] i3-4130 systemd[1023]: Started Accessibility services bus. 
  432.121206] i3-4130 systemd[1023]: Started Systemeinstellungen - Systemeinstellungen. 
**i3-4130:~ #**

Shutdown of user session in previous boot (-b -1):

**i3-4130:~ #** journalctl -b -1 -o short-monotonic -u user@1000.service -g Stop                     
-- Journal begins at Thu 2021-07-29 10:51:52 CEST, ends at Mon 2021-08-02 07:06:10 CEST. -- 
 3343.390184] i3-4130 systemd[1]: Stopping User Manager for UID 1000... 
 3343.391391] i3-4130 systemd[1023]: Stopped target Main User Target. 
 3343.391494] i3-4130 systemd[1023]: Stopping Accessibility services bus... 
 3343.391861] i3-4130 systemd[1023]: Stopping D-Bus User Message Bus... 
 3343.391991] i3-4130 systemd[1023]: Stopping User preferences database... 
 3343.392094] i3-4130 systemd[1023]: Stopping Virtual filesystem service... 
 3343.392635] i3-4130 systemd[1023]: Stopping Profile-sync-daemon... 
 3343.392736] i3-4130 systemd[1023]: Stopped Sound Service. 
 3343.394767] i3-4130 systemd[1023]: Stopped Accessibility services bus. 
 3343.395392] i3-4130 systemd[1023]: Stopped User preferences database. 
 3343.402752] i3-4130 systemd[1023]: Stopped D-Bus User Message Bus. 
 3343.403938] i3-4130 systemd[1023]: Stopped Virtual filesystem service. 
 3343.727236] i3-4130 systemd[1023]: Stopped Profile-sync-daemon. 
 3343.727314] i3-4130 systemd[1023]: Stopped target Basic System. 
 3343.727366] i3-4130 systemd[1023]: Stopped target Paths. 
 3343.727416] i3-4130 systemd[1023]: Stopped target Sockets. 
 3343.727461] i3-4130 systemd[1023]: Stopped target Timers. 
 3343.727542] i3-4130 systemd[1023]: Stopped Timer for profile-sync-daemon - 1Hour. 
 3343.748266] i3-4130 systemd[1]: Stopped User Manager for UID 1000. 
**i3-4130:~ #**

The above will tell you the culprit of the delay, Post as above.

gogalthorp · August 2, 2021, 3:58pm

Orphan code Ie code that dies but is stall in active list

Manual kill with ps and kill commands before shut down if this works you must determine why the process is failing

.

purevw · August 2, 2021, 6:40pm

Pidgin always started upon login by choice. The process only caused problems during the sessions that I actively opened a pidgin chat window and initiated or responded to a chat. Those are the times when there was a delay during shutdown. If I never opened a chat window, pidgin ran only in the background and shutdown was normal.

I have replaced pidgin with another XMPP/Jabber client and my shutdowns have been normal so far, even after active chats.

purevw · August 3, 2021, 6:04pm

I had another timeout on my last shutdown. It seems to be related to the second shutdown problem that I had mentioned. If I am interpreting it correctly, the system is having problems releasing my two large encrypted luks drives. The code is fairly long and I apologize, but I wanted to include all from the time the system called for release of the encrypted drives.



  857.473620] systemd-cryptsetup[5635]: Deactivating volume 04T.
  857.473625] systemd-cryptsetup[5635]: dm versions    opencount flush ]   [16384] (*1)
  857.473630] systemd-cryptsetup[5635]: dm status 04T   opencount noflush ]   [16384] (*1)
  857.473635] systemd-cryptsetup[5635]: dm versions    opencount flush ]   [16384] (*1)
  857.473638] systemd-cryptsetup[5635]: dm table 04T   opencount flush securedata ]   [16384] (*1)
  857.473647] systemd-cryptsetup[5635]: Trying to open and read device /dev/sdc1 with direct-io.
  857.473839] systemd-cryptsetup[5635]: dm versions    opencount flush ]   [16384] (*1)
  857.473852] systemd-cryptsetup[5635]: Udev cookie 0xd4d04e1 (semid 15) created
  857.473855] systemd-cryptsetup[5635]: Udev cookie 0xd4d04e1 (semid 15) incremented to 1
  857.473858] systemd-cryptsetup[5635]: Udev cookie 0xd4d04e1 (semid 15) incremented to 2
  857.473862] systemd-cryptsetup[5635]: Udev cookie 0xd4d04e1 (semid 15) assigned to REMOVE task(2) with flags DISABLE_LIBRARY_FALLBACK         (0x20)
  857.473865] systemd-cryptsetup[5635]: dm remove 04T   opencount flush retryremove ]   [16384] (*1)
  857.473868] systemd-cryptsetup[5635]: device-mapper: remove ioctl on 04T  failed: Device or resource busy
  857.475015] systemd-cryptsetup[5645]: Allocating crypt device context by device 10T.
  857.475022] systemd-cryptsetup[5645]: Initialising device-mapper backend library.
  857.475097] systemd-cryptsetup[5645]: dm version    opencount flush ]   [16384] (*1)
  857.475106] systemd-cryptsetup[5645]: dm versions    opencount flush ]   [16384] (*1)
  857.475113] systemd-cryptsetup[5645]: Detected dm-ioctl version 4.42.0.
  857.475115] systemd-cryptsetup[5645]: Detected dm-crypt version 1.22.0.
  857.475129] systemd-cryptsetup[5645]: Device-mapper backend running with UDEV support enabled.
  857.475134] systemd-cryptsetup[5645]: dm status 10T   opencount noflush ]   [16384] (*1)
  857.475142] systemd-cryptsetup[5645]: Releasing device-mapper backend.
  857.475167] systemd-cryptsetup[5645]: Trying to open and read device /dev/sdb1 with direct-io.
  857.488119] systemd-journald[1205]: Successfully sent stream file descriptor to service manager.
  857.489356] systemd-cryptsetup[5645]: Allocating context for crypt device /dev/sdb1.
  857.489362] systemd-cryptsetup[5645]: Trying to open and read device /dev/sdb1 with direct-io.
  857.489599] systemd-cryptsetup[5645]: Initialising device-mapper backend library.
  857.489618] systemd-cryptsetup[5645]: dm versions    opencount flush ]   [16384] (*1)
  857.489639] systemd-cryptsetup[5645]: dm table 10T   opencount flush securedata ]   [16384] (*1)
  857.489668] systemd-cryptsetup[5645]: Trying to open and read device /dev/sdb1 with direct-io.
  857.489843] systemd-cryptsetup[5645]: dm versions    opencount flush ]   [16384] (*1)
  857.489853] systemd-cryptsetup[5645]: dm deps 10T   opencount flush ]   [16384] (*1)
  857.489882] systemd-cryptsetup[5645]: Crypto backend (OpenSSL 1.1.1d  10 Sep 2019) initialized in cryptsetup library version 2.3.4.
  857.489886] systemd-cryptsetup[5645]: Detected kernel Linux 5.3.18-59.16-default x86_64.
  857.490109] systemd-cryptsetup[5645]: PBKDF pbkdf2-sha256, time_ms 2000 (iterations 0).
  857.490113] systemd-cryptsetup[5645]: Reading LUKS header of size 1024 from device /dev/sdb1
  857.490278] systemd-cryptsetup[5645]: Key length 32, device size 19532871647 sectors, header size 2050 sectors.
  857.490285] systemd-cryptsetup[5645]: Deactivating volume 10T.
  857.490294] systemd-cryptsetup[5645]: dm versions    opencount flush ]   [16384] (*1)
  857.490305] systemd-cryptsetup[5645]: dm status 10T   opencount noflush ]   [16384] (*1)
  857.490315] systemd-cryptsetup[5645]: dm versions    opencount flush ]   [16384] (*1)
  857.490323] systemd-cryptsetup[5645]: dm table 10T   opencount flush securedata ]   [16384] (*1)
  857.490342] systemd-cryptsetup[5645]: Trying to open and read device /dev/sdb1 with direct-io.
  857.490771] systemd-cryptsetup[5645]: dm versions    opencount flush ]   [16384] (*1)
  857.490798] systemd-cryptsetup[5645]: Udev cookie 0xd4deeea (semid 16) created
  857.490804] systemd-cryptsetup[5645]: Udev cookie 0xd4deeea (semid 16) incremented to 1
  857.490812] systemd-cryptsetup[5645]: Udev cookie 0xd4deeea (semid 16) incremented to 2
  857.490819] systemd-cryptsetup[5645]: Udev cookie 0xd4deeea (semid 16) assigned to REMOVE task(2) with flags DISABLE_LIBRARY_FALLBACK         (0x20)
  857.490826] systemd-cryptsetup[5645]: dm remove 10T   opencount flush retryremove ]   [16384] (*1)
  857.512600] systemd-cryptsetup[5645]: Udev cookie 0xd4deeea (semid 16) decremented to 0
  857.512604] systemd-cryptsetup[5645]: Udev cookie 0xd4deeea (semid 16) waiting for zero
  857.512608] systemd-cryptsetup[5645]: Udev cookie 0xd4deeea (semid 16) destroyed
  857.512613] systemd-cryptsetup[5645]: Releasing crypt device /dev/sdb1 context.
  857.512615] systemd-cryptsetup[5645]: Releasing device-mapper backend.
  857.512618] systemd-cryptsetup[5645]: Closing read only fd for /dev/sdb1.
  857.673954] systemd-cryptsetup[5635]: dm remove 04T   opencount flush retryremove ]   [16384] (*2)
  857.696580] systemd-cryptsetup[5635]: Udev cookie 0xd4d04e1 (semid 15) decremented to 0
  857.696583] systemd-cryptsetup[5635]: Udev cookie 0xd4d04e1 (semid 15) waiting for zero
  857.696588] systemd-cryptsetup[5635]: Udev cookie 0xd4d04e1 (semid 15) destroyed
  857.696599] systemd-cryptsetup[5635]: Releasing crypt device /dev/sdc1 context.
  857.696601] systemd-cryptsetup[5635]: Releasing device-mapper backend.
  857.696606] systemd-cryptsetup[5635]: Closing read only fd for /dev/sdc1.
  858.917164] systemd-journald[1205]: Successfully sent stream file descriptor to service manager.
  858.919642] systemd-journald[1205]: Successfully sent stream file descriptor to service manager.
  887.494395] systemd[1]: systemd-logind.service: Got notification message from PID 1726 (WATCHDOG=1)
  947.493625] systemd[1]: session-2.scope: Stopping timed out. Killing.
  947.493892] systemd[1]: session-2.scope: Killing process 3449 (kdesud) with signal SIGKILL.
  947.493944] systemd[1]: session-2.scope: Failed with result 'timeout'.
  947.493951] systemd[1]: session-2.scope changed stop-sigterm -> failed

All file managers and programs that were accessing the drives were closed before I initiated shutdown. This takes me back to my hypothesis that lagging writes from disk caching may be part of the problem. I have 64GB of RAM and a large part of it is used by disk caching for those two drives. What might I do to further troubleshoot this?

karlmistelberger · August 4, 2021, 8:42am

Ever considered providing sound evidence?

All file managers and programs that were accessing the drives were closed before I initiated shutdown. This takes me back to my hypothesis that lagging writes from disk caching may be part of the problem. I have 64GB of RAM and a large part of it is used by disk caching for those two drives. What might I do to further troubleshoot this?

Analyzing wild hypotheses has no merit. Rely on the system’s tools for analysis, such as ‘systemd-cgls -u user-1000.slice’ for user processes and ‘systemd-cgls -u system.slice’.

purevw · August 4, 2021, 6:24pm

If I knew all the tools for diagnostics and troubleshooting I would not need this forum. I was simply throwing out an idea based on the fact that it seems to occur after I move large files, my RAM maxes out because of disk cache, and remains maxed out even 30 minutes after the file transfers are done. I suppose I can keep random observations to myself and post only output of commands that you specify.

Analyzing wild hypotheses has no merit. Rely on the system’s tools for analysis, such as ‘systemd-cgls -u user-1000.slice’ for user processes and ‘systemd-cgls -u system.slice’.

I assume you mean I need to run both tools every time before shutdown? It may be a week or more before this error is duplicated.

purevw · August 4, 2021, 7:07pm

There was a delay as I shut down. Here is the output of the commands that were suggested, taken just before shutdown:

 
systemd-cgls -u user-1000.slice Unit user-1000.slice (/user.slice/user-1000.slice): 
├─user@1000.service … 
│ ├─gvfs-goa-volume-monitor.service  
│ │ └─2720 /usr/lib/gvfs/gvfs-goa-volume-monitor 
│ ├─gvfs-daemon.service  
│ │ ├─2640 /usr/lib/gvfs/gvfsd 
│ │ └─2654 /usr/lib/gvfs/gvfsd-fuse /run/user/1000/gvfs -f 
│ ├─gvfs-udisks2-volume-monitor.service  
│ │ └─2666 /usr/lib/gvfs/gvfs-udisks2-volume-monitor 
│ ├─init.scope  
│ │ ├─2432 /usr/lib/systemd/systemd --user 
│ │ └─2433 (sd-pam) 
│ ├─gvfs-gphoto2-volume-monitor.service  
│ │ └─2712 /usr/lib/gvfs/gvfs-gphoto2-volume-monitor 
│ ├─at-spi-dbus-bus.service  
│ │ └─3419 /usr/lib/at-spi2/at-spi-bus-launcher 
│ ├─gvfs-metadata.service  
│ │ └─5926 /usr/lib/gvfs/gvfsd-metadata 
│ ├─dbus.service  
│ │ ├─2453 /usr/bin/dbus-daemon --session --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 
│ │ ├─2535 /usr/bin/kglobalaccel5 
│ │ ├─2566 /usr/lib/dconf-service 
│ │ ├─2592 /usr/lib64/libexec/kactivitymanagerd 
│ │ ├─2724 /usr/lib/goa-daemon 
│ │ ├─2726 /usr/lib64/libexec/kf5/kscreen_backend_launcher 
│ │ ├─2728 /usr/lib/bluetooth/obexd 
│ │ ├─2777 /usr/lib/goa-identity-service 
│ │ ├─2905 /opt/teamviewer/tv_bin/TeamViewer 
│ │ └─3069 /usr/bin/kwalletd5 
│ ├─gvfs-mtp-volume-monitor.service  
│ │ └─2701 /usr/lib/gvfs/gvfs-mtp-volume-monitor 
│ └─gvfs-afc-volume-monitor.service  
│   └─2782 /usr/lib/gvfs/gvfs-afc-volume-monitor 
└─session-2.scope  
 ├─2430 /usr/lib/sddm/sddm-helper --socket /tmp/sddm-authee63e27b-1240-49cf-a1e3-4588387cf3b1 --id 1 --start /usr/bi> 
 ├─2445 /usr/bin/startplasma-x11 
 ├─2498 /usr/bin/ssh-agent /usr/bin/gpg-agent --sh --daemon --keep-display /etc/X11/xinit/xinitrc 
 ├─2499 /usr/bin/gpg-agent --sh --daemon --keep-display /etc/X11/xinit/xinitrc 
 ├─2508 /usr/lib64/libexec/kf5/start_kdeinit --kded +kcminit_startup 
 ├─2509 kdeinit5: Running... 
 ├─2510 /usr/lib64/libexec/kf5/klauncher --fd=9 
 ├─2513 kded5 
 ├─2520 /usr/bin/kaccess 
 ├─2527 /usr/bin/plasma_session 
 ├─2537 /usr/bin/ksmserver 
 ├─2575 /usr/bin/kwin_x11 
 ├─2577 /usr/bin/plasmashell 
 ├─2579 /usr/bin/xembedsniproxy 
 ├─2581 /usr/lib64/libexec/polkit-kde-authentication-agent-1 
 ├─2602 /usr/bin/xsettingsd 
 ├─2611 /usr/bin/gmenudbusmenuproxy 
 ├─2615 /usr/lib64/libexec/kdeconnectd 
 ├─2618 /usr/bin/python3 /usr/bin/hp-systray -x 
 ├─2624 /usr/lib64/libexec/org_kde_powerdevil 
 ├─2626 /usr/bin/psi-plus 
 ├─2635 /usr/bin/gkrellm 
 ├─2644 /usr/bin/kmix --keepvisibility 
 ├─2766 /usr/bin/python3 /usr/bin/hp-systray -x 
 ├─2767 /usr/bin/python3 /usr/bin/hp-systray -x 
 ├─2941 /usr/lib64/qt5/libexec/QtWebEngineProcess --type=zygote --no-zygote-sandbox --application-name=Psi%2B --webe> 
 ├─2942 /usr/lib64/qt5/libexec/QtWebEngineProcess --type=zygote --application-name=Psi%2B --webengine-schemes=qrc:sL> 
 ├─2948 /usr/lib64/qt5/libexec/QtWebEngineProcess --type=zygote --application-name=Psi%2B --webengine-schemes=qrc:sL> 
 ├─2962 /usr/lib64/qt5/libexec/QtWebEngineProcess --type=utility --utility-sub-type=network.mojom.NetworkService --e> 
 ├─3008 /usr/lib64/qt5/libexec/QtWebEngineProcess --type=renderer --disable-speech-api --enable-threaded-compositing> 
 ├─3009 /usr/lib64/qt5/libexec/QtWebEngineProcess --type=renderer --disable-speech-api --enable-threaded-compositing> 
 ├─3054 /usr/bin/ksysguardd 
 ├─3095 /usr/lib64/libexec/kf5/kdesud 
 ├─3174 dbus-launch --autolaunch 3186ca150be445b6a880d1695718cde1 --binary-syntax --close-stderr 
 ├─3175 /usr/bin/dbus-daemon --syslog-only --fork --print-pid 5 --print-address 7 --session 
 ├─3678 file.so [kdeinit5] file local:/run/user/1000/klaunchersAMaYv.1.slave-socket local:/run/user/1000/kded5eZYDWC> 
 ├─8061 /usr/bin/konsole 
 ├─8095 /bin/bash 
 ├─8121 su 
 ├─8124 bash 
 ├─8154 systemd-cgls -u user-1000.slice 
 └─8155 less


systemd-cgls -u system.slice    
Unit system.slice (/system.slice): 
├─irqbalance.service  
│ └─1368 /usr/sbin/irqbalance --foreground 
├─haveged.service  
│ └─982 /usr/sbin/haveged -w 1024 -v 0 -F 
├─nmb.service  
│ ├─1598 /usr/sbin/nmbd --foreground --no-process-group 
│ └─1649 /usr/sbin/nmbd --foreground --no-process-group 
├─systemd-udevd.service  
│ └─977 /usr/lib/systemd/systemd-udevd 
├─cron.service  
│ └─1884 /usr/sbin/cron -n 
├─polkit.service  
│ └─1378 /usr/lib/polkit-1/polkitd --no-debug 
├─rtkit-daemon.service  
│ └─4271 /usr/lib/rtkit/rtkit-daemon 
├─chronyd.service  
│ └─1609 /usr/sbin/chronyd 
├─bluetooth.service  
│ └─1364 /usr/lib/bluetooth/bluetoothd 
├─auditd.service  
│ └─1338 /sbin/auditd 
├─ModemManager.service  
│ └─1488 /usr/sbin/ModemManager 
├─systemd-journald.service  
│ └─918 /usr/lib/systemd/systemd-journald 
├─sshd.service  
│ └─1616 sshd: /usr/sbin/sshd -D 
[listener] 0 of 10-100 startups 
├─display-manager.service  
│ ├─2215 /usr/bin/sddm 
│ └─2218 /usr/bin/X -nolisten tcp -auth /run/sddm/{0aa9f01b-0dea-4f2c-97c4-f86263489aa7} -background none -noreset -d> 
├─systemd-ask-password-plymouth.service  
│ └─955 /usr/bin/systemd-tty-ask-password-agent --watch --plymouth 
├─NetworkManager.service  
│ └─1578 /usr/sbin/NetworkManager --no-daemon 
├─rsyslog.service  
│ └─1411 /usr/sbin/rsyslogd -n -iNONE 
├─firewalld.service  
│ └─1490 /usr/bin/python3 /usr/sbin/firewalld --nofork --nopid 
├─teamviewerd.service  
│ └─2154 /opt/teamviewer/tv_bin/teamviewerd -d 
├─nscd.service  
│ └─1388 /usr/sbin/nscd 
├─wsdd2.service  
│ └─1396 /usr/sbin/wsdd2 
├─postfix.service  
│ ├─1857 /usr/lib/postfix/bin//master -w 
│ ├─1864 pickup -l -t fifo -u 
│ └─1865 qmgr -l -t fifo -u 
├─cups.service  
│ └─1365 /usr/sbin/cupsd -l 
├─smb.service  
│ ├─2132 /usr/sbin/smbd --foreground --no-process-group 
│ ├─2134 /usr/sbin/smbd --foreground --no-process-group 
│ ├─2135 /usr/sbin/smbd --foreground --no-process-group 
│ └─2136 /usr/sbin/smbd --foreground --no-process-group 
├─upower.service  
│ └─2370 /usr/lib/upower/upowerd 
├─mcelog.service  
│ └─1493 /usr/sbin/mcelog --ignorenodev --daemon --foreground 
├─udisks2.service  
│ └─2347 /usr/lib/udisks2/udisksd 
├─dbus.service  
│ └─1366 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 
├─systemd-timesyncd.service  
│ └─1337 /usr/lib/systemd/systemd-timesyncd 
├─system-getty.slice  
│ └─getty@tty1.service  
│   └─2214 /sbin/agetty -o -p -- \u --noclear tty1 linux 
├─avahi-daemon.service  
│ └─1363 avahi-daemon: running [RYZEN9.local] 
├─smartd.service  
│ └─1387 /usr/sbin/smartd -n 
└─systemd-logind.service  
 └─1428 /usr/lib/systemd/systemd-logind