Hang at reboot/shutdown; nmi_watchdog issue

For the past month or two (e.g.: many snapshots ago!), my computer has hung on about one of every five reboots or shutdowns. The screen goes from black to chartreuse, and then nothing; I have to use the Magic-Sysrq keys to reboot or power down.

After a balked reboot, I ran # journalctl -b -1 and saw:

kernel: watchdog: watchdog0: watchdog did not stop!

I did some research – https://www.pcsuggest.com/disable-nmi-watchdog-linux/ – and:

(1) added kernel.nmi_watchdog=0 to the end of /etc/sysctl.conf

and

(2) added nmi_watchdog=0 as a boot parameter.

No luck. The computer still sometimes reboots or powers off properly, sometimes hangs.

I ran # journalctl -b -1 after the last incident requiring the Magic-Sysrq combination, and saw:

Sep 26 21:36:32 linux-j26j systemd[1]: Starting Reboot...
Sep 26 21:36:32 linux-j26j systemd[1]: Shutting down.
Sep 26 21:36:32 linux-j26j systemd[1]: Hardware watchdog 'iTCO_wdt', version 0
Sep 26 21:36:32 linux-j26j systemd[1]: Set hardware watchdog to 10min.
Sep 26 21:36:32 linux-j26j kernel: watchdog: watchdog0: watchdog did not stop!
Sep 26 21:36:32 linux-j26j kernel: systemd-shutdow: 28 output lines suppressed due to ratelimiting
Sep 26 21:36:32 linux-j26j systemd-shutdown[1]: Syncing filesystems and block devices.
Sep 26 21:36:32 linux-j26j systemd-journald[424]: Journal stopped

… which suggests (at least to me) that the computer is trying to fire up nmi_watchdog before shutdown, despite sysctl.conf and the boot parameters.

I wonder if the chartreuse screen also points to a video driver issue. I have the nvidia driver loaded “the hard way,” have seen the same chartreuse screen immediately after the video driver loads when booting to run level 3. However, upgrading drivers from 396.54 to the latest 410.57 hasn’t helped.

Any troubleshooting suggestions will be appreciated.

Magic-SysRq always reboots or shutdowns the computer. I could continue to live with the problem.

An update, for anyone who might discover this thread while dealing with similar issues.

*** **After writing the post above, I discovered:

https://github.com/systemd/systemd/issues/8485

and edited ShutdownWatchdogSec= in /etc/systemd/system.conf, with the hope that the edit might heal the balky reboot/shutdown issue. It didn’t. I have since removed all the kernel.nmi_watchdog related configuration changes.

*** **The problem also occurs on a Tumbleweed laptop with the 410.57 nvidia driver and an Nvidia Quadro M3000M card, with similar symptoms. Sometimes shutdown/reboot work normally, sometimes it hangs with a chartreuse screen.

*** **Magic-SysRq R E is enough to make the computer continue with reboot or shutdown.

I am suffering from exactly the same symtpoms … on and off now for some time! At the moment it is happening continuously after a session of work, however, if i reboot … then reboot again immediately after reloading, it shuts down ok … however, should i start working for some undetermined period … at the end of that it will hang like you say!!

Sorry to learn that you’re suffering, too, griadooss! I still see the same problem occasionally, but not as often as when I opened this post in late September. I’m now on snapshot 20181126 with the 4.19.4 kernel and the nvidia 410.66 driver installed the hard way.

I still suspect an nvidia issue. If you’re not using nvidia, though, I may change my mind.



||Alternate Version|Installed Version|
|---|---|---|
|Version:|390.87-10.2|390.87-10.2|
|Build Time:|Fri 26 Oct 2018 23:53:55 AEDT|Fri 26 Oct 2018 23:53:55 AEDT|
|Install Time:||Sat 03 Nov 2018 21:12:52 AEDT|
|Package Group:|System/Libraries|System/Libraries|
|License:|SUSE-NonFree|SUSE-NonFree|
|Installed Size:|111.4 MiB|111.4 MiB|
|Download Size:|27.5 MiB|0 B|
|Distribution:||Proprietary:X11:Drivers / openSUSE_Leap_42.3|
|Vendor:|obs://build.suse.de/Proprietary:X11:Drivers|obs://build.suse.de/Proprietary:X11:Drivers|
|Packager:|||
|Architecture:|x86_64|x86_64|
|Build Host:|||
|URL:|https://www.nvidia.com/object/unix.html|https://www.nvidia.com/object/unix.html|
|Source Package:|x11-video-nvidiaG04-390.87-10.2|x11-video-nvidiaG04-390.87-10.2|
|Media No.:|||
|Authors:|||



I had the same problem that it took more than one minute after the last message (something like “started Power-off” or so and sometimes I could see something with “watchdog did not stop” or similar. I have Intel graphics however - no nvidia. The whole extended shut down stopped since version 20181122. Now the shutdown is pretty fast.