For the past month or two (e.g.: many snapshots ago!), my computer has hung on about one of every five reboots or shutdowns. The screen goes from black to chartreuse, and then nothing; I have to use the Magic-Sysrq keys to reboot or power down.
After a balked reboot, I ran # journalctl -b -1 and saw:
kernel: watchdog: watchdog0: watchdog did not stop!
I did some research – https://www.pcsuggest.com/disable-nmi-watchdog-linux/ – and:
(1) added kernel.nmi_watchdog=0 to the end of /etc/sysctl.conf
and
(2) added nmi_watchdog=0 as a boot parameter.
No luck. The computer still sometimes reboots or powers off properly, sometimes hangs.
I ran # journalctl -b -1 after the last incident requiring the Magic-Sysrq combination, and saw:
Sep 26 21:36:32 linux-j26j systemd[1]: Starting Reboot...
Sep 26 21:36:32 linux-j26j systemd[1]: Shutting down.
Sep 26 21:36:32 linux-j26j systemd[1]: Hardware watchdog 'iTCO_wdt', version 0
Sep 26 21:36:32 linux-j26j systemd[1]: Set hardware watchdog to 10min.
Sep 26 21:36:32 linux-j26j kernel: watchdog: watchdog0: watchdog did not stop!
Sep 26 21:36:32 linux-j26j kernel: systemd-shutdow: 28 output lines suppressed due to ratelimiting
Sep 26 21:36:32 linux-j26j systemd-shutdown[1]: Syncing filesystems and block devices.
Sep 26 21:36:32 linux-j26j systemd-journald[424]: Journal stopped
… which suggests (at least to me) that the computer is trying to fire up nmi_watchdog before shutdown, despite sysctl.conf and the boot parameters.
I wonder if the chartreuse screen also points to a video driver issue. I have the nvidia driver loaded “the hard way,” have seen the same chartreuse screen immediately after the video driver loads when booting to run level 3. However, upgrading drivers from 396.54 to the latest 410.57 hasn’t helped.
Any troubleshooting suggestions will be appreciated.
Magic-SysRq always reboots or shutdowns the computer. I could continue to live with the problem.