Unexpected shutdown

Just a few minutes ago, my computer shut down unexpectedly. I don’t think it’s a heat issue, but more a thing with hacking. I did have a Console window open in super user mode, but inactive. And was torrenting.

I looked at dmesg and journalctl --system -b --full --no-pager
but they show just the current running system messages, not ones previously.

Try:

journalctl --system -b -1 --full --no-pager

It’s probably just a bug, not “hacking” or something exotic like that. Focus on what you know, not what your guesses are. :slight_smile:

System crashes are often hard to diagnose, because the system usually doesn’t have the opportunity to write logs to disk during a crash. If it happens again, what you might do is connect another device to your system via ssh and run journalctl --system -f from that device (over ssh), as that’ll show you the messages leading up to the system crash.

Thanks for the good advice to focus on what I know, not on what my guesses are. But still… I shouldn’t feel like I can’t at least express an opinion.

There still could be some indication of a problem, though, right, like there should still be some journal messages before the system crashed.

Certainly you can express an opinion, but just know that it’s not likely anything that anyone can act on because there’s no supporting data for that hypothesis at this time. Without supporting facts, opinions aren’t helpful in getting to the root cause of the issue - and they can be a huge distraction.

I’ve been doing online support for decades, and if I had a nickel for every time someone said “I was hacked!” and it ended up being something more mundane, I could probably retire. (Maybe a slight exaggeration there…but I’ve never once had anyone say “I’ve been hacked” and it was actually what the problem was).

In troubleshooting, we have to deal with what we know, not what we guess to be the case. If we focus on a guess, then that almost always leads to the wrong conclusion, or to ignoring what the evidence shows.

It is entirely possible for nothing of value to be written to any log. Remember that logs are stored on disk. If the system crashes unexpectedly, no processes will write to disk, so those log entries will be lost. That’s why I suggest having a second device connected via ssh tailing the log files - those messages will be shown up to the point that the system is unable to do that, and that’s usually (not always, though) longer than anything that gets written to disk.

This is why system crashes are so difficult to diagnose. Barring information on-screen at the time of the crash, it’s possible (and by design, I should note) that nothing will be preserved. Systems that halt (“abend” was a common term for an operating system I used a lot in the 90s and 2000s - short for “Abnormal End”. Think like the Windows BSOD for another example) usually don’t commit anything to disk because the state of the OS is uncertain, and writing data to disk risks corrupting even more data. When the system integrity is compromised, there’s no guarantee (for example) that a file pointer for a file being written to disk is actually pointing to the right place on the disk. The risk for writing that log file could be inadvertently overwriting the boot sector of the drive or the partition table - which could lead to massive data loss.

When the system can’t trust itself to keep data safe, halting is frequently the safest thing it can do - in spite of the fact that that can corrupt open files.

So it happened again, and it looks like there were all sorts of btrfs filesystem errors that kept showing up in text.

That seems like a more likely cause - what sort of filesystem errors did you see reported?

I don’t remember exactly, just that they were btrfs filesystem errors. I wished that I could have somehow gone back to the GUI or press some sort of key combination to shut down the computer.

If you enable the Magic SysRq keys, you can sometimes more cleanly shut things down if the system has hung. Doesn’t always work, but it can.

You can try using smartctl to see if there are any drive errors reported, and use btrfsck (I’d boot from a live ISO for this) to check the filesystem integrity - I’d run it the first time without the options to fix errors, just use it to see what errors are reported.

If the issue happens again, do what you can to capture the messages - the more specific the messages are, the easier it’ll be for someone to help you.

@as-muncher So this is the external HDD attached to the laptop at the USB port with full-disk encryption? If so, could be overloading the USB port, is the external HDD in a powered enclosure and powered?

1 Like

@malcolmlewis The external HDD is just connected to USB port, not powered externally.

@hendersj It would be nice if I could set some sort of flag, or if gparted or Disks, when trying to do a scan on a running system, would scan the drive for errors on the next boot, changing the init script to do that.

You could always boot from rescue media or live media and run a filesystem check manually, rather than waiting for the next boot and trying to automate it for boot time.

@as-muncher then I would suggest you look at powering the enclosure especially with a HDD…

1 Like

I think it’s not serious, just because if it was a power issue, then I would hear the drive chunking, starting and stopping. It’s not a power issue.

and there’s no option to boot in safe mode at the boot screen eh?

I’d still check the power specs for the HDD, USB are generally 500mA, if having issues the port will power off or blow up… filesystem corruption…

I usually have it plugged into a powered 5port hub and I’m mindful of the electricity current drain and the specs of the hub.

@as-muncher Ahh then it is powered via an external source not the laptop… I’d be looking at the hub then, power the device and plug direct to the laptop to ensure all is ok.

I’m pretty sure it’s not a power issue. I think what may be the cause is that I have on a couple of occasions, held down the power button in xfce because it would not power off immediately. I guess the settings have to be set for Gnome, KDE, and xfce for what to do with the power buttons? That could possibly be it. I don’t know. I haven’t had an issue since, with the system powering off due to btrfs filesystem errors.

Those filesystem issues can certainly be caused by insufficient power.

I’d have a look at dmesg while the system is running and see what it’s reporting. My guess is that it’s probably going to be telling you a lot is happening specifically with that drive.

Storage devices are pretty sensitive to not having enough power, and faulty power regulator or just insufficient power being provided are a great way to end up with a completely corrupted drive.

And an external hub can certainly develop power issues. I used to have one that absolutely fried a USB Yubikey that had previously worked just fine in it.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.