I’ve been running 11.3 for about 3 days, and have been noticing that it seems to crash with depressing regularity. The symptoms are, from the users perspective, the screen just locks up, mouse cursor won’t move, won’t respond to any keystrokes, and just dead in the water.
But, I can go to another system and ssh back into this one. When I do a top I see at least one and sometimes two Xorg processes owned by root that are grabbing every available CPU cycle. This box has dual cores, so it can support two processes running close to 100% CPU. I tried sending the runaway processes a SIGQUIT signal, which did stop them and did restore normal system operation. But I couldn’t find the expected core dump anywhere. If I don’t do anything to stop the runaway processes, eventually the whole system locks up, including the ssh session.
I haven’t been keeping a real close track, buy my general impression is that this only happens when we have two X sessions running. My wife and I usually keep separate login sessions that we switch back and forth from during the day.
I realize this isn’t a lot to go on. Any suggestions on what I can do to collect some more info. I particular what do I have to do to get a core dump from the runaway process and what should I do with it if I can get it.
Knowing your exact hardware is important for anyone to provide any advice, as a LOT of the time such freezes tend to be hardware specific, especially if we do not see every other forum member reporting the same freeze. Can you advise as to exactly what hardware you have ? motherboard ? graphics ? amount of memory ? ethernet/wireless device ?
My experience is freezes tend to be due to:
memory hardware problem
motherboard hardware problem
kernel incompatible with something in one’s hardware
graphic hardware or graphic driver problem
ethernet wired/wireless hardware or ethernet driver problem
What one can do sometimes, is after such a freeze, reboot the PC to a liveCD (or access via ssh like you can), and then copy to a USB stick (or to other ssh pc)
/var/log/messages
/home/username/.xsession-errors
and then look at the end of those files to see if they provide any hints as to the problem. I think there also may be some other log files in /var/log you can look at (I’m not at a Linux PC right now and my memory has failed me ).
In your case, since you can get back in via ssh that HELPS narrow things down, likely this is due to xorg freezing like you noted.
You could try a different graphic driver for a while, see if that solves the problem ?
Sorry I’ve taken so long to get back to this. Been working on a bunch of other problems.
Part of my attempts to solve another problem involved installing the nvidia proprietary driver instead of using the default nouveau driver. And the hang problem, which had been occurring 2-3 times a day, hasn’t recurred since I switched drivers. I’m still working on the resolution problem. When I get that fixed, I would be willing to switch back to see if I can get some more data on what might be triggering the hang. Or we can just mark this down as resolved.