System randomly freezes or crashes to the login screen, glitches until rebooted

Approximately once every 1 to 3 days of uptime, the system experiences a sudden and inexplicable crash: The image completely freezes in place, although unlike similar crashes in the past I can keep moving the mouse pointer around. A few seconds afterward, I find myself in a black console… and a few seconds after that, I’m back at the login screen. If I attempt to log back in however, the image either freezes again or desktop effects are no longer working without any error message as to why. Not even forcefully restarting X11 (control + alt + backspace twice) fixes the remaining glitches, and the only way to truly recover the system is to also reboot.

Although the crashes are completely random, I vaguely get the impression they might be happening when an event triggers certain desktop effects. Several times the crash occurred as a system tray notification popped up, whereas just now the system crashed while I was switching desktops in the middle of the desktop cube animation. Certain games might also have a probability of causing this.

I use the free video drivers and default system packages, all latest versions of openSUSE Tumbleweed. My card is a Radeon R7 370, GCN 1.0 on RadeonSI.

This continues to be a huge issue with the latest Tumbleweed packages! The system now freezes and crashes on a hourly basis. It’s likely something triggering a GPU crash, as it only seems to happen the moment certain desktop effects occur or given panels pop up; Even bringing up the taskbar (from auto-hide) or clicking the arrow which expands the system tray can cause it.

The only way to recover the system without restarting is to immediately hit “control + alt + backspace” to kill x11 once noticing the freeze, before it has time to become permanent or the glitches I mentioned take place. Can anyone else confirm this please, and let us know whether a solution is being worked on?

Kernel 4.10 is… ehm… cough slightly experimental for the last few weeks.

I expect someone of the usual suspects to show up here within the next few hours to tell you that TW is a rolling release and you should only use it for your microwave or so…:wink:

openSUSE Tumbleweed is generally very stable… there are only a few exceptions that occur a few times each year, which could (and likely will) be solved with better automated testing. TW only uses stable releases of each package, always tested to work at a basic level. The real madness would be running Factory directly, and expecting that to work perfectly all the time :slight_smile: But yes, it does seem like Kernel 4.10 would be responsible for this… though it’s just as likely the “radeon” driver itself.

As a test, try adding “intel_idle.max_cstate=1” without the quotes to the kernel boot parameters.
Or to GRUB_CMDLINE_LINUX_DEFAULT= in** /etc/default/grub** then run grub2-mkconfig -o /boot/grub2/grub.cfg

I assume this might not matter, but my video card is an AMD and not Intel. I figure that might be referring to the processor though, which is indeed a core i7 in my case.

But I might try that if the freeze happens again. Surprisingly, a crash hasn’t occurred during the last 24 hours, after some massive package updates… the problem might have been solved though I won’t be surprised if it hasn’t. Kernel 4.10.2 also hit Tumbleweed today, and from the looks of it 4.10.3 (with lots of driver fixes) might be in tomorrow… I should also see if those change anything.

The issue seems to have been maintained from kernel 4.10.1 to 4.10.3. The new kernel has updated drivers, including for radeon which is what my card uses. This implies it might not be a driver issue, although I’m not sure where else such a trigger may be hidden.

I discovered an important detail today: The crash is not limited to KDE desktop effects, unlike most crashes of this sort and what I initially suspected! I’ve had compositing disabled for several hours (alt + shift + F12) yet the exact same crash occurred just now: All windows and buttons froze in place while only the mouse cursor could be moved, the monitor then turned itself on and off a few times, and I found myself back to the login manager. Upon killing X11 and logging back in, I was presented with a window asking to check my compositor settings… indicating that KDE might have detected compositing as a source, although it’s been entirely turned off so that couldn’t be true.

Could a maintainer or developer familiar with this part of the system please be dispatched to the issue? I have reported it two weeks ago, and so far there has been no official response here. While I understand the developers are busy with other concerns, this is a major problem due to which I can’t keep my system from being shot down at random daily intervals. Considering that such a crash may occur anytime, including during a package update or while other processes are handling data, it leaves my system at risk of data corruption!

Additionally, could I please request that updates for x11 and the AMD video driver are prioritized? http://tumbleweed.boombatower.com indicates that Devel has the following versions pending: xorg-x11-server 1.19.3 (currently 1.19.2), xf86-video-ati 7.9.0 (currently 7.8.0), Mesa 17.0.2 (currently 17.0.1). Yesterday’s Factory snapshot appears to have ignored these packages. As Kernel updates don’t seem to affect the issue, I’m hoping a new version of one of these components might include a solution. Thank you.

No developer here just us users. Report on bugzilla with as much detail as you can

The joys of a rolling release :open_mouth:

Already reported in two more places:

https://bugzilla.opensuse.org/show_bug.cgi?id=1028575
https://bugs.freedesktop.org/show_bug.cgi?id=100306

openSUSE Tumbleweed is almost always stable; This sort of thing is the exception, and in the past worse issues happened with stable openSUSE releases too (though this was in the 11.x and pre 42.x era).

I apologize for the chain of comments! I’m also discussing this issue on IRC (#dri-devel on irc.freenode.org) and someone seemingly familiar with the code took a look at the logs I posted. Apparently this is a GPU lockup after all. They suggested I tell the llvm packagers to revert r280589. Someone else also mentioned a regression in llvm 3.9.1.

The log of our conversation can also be found online, in case I happened to miss any significant details: https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2017-03-21

You can get more immediate answers on the openSUSE Factory Mail List.
I would recommend subscribing.

Sent it there as well just now, granted no one’s been noticing this and saying anything official for two weeks. Thanks for the suggestion.

I see the same problem happening on Leap 42.2, running KDE (Plasma, as the purists would like to symantecize ;)), so it is not exclusive to Tumbleweed. This is on an install I was working on for somebody else, but they wanted it quick and stable, so I installed another distro and desktop. I cannot, therefore, add much to the bug reports, unfortunately, as I do not have the machine at my disposal and it is not having that issue now.

That same machine – a Lenovo laptop – ran openSUSE 13.1 with KDE4 flawlessly since the day 13.1 came out .

But, as you have observed, there is a definite problem with openSUSE right now that seems to be getting mostly ignored by those who (should) be able to help.

BTW: Have you tried this, which seems to be helping in a similar situation?

I had the same problem on openSUSE Leap 42.2.
I used your suggestion and deleted

rm /usr/share/appdata/org.kde.gcompris.desktop
rm /usr/share/applications/org.kde.gcompris.desktop

And now, my desktop doesn’t freeze anymore.

I am curious if it will help you.

Thank you for the suggestion! I don’t however have gcompris installed, nor any of those two files. Also I don’t believe the issue is KDE / Plasma exclusive: Some games still cause GPU freezes themselves… and at the end of the day Mesa and the drivers shouldn’t allow applications using them to crash the system, so the flaw is still ultimately on their end.

In any case, it helps to know someone can confirm this. Could you please check if the machine you noticed those crashes on also has an AMD card and uses the RadeonSI architecture?

And if you want a stable distro, I still recommend openSUSE 42.2. This issue was only introduced in Tumbleweed about two weeks ago for me.

Nope, it is Intel Centrino.

And if you want a stable distro, I still recommend openSUSE 42.2. This  issue was only introduced in Tumbleweed about two weeks ago for me.

Nope, complete lockup and only mouse pointer moves, Alt-Tab does not work, Ctrl-Esc does not work, clicks to not work, only Ctrl-Alt-Backspace x2 and log back in.

Not suitable for an install & pickup/deliver situation where the client needs quick and stable up and running.

First time in all these years I have had to use caution on whose machine I install openSUSE, which has me a bit worried about what is going on.

Here is another one for you to test (although I will start another thread and file a bug report for it as soon as I get time):

User cannot change their own password without root privileges???

… oh, and why I think it might be KDE-related?

Because, so far, all the numerous threads about similar freezes are people using KDE. But, that proves nothing, at this point. Could be Gnome or Xfce users have not experienced it as yet, or have not reported it in the forums.

Interesting… the freeze might not be radeon exclusive then. GPU freezes might be more frequent with KDE because it has an advanced desktop compositing system, whereas others use a simpler rendering pipeline… this means that if there’s an issue in a driver, it’s likely to trigger it alongside games that use OpenGL. And it’s sad to hear that 42.2 is in such a bad state too… Linux in general has a tendency to not work well with certain hardware unfortunately, because various drivers are always buggy.

Well, a relief of some sort: At least I know this is NOT my beloved openSUSE!

Seems it is happening with other distros, as well.

And, it seems to be somewhat randomly targeting systems, and I even get the sense that the symptoms can slightly vary.

I see this problem popping up on the IRC channels lately, as well as in the mailing lists.

Here are some excerpts from the mailing lists:

From the mailing lists:

While that restarts plasmashell when it’s working properly, it doesn’t
when the problem occurs. I just tried it now and got the following:

$ Calling appendChild() on a null node does nothing.
"Quitting application plasmashell failed. Error reported was:

org.freedesktop.DBus.Error.NoReply : Did not receive a reply. Possible
causes include: the remote application did not send a reply, the message
bus security policy blocked the reply, the reply timeout expired, or the
network connection was broken."

=====

That doesn’t work when it locks. I get the error:

$ "Quitting application plasmashell failed. Error reported was:

org.freedesktop.DBus.Error.NoReply : Did not receive a reply. Possible
causes include: the remote application did not send a reply, the message
bus security policy blocked the reply, the reply timeout expired, or the
network connection was broken."

Like Patrick’s suggestion, it will restart plasmashell when everything
is normal, but not when the problem occurs.
The only way I can get out of it is to use Ctl-Alt-Backspace to force a
desktop shutdown.

=====

Sorry to be joining this thread late, but been on vacation… James, I too have been plagued by this failure on my laptop but for some mysterious reason, not on my desktop systems. They are all running the same version of OpenSuSE Leap42.2 and KDE/Plasma. I and others have also reported it, not only here on OpenSuSE but in other distros as well. Yes, the kicker bar freezes and nothing but open windows respond to the mouse or keyboard input. Something about Plasmashell appears to be sick…

The best solution I have found so far is to leave a Konsole shell window open on my desktop. That way I can also get to it so as to issue the following two commands -

killall plasmashell
plasmashell &

Do not issue this as root, but just do it under your own user account. Basically kill the plasmashell and restart it without regard to process ids. There are other ways to do this, as already has been pointed out, but like you I found problems and errors when I tried other approaches. This seems to be the more reliable way for me to get my desktop/plasmashell back.

=====

This same problem plagued Manjaro months ago, and was resolved by regular kde updates.

While it lasted I used Marc’s solution keyed into the little command window that popped up after pressing alt-f2.

Any need to do this disappeared months ago as Manjaro is a rolling release and regular kde upgrades came along and the problem went away.

I believe it was traced to invisible remnants of a screen overlay that had crashed but was not completely shut down and was still intercepting mouse and keyboard inputs on some areas of the screen.
There were numerous hits in bugs.kde.org on this issue.

For me it occurred on Intel video chipset desktop fwif.

I wonder something: Are you using SDDM for the Display Manager? If so, can we rule out SDDM? Try installing KDM (I understand it is no longer installed by default) and switching to it, then find out if the freezing problem persists.

I would test that, but I do not have an operating machine with 42.2, Tumbleweed, KDE-Plasma at this time.