System reboot itself regularily/randomly

Hello.

Don’t really know what to make out of this but today my computer suddenly started to switch off and reboot, now happened three times in a row. There were some updates today which might have triggered something, however my other desktop runs fine but with a slightly different set of packages. Both runs OS 13.1, KDE 4.11.5.

Really could need some help interpreting the systemlog, this is from finishing booting and till it went down again this last time; if I should include the whole boot-up process just say so.

 2014-09-25T15:14:27.505603+02:00 DDAmt systemd[1]: Startup finished in 2.193s (kernel) + 6.193s (userspace) = 8.387s.
 2014-09-25T15:14:33.882520+02:00 DDAmt polkitd[1700]: Registered Authentication Agent for unix-session:1 (system bus name :1.19 [/usr/lib64/kde4/libexec/polkit-kde-authentication-agent-1], object path /org/kde/PolicyKit1/AuthenticationAgent, locale en_GB.UTF-8)
 2014-09-25T15:14:57.501917+02:00 DDAmt systemd[1]: Starting Stop Read-Ahead Data Collection...
 2014-09-25T15:14:57.505692+02:00 DDAmt systemd[1]: Started Stop Read-Ahead Data Collection.
 2014-09-25T15:15:01.510305+02:00 DDAmt /usr/sbin/cron[2534]: pam_unix(crond:session): session opened for user root by (uid=0)
 2014-09-25T15:15:01.512582+02:00 DDAmt systemd[1]: Starting user-0.slice.
 2014-09-25T15:15:01.513264+02:00 DDAmt systemd[1]: Created slice user-0.slice.
 2014-09-25T15:15:01.513933+02:00 DDAmt systemd[1]: Starting User Manager for 0...
 2014-09-25T15:15:01.516038+02:00 DDAmt systemd[1]: Starting Session 2 of user root.
 2014-09-25T15:15:01.516569+02:00 DDAmt systemd[1]: Started Session 2 of user root.
 2014-09-25T15:15:01.518665+02:00 DDAmt systemd: pam_unix(systemd-user:session): session opened for user root by (uid=0)
 2014-09-25T15:15:01.522768+02:00 DDAmt systemd[2535]: Failed to open private bus connection: Failed to connect to socket /run/user/0/dbus/user_bus_socket: No such file or directory
 2014-09-25T15:15:01.528810+02:00 DDAmt systemd[2535]: Stopped target Sound Card.
 2014-09-25T15:15:01.529080+02:00 DDAmt systemd[2535]: Starting Default.
 2014-09-25T15:15:01.529308+02:00 DDAmt systemd[2535]: Reached target Default.
 2014-09-25T15:15:01.529621+02:00 DDAmt systemd[2535]: Startup finished in 6ms.
 2014-09-25T15:15:01.529862+02:00 DDAmt systemd[1]: Started User Manager for 0.
 2014-09-25T15:15:01.540536+02:00 DDAmt /USR/SBIN/CRON[2534]: pam_unix(crond:session): session closed for user root
 2014-09-25T15:15:03.878522+02:00 DDAmt kernel:    45.632062] fuse init (API version 7.22)
 2014-09-25T15:15:03.879517+02:00 DDAmt systemd[1]: Mounting FUSE Control File System...
 2014-09-25T15:15:03.883585+02:00 DDAmt systemd[1]: Mounted FUSE Control File System.
 2014-09-25T15:29:19.118922+02:00 DDAmt systemd[1]: Starting Cleanup of Temporary Directories...
 2014-09-25T15:29:19.127032+02:00 DDAmt systemd-tmpfiles[2829]: stat(/run/user/1000/gvfs) failed: Permission denied
 2014-09-25T15:29:19.180399+02:00 DDAmt systemd[1]: Started Cleanup of Temporary Directories.
 2014-09-25T15:30:01.548346+02:00 DDAmt /usr/sbin/cron[2831]: pam_unix(crond:session): session opened for user root by (uid=0)
 2014-09-25T15:30:01.551607+02:00 DDAmt systemd[1]: Starting Session 3 of user root.
 2014-09-25T15:30:01.552032+02:00 DDAmt systemd[1]: Started Session 3 of user root.
 2014-09-25T15:30:01.580901+02:00 DDAmt /USR/SBIN/CRON[2831]: pam_unix(crond:session): session closed for user root
 2014-09-25T15:34:36.327168+02:00 DDAmt rsyslogd: [origin software="rsyslogd" swVersion="7.4.7" x-pid="578" x-info="http://www.rsyslog.com"] start
 2014-09-25T15:34:36.327176+02:00 DDAmt systemd[1]: Expecting device dev-disk-by\x2did-ata\x2dWDC_WD1002FAEX\x2d00Z3A0_WD\x2dWCATRA450801\x2dpart2.device...
 2014-09-25T15:34:36.327269+02:00 DDAmt kernel:     0.000000] Initializing cgroup subsys cpuset
 2014-09-25T15:34:36.327269+02:00 DDAmt systemd[1]: Starting Root Slice.
 2014-09-25T15:34:36.327273+02:00 DDAmt kernel:     0.000000] Initializing cgroup subsys cpu
 2014-09-25T15:34:36.327274+02:00 DDAmt kernel:     0.000000] Initializing cgroup subsys cpuacct
 2014-09-25T15:34:36.327276+02:00 DDAmt kernel:     0.000000] Linux version 3.11.10-21-desktop (geeko@buildhost) (gcc version 4.8.1 20130909 [gcc-4_8-branch revision 202388] (SUSE Linux) ) #1 SMP PREEMPT Mon Jul 21 15:28:46 UTC 2014 (9a9565d)
 2014-09-25T15:34:36.327278+02:00 DDAmt kernel:     0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.11.10-21-desktop root=UUID=c0b50e76-ab2b-4084-aa19-27985dee36b4 resume=/dev/disk/by-id/ata-WDC_WD1002FAEX-00Z3A0_WD-WCATRA450801-part2 splash=silent quiet showopts elevator=deadline


Thanks,
Olav

Needless to say, this should not be happening.

The log is from startup. If you can find logs from just before shutdown, that might be more useful.

This could be a hardware problem. You might try reseating the memory chips.

Hi nrickert.

Would you suggest a log file? I don’t know much about diagnosing.
I did have a look in the warnings log, lots of messages and checked those I thought were relevant, none I checked were peculiar for the present, though, and had all occured earlier.

BTW, I did find something a bit peculiar; tried to manually empty /tmp using superuser-dolphin and received error messages about, probably, half of the files/dirs there, either with ‘no permission …’ or ’ file does not exist’.
Removing them from command line was no problem, though.
This is on an ssd drive which (should) clears tmp at boot (as seen in the log).

Thank you!

If some process is using the temp file then you may not be able to delete even as root in a GUI but command line would be no problem . However you may mess some things up that were using the files. Was this before random shutdown or after?

On 2014-09-25 16:36, nrickert wrote:
>
> Needless to say, this should not be happening.
>
> The log is from startup. If you can find logs from just before
> shutdown, that might be more useful.

It is there. Look again:


2014-09-25T15:30:01.552032+02:00 DDAmt systemd[1]: Started Session 3 of user root.
2014-09-25T15:30:01.580901+02:00 DDAmt /USR/SBIN/CRON[2831]: pam_unix(crond:session): session closed for user root
* 2014-09-25T15:34:36.327168+02:00 DDAmt rsyslogd: [origin software="rsyslogd" swVersion="7.4.7" x-pid="578" x-info="http://www.rsyslog.com"] start
2014-09-25T15:34:36.327176+02:00 DDAmt systemd[1]: Expecting device dev-disk-by\x2did-ata\x2dWDC_WD1002FAEX\x2d00Z3A0_WD\x2dWCATRA450801\x2dpart2.device...
2014-09-25T15:34:36.327269+02:00 DDAmt kernel:     0.000000] Initializing cgroup subsys cpuset
2014-09-25T15:34:36.327269+02:00 DDAmt systemd[1]: Starting Root Slice.
2014-09-25T15:34:36.327273+02:00 DDAmt kernel:     0.000000] Initializing cgroup subsys cpu
2014-09-25T15:34:36.327274+02:00 DDAmt kernel:     0.000000] Initializing cgroup subsys cpuacct
2014-09-25T15:34:36.327276+02:00 DDAmt kernel:     0.000000] Linux version 3.11.10-21-desktop (geeko@buildhost) (gcc version 4.8.1 20130909 [gcc-4_8-branch revision 202388] (SUSE Linux) ) #1 SMP PREEMPT Mon Jul 21 15:28:46 UTC 2014 (9a9565d)
2014-09-25T15:34:36.327278+02:00 DDAmt kernel:     0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.11.10-21-desktop root=UUID=c0b50e76-ab2b-4084-aa19-27985dee36b4 resume=/dev/disk/by-id/ata-WDC_WD1002FAEX-00Z3A0_WD-WCATRA450801-part2 splash=silent quiet showopts elevator=deadline

Boot is at 2014-09-25T15:34 (just a few lines below is the kernel “command line” entry).
The line before, at 2014-09-25T15:30:, it isfrom cron, the last entry from the previous session.
It crashed suddenly and without log entries between those two.

> This could be a hardware problem. You might try reseating the memory
> chips.

Yep.
Or heat.


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

Hi, I suppose you’re right; but, I have done this for years and this is not what I have experienced before, I don’t do this when apps are running though.

Lots of stuff other then Apps use tmp for temp files.

Hi, heat is not the problem, I’m quite sure of that. I also just ran a memtest, running through it one round revealed no problems so I presume it’s not a memory problem either.
One thing I did not mentioned, initially, is that I changed my graphical-card three days ago; I did not think about it, actually, as I have not noticed anything before today.

Could there be a log entry somewhere which revealed whatever takes place more clearly?

Thanks!

Boot with “nomodeset”. You can manually do that at the boot prompt. If it runs reliably for an extended period of time that way, then you may have evidence that is graphic card related. Maybe also look at Xorg.log

I used to have this problem when I was overclocking my CPU a little too high. If you are, try setting it without overclocking (or a slightly lower overclock setting) and see if it helps.

Thanks for the inputs.

I did a system reinstall, things were getting weird.
I tried to log in as root to remove various local user configurations, resetting home variables that way; however, even that was sort of blocked, could not load previous working local root-user settings; I ended up with a blank screen, no panel, no nothing, though I could start processes from terminal.
This I found very strange as the only thing I ever did do to the system was installing the HW (GPU) + reinstalling the nvidia driver afterwards; plus, having used it for a couple of days without noticing anything in particular before now.

The GPU is brand new and the card I previously had in this box I moved over to another machine; that transition went problem-free though, with the exact same procedure as above.

So, might be that I have unintentionally inflicted some damage somewhere when I mounted the GPU in the rack, hope not though.

Thanks all!

BTW, no over-clocking has been attempted, though I had considered doing so in fact:) It is a NVIDA MSI 760 TwinFrozr Gaming … (I’m not really very much of a gamer though).

On 2014-09-26 00:16, F Sauce wrote:

> The GPU is brand new and the card I previously had in this box I moved
> over to another machine; that transition went problem-free though, with
> the exact same procedure as above.

Try reseating the card. If not, change back to the old card. If that
works, the card may be faulty or not compatible.

Run gkrellm, and activate the temperature sensors; then watch them.


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

I have tested the temp, it is quite good (low), hd, gpu and cpu.

Same problem after the reinstall, oh well:)

On 2014-09-26 12:26, F Sauce wrote:

My system just rebooted suddenly today. I’m suspecting the nvidia driver
and downgrading it to a previous version.

My working cycle includes hibernating the machine, instead of halting
it, when I stop using it. Conserves power. I normally get into the room,
power it up, then go make a cup of tea. When I get back, it is running
and the mail fetched and so.

About when I updated nvidia to G03-340 I started getting this problem:
some days when I get there with my tea, the machine is not “running”,
but it has done a full boot, and it is doing an fsck of all filesystems,
as normal after a crash. One suspicion is that I may have switched the
power bar to soon the previous day, distractedly, before hibernation
finished. I know I have done this once, but thrice in a row?

So I made a point of not powering up and go get tea, but instead make
tea, then power it up, sit down and watch. It has been going good for a
long time, since 2014-09-11. And today I saw it happen: the machine came
up from hibernation, it awakes, the graphics comes up; about this point
the screen saver should kick in and hide the display. Instead, the
machine reboots hard and I see the bios ram check going by. Full and
sudden reboot.

What?

So now I’m downgrading nvidia, to G03-331, to see if it is that… I
have another suspect, the xfs home filesystem (a known problem I have),
but that would cause a kernel problem and report, not a bios crash/boot.


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

OK, that is interesting.
I have two computers running NVIDIA, this one is unstable, the other one is quite fine; and only this one use the latest driver. I hadn’t thought about it so checked, it runs 331.79.
What I have noticed is that the reboot issue (here), most likely, is related to screen-blanking/saving; it has not happened while the screen has been active.

I tried playing left4dead just to stress the the gpu a little, it lagged a bit here and there though probably only the server; so it seems fine in some ways at least.

Please let me know what conclusion you end up with.

Thanks,
Olav

On 2014-09-26 17:06, F Sauce wrote:

> Please let me know what conclusion you end up with.

I have now downgraded from G03-340.32 to G03-331.79. I had to reboot
serveral times, remove some rpms from older versions, force reinstall of
the G03-331.79 rpms… because the kernel was loading still the wrong
driver. Now it is working correctly, it seems.

But my crashes were very sporadic, could take days or weeks to happen again.

If I understand correctly, the one you have problems with runs G03-340,
I would try downgrading and find out if that solves your issue.

However, the older version is not on the repo. I just happen to have a
backup…


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

OK
The older driver should be available from NVIDIA though, I may install it that way.

I had to reboot
serveral times, remove some rpms from older versions, force reinstall of
the G03-331.79 rpms… because the kernel was loading still the wrong
driver.

I went that route myself a month or so ago:)
I manually removed the older modules of the package which were to be upgraded; I don’t remember precisely but due to some dependency resolving I overlooked (I guess), I ended up with ‘using’ a driver which weren’t really installed, though that module were still present.

On 2014-09-26 18:16, F Sauce wrote:
>
> OK
> The older driver should be available from NVIDIA though, I may install
> it that way.

The hard way, yes.

>> I had to reboot
>> serveral times, remove some rpms from older versions, force reinstall of
>> the G03-331.79 rpms… because the kernel was loading still the wrong
>> driver.
> I went that route myself a month or so ago:)
> I manually removed the older modules of the package which were to be
> upgraded; I don’t remember precisely but due to some dependency
> resolving I overlooked, I ended up with ‘using’ a driver which weren’t
> really installed, though that module were still present.

Same issue, but I got no dependency message.


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

You seem to be running KDE.

Go into the KDE power settings, and disable screen dimming. Leave the setting to save power by switching off the screen. Disable (uncheck) only the dimming.

See if that helps.

On my laptop with Intel graphics, I sometimes have system freezes. The freeze occurs after dimming. It seems to occur just as the screen returns to full brightness (when I move a mouse). Since I disabled dimming, freezes have been very rare.

On an older laptop with radeon graphics, when the screen returns to full brightness (after having dimmed), it goes blank. And rebooting is the only way to get it back. Since I disabled dimming, I have not had a recurrence.

I don’t know if the suggestion will help, but it is worth a try.

Yes, I suppose I could do that, a bit like sweep the problem under the carpet perhaps.
But, I think it would work for a normal session.

[size=2]I’m a bit more optimistic about the whole thing, though, as the issue seems likely to be software related, driver related perhaps.

The machine has just been running today and it does not crash when screen active; and, as it seems, it does not crash unless Firefox in conjunction with Flash runs when screen-blanking occurs; Konqueror + Flash has been runnning for hours now with no crash/reboot.

This may also be an issue: [/size]What to do about error message: (process:2862): GLib-CRITICAL **: g_slice_set_config: assertion `sys_page_size == 0' failed | Firefox Support Forum | Mozilla Support
[size=2]Starting Firefox (and Konqueror) from Konsole shows some similarity with that bug:
[/size]

olav@DDAmt:~> firefox 

(process:12672): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed


(firefox:12672): Gdk-WARNING **: gdk_window_set_icon_list: icons too large


(firefox:12672): Gdk-WARNING **: gdk_window_set_icon_list: icons too large
^X^C
olav@DDAmt:~> konqueror 
^C
olav@DDAmt:~> 



[size=2]
Do any of you get these errors?
The solution for him, though, will not work; I have none of those installed.

Thanks[/size]