Hi,
my work PC (openSUSE 13.1) suddenly restarts (it just switches off as if the power cable where plugged out and then restarts), this seems related to the CPU, but iam not sure about this. This begun in the begging of the week, before it was on sometimes for weeks without problems.
This week i had to run some programms which needed full CPU load for about 2 days, and the 3 restarts iam talking about allways happens during this periods, but not in the beginning, mostly it happens after one day of running. But i cannot be save about this, because i had only 1-2 days during this week where i had no programms running, during this time the PC was stable, yesterday i started a new programm and the PC restarted just 5-10 minutes ago while i was in front of it.
But as i says before this weeks everything was fine, the last time i had programms running is maybe about 3 month ago, i had some of them running for weeks and everything was stable.
The CPU fan is still running but there a so much other reasons, like the power device, the RAM maybe or the heat paste on the CPU.
What would you do, checking the temps is problematic because the pc mostly restarts when iam at home.
Is there a CPU and or RAM checking programm i could run over night which tests and logs everything and allows to check for a hardware problem the next day when the PC is probably restarted?
Earlier versions of KDE in openSUSE-13.1 had a memory leak, which I believe could possibly cause a reboot < unsure > but the fixes have been out for some time.
wrt Thermal / heat problem, my experience (some years back) with that sort of problem was that typically causes a complete shutdown and not just a restart.
If your PC has a /var/log/messages file, that may provide some hints.
If you want to record the temperature and CPU usage you could use the program “psensor”. When you enable logging it creates a file at ~/.psensor/sensor.log and records the parameters as comma separated values. Combined with the suggested logs from /var/log you could check for correlations of crashes an the recorded parameters.
From my experience with overheated CPUs I can say that they rather lead to a shutdown than a restart.
Thanks for your answers, before i leaved my workplace i noticed another reason for the restart: i use conky for displaying some basic system information on my desktop, one of those information is the RAM usage, and i noticed that the maximum RAM listed where only 16 GB instead of the 24 GB i have installed (that was not the case last week), so what i did is i checked every single RAM bar ( i have 6 4GB bars) by having only one plugged in and then rebooting to the BIOS, in the end every bar could be detected by the BIOS, so i plugged all 6 of them back in.
A full boot into KDE revealt that now the full 24 GB are found. I do not know if that was the reason, but i started one of my programs again if the PC is still running on Monday then probably just the RAM was not correctly plugged in (but i do not know how it was possible to get lose) if the PC restarts again i will replace the power device and the CPU heat paste, if this does not help too i will come back to this thread again.
Did you actually test the RAM?? Just plugging it is does not mean much the problem may be in a single cell/ Changing the order of the sticks may change when the problem happens. Also RAM problems can be heat related so you really need to run memtest at least over night. The option should be in the boot menu if not boot an install media you will find it there.
No i did not run memtest on the RAM bars because the problem was that only 16 of the 24 gb where listed/found by the OS so i thought this can only be the case if they are not correctly detected by the BIOS in the first place, a simple test showed that they are all detected and where by the OS then too.
My plan is to swtich the power device and the whole CPU cooler anyway, and then running a full memtest over the weekend and maybe next week.
If the PC survives this, then one of the two actions where helpfull, if not at least a can rule out CPU overheating and power problems.
At the moment iam sure that the PC will crash/restart again during memtest and on monday i have nothing because there probably are no logs.
Check if your PC needs to be cleaned, if that’s the case carefully clean the CPU, the gold connectors of the ram memory with a pencil rubber, then clean the ram slots with compressed air can, then test each of the ram memory and memory slots individually with memtest.
If your PC have 6 years or more is ok to check and/or replace thermal grease of the cpu, (OTOH if the cpu is getting hot, PC is supposed to shutdown, not restart!).
Look if there’s a BIOS Update for your Mother Board or you can check if there’s a setting on the BIOS that is causing the trouble!
Check if you are not overloading the PSU or if the PSU needs to be opened and cleaned with compressed air!
P.S. Remember to check all the cables and connectors of your PC for possible damages!
Ok, so i just switched the power device, cleaned the whole case and switched the CPU cooler and the heat paste also, i am currently running the normal memtest (memtest86+ 5.01 F1 failsave mode i quess).
There seem to be a SMP or multithreaded method also by pressing F2 should i do that or should i stay on the normal mode?
Ok the memtest passed after 2 hours without any errors, i am still at work, should i just let this run or is there a deeper test i can run over the weekend or should i try to run my program again and see if the pc still runs on monday, what would you do?
Ok, the weekend is over iam again at work, the memtest ran for 72 hours 22 minutes and 30 seconds 12 tests where passed and none of them failed (zero errors).
I will now start one of my programs again, i will post again if the PC restarts again.
… could also be graphics, or as it appears, a memory corruption problem, though most often these two reasons just cause a lockup. But, occasionally, they can cause a reboot.