**NOTE** January 2022 - Changes to Gstreamer and Pipewire packages from PackmanPlease read the following thread about the current changes
-
openSUSE 13.1 (i586) server hangs / freezes / gets unresponsive randomly
Wow, it's been a while since I last wrote on these forums. Guess my knowledge is better nowadays. Hi all.
Anyways, today my knowledge ended after days of trying to figure this out.
My setup is quite simple, I have one little home server running in my clauset, which contains Intel Core 2 Duo E6300 1.86GHz processor, Asus P5LD2 SE motherboard, 2 GB kingston DDR2 RAM, Nexus NX-3500 350W power, and couple of hard drives.
My server is over 5 years old, running now 4 years without any problems. I'm using LAMPP, postgresql, couple of mediaserver-streaming services like plex and subsonic and irssi and bots. It's been couple of months now this random freezing when all connections die, SSH timeouts etc. Nothing helps but pressing the reset button physically. When I check the logs there's nothing in , only some endless ^@^@^@^@^@ line of characters where the freeze was. After that there is skip in timestamps (the time during "timeout"/freezing/unresponsiveness) and then boot messages, nothing in between, no hints to give what's going on.
I've done so far:
- Replaced my original Deltaco 350W power with Nexus NX-3500, old but works slightly better I think
- Replaced two of 512 RAM sticks with one 1 GB stick
- Disabled any Fan controls in BIOS, got CPU fan RPM from 800 to 1700-2000
- Removed unnecessary graphics card and wlan-network card I have not used for ages
- Added more thermal paste on CPU after wiping off the old one, it had been like 6 years with the same, there were very little of it and it was dry
No visible difference. Sensors show little high numbers sometimes, core 1 and 2 sometimes over 70, but they get stable 45-50 C quite quick.
Any hints what's going on? is it the system, some process, or hardware failure? maybe motherboard going bad? Should I start consider getting new motherboard/processor or buying a more effective cooler?
Chameleon in my heart since 2006.
-
Re: openSUSE 13.1 (i586) server hangs / freezes / gets unresponsive randomly
I may have solved this myself. I had nvidia propiertary drivers left in the system even I have been running on runlevel 3 for so long. Don't know if that caused a conflict, however I removed them running .run-file with --uninstall option. I also created a bash script which alerts me through pushover if temperature gets too high (I have cpu_temp as service which monitors CPU temperature into a file), running this as cronjob every minute:
Code:
#!/bin/bash
TEMP=`cat /home/rolle/.cpu_temp`
if [[ $TEMP > 60 ]] ;then
sendpushover 'CPU temp is CRITICAL! (60°C+)';
fi
So when I got alarmed, I noticed CPU (both cores) had been running 100% for 10 minutes and temperatures were as high as 90 C. The process was perl pisg which caused CPU to throttle. I removed that cronjob and now investing in new cooler, I think something is broken in the old fan.
Server uptime is now over a day and I think it will stay up. If not, I'll be back hope this helps someone.
Chameleon in my heart since 2006.
-
Re: openSUSE 13.1 (i586) server hangs / freezes / gets unresponsive randomly
Hi again. Sorry for the double posts but I solved it. It was not the temperatures, nor the perl or python processes. It was the hard drive that was breaking. I wonder why thy didn't show up in /var/log/messages but only in the actual monitor of the server. Bad blocks and superblocks errors, a lot of them. Finally after one freezing and reboot "No operating system found".
Got most of the data saved with KNOPPIX and testdisk. Now reinstalling opensuse. Hope this helps someone.
Chameleon in my heart since 2006.
-
Re: openSUSE 13.1 (i586) server hangs / freezes / gets unresponsive randomly
I hope your reinstalled server works better now, but it may be that the corrupted disk blocks are a result of a failing system and not the cause.
90C is crazy high. Is the closet door open so there's adequate air circulation? The thermal paste layer should be very thin -- too much paste (or any gap or non-flat surface mating) prevents it from transferring heat to the heatsink. Isopropyl alcohol is good for cleaning off the old paste. Dust clogged in the CPU or case or PSU fans can also cause overheating. Try vacuuming them and look for dust collecting on the motherboard (esp. along the RAM sockets).
The ^@ characters are just NULs (binary zero).
The pair of 512MB couldn't have contributed much heat beyond a single 1GB, and the C2D will run faster with dual-channels. You could always run a RAM test if you think one of them might be flaky.
-
Re: openSUSE 13.1 (i586) server hangs / freezes / gets unresponsive randomly
 Originally Posted by BlueDev
I hope your reinstalled server works better now, but it may be that the corrupted disk blocks are a result of a failing system and not the cause.
Thanks for the answer! That's true, I probably should have thought that before. I'm sad to say my server is still not working. The amazing thing about this is that I replaced the entire hardware. I have now compaq PC with the same age with almost the same type of parts like Intel dual core processor. No graphics card. I did a clean reinstall of suse again, just to be sure. Had to disable Intel Management Engine to stop the [65.456035] mei_me 0000:00:03.0: reset: connect/disconnect timeout flood in logs.
I should probably mention that only thing I have left from old PC in this one is the other hard drive (1 TB Western Digital Green). I have backupped my earlier system to that drive in case I need the files there. I scanned it with e2fsck and it has bad blocks, so I played safe to not to use it. So it's unlikely it would cause these freezes since the system is not on it?
When I was backupping the original Seagate 320 GB drive that my system was installed on when I first experienced these freezes I noticed I couldn't mount the part my /home was in a first place. It didn't let me mount. Only KNOPPIX mounted it in its GUI so that way I got everything backed up. Emptied my spare hard drive from other PC and formatted it to ext4 and installed openSUSE 13.1 on it (minimal server stuff of course, nothing more).
 Originally Posted by BlueDev
90C is crazy high. Is the closet door open so there's adequate air circulation? The thermal paste layer should be very thin -- too much paste (or any gap or non-flat surface mating) prevents it from transferring heat to the heatsink. Isopropyl alcohol is good for cleaning off the old paste. Dust clogged in the CPU or case or PSU fans can also cause overheating. Try vacuuming them and look for dust collecting on the motherboard (esp. along the RAM sockets).
Yeah I double checked that. Most of the time the temperatures were all good but I guess something was wrong with my fan. Nevertheless, I have now entirely new hardware and that has a proper cooler and everything, temperatures are very low all the time. And I STILL experience the same downtime! How is this possible?
 Originally Posted by BlueDev
The ^@ characters are just NULs (binary zero).
The pair of 512MB couldn't have contributed much heat beyond a single 1GB, and the C2D will run faster with dual-channels. You could always run a RAM test if you think one of them might be flaky.
I have now newer RAMs. Had a test for those older ones and didn't show up anything special. A question: how to boot to memtest when I don't have an option in GRUB (in fact, my server boots in without splash, delay and without GUI).
I was installing the same software I used earlier when I experienced this crash again:
- Latest transmission (I'm using nightly, would that be the problem? usually when downloading a lot of data at once)
- Couchpotato
- Sick-Beard
- eggdrop IRC bot
- willie IRC bot
- irssi
- apache2, mysql/mariadb, php
- postgreSQL (not yet installed so I highly doubt this is the cause)
- plexmediaserver (not yet installed so I highly doubt this is the cause)
- dropbox
- btsync (not yet installed so I highly doubt this is the cause)
- subsonic (not yet installed so I highly doubt this is the cause)
- samba
Only hint I currently have is that it happens when I'm writing lots of data at once or when system is operating some process-exessive actions like transcoding a video while watching plex, but I'm not sure about that either.
I'm for the very first times in my life out of luck with these things. I'm now really not sure is my hard drives done, or is it my memory or what. I replaced motherboard and processor so I can count those out in this point? or what, I don't know.
I need help figuring this out.
Chameleon in my heart since 2006.
-
Re: openSUSE 13.1 (i586) server hangs / freezes / gets unresponsiverandomly
On 2013-12-29 01:06, rollex2 wrote:
> My setup is quite simple, I have one little home server running in my
> clauset, which contains Intel Core 2 Duo E6300 1.86GHz processor, Asus
> P5LD2 SE motherboard, 2 GB kingston DDR2 RAM, Nexus NX-3500 350W power,
> and couple of hard drives.
13.1, 32 bit arch, has a nasty kernel bug. Under some circumstances, like hibernating the machine,
it triggers. I can't find a link just now. It needs a kernel update to a certain version, but this
has not yet been released on the update channel.
However, I understand the core 2 duo is a 64 bit processor, so you could try that. You don't have
lots of memory, though.
And anyway, I don't know if that is the cause of your problem.
--
Cheers / Saludos,
Carlos E. R.
(from 13.1 x86_64 "Bottle" (Elessar))
-
Re: openSUSE 13.1 (i586) server hangs / freezes / gets unresponsiverandomly
On 2014-01-03 21:06, rollex2 wrote:
>
> Hi again. Sorry for the double posts but I solved it. It was not the
> temperatures, nor the perl or python processes. It was the hard drive
> that was breaking. I wonder why thy didn't show up in /var/log/messages
> but only in the actual monitor of the server. Bad blocks and superblocks
> errors, a lot of them. Finally after one freezing and reboot "No
> operating system found".
You should run the long test of smartctl to make sure the disk is bad, and that the bad blocks were
not caused by the crashes.
--
Cheers / Saludos,
Carlos E. R.
(from 13.1 x86_64 "Bottle" (Elessar))
-
Re: openSUSE 13.1 (i586) server hangs / freezes / gets unresponsiverandomly
On 2014-01-05 13:06, rollex2 wrote:
> I should probably mention that only thing I have left from old PC in
> this one is the other hard drive (1 TB Western Digital Green). I have
> backupped my earlier system to that drive in case I need the files
> there. I scanned it with e2fsck and it has bad blocks, so I played safe
> to not to use it. So it's unlikely it would cause these freezes since
> the system is not on it?
Bad blocks _may_ cause freezes, but typically they don't. The system tries about 20 times to read
the same block, resetting the hard disk perhaps between each attempt. I'm not sure of the numbers,
but I'm sure that it takes a long time. Depending on where is the problem, main system disk or data
disk, the system may be totally unresponsive, or may recover. If it can write to the log, it will
certainly do so.
I had a system, with IDE buses, which periodically crashed, once or twice a month. I had to
poweroff, reseat the ide cables, and it would reboot just fine. I never found the cause. I still
keep the system, but I don't run it. It was only one hard disk which was affected.
> Yeah I double checked that. Most of the time the temperatures were all
> good but I guess something was wrong with my fan. Nevertheless, I have
> now entirely new hardware and that has a proper cooler and everything,
> temperatures are very low all the time. And I STILL experience the same
> downtime! How is this possible?
Dunno... :-o
> Only hint I currently have is that it happens when I'm writing lots of
> data at once or when system is operating some process-exessive actions
> like transcoding a video while watching plex, but I'm not sure about
> that either.
How big is the swap? 2 gigs of RAM is too little nowdays.
--
Cheers / Saludos,
Carlos E. R.
(from 13.1 x86_64 "Bottle" (Elessar))
-
Re: openSUSE 13.1 (i586) server hangs / freezes / gets unresponsiverandomly
 Originally Posted by robin_listas
13.1, 32 bit arch, has a nasty kernel bug. Under some circumstances, like hibernating the machine,
it triggers. I can't find a link just now. It needs a kernel update to a certain version, but this
has not yet been released on the update channel.
However, I understand the core 2 duo is a 64 bit processor, so you could try that. You don't have
lots of memory, though.
And anyway, I don't know if that is the cause of your problem.
Well, I forgot to mention that every time the crash occurs I see a lot of kernel stuff in the final log and what caucht my eye was "DWARF2 unwinder stuck" in between. I also thought this could be kernel-related.
I indeed have 32bit 13.1. I would appreciate the links or kernel update procedure!
Chameleon in my heart since 2006.
-
Re: openSUSE 13.1 (i586) server hangs / freezes / gets unresponsiverandomly
On 2014-01-05 13:56, rollex2 wrote:
> I indeed have 32bit 13.1. I would appreciate the links or kernel update
> procedure!
A procedure I don't have, because there is no official update for the problem they found. And the
emails that explained it I have saved on a system that is currently running a backup, so I can not
access them. Ping me in some hours.
It is possible to update to a more recent kernel from the repo where they experiment with them. I'm
not familiar with that repo, I hesitate to give instructions.
I have a 32 bit machine I want to update to 13.1 but I'm also waiting for the kernel official
upgrade before doing it.
But I don't know if this reported problem is related to yours.
--
Cheers / Saludos,
Carlos E. R.
(from 13.1 x86_64 "Bottle" (Elessar))
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|