I have a couple of servers based on 10.1, one is a main server the other does nightly backups and provides printing services and some other things. These are home use. They are many years old (since 10.1 was released) but they run 24/7 flawlessly other than a few drive crashes over the years.
To answer the first question I am sure many are thinking, I haven’t upgraded because when they were new (I replaced an NT4.0 server at the time) I was just happy to get them running then later on they worked as needed so why bother. Today I would like to for security reasons but now this hardware is too old and incompatible.
Also, I have been running a home server for probably 15 years (Win 95) and the only time I have ever lost data was when I migrated from Microsoft to Linux.
So, I am having a problem with the main server. That is it will just suddenly disappear (hang) and at this point I don’t know why. Nothing has changed either with the servers or anything else around them.
Over the last week this has happened 3 times where the only choice is to power cycle. Something is clearly wrong as they never hang.
Here is the log file of the most recent showing the last few entries before the hang:
Dec 12 17:49:56 SambaServer smbd: read_data: read failure for 4 bytes to client 192.168.0.51. Error = No route to host
Dec 12 18:01:23 SambaServer syslog-ng: STATS: dropped 0
Dec 12 18:21:07 SambaServer zmd: NetworkManagerModule (WARN): Failed to connect to NetworkManager
Dec 12 18:21:39 SambaServer zmd: Daemon (WARN): Not starting remote web server
Dec 12 18:26:06 SambaServer zmd: ShutdownManager (WARN): Preparing to sleep...
Dec 12 18:26:08 SambaServer zmd: ShutdownManager (WARN): Going to sleep, waking up at 12/13/2013 18:11:06
My question is, what can I do next to determine what is happening?
As it is unlikely that anything changed in the software (simply because there are no updates for years alrready), IMHO it is something in the hardware. And because it seems to happen intermittently, it will be difficult to find. But the messages point to the network. Network card trying to die?
On Sun, 15 Dec 2013 21:06:02 +0000, sd read wrote:
> Here is a sample from dmesg in a terminal (obviously the server is not
> hung at this point):
All that’s showing is iptables information, not anything useful. You
may need to tweak the log settings to disable this. There are bound to
be useful messages in there somewhere, but the iptables info is drowning
On 2013-12-16 00:21, Jim Henderson wrote:
> On Sun, 15 Dec 2013 21:06:02 +0000, sd read wrote:
>> Here is a sample from dmesg in a terminal (obviously the server is not
>> hung at this point):
> All that’s showing is iptables information, not anything useful. You
> may need to tweak the log settings to disable this. There are bound to
> be useful messages in there somewhere, but the iptables info is drowning
> it out.
Instead of looking at dmesg, look at the contents of
“/var/log/messages”, it should not have the iptables messages, which
instead should go to “/var/log/firewall”
Cheers / Saludos,
Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)
On Sun, 15 Dec 2013 23:53:06 +0000, Carlos E. R. wrote:
> On 2013-12-16 00:21, Jim Henderson wrote:
>> On Sun, 15 Dec 2013 21:06:02 +0000, sd read wrote:
>>> Here is a sample from dmesg in a terminal (obviously the server is not
>>> hung at this point):
>> All that’s showing is iptables information, not anything useful. You
>> may need to tweak the log settings to disable this. There are bound to
>> be useful messages in there somewhere, but the iptables info is
>> drowning it out.
> Instead of looking at dmesg, look at the contents of
> “/var/log/messages”, it should not have the iptables messages, which
> instead should go to “/var/log/firewall”
I usually find that hardware-related error messages are somewhat clearer
in dmesg. Maybe that’s only me, though.
On Mon, 16 Dec 2013 00:26:01 +0000, sd read wrote:
> Hi Jim, I don’t know how to tweak the log settings as I don’t even know
> where to find dmesg. What I posted was the results of typing dmesg in a
> I looked through yast and googled this but was not able to figure out
> how I would change the log settings.
> Carlos, sorry I wasn’t clear in my initial post but what I pasted there
> is from /var/log/messages. It is what I believe what was last recorded
> before it hung.
Check in YaST to see if the magic sysrq keys are enabled - if they’re
not, enable them - then when it hangs, hit:
SysRq+S (Forces the kernel to sync the disks)
SysRq+U (Forces the kernel to remount the partitions read-only)
SysRq+B (causes the system to reboot)
That should get the last of what is in the logs flushed to disk, might
give us more to go on.
10.1 is a long ways out of support, so it’s going to be somewhat
difficult to point you precisely where you need to make those settings -
on 13.1, it’s at YaST -> Kernel Settings; there’s a checkbox to enable
the keys. You can also enable them from the terminal with:
I’m with Henk here. Software hasn’t changed in years, nor have config files, ergo conclusio: hardware failure. And, since the machine is old, only one conclusion: backup now that it’s still possible, replace.
Several years ago I got a call from an old customer who was ordering a replacement CPU fan. In the course of the conversation I heard him say: “I’m going to hate to have to power this machine down. It’s a Linux server that’s been running continuously since I installed it seven years ago, and it’s never even been rebooted. I wanted to see how long it could go, but now the fan’s so noisy it’ll fail soon if I don’t replace it.”