On Tue, 02 Aug 2011 22:33:06 +0000, Carlos E. R. wrote:
> On 2011-08-02 22:49, Jim Henderson wrote:
>> True, but I would think that getting the hardware clock synced up with
>> reality certainly would help.
>
> Not really… sorry.
Well, in general, not with this specific problem. Having the hardware
clock be wrong does contribute a problem, but I should’ve clarified that
it isn’t necessarily a contributing factor to this specific problem.
Jim
Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C
On 2011-08-03 01:21, Jim Henderson wrote:
> On Tue, 02 Aug 2011 22:33:06 +0000, Carlos E. R. wrote:
>
>> On 2011-08-02 22:49, Jim Henderson wrote:
>>> True, but I would think that getting the hardware clock synced up with
>>> reality certainly would help.
>>
>> Not really… sorry.
>
> Well, in general, not with this specific problem. Having the hardware
> clock be wrong does contribute a problem, but I should’ve clarified that
> it isn’t necessarily a contributing factor to this specific problem.
Nowdays, not really! X’-)
If you have a permanent network access and can use to get the time.
Remember that the initial PC design did not have it.
It makes things easier, but it is not absolutely necessary.
However, the same battery is also used to keep the bios config. Nowdays
they could use a flash memory instead, but the design is older.
–
Cheers / Saludos,
Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)
On Wed, 03 Aug 2011 00:38:06 +0000, Carlos E. R. wrote:
> On 2011-08-03 01:21, Jim Henderson wrote:
>> On Tue, 02 Aug 2011 22:33:06 +0000, Carlos E. R. wrote:
>>
>>> On 2011-08-02 22:49, Jim Henderson wrote:
>>>> True, but I would think that getting the hardware clock synced up
>>>> with reality certainly would help.
>>>
>>> Not really… sorry.
>>
>> Well, in general, not with this specific problem. Having the hardware
>> clock be wrong does contribute a problem, but I should’ve clarified
>> that it isn’t necessarily a contributing factor to this specific
>> problem.
>
> Nowdays, not really! X’-)
>
> If you have a permanent network access and can use to get the time.
> Remember that the initial PC design did not have it.
>
> It makes things easier, but it is not absolutely necessary.
That’s what I was trying to say. It’s been kinda a long day.
> However, the same battery is also used to keep the bios config. Nowdays
> they could use a flash memory instead, but the design is older.
Yes, that’s certainly true.
I can remember from years past having problems with NetWare (which used
DOS as a bootloader) keeping time but the hardware clock not updating -
and on the next boot, the time would come from the clock, and would be
set backwards to the last time the server was booted.
Fortunately, that’s not been a problem for many, many years. But just
like the old saying about a man with two watches…
Jim
Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C
Umm, lots of things to process but not much to do yet.
My system doesn’t freeze or hand or anything when NTP fails, downloads stop and clock applet shows the wrong time, that’s it. Come to think of it, applet still keeps ticking, it’s just behind by the amount of time between NTP fail and me waking up the display.
Well, actually that sounds pretty muck like a clock freeze. I looked into power settings again and there’s nothing about any kind of sleep or hybernation, everything is set to never or do nothing, yet my display goes black after five minutes of inactivity anyway…
It’s quite possible that NTP fail is simply another symptom, not a cause, but what could be the problem? kTorrent doesn’t keep any logs, nothing to look into.
There’s no way to replicate the fail either, it happens at random, maybe once or twice a week. Maybe it has something to do with network failure, my modem/router reconnects automatically but if the computer gets a different IP the connection with torrent peers might get broken. What would happen if NTP runs a sync at the same time, too?
Found hwclock --show command, turns out my hardware clock is half and hour behind while system time is correct.
Reset with hwclock --systohc, it’s correct now, more or less.
Is it true that ntpd has a default 3600 sec limit for time difference during sync or it will give up?
On Wed, 03 Aug 2011 11:36:02 +0000, Stan Ice wrote:
> It’s quite possible that NTP fail is simply another symptom, not a
> cause, but what could be the problem?
It’s probably not another symptom either, but a separate issue. One
thing that’s important is to understand how NTP actually works - you seem
to think it should be running at regular intervals, but that’s not how it
works, as I explained.
> kTorrent doesn’t keep any logs,
> nothing to look into.
>
> There’s no way to replicate the fail either, it happens at random, maybe
> once or twice a week. Maybe it has something to do with network failure,
> my modem/router reconnects automatically but if the computer gets a
> different IP the connection with torrent peers might get broken. What
> would happen if NTP runs a sync at the same time, too?
It could have something to do with a network problem, that’s most
likely I would think. What kind of router are you using, and do you have
port forwarding set up properly for your torrents? (If your machine
isn’t reachable from outside, torrents can be very fussy about that)
Jim
–
Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C
On Wed, 03 Aug 2011 11:56:03 +0000, Stan Ice wrote:
> Is it true that ntpd has a default 3600 sec limit for time difference
> during sync or it will give up?
By default, ntp uses a concept called “insane time” (yes, that’s really
what it’s called) whereby it declares if the time is off by more than
about 17 minutes (1000 seconds, actually), then the time source is
declared to be unreliable.
http://www.jamesgosling.com/Troubleshooting%20Time%20Synchronization%
20Issues%20Notes.html is one place that explains this (the article is
about NetWare rather than Linux, but the concept and interval is the
same).
Jim
–
Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C
On 2011-08-03 13:56, Stan Ice wrote:
> Is it true that ntpd has a default 3600 sec limit for time difference
> during sync or it will give up?
Yes.
I don’t remember if the figure is 3000" or something else, but there is
certainly a limit and is not big.
The reason is that ntp tries to adjust the clock very slowly. If you are
one second slow, what it does is speed the clock by say, 1% (I don’t know
the exact figure) and waits for the clock to catch up with reality: at that
rate, 100 seconds. Then it slows back the clock to keep it at exact sync.
This is done so that always 1 second is always 1 second - or a trifle wrong.
You see that adjusting for several minutes difference would take hours, an
hour days. If the error is big, it has to abandon.
However, the script that starts ntpd in SUSE also does something else:
before starting the daemon, it queries the time and sets it instantly, no
matter the difference. Only then is the daemon started.
If now something makes the clock go bad, ntp will abort. With your kind of
problem, ntp simply can not keep the clock right.
–
Cheers / Saludos,
Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)
It turns out my hardware clock was off by half an hour for the past three days, from the beginning of this thread. I assume it went bad after the latest time freeze, I first noticed it when system ran a file check during boot and I assumed it was automatically corrected. Not at all.
Ntpd apparently totally ignored it and maintained system time correct instead.
Thinking it over, I now tend to suspect some massive sleep freeze that stops torrents and the system clock. On wake up system clock resumes from the freeze moment and stays behind by the total freeze time.
Since system time becomes truly insane NTPD probably doesn’t even try to correct it until the service is restarted manually from Yast.
Where else can I find the traces of this deep freeze? There’s no sign of it in NTP logs, where else can it be recorded?
It’s a desktop, it never hibernates or anything. And why does it keep turning off the monitor no matter what I set in Configure Desktop?
On 08/04/2011 01:26 PM, Stan Ice wrote:
>
> It turns out my hardware clock was off by half an hour
just a wild idea: there are several places on earth where the local time
is exactly 30 minutes off of the rest of the world…
have you maybe set your time zone to one of those areas? accidentally?
–
DD
Caveat-Hardware-Software
openSUSE®, the “German Engineered Automobiles” of operating systems!
No, also if that were the case my hardware clock and system clock would have been the same.
What surprised me was that this half an hour difference between hardware clock and system time stayed hidden for so long, I actually only assume it started after the latest freeze, it could have been there for ages.
On 2011-08-04 13:26, Stan Ice wrote:
>
> It turns out my hardware clock was off by half an hour for the past
> three days, from the beginning of this thread. I assume it went bad
> after the latest time freeze, I first noticed it when system ran a file
> check during boot and I assumed it was automatically corrected. Not at
> all.
Ah, yes, I saw it.
> Ntpd apparently totally ignored it and maintained system time correct
> instead.
By design, yes.
> Thinking it over, I now tend to suspect some massive sleep freeze that
> stops torrents and the system clock. On wake up system clock resumes
> from the freeze moment and stays behind by the total freeze time.
Right.
> Since system time becomes truly insane NTPD probably doesn’t even try
> to correct it until the service is restarted manually from Yast.
Yes, that is correct.
> Where else can I find the traces of this deep freeze? There’s no sign
> of it in NTP logs, where else can it be recorded?
Only that ntpd quits, perhaps.
As I wrote on another post, I had a very similar problem for years, and I
wrote my own script to record the problem. It can be found in the bugzilla
that tracked the issue.
Basically, the script (actually a pascal binary) runs in a loop with one
second delay. On every second I checked the time, if it was longer, wrote a
message on a file. In your case I would delay 5 seconds, and check the time
not of the system clock, but of the hwclock
–
Cheers / Saludos,
Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)
So it would be logging any abnormalities with the time? Would it be collecting any other data to diagnose the underlying issue?
Hmm, I also got a question - what happens if the system time freezes - what would keep the 5 second interval? NTP is off, hardware clock probably stops, too. Time applet freezes - is there anything left to count the seconds?
On 2011-08-04 18:16, Stan Ice wrote:
>
> So it would be logging any abnormalities with the time? Would it be
> collecting any other data to diagnose the underlying issue?
It is up to you when you create the script.
Have a read of Bugzilla 350980 to get some ideas. The script I used is
logged there, you would have to adapt it.
> Hmm, I also got a question - what happens if the system time freezes -
> what would keep the 5 second interval? NTP is off, hardware clock
> probably stops, too. Time applet freezes - is there anything left to
> count the seconds?
Hardware clock is independent, it is a real clock like that on your hand,
it does not stop, even if your computer crashes completely.
It would be something like this:
while true ; do
sleep 5
hwclock --show
save the output somewhere, compare to the previous loop,
calculate the difference, if it is less than 6 seconds. If not, alarm.
done
The hwclock will always be correct, but the sleep 5 delay will not if the
system has a problem.
–
Cheers / Saludos,
Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)
I’ve caught the problem with my bare hands today. Started downloading, checked half an hour later and the time freeze in action.
It was the system time that froze, hardware clock was still correct. When I woke up the screen the system clock resumed ticking, missing five minutes of sleeping time.
For the lack of better options, I just restarted NTP from runlevels and system time corrected itself in a minute.
Desktop woke up just fine, without any extra delays or anything, so I think it affected only the clock, nothing else. Downloads stopped but I think I have an idea why.
Grabit has a visual representation of download progress - how many threads are running, overall speed, and a moving graph. When I woke up the computer Grabit showed 0 Kbps and it showed that downloads dropped just now. It didn’t record the missing five minutes of inactivity at all - as far as Grabit is concerned, they didn’t happen, time literally froze.
I suspect that when it tried to calculate the current speed it got 0 time difference between two update points and you can’t divide anything by 0, the function returned an error and Grabit just stopped, totally bewildered. The same thing happens to torrents, I guess. God know where else in the applications they need to divide something by time difference. It can’t be zero, no one ever expected that, no one planned to deal with this kind of error.
I think it sounds plausible.
Now the main question - how could the system clock just stop like that? How does it keep its time between NTP check in points?
hi,
i’m sorry but i have to ask, did you change your motherboard battery recently?
i’ve had computers that even lost time completely over powerdown, just needed to change the lithium battery…
regards,
Reda
On 2011-08-05 17:56, Stan Ice wrote:
> Now the main question - how could the system clock just stop like that?
> How does it keep its time between NTP check in points?
No, you are wrong. The problem is not the clock stopping. The problem is
your entire computer stopping. If the computer stops, the clock also stops.
If you don’t believe me, run an application with a moving graphic display,
you will see how it stops.
Open a bugzilla. You can point to mine, similar but different.
–
Cheers / Saludos,
Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)
No, you are wrong. The problem is not the clock stopping. The problem is
your entire computer stopping. If the computer stops, the clock also stops.
If you don’t believe me, run an application with a moving graphic display,
you will see how it stops.
A bit of a chicken and egg dilemma. I assume if my whole computer stopped it wouldn’t just wake up without a glitch on a mouse movement. What would it even mean “computer stopped” compared to “clock stopped” in this case?
Btw, I just ran “zyyper dup” and ntp client was included in the update. Maybe the issue will go away but we sort of agreed that ntp was not the cause.
Since hardware clock kept ticking and there are no other symptoms of the battery failing I ruled it out for the time being.
The issue still persists, I updated to 11.4 (not a clean install) and it’s still there.
When some of my folders are mounted via fuse on other computers they just go into disk sleep until my system comes back.
Whatever the reason, restarting NTP does the trick.
There are too many clicks to restart it from yast module, can someone suggest a way to restart ntp with a bash script?
I can do it with “rcntp restart” in terminal already but it still requires manual typing, I created ntp.sh file with executable permissions:
#!/bin/sh
su
rcntp stop
It does nothing when double clicked on, and gives password prompt when run from terminal but it doesn’t stop the service according to runlevels status refresh.
I realize I’m deviating from my initial query but it seems there’s no progress beyond acknowledging system clock freeze symptom anyway.