11.1 systems locking up?

Not running the samba server here… I am often connecting to samba shares though… could the samba client have the same effect?

I’ll keep an eye out to see if the hang happens after connecting to a smb share

I had a samba server running but it isn’t used. I’ll turn it off but I doubt if that will affect anything.

Assuming it’s a common cause the reason has to be quite basic. The CPUs range from a 450MHz Celeron to a recent dual-core Athlon. So it’s not the CPU or memory. It can’t be the GUI, the slowest machine is console only. I thought it was HPET, but only the newest machine has HPET. It can’t be the word length, there are 32-bit and 64-bit machines. It can’t be VirtualBox, the server isn’t running VirtualBox, only mail services. There is no particular service or action that is more likely to cause the freeze. Not that it happens often enough to tell, maybe once a day. The only thing that is common is 11.1 and the 2.6.27 kernel.

Unlikely as it is I’m starting to think there might be a rare corner case in the kernel, maybe something to do with tickless kernels. But why haven’t I experienced it with other distros, e.g. Kubuntu 8.10? Did 11.1 introduce a regression or something?

I JUST read a recent thread/posting in hardware (or applications or
???) here that a guy said he just went back one kernel version (in
11.1) and solved ALL his locking problems…

Sorry, I can’t find it again.

Hi Odin,

Thanks for that! You’re probably refering to the last post on this thread : Opensuse 11 freezes at random - openSUSE Forums

Although the op was about 11.0 and reiserfs… the last one is about 11.1 and reverting to a previous kernel. Could well be!

Cheers,
Wj

Magic31 wrote:
> Hi Odin,
>
> Thanks for that!

Thank YOU for finding and reporting to Ken, who I hope will
back-level his kernel and (maybe) verify the problem; and report
specifics to the kernel/SUSE gurus.

I used to experience lock-ups like this, and in my experience were nearly always caused by acpi issues with interaction of the bios/kernel/drivers

You can test if this is your problem by booting with acpi=off and seeing if your freezes go away. If that works, you can try booting with pci=noacpi, which is much easier to live with on a day to day basis.

In past versions of OS (10.2, 10.3, and 11.0) these issues have gone away for me after new kernels or video drivers have been released.

On Tue January 6 2009 07:01 am, Odin wrote:

> Magic31 wrote:
>> Hi Odin,
>>
>> Thanks for that!
>
> Thank YOU for finding and reporting to Ken, who I hope will
> back-level his kernel and (maybe) verify the problem; and report
> specifics to the kernel/SUSE gurus.

Are you aware of the following Bugzilla:
https://bugzilla.novell.com/show_bug.cgi?id=463372


P. V.
“We’re all in this together, I’m pulling for you.” Red Green

Yes, that sounds right. All the affected machines were running dovecot and most people don’t which would explain why the bug was not widespread. I’ve voted for the bug. I’ll try running one of my machines without dovecot for while to see what happens.

On Tue January 6 2009 09:06 pm, ken yap wrote:

>
> Yes, that sounds right. All the affected machines were running dovecot
> and most people don’t which would explain why the bug was not
> widespread. I’ve voted for the bug. I’ll try running one of my machines
> without dovecot for while to see what happens.
>
>
ken;

Since the Samba errors seem to be due to a broken inotify module, it would
appear that this is the bug causing both problems ( and I suspect a number of
others)

P. V.
“We’re all in this together, I’m pulling for you.” Red Green

I’m running a machine with Dovecot and have this problem. Now to try and find another kernel.

Just to report my status…

I haven’t had a hang since last Sunday.

Three things I’ve done:

  1. Removed Java 1_6_0 and reverted to 1_5_0
  2. Haven’t connected to an smb share since Sunday (again, not running a Samba Server on this system, only connecting to others)
  3. Updated compiz with the latest packages in X11 repo and reset the compiz config.

If the system stays as stable as it is, I’ll try connecting to some Windows shares this weekend and see what happens.

On Wed January 7 2009 10:56 am, Magic31 wrote:

>
> Just to report my status…
>
> I haven’t had a hang since last Sunday.
>
> Three things I’ve done:
> 1) Removed Java 1_6_0 and reverted to 1_5_0
> 2) Haven’t connected to an smb share since Sunday (again, not running a
> Samba Server on this system, only connecting to others)
> 3) Updated compiz with the latest packages in X11 repo and reset the
> compiz config.
>
> If the system stays as stable as it is, I’ll try connecting to some
> Windows shares this weekend and see what happens.
>
>
Magic31
I’m quite sure the “inotify bug” will only be a problem on servers.

P. V.
“We’re all in this together, I’m pulling for you.” Red Green

Thanks PV. Could be I’m having or had another issue. Still though I’d post my findings for other to try (and maybe confirm).

Not my intention to mess up this thread… just add some thoughts :wink:

Cheers,
Wj

I tried running overnight without dovecot. No issues so that’s encouraging. This morning I saw in bugzilla that the bug had been resolved and you can get a KoTD to try if you are impatient.

Index of /pub/projects/kernel/kotd/HEAD

So I’m running 2.6.27.10 now, with dovecot active and we’ll see what happens, but I believe that this is the correct resolution.

Just a quick update. I ran 2.6.27.10-HEAD for a few days from KotD while waiting for an update to show up. All the time it’s been stable. Tonight I got impatient and started looking in Factory. There has been a 2.6.27.10 there since Jan 10. So I added Factory, installed 2.6.27.10, then disabled Factory. I will keep an eye on it, but I believe it is really fixed—the changelog does mention the inotify bug—and there won’t be any more drama. The URL to add in YaST is:

http://download.opensuse.org/factory/repo/oss/

I think anybody with this problem will have to get the kernel from Factory, as a kernel update is big and disruptive to everybody, so I don’t think you will see a newer kernel appear in the Update repo until a security issue is discovered.

As I have Virtualbox installed I also had to update kernel-source and rerun /etc/init.d/vboxdrv setup. Fortunately I have lots of download quota left. :slight_smile:

Thanks to all for the tips, but especially to PV, who pinpointed the problem.

I have an older system, which occasionally locks up. Apparently, the IDE’s go to sleep and fail to wake up.

The sata drives seem to be immune to this behavior.

Your comment is neither here nor there. You didn’t read the history of this thread. This lockup is a kernel bug and happened on all kinds of systems, but the common thing was software using inotify.

There were Seagate drives with that problem. I think a firmware update was issued to cure it.

About your question: how many …?
I don’t know, but I have it too with our good server locking up around once a day around 17:00h each day…

Has anybody found out what may help (despite 11.0 ?)

SuSE 11.1 32 bit
Kernel: kernel_pae
RAM: 512 MB
Intel server board and Celeron with on-board RAID and 4 HDs.
Software: Apache 2, Samba, Cyrus-IMAP

Another phenomenon: VNC got terribly unrelayable and often brakes with socket errors on login.

Thank you,
Harald

Well, really “puzzled Pinguin” here, I thought they only live in zoos or on the south hemisphere…
No (Edit) button, I have missed that there are multiple pages.

So I will try the factory kernel. Thank you for the clue.
Report later on all hemispheres…