Network connection freezes

No idea what’s going on here. At first I thought it was the new version of Opera, but Firefox does the same thing.

At various times, my network just freezes for 60 seconds at a time. Yes, the length is consistent (though I can’t guarantee it’s exactly 60 seconds since I never know when it will happen, so I can’t time it with a stopwatch). I’ll just click on a link, or try to send an email message, or anything - and the program will indicate that it is trying to connect, but will just sit there. I can of course cancel the operation in whichever program I was using, but if I cancel and try again (and it’s been less than a minute) the same thing will happen.

Depending on where it was in the operation when the network froze, I can get various network errors if I just wait for it to resume, “Connection closed by peer” or “Parameter error” if I was submitting a webform. Quite aggravating. And yes, 10.3 didn’t do this.

System? 11.0, connecting to the network over a LAN. (Cable modem at the other end of the LAN, but the router takes care of that.)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Anything in /var/log/messages or /var/log/NetworkManager when this is
happening?

Perhaps open a terminal of some kind (konsole, xterm, gnome-terminal)
and just have your box ping your gateway 24x7 to see if those are also
dropping at the same time (probably are) and see what error comes back
on those dropped packets.

Good luck.

sgunhouse wrote:
| No idea what’s going on here. At first I thought it was the new version
| of Opera, but Firefox does the same thing.
|
| At various times, my network just freezes for 60 seconds at a time.
| Yes, the length is consistent (though I can’t guarantee it’s exactly 60
| seconds since I never know when it will happen, so I can’t time it with
| a stopwatch). I’ll just click on a link, or try to send an email
| message, or anything - and the program will indicate that it is trying
| to connect, but will just sit there. I can of course cancel the
| operation in whichever program I was using, but if I cancel and try
| again (and it’s been less than a minute) the same thing will happen.
|
| Depending on where it was in the operation when the network froze, I
| can get various network errors if I just wait for it to resume,
| “Connection closed by peer” or “Parameter error” if I was submitting a
| webform. Quite aggravating. And yes, 10.3 didn’t do this.
|
| System? 11.0, connecting to the network over a LAN. (Cable modem at the
| other end of the LAN, but the router takes care of that.)
|
|
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIbQCg3s42bA80+9kRAr8wAJ9zk4CswrVzj9jc0gFTyPNNfmgwgwCfWkas
R/W5A9DDo++jiQXIrLblemA=
=VAyK
-----END PGP SIGNATURE-----

have a look in dmesg to see if the link is maybe jumping?

’ dmesg | grep -i link ’

Also check /var/log/messages

steve@linux-z3ki:~> dmesg | grep -i link
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 10 11 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 *10 11 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 6 7 10 11 14 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 *10 11 14 15)
audit: initializing netlink socket (disabled)
ADDRCONF(NETDEV_UP): eth0: link is not ready
e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

Nothing I can identify there …

This could be interesting …

Jul 3 12:22:04 linux-z3ki dhclient: DHCPREQUEST on eth0 to 192.168.0.1 port 67
Jul 3 12:22:04 linux-z3ki dhclient: DHCPACK from 192.168.0.1
Jul 3 12:22:04 linux-z3ki dhclient: bound to 192.168.0.100 – renewal in 1516 seconds.
Jul 3 12:47:20 linux-z3ki dhclient: DHCPREQUEST on eth0 to 192.168.0.1 port 67
Jul 3 12:47:20 linux-z3ki dhclient: DHCPACK from 192.168.0.1
Jul 3 12:47:20 linux-z3ki dhclient: bound to 192.168.0.100 – renewal in 1553 seconds.
Jul 3 13:07:25 linux-z3ki syslog-ng[1738]: STATS: dropped 0
Jul 3 13:13:13 linux-z3ki dhclient: DHCPREQUEST on eth0 to 192.168.0.1 port 67
Jul 3 13:13:13 linux-z3ki dhclient: DHCPACK from 192.168.0.1
Jul 3 13:13:13 linux-z3ki dhclient: bound to 192.168.0.100 – renewal in 1555 seconds.
Jul 3 13:39:08 linux-z3ki dhclient: DHCPREQUEST on eth0 to 192.168.0.1 port 67
Jul 3 13:39:08 linux-z3ki dhclient: DHCPACK from 192.168.0.1
Jul 3 13:39:08 linux-z3ki dhclient: bound to 192.168.0.100 – renewal in 1620 seconds.
Jul 3 14:06:08 linux-z3ki dhclient: DHCPREQUEST on eth0 to 192.168.0.1 port 67
Jul 3 14:06:08 linux-z3ki dhclient: DHCPACK from 192.168.0.1
Jul 3 14:06:08 linux-z3ki dhclient: bound to 192.168.0.100 – renewal in 1629 seconds.
Jul 3 14:07:25 linux-z3ki syslog-ng[1738]: STATS: dropped 0
Jul 3 14:33:17 linux-z3ki dhclient: DHCPREQUEST on eth0 to 192.168.0.1 port 67
Jul 3 14:33:17 linux-z3ki dhclient: DHCPACK from 192.168.0.1
Jul 3 14:33:17 linux-z3ki dhclient: bound to 192.168.0.100 – renewal in 1606 seconds.
Jul 3 15:00:03 linux-z3ki dhclient: DHCPREQUEST on eth0 to 192.168.0.1 port 67
Jul 3 15:00:03 linux-z3ki dhclient: DHCPACK from 192.168.0.1
Jul 3 15:00:03 linux-z3ki dhclient: bound to 192.168.0.100 – renewal in 1718 seconds.
Jul 3 15:07:25 linux-z3ki syslog-ng[1738]: STATS: dropped 0
Jul 3 15:28:41 linux-z3ki dhclient: DHCPREQUEST on eth0 to 192.168.0.1 port 67
Jul 3 15:28:41 linux-z3ki dhclient: DHCPACK from 192.168.0.1
Jul 3 15:28:41 linux-z3ki dhclient: bound to 192.168.0.100 – renewal in 1480 seconds.
Jul 3 15:53:21 linux-z3ki dhclient: DHCPREQUEST on eth0 to 192.168.0.1 port 67
Jul 3 15:53:21 linux-z3ki dhclient: DHCPACK from 192.168.0.1
Jul 3 15:53:21 linux-z3ki dhclient: bound to 192.168.0.100 – renewal in 1515 seconds.
Jul 3 16:07:25 linux-z3ki syslog-ng[1738]: STATS: dropped 0
Jul 3 16:18:36 linux-z3ki dhclient: DHCPREQUEST on eth0 to 192.168.0.1 port 67
Jul 3 16:18:36 linux-z3ki dhclient: DHCPACK from 192.168.0.1
Jul 3 16:18:36 linux-z3ki dhclient: bound to 192.168.0.100 – renewal in 1706 seconds.

Not sure why it would be renewing the lease so often, most of them are in the area of 25-26 minutes. Sounds about like the frequency of my connection issues. Question being, why 10.3 didn’t do the same thing.

openSUSE 11 has a new version of dhcp included. 3.0 if i remember correctly. Could be something is bugging there…

I’ve also been having some strange connection hiccups (like rdp sessions disconnecting) and loosing static routes I’ve set due to getting a new lease and thought it had to do with the b44 (broadcomm) driver… but it could be it’s a more general thing.

It’s quite late here, so I’ll get back in the morning and give the openSUSE bug reports a scan, see if I can find similar reports.
If not might me time to open one :slight_smile:

Cheers,
Wj

Strangely, I see a message which from the title could be about the same issue (possibly) at the top of the forum just now. The network hanging for a minute at a time should be pretty obvious …

It could be coincidence, but today I’m having a stable connection on a network where I normally get the quoted behavior.
Also my /var/log/messages looks much cleaner…
Most important, dhcp and route table stay as I first set them this morning.

It could be a kernel (and other) update(s) I applied yesterday… my running kernel version is now 2.6.25.9-0.2-pae.

@Steve, Curious to know if it seems better for you too with the new kernel?

Cheers,
Wj

Some further investigation: turns out I can still produce hiccups when I connect to a small switch they are running here. (this morning I had connected using an other cable that runs to another switch, last time I seem to have been connected to the small swtich).

The switch that is giving me this issue has it’s management ip on an old ip address that is different from the network subnet I’m in.

Looking at the output of dmesg I’m getting this:

martian source 192.168.30.241 from 130.57.4.15, on dev eth0
ll header: 00:15:c5:1c:13:69:00:04:75:d9:68:f5:08:00
martian source 192.168.30.241 from 130.57.4.15, on dev eth0
ll header: 00:15:c5:1c:13:69:00:04:75:d9:68:f5:08:00
martian source 192.168.30.241 from 130.57.4.15, on dev eth0
ll header: 00:15:c5:1c:13:69:00:04:75:d9:68:f5:08:00
martian source 192.168.30.241 from 195.135.221.140, on dev eth0
ll header: 00:15:c5:1c:13:69:00:04:75:d9:68:f5:08:00
martian source 192.168.30.241 from 195.135.221.140, on dev eth0
ll header: 00:15:c5:1c:13:69:00:04:75:d9:68:f5:08:00

195.135.221.x is the old subnet used and seems to be causing my nic to disconnect / reconnect ? Or maybe the switch is just faulty.

Strange thing is that other systems (Windows XP) don’t seem to be botherd by this.

Using my wireless connection I’m also having no issues here… so it does seem to have to do with the connection to the switch with the 195.135.221.140 address.

I guess you could also be seeing this in a network that uses multiple subnets but has no VLAN configured switches.

Curious if others see this happening to… Can’t test right now if openSUSE 10.3 would not have the issue.

-Wj

Magic31 wrote:

>
> Some further investigation: turns out I can still produce hiccups when I
> connect to a small switch they are running here. (this morning I had
> connected using an other cable that runs to another switch, last time I
> seem to have been connected to the small swtich).
>
> The switch that is giving me this issue has it’s management ip on an
> old ip address that is different from the network subnet I’m in.
>
> Looking at the output of dmesg I’m getting this:
>
>>
>> martian source 192.168.30.241 from 130.57.4.15, on dev eth0
>> ll header: 00:15:c5:1c:13:69:00:04:75:d9:68:f5:08:00
>> martian source 192.168.30.241 from 130.57.4.15, on dev eth0
>> ll header: 00:15:c5:1c:13:69:00:04:75:d9:68:f5:08:00
>> martian source 192.168.30.241 from 130.57.4.15, on dev eth0
>> ll header: 00:15:c5:1c:13:69:00:04:75:d9:68:f5:08:00
>> martian source 192.168.30.241 from 195.135.221.140, on dev eth0
>> ll header: 00:15:c5:1c:13:69:00:04:75:d9:68:f5:08:00
>> martian source 192.168.30.241 from 195.135.221.140, on dev eth0
>> ll header: 00:15:c5:1c:13:69:00:04:75:d9:68:f5:08:00
>>
>
> 195.135.221.x is the old subnet used and seems to be causing my nic to
> disconnect / reconnect ? Or maybe the switch is just faulty.
>
> Strange thing is that other systems (Windows XP) don’t seem to be
> botherd by this.
>
> …
>
> Using my wireless connection I’m also having no issues here… so it
> does seem to have to do with the connection to the switch with the
> 195.135.221.140 address.
>
> I guess you could also be seeing this in a network that uses multiple
> subnets but has no VLAN configured switches.
>
> …
>
> Curious if others see this happening to… Can’t test right now if
> openSUSE 10.3 would not have the issue.
>
> -Wj
>
>

Try investigating ECN (early congestion notification). Could be a part of the
problem you’re facing.

A system-wide setting for it is located in /etc/sysconfig/sysctl. I think
it’s normally enabled, I forget, and I adjust the setting to off in my usual
setups.

I seem to recall somewhere that 10.3 defaulted to off, 11.0 defaulted to on.
Don’t know, hopefully someone could remind me.

The /etc/sysconfig/sysctl setting takes affect on boot. To adjust on the fly,

(as root)

to turn off

echo 0 > /proc/sys/net/ipv4/tcp_ecn

to turn on

echo 1 > /proc/sys/net/ipv4/tcp_ecn

Loni


L R Nix
lornix@lornix.com

Hi Loni,

Thats a valuable tip! I will have a test with this but won’t be at the mentioned site until later next week. am curious to see if that setting fixes things.

will report back…

Thanks,
Willem