Wifi is up, access point is visible, but network goes down.

This is 12.3 with Xfce and Network Manager, connecting through wifi.

Once in a while, sometimes twice a day, network just disconnects, the bars in the NM icon turn all grey and there’s not way to reconnect but a restart, which always works.

Changing the settings, restarting networking either via NM or systemctl, resarting with a physical wifi toggle button on the laptop, logging out - nothing helps. IP4 addresses are set manually.

Here’s some terminal output:


stan@linux-ektu:~> /sbin/ifconfig
eth0      Link encap:Ethernet  HWaddr 00:0F:B0:80:8A:C0  
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:21 

eth1      Link encap:Ethernet  HWaddr 00:12:F0:B1:C9:DA  
          inet addr:192.168.2.111  Bcast:192.168.2.255  Mask:255.255.255.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:2022 errors:0 dropped:0 overruns:0 frame:0
          TX packets:565 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:404037 (394.5 Kb)  TX bytes:178080 (173.9 Kb)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:5232 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5232 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:501005 (489.2 Kb)  TX bytes:501005 (489.2 Kb)

stan@linux-ektu:~> sudo /usr/sbin/iwlist scan
root's password:
lo        Interface doesn't support scanning.

eth0      Interface doesn't support scanning.

eth1      Scan completed :
          Cell 01 - Address: 00:22:2D:3F:C7:6A
                    ESSID:"SMC"
                    Protocol:IEEE 802.11bg
                    Mode:Master
                    Frequency:2.462 GHz (Channel 11)
                    Encryption key:off
                    Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s
                              11 Mb/s; 12 Mb/s; 18 Mb/s; 22 Mb/s; 24 Mb/s
                              36 Mb/s; 48 Mb/s; 54 Mb/s
                    Quality=93/100  Signal level=-34 dBm  
                    Extra: Last beacon: 75ms ago

stan@linux-ektu:~> /usr/sbin/iwconfig
lo        no wireless extensions.

eth0      no wireless extensions.

eth1      IEEE 802.11bg  ESSID:"SMC"  
          Mode:Managed  Frequency:2.462 GHz  Access Point: 00:22:2D:3F:C7:6A   
          Bit Rate:54 Mb/s   Tx-Power=20 dBm   Sensitivity=8/0  
          Retry limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality:0  Signal level:0  Noise level:0
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

stan@linux-ektu:~> ping 192.168.2.1
PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
From 192.168.2.111 icmp_seq=1 Destination Host Unreachable
From 192.168.2.111 icmp_seq=2 Destination Host Unreachable
From 192.168.2.111 icmp_seq=3 Destination Host Unreachable
From 192.168.2.111 icmp_seq=4 Destination Host Unreachable
From 192.168.2.111 icmp_seq=5 Destination Host Unreachable
^C

What gives? Why nothing but a restart fixes it? One time I got the network back after troubleshooting with ifconfig and other terminal commands but I can’t replicate this solution anymore.

After toggling wifi on/off NM tries to connect, gives up, and shows “no network” icon. At this point iwconfig and ifconfig look like this

stan@linux-ektu:~> /usr/sbin/iwconfig
lo        no wireless extensions.

eth0      no wireless extensions.

eth1      IEEE 802.11bg  ESSID:"SMC"  
          Mode:Managed  Frequency:2.462 GHz  Access Point: 00:22:2D:3F:C7:6A   
          Bit Rate:54 Mb/s   Tx-Power=20 dBm   Sensitivity=8/0  
          Retry limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality:0  Signal level:0  Noise level:0
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

stan@linux-ektu:~> /sbin/ifconfig
eth0      Link encap:Ethernet  HWaddr 00:0F:B0:80:8A:C0  
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:21 

eth1      Link encap:Ethernet  HWaddr 00:12:F0:B1:C9:DA  
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:2624 errors:0 dropped:0 overruns:0 frame:0
          TX packets:565 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:465690 (454.7 Kb)  TX bytes:178080 (173.9 Kb)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:5640 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5640 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:538938 (526.3 Kb)  TX bytes:538938 (526.3 Kb)


Gateway settings and IP address are gone but SSID and access point are still visible.

From what you posted,

What kind of wireless NIC are you using?
How was it set up?
Ordinarily, eth interfaces are typically wired, not wireless.

Aside from those anomolies,
Research your NIC
Consider alternate drivers
Reboot the AP now and maybe regularly
Consider replacing the AP

TSU

It’s a notebook circa 2006 that had openSUSE on it for three or four years without any wifi problems, but 12.3 was a fresh install and is now the only OS on it. I don’t remember what drivers were in use on previous installs.


stan@linux-ektu:~> lspci -nnk

.....

05:02.0 Network controller [0280]: Intel Corporation PRO/Wireless 2200BG [Calexico2] Network Connection [8086:4220] (rev 05)
    Subsystem: Intel Corporation WM3B2300BG Mini-PCI Card [8086:2701]
    Kernel driver in use: ipw2200

After install I clicked on NM icon and set wireless connection, that’s all.

It’s interesting that it reports wifi connection as eth1 and not as wilan, thanks for pointing that out. Can it be corrected?

By rebooting and replacing AP you mean mess with the router? There are tons of other devices on my home network and they work okay even when this laptop drops connection.

On 08/13/2013 05:36 AM, Stan Ice wrote:
>
> It’s a notebook circa 2006 that had openSUSE on it for three or four
> years without any wifi problems, but 12.3 was a fresh install and is now
> the only OS on it. I don’t remember what drivers were in use on previous
> installs.
>
>
> Code:
> --------------------
>
> stan@linux-ektu:~> lspci -nnk
>
> …
>
> 05:02.0 Network controller [0280]: Intel Corporation PRO/Wireless 2200BG [Calexico2] Network Connection [8086:4220] (rev 05)
> Subsystem: Intel Corporation WM3B2300BG Mini-PCI Card [8086:2701]
> Kernel driver in use: ipw2200
>
> --------------------
>
>
> After install I clicked on NM icon and set wireless connection, that’s
> all.
>
>
> It’s interesting that it reports wifi connection as eth1 and not as
> wilan, thanks for pointing that out. Can it be corrected?

Why do you think it needs correcting? Wireless will run perfectly well on eth1
or even xyz23. The renaming is probably done by one of the rules in
/etc/udev/rules.d/. File 70-persistent-net.rules is the first place to look.

> By rebooting and replacing AP you mean mess with the router? There are
> tons of other devices on my home network and they work okay even when
> this laptop drops connection.

Have you checked the output of the dmesg command for clues? Perhaps the firmware
is not available. You can ensure that it is loaded by


sudo zypper install ipw-firmware

The file needed is /lib/firmware/ipw2200-ibss.fw.

Agreed, was pointing out the interface name only because it’s a possible indicator of a more substantive issue (eg not installing the correct driver) but the interface name itself is only cosmetic.

So, this is what I’m thinking…
From what you’ve posted it looks like the same kernel driver may be used for both wired and wireless (weird to me, but likely entirely possible). So, when you see the eth interface it seems to me that only your wired interface is configured.

Is there a hardware or other switch to turn on your wireless radio?

If the radio is definitely on, then I’d suspect that there should be some documentation specific to your driver regarding configuration,.
Who knows, maybe there is a another wireless driver that needs to be loaded that’s similar in name.

TSU

Here’s a thought…

When you’re connecting wirelessly, <are you disconnecting the wired interface>?
That’d be critical.

And, there may be a latency issue when switching from wired to wireless if it’s possible at all, you may have to at least stop/restart networking or even reboot to “set” your system to use wireless instead of wired.

If you continue to experience problems, post your

ip addr

TSU

Ethernet chip and driver are different:


...
05:01.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme BCM5788 Gigabit Ethernet [14e4:169c] (rev 03)
    Subsystem: Acer Incorporated [ALI] Device [1025:0081]
    Kernel driver in use: tg3

I don’t think I’ve ever tried wired connection on this install, there’s definitely no switching going on.

Laptop has a physical button to turn wifi on and off but that doesn’t solve the problem and has no effect on configuration, as far as I can tell.

I didn’t think it’s a firmware problem, I checked it and the driver before starting the thread, there’s nothing suspicious there, apart from eth1 name that could mean something is installed incorrectly.

The file /lib/firmware/ipw2200-ibss.fw is there.

This is from 70-persistent-net.rules

SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?", ATTR{address}==“00:12:f0:b1:c9:da”, ATTR{dev_id}==“0x0”, ATTR{type}==“1”, KERNEL=="eth", NAME=“eth1”
SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?", ATTR{address}==“00:0f:b0:80:8a:c0”, ATTR{dev_id}==“0x0”, ATTR{type}==“1”, KERNEL=="eth", NAME=“eth0”

And this is installing firmware just in case:


stan@linux-ektu:~> sudo zypper install ipw-firmware
root's password:
Retrieving repository 'Packman Repository' metadata ......................[done]
Building repository 'Packman Repository' cache ...........................[done]
Retrieving repository 'openSUSE-12.3-Update' metadata ....................[done]
Building repository 'openSUSE-12.3-Update' cache .........................[done]
Retrieving repository 'openSUSE-12.3-Update-Non-Oss' metadata ............[done]
Building repository 'openSUSE-12.3-Update-Non-Oss' cache .................[done]
Loading repository data...
Reading installed packages...
Resolving package dependencies...

Nothing to do.
stan@linux-ektu:~> 


“Dmesg” output should be checked when wifi fails and I should look at dmesg’s very end, right? So far I haven’t seen the problem reoccur since I started the thread, it might take a while.

I haven’t tried “ip addr”, only “ifconfig”, but what happens is that when wifi fails the address itself is reported as usual until I try to reconnect, after that it’s gone and only AP is visible, you can see it from my first post, there are two “ifconfig” outputs there, before and after reconnecting/restarting wireless.

On 08/13/2013 10:56 PM, Stan Ice wrote:
>
> Ethernet chip and driver are different:
>
>
> Code:
> --------------------
>
> …
> 05:01.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme BCM5788 Gigabit Ethernet [14e4:169c] (rev 03)
> Subsystem: Acer Incorporated [ALI] Device [1025:0081]
> Kernel driver in use: tg3
>
> --------------------
>
>
> I don’t think I’ve ever tried wired connection on this install, there’s
> definitely no switching going on.
>
> Laptop has a physical button to turn wifi on and off but that doesn’t
> solve the problem and has no effect on configuration, as far as I can
> tell.
>
> I didn’t think it’s a firmware problem, I checked it and the driver
> before starting the thread, there’s nothing suspicious there, apart from
> eth1 name that could mean something is installed incorrectly.
>
> The file /lib/firmware/ipw2200-ibss.fw is there.
>
> This is from 70-persistent-net.rules
>
> SUBSYSTEM==“net”, ACTION==“add”, DRIVERS=="?",
> ATTR{address}==“00:12:f0:b1:c9:da”, ATTR{dev_id}==“0x0”,
> ATTR{type}==“1”, KERNEL=="eth
", NAME=“eth1”

The above rule applies to a network interface with MAC address 00:12:f0:b1:c9:da
and a kernel name that matches “eth*” and renames it to “eth1”. From this, it
seems that driver ipw2200 creates device ethX, not wlanX. It is an old driver
from kernels older than the 2.6 series, and the convention has changes; however
changing its behavior could break lots of systems. Ignore the name.

As you have not shown the dmesg output at the time the network disconnects,
there is little more I can do or say. Your before and after iwconfig outputs
show that the interface still has a connection to the access point, but the
ifconfig output shows that the IP address has been dropped.

Please install the rfkill package and post the output of ‘/usr/sbin/rfkill list’
when the network goes down.

Okay, spotted network outage today.

As it went down:

stan@linux-ektu:~> /usr/sbin/rfkill list
0: phy0: Wireless LAN
    Soft blocked: no
    Hard blocked: no
stan@linux-ektu:~> ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
    link/ether 00:0f:b0:80:8a:c0 brd ff:ff:ff:ff:ff:ff
3: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
    link/ether 00:12:f0:b1:c9:da brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.111/24 brd 192.168.2.255 scope global eth1

Unfortunately, “ip addr” output after trying to reconnect got lost, I only remember that ip address wasn’t set. “rfkill” was exactly there same, and so was dmesg.

This is the last dmesg line, after roughly 6-7 hours of up time.

[26617.864791] ipw2200: Failed to send ASSOCIATE: Already sending a command.

I can upload entire dmesg to “paste”, I don’t think it’s relevant, though.

On 08/15/2013 06:56 AM, Stan Ice wrote:
>
> This is the last dmesg line, after roughly 6-7 hours of up time.
>
>
> Code:
> --------------------
> [26617.864791] ipw2200: Failed to send ASSOCIATE: Already sending a command.
> --------------------
>
>
> I can upload entire dmesg to “paste”, I don’t think it’s relevant,
> though.

I don’t think so either, but two or 3 entries before this one would be useful. A
little research on the kernel patch history shows the following commit:

============================================================================
commit dd447319895d0c0af423e483d9b63f84f3f8869a
Author: Stanislav Yakovlev <stas.yakovlev@gmail.com>
Date: Thu Apr 19 15:55:09 2012 -0400

ipw2200: Fix race condition in the command completion acknowledge

Driver incorrectly validates command completion: instead of waiting
for a command to be acknowledged it continues execution. Most of the
time driver gets acknowledge of the command completion in a tasklet
before it executes the next one. But sometimes it sends the next
command before it gets acknowledge for the previous one. In such a
case one of the following error messages appear in the log:

Failed to send SYSTEM_CONFIG: Already sending a command.
Failed to send ASSOCIATE: Already sending a command.
Failed to send TX_POWER: Already sending a command.

After that you need to reload the driver to get it working again.

This bug occurs during roaming (reported by Sam Varshavchik)
https://bugzilla.redhat.com/show_bug.cgi?id=738508
and machine booting (reported by Tom Gundersen and Mads Kiilerich)
https://bugs.archlinux.org/task/28097
https://bugzilla.redhat.com/show_bug.cgi?id=802106

This patch doesn’t fix the delay issue during firmware load.
But at least device now works as usual after boot.

Cc: stable@kernel.org
Signed-off-by: Stanislav Yakovlev <stas.yakovlev@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

============================================================================

The symptoms certainly seem to be the same as you are seeing. This patch has
been applied to the openSUSE 12.3 kernel, thus I would guess that it did not fix
all the race conditions.

It does say that you need to reload the driver. Try the following:


sudo /sbin/modprobe -rv ipw2200
sudo /sbin/modprobe -v ipw2200

That should eliminate the need to reboot at that point.

For the next step, I would contact Stanislav Yakovlev <stas.yakovlev@gmail.com>
with Cc to linux-wireless@vger.kernel.org and linux-kernel@vger.kernel.org
reporting that the problem still exists even though the patch in commit dd44731
has been applied. He will probably ask for some additional debugging info, which
will be set by a parameter at load time.

Preceding lines in dmesg are from several hours earlier and were probably produced when I applied patches and updates:


   50.470932] Bluetooth: BNEP filters: protocol multicast
   50.470945] Bluetooth: BNEP socket layer initialized
   50.805802] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
   51.056471] NET: Registered protocol family 17
   74.712863] fuse init (API version 7.20)
   88.705608] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
   97.874503] EXT4-fs (sda6): re-mounted. Opts: acl,user_xattr,commit=0
  117.214368] EXT4-fs (sda7): re-mounted. Opts: acl,user_xattr,commit=0
 9162.601081] device-mapper: uevent: version 1.0.3
 9162.601201] device-mapper: ioctl: 4.23.0-ioctl (2012-07-25) initialised: dm-devel@redhat.com
[26617.864791] ipw2200: Failed to send ASSOCIATE: Already sending a command.

Reloading driver fixes the problem without needing a reboot, thanks.

I sent an e-mail to the above addresses, let’s see what they say. One of them, linux-wireless@vger.kernel.org, is apparently dead.

I suspect the driver is too old to be fixed. Interestingly, I’ve never had this issue on previous installs. Maybe they used a different driver, no way to tell now.

Where can I check for other possible drivers?

On 08/21/2013 03:36 AM, Stan Ice wrote:
> Reloading driver fixes the problem without needing a reboot, thanks.

Good.

> I sent an e-mail to the above addresses, let’s see what they say. One
> of them, linux-wireless@vger.kernel.org, is apparently dead.

I get 50-60 E-mails each day from linux-wireless@vger.kernel.org. It is most
definitely not dead! Try again there.

> I suspect the driver is too old to be fixed. Interestingly, I’ve never
> had this issue on previous installs. Maybe they used a different driver,
> no way to tell now.
>
> Where can I check for other possible drivers?

There is only one driver for this device. Yes, it is old, but it seems as though
something changed in the kernel that caused it to fail. Of course, one other
interpretation is that your hardware is failing, but we will choose that
explanation only when everything else is eliminated.

The other fix mentioned a “race” condition, which seems a likely possibility.
Something in the code occasionally gets done in the wrong order and that causes
the CPU on the device to lock up. These kinds of things could be caused by
subtle bugs in either the driver or the firmware.

There are not many changes in this driver between 3.7.10-1.16 and 3.11-rc6, but
some of them look as though they might be significant. If you follow the
instructions at http://en.opensuse.org/openSUSE:Kernel_of_the_day, you will
obtain a copy of the latest kernel, and you will be able to see if it fixes your
problem.

Yesterday emails to both linux-wireless@vger.kernel.org and linux-kernel@vger.kernel.org bounced back as undeliverable. Reading through automated reply from their server I see that they rejected html that crept into the text, retried sending as a plain text, so far no rejections, see what they answer.

Let’s see if updating the kernel is worth it.