Problem - one of my network connections being dropped.

I’m using wicked for management and am connected to 2 networks. Both have web access via their independent isp’s but I only use the 2nd one to access the printer on it. Both networks obtain their address from their respective routers. The printer ip address is fixed in the router to allow cups to work with it. Printer access is via wifi. My normal web access via the other router is wired.

This has worked flawlessly for months but I have had problems for the last few weeks. For some reason the wired connection is dropped and the system automatically falls back to accessing the web via the wifi connection. When this happens the router on the wired connection shows a red fault light and no connection to my pc. If I reset the router and reboot all goes back to normal web via wire and printer via wifi to the other network. There never seems to be any problems with the router I am connected to via wifi.

Initially as I and other are involved in some extensive house modifications I thought that the lead to the router was damaged. That has been replaced and the problem still occurs. It seems to happen when ever the pc is on but not used for some time. Hard to be sure about that though.

I’m not sure what to do to investigate further. One thing that would help is some method of manually bringing the wired connection on eth0 back up for web access. ifup just brings up a message stating set up in progress. Same if I used an ifdown first. wlan0 can be bought down and back up without any problems.

John

From looking around this should set my network connections in the same way as they are from boot.


systemctl restart network

However status then shows this


systemctl status network
● wicked.service - wicked managed network interfaces
   Loaded: loaded (/usr/lib/systemd/system/wicked.service; enabled; vendor preset: disabled)
   Active: active (exited) since Thu 2017-07-06 11:27:36 BST; 1min 9s ago
  Process: 4419 ExecStop=/usr/sbin/wicked --systemd ifdown all (code=exited, status=0/SUCCESS)
  Process: 4549 ExecStart=/usr/sbin/wicked --systemd ifup all (code=exited, status=0/SUCCESS)
 Main PID: 4549 (code=exited, status=0/SUCCESS)

Jul 06 11:27:06 localhost systemd[1]: Starting wicked managed network interfaces...
Jul 06 11:27:36 localhost wicked[4549]: lo              up
Jul 06 11:27:36 localhost wicked[4549]: eth0            setup-in-progress
Jul 06 11:27:36 localhost wicked[4549]: wlan0           up
Jul 06 11:27:36 localhost systemd[1]: Started wicked managed network interfaces.

The router on eth0 was showing a green light - logged in and connected to my isp. Eth0’s config file also shows it as the default but the net is still coming in via wlan0. The router on eth0 doesn’t show that a pc is connected so the restart hasn’t actually requested and address from it.

John

What happens if you set the config for eth0 to static IP ? Don’t forget to add 8.8.8.8 and 8.8.4.4 as DNS and your router’s internal IP as the gateway.

Thanks. I seem to have solved the problem - well at least the connection indicator lights are now working correctly and the restart network command has set up my connections correctly. Suddenly dropping eth0 and switching to wlan0 for internet needs a longer test.

I had mixed feeling about the cable I was using - ex my son only used a couple of times and ‘perfectly ok’. Cat 6 and expensive too. The plugs on the end didn’t seem to latch that solidly to me so I went out and bought another from pcworld. The system fired up the connection as soon as I plugged it in. It didn’t automatically switch internet to it but systemctl restart network sorted that out correctly.

;)Must admit it’s a better cable. It uncurled pretty easily for a 10m cable straight from the box and the plugs on the end do fit well and latch positively.

I run the printer on 192.168.1.x and the web on 192.168.2.x as follows


ip route show
default via 192.168.2.1 dev eth0  proto dhcp 
192.168.1.0/24 dev wlan0  proto kernel  scope link  src 192.168.1.160 
192.168.2.0/24 dev eth0  proto kernel  scope link  src 192.168.2.2 

The printer address on wlan0 is fixed on the router it’s connecting to. Both links were set to dhcp via yast.

John

In setups like this I prefer to have control, so static IP, outside f.e. the DHCP range of the router.

I had some help on here setting this up and did try static addresses. A connection to the network the printer was on via wlan0 was established but not long after the router kicked it out. I assume because it didn’t set the address itself. So out of curiosity I just used the same method I would use setting a normal network connection via yast - little to do after checking use dhcp. That connected to it and also it seems added a default route line to my eth0 config file and installed some or all of modem manager to handle the wifi on wlan0. I then connected to the printer router via wifi and set the printer to a static address and used that to set cups up.

John

Did you check the Ethernet path with Wireshark running on the machine experiencing the “shaky copper” connection (dropped packets and so on . . . )?
[HR][/HR]I had a similar cable issue a while ago with the DVI monitor cable – intermittent “black screen” while booting – a new graphics card didn’t fix the issue but, a new cable did (also expensive) . . .

No. :shame: I have used wireshark in the past but my immediate thought was the router log. Now I can get at that reliably it’s showing about 3,000 recoverable errors and a couple of hundred that weren’t so I’ve cleared it.

It looks like the main problem was the cable. My connection remained up over night and hasn’t switched to the wlan one. Previously it had been dropping the eth0 connection every night and sometimes during the day.

I still seem to have one problem - claws mail. From time to time it’s failing to connect to the isp to download emails. Symptoms are odd though. The pop up reporting the error doesn’t show so it appears to have locked up. If I log out and back in the pop up does show and claws is ok - so if the pop up had shown and was cleared all would have been ok. So far it has done this once with the new cable. First thing to sort is why the error pop up is not displaying. This seems to be due to kde etc upgrades. There may be other reasons for failing to connect. New cordless phones installed and the wifi router has been moved. It’s now closer to the telephone cables. On the other hand all could be due to some software upgrade. I am running the supported version of claws using pop…

John

If I were to guess, your problem is simple… Your wireless has a Default Gateway.
Remove it (Everything else, including net address, subnet mask, DNS, etc should remain), restart your network service and you should be fine.

Needless to say, your wired connection should have an assigned DG.

If you had posted an “ip addr” in the beginning, this should have been apparent to everyone.

Bonus points but not likely necessary from your description… If your printer didn’t have the same NetworkID as the wireless router, then you’d have to create a special routing table entry for your printer.

TSU

How about a list of (root and user) CLI commands needed to troubleshoot these issues? For example:


 # ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 14:da:e9:ec:a0:4d brd ff:ff:ff:ff:ff:ff
    inet 192.168.178.22/24 brd 192.168.178.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 2003:65:ee1f:8c00:5939:7667:592c:ff37/64 scope global temporary dynamic
       valid_lft 7112sec preferred_lft 1712sec
    inet6 2003:65:ee1f:8c00:16da:e9ff:feec:a04d/64 scope global mngtmpaddr dynamic
       valid_lft 7112sec preferred_lft 1712sec
    inet6 fe80::16da:e9ff:feec:a04d/64 scope link
       valid_lft forever preferred_lft forever
 #
 # netstat -i
Kernel Interface table
Iface   MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0   1500   0   117273      0      2      0    77590      0      0      0 BMRU
lo    65536   0      500      0      0      0      500      0      0      0 LRU
 #
 # ethtool --statistics eth0
NIC statistics:
     tx_packets: 77515
     rx_packets: 117126
     tx_errors: 0
     rx_errors: 0
     rx_missed: 0
     align_errors: 0
     tx_single_collisions: 0
     tx_multi_collisions: 0
     unicast: 98327
     broadcast: 14697
     multicast: 4102
     tx_aborted: 0
     tx_underrun: 0
 # 

 > 
 > l /sys/class/net/eth0/statistics/
insgesamt 0
drwxr-xr-x 2 root root    0  7. Jul 18:04 ./
drwxr-xr-x 5 root root    0  7. Jul 10:31 ../
-r--r--r-- 1 root root 4096  7. Jul 10:37 collisions
-r--r--r-- 1 root root 4096  7. Jul 10:37 multicast
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_bytes
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_compressed
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_crc_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_dropped
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_fifo_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_frame_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_length_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_missed_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_over_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 rx_packets
-r--r--r-- 1 root root 4096  7. Jul 10:37 tx_aborted_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 tx_bytes
-r--r--r-- 1 root root 4096  7. Jul 10:37 tx_carrier_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 tx_compressed
-r--r--r-- 1 root root 4096  7. Jul 10:37 tx_dropped
-r--r--r-- 1 root root 4096  7. Jul 10:37 tx_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 tx_fifo_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 tx_heartbeat_errors
-r--r--r-- 1 root root 4096  7. Jul 10:37 tx_packets
-r--r--r-- 1 root root 4096  7. Jul 10:37 tx_window_errors
 > 

The counters in “/sys/class/net/eth0/statistics/” are readable by everyone but, the interpretation is easier with “ethtool” (root only).

Forgot to mention “ip route” which displays the routing table for your interfaces

ip route

If you didn’t extract the reasoning behind my idea,

  • By default DHCP and many people set up Default Gateways on all interfaces, which is typically wrong. The word “default” in itself should provide the clue that there should be only one.
  • If you have multiple Default Gateways, then some of your machine’s network requests will be routed through a wrong gateway, and responses may be routed to the wrong interface.

TSU

Am Fri, 07 Jul 2017 16:36:02 GMT
schrieb dcurtisfra <dcurtisfra@no-mx.forums.microfocus.com>:

> The counters in “/sys/class/net/eth0/statistics/” are readable by
> everyone but, the interpretation is easier with “ethtool” (root only).

<Nitpick>
ethtool --statistics INTERFACE

works fine as restricted user.

</Nitpicl>

AK


Never attribute to malice that which can be adequately explained by stupidity.
(R.J. Hanlon)