ARP cache table never deletes any item (openSuSE 13.1)

After upgrade into openSuSE 13.1 (zypper dup) the arp cache stops cleaning. In the table stay dead items, every with MAC adress even the assigned machines are switched off for many days.

After command “ip -s -s neigh flush all” dead items have been changed only to “(incomplete) HWaddress” but stay in arp cache table for ever.

Before 13.1 dead items has been regularly and automatically changed into “(incomplete)” and then disappeared from arp chache table.

I just checked.

There’s an entry in my arp cache, for a WiFi connected computer that has been powered off for two days.

On the other hand, there is no entry for an ethernet computer that I used yesterday, but is now powered off.

I have no idea whether the WiFi vs. ethernet makes a difference.

I found that either “ifconfig <interface> down/up” or “systemctl restart network” cleans arp cache for interface (all interfaces in case of restart). The same result as “ifconfig eth0 down” and “ifconfig eth0 up” has unplug and plug cable (or switch restart). Maybe that was one of them as reason for cleaning the ethernet entry(ies).

Has “Wifi entry” defined HWadress (MAC) or there is “(incomplete)” string?

That situation I have got on every machine starting with update from 12.3 to 13.1, arp cache regularly cleans only last one, where still stays openSuSE 12.3

Does “Wifi entry” have … of course

sorry for my CZenglish :frowning:

It still had the HWaddress at the time. Checking again, it now says incomplete.

All entries stay in the arp cache whole weekend even the machines are switched off from friday evening. And also under the new kernel 3.11.10-7.1 which I have installed during friday on every servers (several kernel_default on x86_64 machines and also one kernel_pae). Only last instalation with openSuSE 12.3 cleans the arp cache as needed.

I did another check. I am still seeing this.

What I am not seeing, is any serious problem.

I ping a system that has been down since yesterday. And, shortly thereafter, I see the arp entry change to “incomplete”.

It looks as if ping tried 10 times, then removed the arp data. The ping output gives destination unreachable from sequence 10 on.

If you ping the “dead” machine, the appropriate entry will change to “(incomplete)”. But I know, that under the openSuSe 12.3 will be “(incomplete) entries” removed after a while. I have no idea about the process, which takes care about that and also I have no idea about timeouts for cleaning “(incomplete) entries” because till now I had no reason to be interested in that.

I’m not disagreeing that this is a change in behavior. Presumably, this is a change in kernel networking code.

What I won’t do, at present, is say that the new behavior is wrong. There’s no easy way to decide that. The arp cache is, after all, a cache. If it is useful to keep old entries in the cache, then it is not wrong.

Here’s what really matters: suppose I shutdown a machine, and replace its ethernet card. When I power up the machine, will it be recognized by other systems on the LAN?

I don’t currently have a spare ethernet card sitting around to test that. The fact that an entry reverts to “incomplete” after a failed ping, suggest that this will work properly.

Sorry, but I really do say, that behavior in openSuSE 13.1 is wrong. Because of DHCP server - for example: if HWadress never change to at least “(incomplete)”, I think (but I can do mistakes) that the lease for certain IP adress will newer be released.

And it has happened to me, that when I made ping to some switched off machine (whose the firewall did not block ICMP), nothing happened. No response, no “not response”. Simply nothing. Only after command “ip -s -s neigh flush all” was HWadress changed to (incomplete) and with another ping I have got “Destination Host Unreachable” finally.

This behavior is the reason, for my last server stays under openSuSE 12.3 - it serves like DHCP, DNS and PDC for Intranet and I don’t care for any problems with angry users …

And once more sorry for my czenglish, I hope that my lamentation is at least this time a little bit understandable :shame:

If you think it’s a bug, you can file a bug report.

I’m not sure I understand that.

Normally, a lease is released only on request of the client. And thata can only happen when the client is online.

Otherwise, a lease expires when its time is up. Normally, a DHCP server will just expire the lease, though it might keep a record in its tables so that if sees the same MAC address, it will give the same IP. As far as I know, the DHCP server is not checking the arp cache for this.

When I last tried this, I got no response for a while, and then “Destination Host Unreachable”, starting with “icmp_seq=10”.

The “unreachable” response is a linux-ism. I have never seen that on a solaris system or on a BSD system, though I admit it is a long time since I last used a BSD system. Those system just give no response. They only give a “Host Unreachable” if they receive a icmp packet from a router, stating that the host is unreachable. They do not initiate an unreachable response based on the arp cache.

I do not know why this change was made. But, thus far, I have not seen any problems caused by the change.

How many entries do you have? Garbage collector does not start if there are less than 128 entries by default.

/proc/sys/net/ipv4/neigh/default/gc_thresh1 shows 128 and items in arp cache is below that number, that’s sure.

BUT I found the same value on several servers with 13.1 and on my last server with 12.3 too. And behavior has changed after upgrade to 13.1, server with 12.3 works (for me) good. If garbage collector does not start if there are less than 128 entries by default (gc_tresh1), what kind of devil cleans arp cache on server with openSuSE 12.3 ???

Tomorrow I will change gc_tresh1 to lower value on one server, sysctl.conf will help me with it, I think…

For nrickert: Everything what I need is - return to old behavior. If somebody has changed that, there has to be some configuration. It is enough to point me to the right direction. I cannot find any comprehensive “history of changes openSuSE 13.1”. The text on https://doc.opensuse.org/release-notes/x86_64/openSUSE/13.1/RELEASE-NOTES.en.html is poor.

I found this: Configuring ARP age timeout

It looks as if the stale entry has indeed been marked as stale. It is just that the “arp -a” command does not show that information. The appropriate “ip” command does.

So this is the real problem you want to solve. Start with providing output of

ip -s ne list

as well as

ping remote-system

with full output and packet trace when you do ping.

Thanks, I will try it during weekend - most of the machines will be switched off.