Hello!
I’ve updated a machine today from an ancient CentOS distribution to the latest openSUSE, 13.2.
Everything worked fine except i got regular network failures (every minute or so, depending on network traffic). The network would go down and the back up again in about two seconds.
Upon checking journalctl, i’ve found the following:
Dec 03 22:48:16 gw1 kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
TDH <23>
TDT <35>
next_to_use <35>
next_to_clean <21>
buffer_info[next_to_clean]:
time_stamp <10047c8ff>
next_to_watch <23>
jiffies <10047cbb5>
next_to_watch.status <0>
MAC Status <802a3>
PHY Status <792d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
Dec 03 22:48:18 gw1 kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
TDH <23>
TDT <35>
next_to_use <35>
next_to_clean <21>
buffer_info[next_to_clean]:
time_stamp <10047c8ff>
next_to_watch <23>
jiffies <10047cda9>
next_to_watch.status <0>
MAC Status <802a3>
PHY Status <792d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
Dec 03 22:48:20 gw1 kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
TDH <23>
TDT <35>
next_to_use <35>
next_to_clean <21>
buffer_info[next_to_clean]:
time_stamp <10047c8ff>
next_to_watch <23>
jiffies <10047cf9d>
next_to_watch.status <0>
MAC Status <802a3>
PHY Status <792d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
Dec 03 22:48:22 gw1 kernel: e1000e 0000:00:19.0 enp0s25: Detected Hardware Unit Hang:
TDH <23>
TDT <35>
next_to_use <35>
next_to_clean <21>
buffer_info[next_to_clean]:
time_stamp <10047c8ff>
next_to_watch <23>
jiffies <10047d191>
next_to_watch.status <0>
MAC Status <802a3>
PHY Status <792d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
Dec 03 22:48:22 gw1 kernel: e1000e 0000:00:19.0 enp0s25: Reset adapter unexpectedly
Dec 03 22:48:25 gw1 kernel: e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
This keeps repeating over and over at about two minutes interval.
The system is based on an Intel DP65LT motherboard that has an Intel 82566DC integrated Gbit network controller (according to the manual). It has the latest BIOS update.
Searching for the error message, i’ve found this solution and it helped.
I ran
ethtool -K enp0s25 tso off
and the journalctl messages disappeared and the network connection ran fine, as before.
Why is this happening?
I never had problems with the older Linux distro that was on this machine but i never checked ethtool to see if the older driver/kernel used the tso feature… (i didn’t bother to back it up before erasing it since openSUSE ran on all of my other machines flawlessly)
Is tso broken in this chip or in the driver?
I understand that there may be a performance penalty if i disable the tso feature of the chip so is there anything i can do to fix this?
Regards,
Andy.