High ping after disabled irq

Helo,

I’m having some issues with one of the servers.

Some details:
Opensuse 11.4 - 64

ethtool - eth1
neisha:~ # ethtool eth1
Settings for eth1:
Supported ports: TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Link partner advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: MII
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000033 (51)
drv probe ifdown ifup
Link detected: yes

IRQ
cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
0: 97 0 0 0 IO-APIC-edge timer
1: 2 0 0 0 IO-APIC-edge i8042
4: 19696 0 0 0 IO-APIC-edge serial
5: 0 0 0 0 IO-APIC-edge parport0
8: 6 0 0 0 IO-APIC-edge rtc0
9: 0 0 0 0 IO-APIC-fasteoi acpi
12: 4 0 0 0 IO-APIC-edge i8042
16: 2110738 0 0 0 IO-APIC-fasteoi nouveau, eth1
17: 165 0 0 0 IO-APIC-fasteoi hda_intel
20: 67538 304505 0 0 IO-APIC-fasteoi ata_piix, ata_piix
23: 72 0 0 0 IO-APIC-fasteoi ehci_hcd:usb1, ehci_hcd:usb2
40: 0 0 0 0 PCI-MSI-edge PCIe PME
41: 0 0 0 0 PCI-MSI-edge PCIe PME
42: 0 0 0 0 PCI-MSI-edge PCIe PME
43: 0 0 0 0 PCI-MSI-edge PCIe PME
44: 0 0 0 0 PCI-MSI-edge PCIe PME
45: 0 0 0 0 PCI-MSI-edge PCIe PME
46: 122558 0 0 0 PCI-MSI-edge eth0
NMI: 755 1032 504 500 Non-maskable interrupts
LOC: 857635 585207 312385 202730 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
PMI: 755 1032 504 500 Performance monitoring interrupts
IWI: 0 0 0 0 IRQ work interrupts
RES: 16529 10961 8489 6055 Rescheduling interrupts
CAL: 2411 2214 6910 10274 Function call interrupts
TLB: 3493 9936 5535 2120 TLB shootdowns
TRM: 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 Machine check exceptions
MCP: 58 58 58 58 Machine check polls
ERR: 3
MIS: 0

Error in messages logDec 25 18:56:39 neisha kernel: [15589.904611] irq 16: nobody cared (try booting with the “irqpoll” option)
Dec 25 18:56:39 neisha kernel: [15589.904617] Pid: 0, comm: swapper Not tainted 2.6.37.6-0.9-default #1
Dec 25 18:56:39 neisha kernel: [15589.904619] Call Trace:
Dec 25 18:56:39 neisha kernel: [15589.904637] <ffffffff81005819>] dump_trace+0x69/0x2e0
Dec 25 18:56:39 neisha kernel: [15589.904641] <ffffffff814bb153>] dump_stack+0x69/0x6f
Dec 25 18:56:39 neisha kernel: [15589.904645] <ffffffff810c612e>] __report_bad_irq+0x1e/0x90
Dec 25 18:56:39 neisha kernel: [15589.904647] <ffffffff810c6349>] note_interrupt+0x1a9/0x200
Dec 25 18:56:39 neisha kernel: [15589.904651] <ffffffff810c7275>] handle_fasteoi_irq+0xd5/0x110
Dec 25 18:56:39 neisha kernel: [15589.904654] <ffffffff81005725>] handle_irq+0x15/0x20
Dec 25 18:56:39 neisha kernel: [15589.904656] <ffffffff8100538e>] do_IRQ+0x5e/0xe0
Dec 25 18:56:39 neisha kernel: [15589.904659] <ffffffff814be153>] ret_from_intr+0x0/0xa
Dec 25 18:56:39 neisha kernel: [15589.904663] <ffffffff812a5ace>] intel_idle+0xbe/0x110
Dec 25 18:56:39 neisha kernel: [15589.904667] <ffffffff8139dfb8>] cpuidle_idle_call+0xb8/0x2b0
Dec 25 18:56:39 neisha kernel: [15589.904671] <ffffffff8100125c>] cpu_idle+0x4c/0x90
Dec 25 18:56:39 neisha kernel: [15589.904675] <ffffffff81b32be9>] start_kernel+0x38b/0x396
Dec 25 18:56:39 neisha kernel: [15589.904678] <ffffffff81b32414>] x86_64_start_kernel+0xf9/0xff
Dec 25 18:56:39 neisha kernel: [15589.904680] handlers:
Dec 25 18:56:39 neisha kernel: [15589.904680] <ffffffffa0116460>] (nouveau_irq_handler+0x0/0x1b0 [nouveau])
Dec 25 18:56:39 neisha kernel: [15589.904696] <ffffffffa0232ff0>] (rtl8169_interrupt+0x0/0x220 [r8169])
Dec 25 18:56:39 neisha kernel: [15589.904700] Disabling IRQ #16

Sympthoms
Everything is working fine until irq is disabled.
The the ping to PCI network card (IRQ 16) gets from 0.05 ms to 90 ms.
After the network is restarted everything is OK again.

Problem is only with PCI connected card and does not effect network card on the motherboard.

Recent changes
We have changed the motherboard and CPU. After this problems we have changed the NIC but problem persists

Any idea, i’m out of it.

Looks like a good material for a bug report to my mind. Did You try checking the driver used by your NIC with

/sbin/lspci -nnk ?

Maybe this driver is known to be problematic with this kernel.
Best regards,
Greg

Helo,

This are both NICs

04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller [10ec:8168] (rev 06)
Subsystem: ASUSTeK Computer Inc. P8P67 Deluxe Motherboard [Realtek RTL8111E] [1043:8432]
Kernel driver in use: r8169
06:00.0 PCI bridge [0604]: Device [1b21:1080] (rev 01)
07:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet [10ec:8169] (rev 10)
Subsystem: Realtek Semiconductor Co., Ltd. RTL8169/8110 Family PCI Gigabit Ethernet NIC [10ec:8169]
Kernel driver in use: r8169

They both use the same driver.
07 is the problematic one i gues.

Best,
Domen

Looks like this particular driver can be really problematic depending on the kernel version as indicated in the first post here :
Gentoo Forums :: View topic - [SOLVED] lot of issues with RTL-8110SC/8169SC

If I were You I would fill in a bug report and try a different kernel version or compile the realtek driver from source.
I’m using

grzes@opensuse:~> uname -a
Linux opensuse.pl 3.1.5-1-desktop #1 SMP PREEMPT Sat Dec 10 19:20:50 UTC 2011 (d70fd6b) i686 i686 i386 GNU/Linux

Index of /repositories/Kernel:/stable/standard

Because my WiFi breaks when torrenting with the default 11.4 kernel.

Best regards,
Greg

Thanks.

I’ll do the distro update then to 12.1 i was meaning anyway.

Best,
Domen

without an IRQ, the ‘add-on’ device needs to ‘politely wait’ for processor time where the onboard nic may not have to.

?
Landis.