Loops of Kernel Oops with NULL pointer dereference at do_softirq

Hi guys.

I´ve just joined this community in the hope that anybody could help us out with a problem that is killing us.
The problem started after the Kernel and Suse upgrade (it had a Kernel 2.6.34.7, OS 11.3).
It is an Advantech system running OpenSuse 13.1 and Kernel 3.4.6 (although the problem also happened in the Kernel 3.11.10) with Digium PCI card (dahdi driver).

The loop of Kernel Oops happens intermittently and is not reproducible, but I suspect it might be related to load, CPU or IRQs I guess.
I am not expecting any miraculous answer but any hint, guess or directions for the problem we are facing will be very, very much appreciated !!:slight_smile:

Thanks

[134031.334446] BUG: unable to handle kernel NULL pointer dereference at 0000000a
[134031.342277] IP: <c0204d74>] print_context_stack+0x54/0xa0
[134031.347974] *pdpt = 00000000336c7001 *pde = 0000000000000000
[134031.354115] Oops: 0000 #1] SMP
[134031.357709] Modules linked in: binfmt_misc ppp_generic slhc wctdm24xxp(O) wctdm(O) wcfxo(O) wctc4xxp(O) dahdi_transcode(O) wcb4xxp(O) wcte12xp(O) dahdi_voicebus(O) ip6table_mangle ip6table_filter ip6_tables iptable_nat xt_DSCP iptable_mangle ipt_REJECT xt_tcpudp xt_LOG xt_limit xt_multiport xt_addrtype xt_conntrack iptable_filter ip_tables ipt_MASQUERADE nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 x_tables nf_conntrack wct4xxp(O) dahdi(O) crc_ccitt loop acpi_cpufreq mperf coretemp video serio_raw shpchp pcspkr iTCO_wdt pci_hotplug i2c_i801 iTCO_vendor_support crc32c_intel aesni_intel cryptd aes_i586 microcode button 8021q garp stp xfrm_user ipcomp xfrm_ipcomp esp4 ah4 af_key sg autofs4 af_packet ata_generic ata_piix ehci_hcd usbcore e1000e usb_common fan thermal processor thermal_sys scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh_emc scsi_dh mptspi mptscsih mptbase scsi_transport_spi
[134031.437900]
[134031.439505] Pid: 3762, comm: kamailio Tainted: G O 3.4.6-2.10.1-pae #1 Advantech SYS-2USM03-6M01E/SYS-2USM03-6M01E
[134031.451013] EIP: 0060:<c0204d74>] EFLAGS: 00210293 CPU: 1
[134031.456641] EIP is at print_context_stack+0x54/0xa0
[134031.461632] EAX: ffffe000 EBX: 0000000a ECX: 00000000 EDX: 0000000a
[134031.468090] ESI: 00000000 EDI: 00000000 EBP: f410deb8 ESP: f410de80
[134031.474514] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[134031.480061] CR0: 80050033 CR2: 0000000a CR3: 1f4e3000 CR4: 000407f0
[134031.486465] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[134031.492880] DR6: ffff0ff0 DR7: 00000400
[134031.496844] Process kamailio (pid: 3762, ti=f410c000 task=f244b0d0 task.ti=f2d64000)
[134031.504712] Stack:
[134031.506822] ffffe000 00000000 00000000 00001ffc c069f1a8 00000000 c07f56fc f410deb8
[134031.514822] c020401b c069f1a8 c07f56fc 00000000 f410deb8 0000000a 00000000 00000000
[134031.522879] c07f56fc 00000000 00000000 f410df10 c020513f 00000000 c069f1a8 c07f56fc
[134031.530938] Call Trace:
[134031.533508] <c020401b>] dump_trace+0x9b/0xf0
[134031.538088] <c020513f>] show_trace_log_lvl+0x3f/0x50
[134031.543398] <c068b77d>] dump_stack+0x75/0x7a
[134031.547992] <c0230c0c>] warn_slowpath_common+0x6c/0xa0
[134031.553461] <c0230c73>] warn_slowpath_fmt+0x33/0x40
[134031.558656] <c05d70d2>] dev_watchdog+0x1d2/0x1e0
[134031.563580] <c023db91>] run_timer_softirq+0xe1/0x290
[134031.568850] <c023725e>] __do_softirq+0x8e/0x170
[134031.573705] <c0203eb9>] do_softirq+0x59/0xa0
[134031.578244] <f410e108>] 0xf410e107
[134031.581950] DWARF2 unwinder stuck at 0xf410e108
[134031.586599]
[134031.588209] Leftover inexact backtrace:
[134031.588210]
[134031.593746] <c0694ea1>] ? __schedule+0x341/0x730
[134031.598702] <c0694ea1>] ? __schedule+0x341/0x730
[134031.603633] <c0350065>] ? bioset_free+0xb5/0xd0
[134031.608501] <c0350063>] ? bioset_free+0xb3/0xd0
[134031.613348] <IRQ>
[134031.615394] BUG: unable to handle kernel NULL pointer dereference at 0000000a
[134031.622936] IP: <c0204d74>] print_context_stack+0x54/0xa0
[134031.628589] *pdpt = 00000000336c7001 *pde = 0000000000000000
[134031.634509] Oops: 0000 #2] SMP
[134031.637884] Modules linked in: binfmt_misc ppp_generic slhc wctdm24xxp(O) wctdm(O) wcfxo(O) wctc4xxp(O) dahdi_transcode(O) wcb4xxp(O) wcte12xp(O) dahdi_voicebus(O) ip6table_mangle ip6table_filter ip6_tables iptable_nat xt_DSCP iptable_mangle ipt_REJECT xt_tcpudp xt_LOG xt_limit xt_multiport xt_addrtype xt_conntrack iptable_filter ip_tables ipt_MASQUERADE nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 x_tables nf_conntrack wct4xxp(O) dahdi(O) crc_ccitt loop acpi_cpufreq mperf coretemp video serio_raw shpchp pcspkr iTCO_wdt pci_hotplug i2c_i801 iTCO_vendor_support crc32c_intel aesni_intel cryptd aes_i586 microcode button 8021q garp stp xfrm_user ipcomp xfrm_ipcomp esp4 ah4 af_key sg autofs4 af_packet ata_generic ata_piix ehci_hcd usbcore e1000e usb_common fan thermal processor thermal_sys scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh_emc scsi_dh mptspi mptscsih mptbase scsi_transport_spi
[134031.717927]
[134031.719513] Pid: 3762, comm: kamailio Tainted: G O 3.4.6-2.10.1-pae #1 Advantech SYS-2USM03-6M01E/SYS-2USM03-6M01E
[134031.730981] EIP: 0060:<c0204d74>] EFLAGS: 00210093 CPU: 1
[134031.736662] EIP is at print_context_stack+0x54/0xa0
[134031.741687] EAX: ffffe000 EBX: 0000000a ECX: 00000000 EDX: 0000000a
[134031.748086] ESI: 00000000 EDI: 00000000 EBP: f410dcac ESP: f410dc74
[134031.754498] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[134031.760055] CR0: 80050033 CR2: 0000000a CR3: 1f4e3000 CR4: 000407f0
[134031.766496] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[134031.772892] DR6: ffff0ff0 DR7: 00000400
[134031.776846] Process kamailio (pid: 3762, ti=f410c000 task=f244b0d0 task.ti=f2d64000)
[134031.784740] Stack:
[134031.786852] ffffe000 00000000 00000000 00001ffc c069f1a8 00000000 c07e5c64 f410dcac
[134031.794859] c020401b c069f1a8 c07e5c64 00000000 f410dcac 0000000a 00000000 f410de44
[134031.802932] c07e5c64 00000000 f410de44 f410de80 c020513f 00000000 c069f1a8 c07e5c64
[134031.810939] Call Trace:
[134031.813515] <c020401b>] dump_trace+0x9b/0xf0
[134031.818092] <c020513f>] show_trace_log_lvl+0x3f/0x50
[134031.823417] <c02040c0>] show_stack_log_lvl+0x50/0xd0
[134031.828677] <c0204212>] show_registers+0xd2/0x1e0
[134031.833671] <c069759f>] __die+0x8f/0xf0
[134031.837794] <c068c507>] no_context+0x17f/0x1ac
[134031.842544] <c068c66e>] __bad_area_nosemaphore+0x13a/0x142
[134031.848351] <c068c685>] bad_area_nosemaphore+0xf/0x11
[134031.853709] <c0699717>] do_page_fault+0x3f7/0x450
[134031.858721] <c0696d6a>] error_code+0x5a/0x60
[134031.863326] <c0204d74>] print_context_stack+0x54/0xa0
[134031.868699] <c020401b>] dump_trace+0x9b/0xf0
[134031.873253] <c020513f>] show_trace_log_lvl+0x3f/0x50
[134031.878498] <c068b77d>] dump_stack+0x75/0x7a
[134031.883077] <c0230c0c>] warn_slowpath_common+0x6c/0xa0
[134031.888537] <c0230c73>] warn_slowpath_fmt+0x33/0x40
[134031.893732] <c05d70d2>] dev_watchdog+0x1d2/0x1e0
[134031.898640] <c023db91>] run_timer_softirq+0xe1/0x290
[134031.903911] <c023725e>] __do_softirq+0x8e/0x170
[134031.908757] <c0203eb9>] do_softirq+0x59/0xa0
[134031.913354] <f410e108>] 0xf410e107
[134031.917097] DWARF2 unwinder stuck at 0xf410e108
[134031.921739]
[134031.923332] Leftover inexact backtrace:
[134031.923332]
[134031.928858] <c0694ea1>] ? __schedule+0x341/0x730
[134031.933805] <c0694ea1>] ? __schedule+0x341/0x730
[134031.938764] <c0350065>] ? bioset_free+0xb5/0xd0
[134031.943616] <c0350063>] ? bioset_free+0xb3/0xd0
[134031.948431] <IRQ>
/usr/sbin/kamail[134031.950494] BUG: unable to handle kernel NULL pointer dereference at 0000000a
[134031.959424] IP:io[3886]: ERROR: <c0204d74>] print_context_stack+0x54/0xa0
[134031.961181] [dahdi] wct4xxp: Need to increase latency. Estimated latency should be 3
[134031.961246] [dahdi] wct4xxp: Increased latency to 3
<core> [tcp_mai[134031.979596] *pdpt = 00000000336c7001 *pde = 0000000000000000 n.c:4553]: conne
[134031.988266] Oops: 0000 #3] ct 10.100.182.15SMP
[134031.991148] [dahdi] wct4xxp: Need to increase latency. Estimated latency should be 4
[134031.991188] [dahdi] wct4xxp: Increased latency to 4
[134032.006182]
6:5061 failed (t[134032.007765] Modules linked in: binfmt_misc ppp_generic slhc wctdm24xxp(O) wctdm(O) wcfxo(O) wctc4xxp(O) dahdi_transcode(O) wcb4xxp(O) wcte12xp(O) dahdi_voicebus(O) ip6table_mangle ip6table_filter ip6_tables iptable_nat xt_DSCP iptable_mangle ipt_REJECT xt_tcpudp xt_LOG xt_limit xt_multiport xt_addrtype xt_conntrack iptable_filter ip_tables ipt_MASQUERADE nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 x_tables nf_conntrack wct4xxp(O) dahdi(O) crc_ccitt loop acpi_cpufreq mperf coretemp video serio_raw shpchp pcspkr iTCO_wdt pci_hotplug i2c_i801 iTCO_vendor_support crc32c_intel aesni_intel cryptd aes_i586 microcode button 8021q garp stp xfrm_user ipcomp xfrm_ipcomp esp4 ah4 af_key sg autofs4 af_packet ata_generic ata_piix ehci_hcd usbcore e1000e usb_common fan thermal processor thermal_sys scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh_emc scsi_dh mptspi mptscsih mptbase scsi_transport_spi
[134032.089453]
[134032.091065] Pid: 3762, comm: kamailio Tainted: G O 3.4.6-2.10.1-pae #1 Advantech SYS-2USM03-6M01E/SYS-2USM03-6M01E
[134032.102551] EIP: 0060:<c0204d74>] EFLAGS: 00210093 CPU: 1
[134032.108161] EIP is at print_context_stack+0x54/0xa0
[134032.113186] EAX: ffffe000 EBX: 0000000a ECX: 00000000 EDX: 0000000a
imeout)
[134032.119618] ESI: 00000000 EDI: 00000000 EBP: f410daa0 ESP: f410da68
[134032.119662] [dahdi] wct4xxp: Need to increase latency. Estimated latency should be 7
[134032.119701] [dahdi] wct4xxp: Increased latency to 7
[134032.139916] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[134032.145437] CR0: 80050033 CR2: 0000000a CR3: 1f4e3000 CR4: 000407f0
[134032.151825] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[134032.158194] DR6: ffff0ff0 DR7: 00000400
[134032.162150] Process kamailio (pid: 3762, ti=f410c000 task=f244b0d0 task.ti=f2d64000)
[134032.170026] Stack:
[134032.172137] ffffe000 00000000 00000000 00001ffc c069f1a8 00000000 c07e5c64 f410daa0
[134032.180135] c020401b c069f1a8 c07e5c64 00000000 f410daa0 0000000a 00000000 f410dc38
[134032.188193] c07e5c64 00000000 f410dc38 f410dc74 c020513f 00000000 c069f1a8 c07e5c64
[134032.196227] Call Trace:
[134032.198802] <c020401b>] dump_trace+0x9b/0xf0
[134032.203366] <c020513f>] show_trace_log_lvl+0x3f/0x50
[134032.208679] <c02040c0>] show_stack_log_lvl+0x50/0xd0
[134032.213940] <c0204212>] show_registers+0xd2/0x1e0
[134032.218977] <c069759f>] __die+0x8f/0xf0
[134032.223107] <c068c507>] no_context+0x17f/0x1ac
[134032.227874] <c068c66e>] __bad_area_nosemaphore+0x13a/0x142
[134032.233689] <c068c685>] bad_area_nosemaphore+0xf/0x11
[134032.239013] <c0699717>] do_page_fault+0x3f7/0x450
[134032.244058] <c0696d6a>] error_code+0x5a/0x60
[134032.248647] <c0204d74>] print_context_stack+0x54/0xa0
[134032.254012] <c020401b>] dump_trace+0x9b/0xf0
[134032.258589] <c020513f>] show_trace_log_lvl+0x3f/0x50
[134032.263879] <c02040c0>] show_stack_log_lvl+0x50/0xd0
[134032.269176] <c0204212>] show_registers+0xd2/0x1e0
[134032.274203] <c069759f>] __die+0x8f/0xf0
[134032.278367] <c068c507>] no_context+0x17f/0x1ac
[134032.283102] <c068c66e>] __bad_area_nosemaphore+0x13a/0x142
[134032.288936] <c068c685>] bad_area_nosemaphore+0xf/0x11
[134032.294319] <c0699717>] do_page_fault+0x3f7/0x450
[134032.299340] <c0696d6a>] error_code+0x5a/0x60
[134032.303936] <c0204d74>] print_context_stack+0x54/0xa0
[134032.309328] <c020401b>] dump_trace+0x9b/0xf0
[134032.313891] <c020513f>] show_trace_log_lvl+0x3f/0x50
[134032.313923] [dahdi] wct4xxp: Need to increase latency. Estimated latency should be 71
[134032.314011] [dahdi] wct4xxp: Increased latency to 71
[134032.332475] <c068b77d>] dump_stack+0x75/0x7a
[134032.337062] <c0230c0c>] warn_slowpath_common+0x6c/0xa0
[134032.342566] <c0230c73>] warn_slowpath_fmt+0x33/0x40
[134032.347760] <c05d70d2>] dev_watchdog+0x1d2/0x1e0
[134032.352702] <c023db91>] run_timer_softirq+0xe1/0x290
[134032.357972] <c023725e>] __do_softirq+0x8e/0x170
[134032.362820] <c0203eb9>] do_softirq+0x59/0xa0
[134032.367419] <f410e108>] 0xf410e107
[134032.371141] DWARF2 unwinder stuck at 0xf410e108
[134032.375801]
[134032.377383] Leftover inexact backtrace:
[134032.377384]
[134032.382972] <c0694ea1>] ? __schedule+0x341/0x730
[134032.387954] <c0694ea1>] ? __schedule+0x341/0x730
[134032.392868] <c0350065>] ? bioset_free+0xb5/0xd0
[134032.397684] <c0350063>] ? bioset_free+0xb3/0xd0
[134032.402514] <IRQ>
[134032.404551] BUG: unable to handle kernel NULL pointer dereference at 0000000a
[134032.412083] IP: <c0204d74>] print_context_stack+0x54/0xa0
[134032.417710] *pdpt = 00000000336c7001 *pde = 0000000000000000
[134032.423623] Oops: 0000 #4] SMP
and goes on and on …

Did you patch your openSUSE 13.1 system lately? My system uses kernel 3.12.53.
Thus it seems thaa ou are a bit behind in bringing your system up to date.

Thanks Henk.

We have gone back to the kernel 3.4.6 in order to solve another problem found with the kernel 3.11.10 (IRQs stop to be processed on the second Digium´s PRI PCI board).
Thank you very much for your hint, we will try it with the kernel 3.12.53 and let you know.:slight_smile:

We have managed to install the latest kernel version, but the problem still happens :frowning:
I really do appreciate if anyone has another hint for us.
Thanks in advance

[239000.666672] BUG: unable to handle kernel NULL pointer dereference at 0000000a
[239000.674200] IP: <c0204c94>] print_context_stack+0x54/0xa0
[239000.679836] *pdpt = 000000002fac4001 *pde = 0000000000000000
[239000.685792] Oops: 0000 #1] SMP
[239000.689208] Modules linked in: nfnetlink_log nfnetlink binfmt_misc ppp_generic slhc wctdm24xxp(O) wcaxx(O) wctdm(O) wcfxo(O) wctc4xxp(O) dahdi_transcode(O) wcb4xxp(O) wcte13xp(O) wcte12xp(O) dahdi_voicebus(O) ip6table_mangle ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 xt_DSCP iptable_mangle ipt_REJECT xt_tcpudp xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_addrtype xt_conntrack iptable_filter ip_tables ipt_MASQUERADE nf_nat x_tables nf_conntrack wct4xxp(O) wcte43x(O) oct612x(O) dahdi(O) crc_ccitt loop x86_pkg_temp_thermal intel_powerclamp intel_rapl coretemp crc32_pclmul iTCO_wdt iTCO_vendor_support aesni_intel ablk_helper cryptd lrw aes_i586 xts gf128mul lpc_ich pcspkr mfd_core serio_raw i2c_i801 mei_me mei shpchp battery tpm_infineon video tpm_tis tpm tpm_bios button 8021q mrp garp stp llc xfrm_user ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo sg autofs4 af_packet ata_generic ata_piix crc32c_intel ehci_pci ehci_hcd e1000e ptp pps_core usbcore usb_common fan thermal processor thermal_sys scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw scsi_dh mptspi mptscsih mptbase scsi_transport_spi
[239000.791579] CPU: 1 PID: 3917 Comm: ssm Tainted: G O 3.12.53-40.1-pae #1
[239000.799394] Hardware name: Advantech SYS-2USM03-6M01E/SYS-2USM03-6M01E, BIOS 4.6.4 11/22/2011
[239000.808153] task: efbeef50 ti: f48f6000 task.ti: ea5c6000
[239000.813737] EIP: 0060:<c0204c94>] EFLAGS: 00210293 CPU: 1
[239000.819365] EIP is at print_context_stack+0x54/0xa0
[239000.824448] EAX: ffffe000 EBX: 0000000a ECX: 00000000 EDX: 0000000a
[239000.830950] ESI: 00000000 EDI: 00000000 EBP: f48f7e6c ESP: f48f7e34
[239000.837433] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[239000.842971] CR0: 80050033 CR2: 0000000a CR3: 2a5b0000 CR4: 000407f0
[239000.849386] Stack:
[239000.851507] ffffe000 00000000 00000000 00001ffc c071b1e0 00000000 c08a17f6 f48f7e6c
[239000.859634] c02040db c071b1e0 c08a17f6 00000000 f48f7e6c 0000000a 00000000 ffffffff
[239000.867759] c08a17f6 00000000 00000000 f48f7ed0 c020504f 00000000 c071b1e0 c08a17f6
[239000.875829] Call Trace:
[239000.878485] <c02040db>] dump_trace+0x9b/0xf0
[239000.883092] <c020504f>] show_trace_log_lvl+0x3f/0x50
[239000.888382] <c0204180>] show_stack_log_lvl+0x50/0xd0
[239000.893730] <c020509f>] show_stack+0x1f/0x40
[239000.898380] <c070b5df>] dump_stack+0x3e/0x4e
[239000.902941] <c02461a8>] warn_slowpath_common+0x88/0xc0
[239000.908402] <c0246213>] warn_slowpath_fmt+0x33/0x40
[239000.913621] <c065857e>] dev_watchdog+0x1de/0x1f0
[239000.918584] <c0250c24>] call_timer_fn+0x24/0xe0
[239000.923454] <c0251958>] run_timer_softirq+0x178/0x210
[239000.928810] <c024adb6>] __do_softirq+0xb6/0x1d0
[239000.933719] <c0203f79>] do_softirq+0x59/0xa0
[239000.938364] <f48f8140>] 0xf48f813f
[239000.942124] DWARF2 unwinder stuck at 0xf48f8140
[239000.946932]
[239000.948543] Leftover inexact backtrace:
[239000.948543]
[239000.954132] <c03074d0>] ? mempool_free_slab+0x10/0x10
[239000.959563] <c03074e0>] ? mempool_alloc_pages+0x10/0x10
[239000.965132] <c02ab4f0>] ? try_to_force_load+0x50/0x50
[239000.970571] <f7b8a000>] ? 0xf7b89fff
[239000.974481] <c02ad509>] ? module_alloc_update_bounds+0x9/0x50
[239000.980603] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239000.986547] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239000.992457] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239000.998385] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.004278] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.010208] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.016127] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.022135] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.028035] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.033977] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.039897] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.045954] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.051857] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.057832] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.063784] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.069695] <c054a170>] ? tty_prepare_flip_string+0x40/0x40
[239001.075640] <IRQ>
[239001.077680] BUG: unable to handle kernel NULL pointer dereference at 0000000a
[239001.085270] IP: <c0204c94>] print_context_stack+0x54/0xa0
[239001.090957] *pdpt = 000000002fac4001 *pde = 0000000000000000
[239001.096970] Oops: 0000 #2] SMP
[239001.100368] Modules linked in: nfnetlink_log nfnetlink binfmt_misc ppp_generic slhc wctdm24xxp(O) wcaxx(O) wctdm(O) wcfxo(O) wctc4xxp(O) dahdi_transcode(O) wcb4xxp(O) wcte13xp(O) wcte12xp(O) dahdi_voicebus(O) ip6table_mangle ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 xt_DSCP iptable_mangle ipt_REJECT xt_tcpudp xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_addrtype xt_conntrack iptable_filter ip_tables ipt_MASQUERADE nf_nat x_tables nf_conntrack wct4xxp(O) wcte43x(O) oct612x(O) dahdi(O) crc_ccitt loop x86_pkg_temp_thermal intel_powerclamp intel_rapl coretemp crc32_pclmul iTCO_wdt iTCO_vendor_support aesni_intel ablk_helper cryptd lrw aes_i586 xts gf128mul lpc_ich pcspkr mfd_core serio_raw i2c_i801 mei_me mei shpchp battery tpm_infineon video tpm_tis tpm tpm_bios button 8021q mrp garp stp llc xfrm_user ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo sg autofs4 af_packet ata_generic ata_piix crc32c_intel ehci_pci ehci_hcd e1000e ptp pps_core usbcore usb_common fan thermal processor thermal_sys scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw

the Oops goes over and over until the :

[239013.910284] BUG: unable to handle kernel NULL pointer dereference at 0000000a
[239013.917875] IP: <c0204c94>] print_context_stack+0x54/0xa0
[239013.923569] *pdpt = 000000002fac4001 *pde = 0000000000000000
[239013.929526] Oops: 0000 #17] SMP
[239013.932979] Modules linked in: nfnetlink_log nfnetlink binfmt_misc ppp_generic slhc wctdm24xxp(O) wcaxx(O) wctdm(O) wcfxo(O) wctc4xxp(O) dahdi_transcode(O) wcb4xxp(O) wcte13xp(O) wcte12xp(O) dahdi_voicebus(O) ip6table_mangle ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 xt_DSCP iptable_mangle ipt_REJECT xt_tcpudp xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_addrtype xt_conntrack iptable_filter ip_tables ipt_MASQUERADE nf_nat x_tables nf_conntrack wct4xxp(O) wcte43x(O) oct612x(O) dahdi(O) crc_ccitt loop x86_pkg_temp_thermal intel_powerclamp intel_rapl coretemp crc32_pclmul iTCO_wdt iTCO_vendor_support aesni_intel ablk_helper cryptd lrw aes_i586 xts gf128mul lpc_ich pcspkr mfd_core serio_raw i2c_i801 mei_me mei shpchp battery tpm_infineon video tpm_tis tpm tpm_bios button 8021q mrp garp stp llc xfrm_user ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo sg autofs4 af_packet ata_generic ata_piix crc32c_intel ehci_pci ehci_hcd e1000e ptp pps_core usbcore usb_common fan thermal processor thermal_sys scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw scsi_dh mptspi mptscsih mptbase scsi_transport_spi
[239014.035382] CPU: 1 PID: 3917 Comm: ssm Tainted: G O 3.12.53-40.1-pae #1
[239014.043172] Hardware name: Advantech SYS-2USM03-6M01E/SYS-2USM03-6M01E, BIOS 4.6.4 11/22/2011
[239014.051922] task: efbeef50 ti: f48f4000 task.ti: ea5c6000
[239014.057459] EIP: 0060:<c0204c94>] EFLAGS: 00210093 CPU: 1
[239014.063124] EIP is at print_context_stack+0x54/0xa0
[239014.068175] EAX: ffffe000 EBX: 0000000a ECX: 00000000 EDX: 0000000a
[239014.074692] ESI: 00000000 EDI: 00000000 EBP: f48f5e6c ESP: f48f5e34
[239014.081173] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[239014.086785] CR0: 80050033 CR2: 0000000a CR3: 2a5b0000 CR4: 000407f0
[239014.093222] Stack:
[239014.095396] ffffe000 00000000 00000000 00001ffc c071b1e0 00000000 c0886102 f48f5e6c
[239014.103524] c02040db c071b1e0 c0886102 00000000 f48f5e6c 0000000a 00000000 ffffffff
[239014.111608] c0886102 00000000 f48f5ff8 f48f6034 c020504f 00000000 c071b1e0 c0886102
[239014.119717] Call Trace:
[239014.122278] <c02040db>] dump_trace+0x9b/0xf0
[239014.126891] <c020504f>] show_trace_log_lvl+0x3f/0x50
[239014.132259] <c0204180>] show_stack_log_lvl+0x50/0xd0
[239014.137520] <c0204291>] show_regs+0x91/0x1a0
[239014.142176] <c0712524>] __die+0x94/0x100
[239014.146476] <c0706f90>] no_context+0x1c7/0x242
[239014.151265] <c070712e>] __bad_area_nosemaphore+0x123/0x12b
[239014.157166] <c0707145>] bad_area_nosemaphore+0xf/0x11
[239014.162586] <c0714640>] __do_page_fault+0xe0/0x4d0
[239014.167801] <c0711cb3>] error_code+0x67/0x6c
[239014.172458] DWARF2 unwinder stuck at error_code+0x67/0x6c
[239014.178037]
[239014.179648] Leftover inexact backtrace:
[239014.179648]
[239014.185216] Code: 8d b4 26 00 00 00 00 85 f6 74 14 39 f3 73 05 3b 1c 24 73 17 83 c4 10 89 f8 5b 5e 5f 5d c3 90 3b 5c 24 08 76 ef 3b 5c 24 0c 73 e9 <8b> 2b 89 e8 e8 c3 bb 05 00 85 c0 74 16 8d 47 04 39 c3 74 18 89
[239014.205458] EIP: <c0204c94>] print_context_stack+0x54/0xa0 SS:ESP 0068:f48f5e34
[239014.213110] CR2: 000000000000000a
[239014.216921] — end trace b05d221280869788 ]—
[239014.221705] Kernel panic - not syncing: Fatal exception in interrupt
[239014.228263] Rebooting in 5 seconds…
[239019.260555] ACPI MEMORY or I/O RESET_REG.

Hi Guys, just an update.

Debugging a little bit more, we found that the Oops (Null Pointer) has happen inside a warning function in the dev_watchdog.
We do not know what this warning was supposed to show to use, because we did not have the right debug level.
We are changing it now to see what it is.

But, is there a reason for a warning to produce a Kernel Oops? Is this correct?

if (some_queue_timedout) {
WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out
",
dev->name, netdev_drivername(dev), i);
dev->netdev_ops->ndo_tx_timeout(dev);

Another update;

In order to test the output, we tryed the following:
modprobe lkdtm
echo EXCEPTION > /sys/kernel/debug/provoke-crash/INT_HARDWARE_ENTRY

We expected to produce a single OOPS, the result was a loop of Oops similar what we are hunting.
We this this is not correct, e.g. the Kernel should not never enter a loop of Oops, or is it?

root[4720]: @shellLog: [3179] root 42 2016-04-19 11:38:53 echo E 221.704665] BUG: unable to handle kernel NULL pointer dereference at (null)
221.712600] IP: <f7d0b304>] lkdtm_do_action+0x164/0x450 [lkdtm]
221.718591] *pdpt = 0000000036e65001 *pde = 0000000000000000
221.724576] Oops: 0002 #1] SMP
221.727635] Modules linked in: lkdtm ppp_generic slhc ip6table_mangle ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 iptable_mangle ipt_REJECT xt_tcpudp xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_addrtype xt_conntrack iptable_filter ip_tables ipt_MASQUERADE nf_nat x_tables nf_conntrack loop mptctl coretemp crc32_pclmul aesni_intel vmw_balloon ablk_helper cryptd lrw aes_i586 xts gf128mul pcspkr shpchp serio_raw i2c_piix4 battery ac button 8021q mrp garp stp llc xfrm_user ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo sg autofs4 af_packet sr_mod cdrom ata_generic ata_piix crc32c_intel uhci_hcd ehci_hcd processor thermal_sys usbcore usb_common e1000 scsi_dh_hp_sw scsi_dh_alua scsi_dh_rdac scsi_dh_emc scsi_dh mptspi mptscsih mptbase scsi_transport_spi
221.796453] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.53-40.1-pae #1
221.802583] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2012
221.813529] task: c0ac2a00 ti: f4c08000 task.ti: c0ab6000
221.818610] EIP: 0060:<f7d0b304>] EFLAGS: 00210083 CPU: 0
221.824489] EIP is at lkdtm_do_action+0x164/0x450 [lkdtm]
221.829540] EAX: 00000004 EBX: c0ab7f54 ECX: 00000000 EDX: 00200082
221.835589] ESI: 00200046 EDI: 00200000 EBP: 00000082 ESP: f4c09f24
221.842508] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
221.847552] CR0: 8005003b CR2: 00000000 CR3: 36e64000 CR4: 000007f0
221.853623] Stack:
221.855635] 00000000 f7d0c141 f7d0c1c2 00000000 00200082 c0707c25 00000000 ffffffff
221.863582] 00000000 00000000 f7d0c820 00200082 f7d0b73b f7d0c820 f7d0c141 f7d0c1c2
221.871499] 00000000 c0ab7f54 2634aae6 c0ab7f54 00200000 f7d0b7d5 c0718573 c0ab7f54
221.878681] Call Trace:
221.881501] <f7d0b7d5>] jp_do_irq+0x5/0x10 [lkdtm]
221.886498] <c0718573>] common_interrupt+0x33/0x38
221.891486] <c024ad77>] __do_softirq+0x77/0x1d0
221.895617] <c0203f79>] do_softirq+0x59/0xa0
221.900545] <f4c0a080>] 0xf4c0a07f
221.903617] DWARF2 unwinder stuck at 0xf4c0a080
221.908555]
221.909634] Leftover inexact backtrace:
221.909634]
221.915513] <IRQ>
221.917523] BUG: unable to handle kernel NULL pointer dereference at 0000000a
221.924502] IP: <c0204c94>] print_context_stack+0x54/0xa0
221.929669] *pdpt = 0000000036e65001 *pde = 0000000000000000
221.935564] Oops: 0000 #2] SMP
221.938668] Modules linked in: lkdtm ppp_generic slhc ip6table_mangle ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 iptable_mangle ipt_REJECT xt_tcpudp xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_addrtype xt_conntrack iptable_filter ip_tables ipt_MASQUERADE nf_nat x_tables nf_conntrack loop mptctl coretemp crc32_pclmul aesni_intel vmw_balloon ablk_helper cryptd lrw aes_i586 xts gf128mul pcspkr shpchp serio_raw i2c_piix4 battery ac button 8021q mrp garp stp llc xfrm_user ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo sg autofs4 af_packet sr_mod cdrom ata_generic ata_piix crc32c_intel uhci_hcd ehci_hcd processor thermal_sys usbcore usb_common e1000 scsi_dh_hp_sw scsi_dh_alua scsi_dh_rdac scsi_dh_emc scsi_dh mptspi mptscsih mptbase scsi_transport_spi
222.007418] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.53-40.1-pae #1
222.013561] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2012
222.024502] task: c0ac2a00 ti: f4c08000 task.ti: c0ab6000
etc…

For the moment, we have patched the dump_trace function to avoid the NULL POINTER, and we found out that
the original output was a Warning.
We will try to debug to understand why our output returns a NULL value after a series of stack printouts.

54009.630418] ------------ cut here ]------------
[54009.635252] WARNING: at /usr/src/packages/BUILD/kernel-pae-3.4.6/linux-3.4/net/sched/sch_generic.c:256 dev_watchdog+0x1d2/0x1e0()
[54009.647230] Hardware name: SYS-2USM03-6M01E
[54009.651592] NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
[54009.658337] Modules linked in: ppp_generic slhc wctdm24xxp(O) wctdm(O) wcfxo(O) wctc4xxp(O) dahdi_transcode(O) wcb4xxp(O) ip6table_mangle ip6table_filter ip6_tables iptable_nat xt_DSCP iptable_mangle ipt_REJECT xt_tcpudp xt_LOG xt_limit xt_multiport xt_addrtype xt_conntrack iptable_filter ip_tables ipt_MASQUERADE nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 x_tables wcte12xp(O) dahdi_voicebus(O) nf_conntrack wct4xxp(O) dahdi(O) crc_ccitt loop acpi_cpufreq mperf video iTCO_wdt i2c_i801 iTCO_vendor_support coretemp crc32c_intel aesni_intel cryptd aes_i586 microcode pcspkr serio_raw shpchp pci_hotplug button 8021q garp stp xfrm_user ipcomp xfrm_ipcomp esp4 ah4 af_key sg autofs4 af_packet ata_generic ehci_hcd ata_piix e1000e(O) usbcore ptp usb_common pps_core thermal fan processor thermal_sys scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh mptspi mptscsih mptbase scsi_transport_spi
[54009.741010] OSB-ALF: dump_stack(); was removed
[54009.745559] — end trace bf927e9462ba2c2f ]—
[54009.750495] e1000e 0000:00:19.0: Net device Info
[54009.755138] e1000e: Device Name state trans_start last_rx
[54009.762367] e1000e: eth0 0000000000000003 0000000000CD1BB4 0000000000000000
[54009.770372] e1000e 0000:00:19.0: Register Dump
[54009.774890] e1000e: Register Name Value
[54009.779038] e1000e: CTRL 58100240
[54009.783337] e1000e: STATUS 00080043
[54009.787624] e1000e: CTRL_EXT 195a1000
[54009.791927] e1000e: ICR 00000000
[54009.796236] e1000e: RCTL 04008002
[54009.800572] e1000e: RDLEN 00001000
[54009.804889] e1000e: RDH 00000095
[54009.809193] e1000e: RDT 00000090
[54009.813520] e1000e: RDTR 00000000
[54009.817839] e1000e: RXDCTL[0-1] 00010000 00010000
[54009.822973] e1000e: ERT 00000000
[54009.827282] e1000e: RDBAL 1f28a000
[54009.831645] e1000e: RDBAH 00000000
[54009.835949] e1000e: RDFH 00000c66
[54009.840257] e1000e: RDFT 00000c66
[54009.844566] e1000e: RDFHS 00000c66
[54009.848896] e1000e: RDFTS 00000c66
[54009.853230] e1000e: RDFPC 00000000
[54009.857559] e1000e: TCTL 3103f0fa
[54009.861860] e1000e: TDBAL 1f1a7000
[54009.866197] e1000e: TDBAH 00000000
[54009.870515] e1000e: TDLEN 00001000
[54009.874842] e1000e: TDH 000000f7
[54009.879197] e1000e: TDT 000000f7
[54009.883517] e1000e: TIDV 00000008
[54009.887845] e1000e: TXDCTL[0-1] 0141001f 0141001f
[54009.892976] e1000e: TADV 00000020
[54009.897295] e1000e: TARC[0-1] 0d800403 45000403
[54009.902409] e1000e: TDFH 00000fa4
[54009.906720] e1000e: TDFT 00000fa4
[54009.911038] e1000e: TDFHS 00000fa4
[54009.915351] e1000e: TDFTS 00000fa4
[54009.919660] e1000e: TDFPC 00000000
[54009.924006] e1000e 0000:00:19.0: Tx Ring Summary
[54009.928668] e1000e: Queue [NTU][NTC][bi(ntc)->dma ] leng ntw timestamp
[54009.935508] e1000e: 0 F7 F7 0000000000000000 0086 F8 0000000000000000
[54009.943012] e1000e 0000:00:19.0: Tx Ring Dump
[54009.947398] e1000e: Tl[desc][address 63:0 ][SpeCssSCmCsLen][bi->dma ] leng ntw timestamp bi->skb ← Legacy format
[54009.959734] e1000e: Tc[desc][Ce CoCsIpceCoS][MssHlRSCm0Plen][bi->dma ] leng ntw timestamp bi->skb ← Ext Context format
[54009.972578] e1000e: Td[desc][address 63:0 ][VlaPoRSCm1Dlen][bi->dma ] leng ntw timestamp bi->skb ← Ext Data format
[54009.985114] e1000e: Td[0x000] 0000000033AE0E02 00000000AB100086 0000000000000000 0086 0 0000000000000000 (null)
[54009.995950] e1000e: Tc[0x001] 0000282200000000 0000000020000000 0000000000000000 0086 2 0000000000000000 (null)
[54010.006795] e1000e: Td[0x002] 0000000033AE0E02 00000000AB100086 0000000000000000 0086 2 0000000000000000 (null)
[54010.017682] e1000e: Tc[0x003] 0000282200000000 0000000020000000 0000000000000000 0086 4 0000000000000000 (null)