Hi,
I keep getting these dumps in /var/log/messages
Mostly they “just appear”, nothing special going on, suddenly the computer freezes for some 30 seconds or so and I’ve got used to the phenomena now so I just wait and after some time everything gets going again and the loggings have appeared. They start with these lines:
kernel: irq 21: nobody cared (try booting with the "irqpoll" option)
kernel: Pid: 2654, comm: hald-addon-stor Tainted: P 2.6.27.25-0.1-default #1
I’ve added “irqpoll” to boot options a long time ago but it doesn’t help.
Then there is a trace:
kernel: Call Trace:
kernel: <ffffffff8020da29>] show_trace_log_lvl+0x41/0x58
kernel: <ffffffff8049a616>] dump_stack+0x69/0x6f
kernel: <ffffffff8027c182>] __report_bad_irq+0x30/0x7d
kernel: <ffffffff8027c296>] note_interrupt+0xc7/0x109
kernel: <ffffffff8027c940>] handle_fasteoi_irq+0xab/0xcf
kernel: <ffffffff8020e7e1>] do_IRQ+0x6d/0xda
kernel: <ffffffff8020c531>] ret_from_intr+0x0/0xa
kernel: <ffffffffa029630b>] IRQMGR_ReadRegister+0x2b/0x30 [fglrx]
kernel: <ffffffffa02980d2>] RV770_IVRing_GetPointers+0x92/0xc0 [fglrx]
kernel: <ffffffffa02976ac>] R6XXIsr_CheckForActiveInterrupts+0x4c/0xa0 [fglrx]
kernel: <ffffffffa028a406>] IRQMGR_InterruptServiceRoutine+0x26/0x70 [fglrx]
kernel: <ffffffffa01d4cec>] IRQMGR_CallbackWrapperRet+0xc/0x20 [fglrx]
kernel: <ffffffffa01ba6ae>] KAS_Ih_Execute+0xbf/0x14e [fglrx]
kernel: <ffffffffa01ba3aa>] KCL_PUB_InterruptHandlerWrap+0x30/0x5a [fglrx]
kernel: <ffffffff8027b629>] handle_IRQ_event+0x25/0x53
kernel: <ffffffff8027c853>] handle_edge_irq+0xe4/0x126
kernel: <ffffffff8020e7e1>] do_IRQ+0x6d/0xda
kernel: <ffffffff8020c531>] ret_from_intr+0x0/0xa
kernel: <ffffffff8033b3ba>] get_request+0x213/0x394
kernel: <ffffffff8033b56a>] get_request_wait+0x2f/0x194
kernel: <ffffffffa000c44d>] scsi_execute+0x3b/0x11a [scsi_mod]
kernel: <ffffffffa000c5a8>] scsi_execute_req+0x7c/0xae [scsi_mod]
kernel: <ffffffffa0006a58>] ioctl_internal_command+0x5b/0x15f [scsi_mod]
kernel: <ffffffffa0006ba6>] scsi_set_medium_removal+0x4a/0x73 [scsi_mod]
kernel: <ffffffffa0419b4d>] cdrom_release+0x8b/0xe8 [cdrom]
kernel: <ffffffffa0430545>] sr_block_release+0x21/0x36 [sr_mod]
kernel: <ffffffff802d77a3>] __blkdev_put+0x8b/0x13e
kernel: <ffffffff802b24a0>] __fput+0xa1/0x165
kernel: <ffffffff802afba9>] filp_close+0x5b/0x62
kernel: <ffffffff802afc44>] sys_close+0x94/0xcd
kernel: <ffffffff8020bfbb>] system_call_fastpath+0x16/0x1b
kernel: <00007fe8a021be70>] 0x7fe8a021be70
After the trace this kind of “informative” loggings are periodically appearing:
kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata7.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
kernel: cdb 4a 01 00 00 10 00 00 00 08 00 00 00 00 00 00 00
kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata7.00: status: { DRDY }
kernel: ata7: soft resetting link
kernel: ata7.00: configured for PIO0
kernel: ata7: EH complete
I have removed the time stamps etc from the examples.
After the initial freeze the computer seems to be ok, just these disturbing loggings.
Any ideas?
If you use the command ‘cat /proc/interrupts’, you will see which
device is using IRQ21.
The disk message looks like some kind of hardware error. Note that
that disk is reconfigures for PIO (programmed I/O) rather than DMA.
That really hurts performance.
OK, so this is the output of ‘cat /proc/interrupts’
# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
0: 54 0 0 24 IO-APIC-edge timer
1: 0 0 0 2 IO-APIC-edge i8042
4: 0 0 0 1 IO-APIC-edge
6: 0 0 0 5 IO-APIC-edge floppy
8: 0 0 0 1 IO-APIC-edge rtc0
9: 0 0 0 1 IO-APIC-fasteoi acpi
12: 0 0 0 4 IO-APIC-edge i8042
14: 25 0 1 323 IO-APIC-edge pata_atiixp
15: 0 0 0 0 IO-APIC-edge pata_atiixp
16: 0 600 0 359 IO-APIC-fasteoi ohci_hcd:usb1, HDA Intel
17: 0 0 0 0 IO-APIC-fasteoi ohci_hcd:usb2, ohci_hcd:usb4
18: 0 922 18670 77 IO-APIC-fasteoi ohci_hcd:usb3, ohci_hcd:usb5
19: 0 0 0 39 IO-APIC-fasteoi ehci_hcd:usb6, HDA Intel
21: 141981 0 0 301 IO-APIC-fasteoi pata_sil680
22: 9008 113865 11 7157 IO-APIC-fasteoi ahci, ohci1394
4347: 411930 0 4 922 PCI-MSI-edge fglrx[0]@PCI:1:0:0
4348: 598157 0 0 45 PCI-MSI-edge eth0
NMI: 0 0 0 0 Non-maskable interrupts
LOC: 1919277 1952461 1939041 1858256 Local timer interrupts
RES: 101463 116777 107593 106539 Rescheduling interrupts
CAL: 11528 3480 11052 11052 function call interrupts
TLB: 1488 1417 1621 1153 TLB shootdowns
TRM: 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 Threshold APIC interrupts
SPU: 0 0 0 0 Spurious interrupts
ERR: 0
At the time of this writing the freeze hasn’t occurred, switched on the computer not long ago, so the output might look different after the freeze?
The message in the end of my first post was just an example, there are “all sorts” of devices, or whatever they are, in the messages, here are some more examples:
kernel: handlers:
kernel: <ffffffffa0041e4f>] (ata_sff_interrupt+0x0/0x23b [libata])
kernel: Disabling IRQ #21
kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata7.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
kernel: res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata7.00: status: { DRDY }
kernel: ata7: soft resetting link
kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata8.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata8.00: status: { DRDY }
kernel: ata8: soft resetting link
kernel: ata7.00: configured for MWDMA2
kernel: ata7: EH complete
kernel: ata8.00: configured for UDMA/66
kernel: ata8: EH complete
kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata7.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
kernel: res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata7.00: status: { DRDY }
kernel: ata7: soft resetting link
kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata8.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata8.00: status: { DRDY }
kernel: ata8: soft resetting link
kernel: ata7.00: configured for MWDMA2
kernel: ata7: EH complete
kernel: ata8.00: configured for UDMA/66
kernel: ata8: EH complete
kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata8.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
kernel: cdb 4a 01 00 00 10 00 00 00 08 00 00 00 00 00 00 00
kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata8.00: status: { DRDY }
kernel: ata8: soft resetting link
kernel: ata8.00: configured for UDMA/66
kernel: ata8: EH complete
kernel: ata8.00: limiting speed to UDMA/44:PIO4
kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata8.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
kernel: cdb 4a 01 00 00 10 00 00 00 08 00 00 00 00 00 00 00
kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata8.00: status: { DRDY }
kernel: ata8: soft resetting link
kernel: ata8.00: configured for UDMA/44
kernel: ata8: EH complete
kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata8.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
kernel: cdb 4a 01 00 00 10 00 00 00 08 00 00 00 00 00 00 00
kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata8.00: status: { DRDY }
kernel: ata8: soft resetting link
kernel: ata8.00: configured for UDMA/44
kernel: ata8: EH complete
kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata7.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
kernel: res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata7.00: status: { DRDY }
kernel: ata7: soft resetting link
kernel: ata7.00: configured for MWDMA2
kernel: ata7: EH complete
kernel: ata8.00: limiting speed to UDMA/33:PIO4
kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata8.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in
kernel: cdb 4a 01 00 00 10 00 00 00 08 00 00 00 00 00 00 00
kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata8.00: status: { DRDY }
kernel: ata8: soft resetting link
kernel: ata8.00: configured for UDMA/33
kernel: ata8: EH complete
kernel: sr1: CDROM (ioctl) error, command: Get event status notification 4a 01 00 00 10 00 00 00 08 00
kernel: sr: Sense Key : Aborted Command [current] [descriptor]
kernel: sr: Add. Sense: No additional sense information
kernel: ata8.00: limiting speed to PIO4
kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata8.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
kernel: cdb 1e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata8.00: status: { DRDY }
kernel: ata8: soft resetting link
kernel: ata8.00: configured for PIO4
kernel: ata8: EH complete
kernel: ata7.00: qc timeout (cmd 0xa0)
kernel: ata7.00: limiting speed to MWDMA1:PIO4
kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata7.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
kernel: res 51/20:03:00:00:00/00:00:00:00:00/a0 Emask 0x5 (timeout)
kernel: ata7.00: status: { DRDY ERR }
kernel: ata7: soft resetting link
kernel: ata7.00: configured for MWDMA1
kernel: ata7: EH complete
kernel: ata8.00: limiting speed to PIO3
kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata8.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata8.00: status: { DRDY }
kernel: ata8: soft resetting link
As you say it looks like some hardware error, something wrong with “ata” and disks or so. It does not seem to give me any other problem than the initial freeze and then the loggings, all disks seem to work ok.
man with hat wrote:
> OK, so this is the output of ‘cat /proc/interrupts’
>
> Code:
> --------------------
>
> # cat /proc/interrupts
> 21: 141981 0 0 301 IO-APIC-fasteoi pata_sil680
> As you say it looks like some hardware error, something wrong with
> “ata” and disks or so. It does not seem to give me any other problem
> than the initial freeze and then the loggings, all disks seem to work
> ok.
A reboot of the same kernel will result in identical IRQ assignments.
Your hardware is erroring, and reverting to PIO mode, which is slow.
Keep that drive backed up.
This phenomena has been with me for several kernel versions. Anyway I’ve managed to identify the disks, they are two old ATA disks connected to a PCI card because there wasn’t enough connectors on the motherboard. I guess it’s the PCI card that’s giving the errors. Thanks for your help.