Hello!
I installed OpenSUSE 11 a few weeks ago and have some problems with it regarding ATA/SATA device handling. I have an AMD Athlon 64 X2 3800+ CPU on Asus M2V motherboard (VIA VT8237A south bridge), 1 ATA HDD, 1 ATA DVD-RW and 1 SATA HDD.
My minor problem is that all disk activity results in 100% utilization of 1 CPU core. Using UDMA5 for the ATA HDD and SATA, I’d expect almost no CPU utilization - which is the case under WinXP.
The major problem is that after some time I cannot access either my SATA HDD or my ATA DVD-RW at all. Sometimes this happens during reading/writing large amount of data, other times it happens without any reason (e.g. 5 minutes after boot this problem happens without any disk activity). When this happens one core’s utilization is continously at 100%. I have to restart the pc to be able to access the disk again. In the system log I can find the below error messages (see end of post).
After googling for similar problems I’ve found and tried the below things to no avail:
- disable smartd
- disable and uninstall beagle
What seems to work is disabling APIC. If I start OpenSUSE with noapic kernel boot parameter, the problem doesn’t happen, I can use my pc for hours. If I start the system without this parameter, the problem happens within 5-30 minutes for sure.
Needless to say, I have no problems under WinXP, I can use my pc for 10-16 hours without any problems.
Any ideas or suggestions?
My suspect is libata module. This is the first time I use OpenSUSE, but I used Linux for years. My previous distro had an older kernel and didn’t use libata, and I didn’t have any problems on this very same hardware.
Jul 17 18:15:17 macisuse kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Jul 17 18:15:17 macisuse kernel: ata3.00: cmd 25/00:08:2d:2b:37/00:00:21:00:00/e0 tag 0 dma 4096 in
Jul 17 18:15:17 macisuse kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jul 17 18:15:17 macisuse kernel: ata3.00: status: { DRDY }
Jul 17 18:15:17 macisuse kernel: ata3: soft resetting link
Jul 17 18:15:41 macisuse su: (to root) maci on /dev/pts/3
Jul 17 18:15:48 macisuse kernel: ata3.00: qc timeout (cmd 0x27)
Jul 17 18:15:48 macisuse kernel: ata3.00: failed to read native max address (err_mask=0x4)
Jul 17 18:15:48 macisuse kernel: ata3.00: revalidation failed (errno=-5)
Jul 17 18:15:48 macisuse kernel: ata3: failed to recover some devices, retrying in 5 secs
... above rows repeated several times ...
Jul 17 18:16:59 macisuse kernel: sd 2:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 17 18:16:59 macisuse kernel: end_request: I/O error, dev sdb, sector 557263661
Jul 17 18:16:59 macisuse kernel: EXT3-fs error (device sdb6): ext3_get_inode_loc: unable to read inode block - inode=26509313, block=53018626
Jul 17 18:16:59 macisuse kernel: sd 2:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 17 18:16:59 macisuse kernel: end_request: I/O error, dev sdb, sector 133114653
Jul 17 18:16:59 macisuse kernel: Buffer I/O error on device sdb6, logical block 0
Jul 17 18:16:59 macisuse kernel: lost page write due to I/O error on sdb6
Jul 17 18:16:59 macisuse kernel: sd 2:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 17 18:16:59 macisuse kernel: end_request: I/O error, dev sdb, sector 133127221
Jul 17 18:16:59 macisuse kernel: Buffer I/O error on device sdb6, logical block 1571
Jul 17 18:16:59 macisuse kernel: lost page write due to I/O error on sdb6
Jul 17 18:16:59 macisuse kernel: sd 2:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 17 18:16:59 macisuse kernel: end_request: I/O error, dev sdb, sector 135211821
Jul 17 18:16:59 macisuse kernel: EXT3-fs error (device sdb6): ext3_get_inode_loc: unable to read inode block - inode=131073, block=262146
Jul 17 18:16:59 macisuse kernel: <c01071d9>] dump_trace+0x63/0x227
Jul 17 18:16:59 macisuse kernel: <c0107c8a>] show_trace+0x15/0x29
Jul 17 18:16:59 macisuse kernel: <c02e85e5>] _etext+0x5b/0x65
Jul 17 18:16:59 macisuse kernel: <c0125759>] warn_on_slowpath+0x41/0x67
Jul 17 18:16:59 macisuse kernel: <c0196e13>] mark_buffer_dirty+0x23/0x72
Jul 17 18:16:59 macisuse kernel: <f96311f9>] ext3_commit_super+0x40/0x53 [ext3]
Jul 17 18:16:59 macisuse kernel: <f96326c1>] ext3_handle_error+0x71/0x95 [ext3]
Jul 17 18:16:59 macisuse kernel: <f9632774>] ext3_error+0x39/0x43 [ext3]
Jul 17 18:16:59 macisuse kernel: <f962a736>] __ext3_get_inode_loc+0x293/0x2ba [ext3]
Jul 17 18:16:59 macisuse kernel: <f962a7b4>] ext3_iget+0x57/0x324 [ext3]
Jul 17 18:16:59 macisuse kernel: <f962fd10>] ext3_lookup+0x67/0xa2 [ext3]
Jul 17 18:16:59 macisuse kernel: <c0180284>] do_lookup+0xa1/0x140
Jul 17 18:16:59 macisuse kernel: <c0182256>] __link_path_walk+0x899/0xcf9
Jul 17 18:16:59 macisuse kernel: <c0182702>] path_walk+0x4c/0x9b
Jul 17 18:16:59 macisuse kernel: <c0182a4f>] do_path_lookup+0x181/0x1ca
Jul 17 18:16:59 macisuse kernel: <c01832b6>] __user_walk_fd+0x2f/0x43
Jul 17 18:16:59 macisuse kernel: <c017cbb3>] vfs_lstat_fd+0x16/0x3d
Jul 17 18:16:59 macisuse kernel: <c017cc45>] vfs_lstat+0x11/0x13
Jul 17 18:16:59 macisuse kernel: <c017cc5b>] sys_lstat64+0x14/0x28
Jul 17 18:16:59 macisuse kernel: <c01059e4>] sysenter_past_esp+0x6d/0xa9
Jul 17 18:16:59 macisuse kernel: <ffffe430>] 0xffffe430
Jul 17 18:16:59 macisuse kernel: =======================
Jul 17 18:16:59 macisuse kernel: --- end trace 01a11084dbb38cf1 ]---
... below rows repeated several times ...
Jul 17 18:16:59 macisuse kernel: sd 2:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 17 18:16:59 macisuse kernel: end_request: I/O error, dev sdb, sector 133114653
Jul 17 18:16:59 macisuse kernel: Buffer I/O error on device sdb6, logical block 0
Jul 17 18:16:59 macisuse kernel: lost page write due to I/O error on sdb6