strange sata and network interaction

Hi,

I’m getting a strange fault and I would appreciate any ideas to fix it (or if not should I try to add this as fault report to Bugzilla?)

I have bought a new computer (has ASUS P5M-MX motherboard which uses nForce 610i(MCP73V) chipset and Intel E4600 CPU).
when I installed openSUSE 10.3_64 I found the computer was occasionally freezing when say starting a program or shutting down the system.

When I checked the system log I found a sequence of messages like this was repeating all the time:

Jul 12 09:14:28 suse110 kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Jul 12 09:14:28 suse110 kernel: ata5.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Jul 12 09:14:28 suse110 kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Jul 12 09:14:28 suse110 kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Jul 12 09:14:28 suse110 kernel: ata5.00: status: { DRDY }
Jul 12 09:14:28 suse110 kernel: ata5: soft resetting link
Jul 12 09:14:28 suse110 kernel: ata5: nv_mode_filter: 0x701f&0x701f->0x701f, BIOS=0x7000 (0xc0000000) ACPI=0x701f (60:600:0x13)
Jul 12 09:14:28 suse110 kernel: ata5.00: configured for UDMA/33
Jul 12 09:14:28 suse110 kernel: ata5: EH complete

I therefore upgraded to openSUSE 11_64, this appeared to fix the problem, however I have found that these messages can get triggered on this version (on SUSE 11 they only happen after being triggered on 10.3 they happen all the time). Once they are triggered they then continue continuously until I shut down the system.

The only thing that seems to trigger these log messages is to mount a network (ethernet) drive:
mount.cifs //192.168.0.4/share /mnt/buffalo -o username=guest
and then copy a directory between the network drive and my (sata) hard drive. As soon as I start the copy the messages start and continue even after the copy has completed until I shut down the system.

I have not found anything else that triggers the log messages, for instance, copying a directory between the sata disk and a usb drive does not cause any problems nor does copying directories within the sata disk.

I don’t think it is a hardware problem with the hard drive because I used a different physical drive when I switched from 10.3 to 11.0.

Is there any parameters I can try changing? in the BIOS or Linux configuration like packet size or something like that?

Thanks,

Martin

PS - I can’t get the built in nForce chipset ethernet port to work so I am using a seperate PCI card for ethernet.

As I didn’t get a reply I put this as a bug on Bugzilla number 409484

Martin

Just a few tips:

  • disable smartd
  • disable beagle
  • if you still get the error, try using “noapic” kernel boot parameter (enter noapic in grub before you hit enter to boot opensuse)

I have just discovered that this bug is specific to the pci ethernet card that I am using and therefore not as severe as I first thought. The ethernet card is just a generic 10/100M pci card that uses an RTL8139C chip.

I tried stopping smartd as suggested by simico (by using system monitor to kill the process). This did not affect the bug; I was still able to trigger it.

I don’t have Beagle installed so that is not an issue.

I then tried putting noapic into grub at bootup as suggested, this appeared to cure the problem in that the messages did not appear in the log, however the system froze during shutdown so I suspect this just masked the log messages but did not cure the underlying problem.

Then, while using the computer, the bug was triggered when I was streaming an audio file. So the bug was triggered by cifs or streaming (but not by normal internet browsing, downloads, email, suse updater or other web stuff).

I therefore decided to investigate the network card and after a lot of hassle I finally managed to get the nvidia ethernet port on the motherboard to work (I still don’t know how, after doing a lot of stuff in YAST, it just started working). When I used this port the problem did not happen.

Thanks,

Martin