Hard disk locking up

Hi,

I believe I have a problem with my hardware, possibly in my BIOS settings, but it could be something physically wrong.

During normal operation (launching an application, for example), my PC intermittently freezes for 10-30 seconds. Can someone please decipher the following messages from ‘dmesg’?

  107.772470] ata1.00: BMDMA stat 0x4
  111.435967] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
  111.435971] ata1.00: BMDMA stat 0x4
  111.435975] ata1.00: failed command: READ DMA
  111.435983] ata1.00: cmd c8/00:08:0f:75:c4/00:00:00:00:00/e6 tag 0 dma 4096 in
  111.435985]          res 51/40:04:13:75:c4/00:00:00:00:00/e6 Emask 0x9 (media error)
  111.435988] ata1.00: status: { DRDY ERR }
  111.435991] ata1.00: error: { UNC }
  111.477334] ata1.00: configured for UDMA/133
  111.477345] ata1: EH complete
  115.090267] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
  115.090272] ata1.00: BMDMA stat 0x4
  115.090276] ata1.00: failed command: READ DMA
  115.090284] ata1.00: cmd c8/00:08:0f:75:c4/00:00:00:00:00/e6 tag 0 dma 4096 in
  115.090285]          res 51/40:04:13:75:c4/00:00:00:00:00/e6 Emask 0x9 (media error)
  115.090289] ata1.00: status: { DRDY ERR }
  115.090291] ata1.00: error: { UNC }
  115.131529] ata1.00: configured for UDMA/133
  115.131538] ata1: EH complete
  118.723983] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
  118.723987] ata1.00: BMDMA stat 0x4
  118.723991] ata1.00: failed command: READ DMA
  118.724000] ata1.00: cmd c8/00:08:0f:75:c4/00:00:00:00:00/e6 tag 0 dma 4096 in
  118.724001]          res 51/40:04:13:75:c4/00:00:00:00:00/e6 Emask 0x9 (media error)
  118.724010] ata1.00: status: { DRDY ERR }
  118.724013] ata1.00: error: { UNC }
  118.765427] ata1.00: configured for UDMA/133
  118.765438] sd 0:0:0:0: [sda] Unhandled sense code
  118.765441] sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
  118.765444] sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
  118.765449] Descriptor sense data with sense descriptors (in hex):
  118.765452]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
  118.765460]         06 c4 75 13 
  118.765464] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
  118.765470] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 06 c4 75 0f 00 00 08 00
  118.765478] end_request: I/O error, dev sda, sector 113538323
  118.765491] ata1: EH complete

Thanks!

Jason

Hello Jason and welcome to the openSUSE forums. Sorry to hear of your hard drive problems. I would have to say that for 99.99 percent of all such recorded errors on hard drives they are hardware related. Now, ever so often, a reformat can help, which means starting over and you would want to save any personnel data on a different hard drive. Even so often, the problem is over heating due to a bad cooling fan or other such heat related problem. And ever so often, a bad power supply (which can also be over heating) can be a problem. I can say that cleaning out all excess dust from any Laptop or Desktop is a very good idea (with the system turned off and unplugged and battery removed for Laptops) using Duster spray you can find at most computer shops and online. You might be surprised of the amount of dust buildup that occur in just one year of operation. But, in the end, depending on the age of the hard drive, I think its time to look for a new one while your old is not yet dead. Consider that all hard drives will fail given long enough. The fast majority do not fail during their normal lifetime, but some do. So, its prudent to consider it may be time to replace the old hard drive, but pursuing (quickly) some of my other suggestions is fine to do as well.

Thank You,

Is that an IDE hard disk, start by replacing the cable. Next run a surface test on that disk from PartedMagic or a Linux Live CD with a HDD diagnostic tool (maybe just boot a Fedora Live CD. It usually automatically checks for bad sectors).

+1. Sometimes it’s just a loose connector acting up when heated.

But if the HD is constantly making noises (clicks) it’s probably going bad. I’d also check it with smart - possibly from partedmagic livecd, a great tool, and also see how old it is, I wouldn’t really trust any drive over 5 years old.

please try again wrote:

>
> Is that an IDE hard disk, start by replacing the cable. Next run a
> surface test on that disk from PartedMagic or a Linux Live CD with a
> HDD diagnostic tool (maybe just boot a Fedora Live CD. It usually
> automatically checks for bad sectors).

You can also download a tool from many of the hard drive vendor’s web
site. Seagate has Seatools. WD has one too, I believe.

Download the tool and create a bootable CD. You can run a quick
non-destructive test or, if necessary, very thorough diagnostics.

If you have smartmontools installed you can try “smartctl”. See the man
pages for more info.


Kevin Boyle, Knowledge Partner - Novel/SUSE/NetIQ
http://support.novell.com/community/volunteers/1kevinb.html

On 01/06/2012 04:06 AM, linux-2point0 wrote:
>
> sorry if there is no link my dad messed with it and i could not find
> the link again…

you are doing fine! (there is a great link in your first post on hard
drive failure…but do listen to dad, he is a smart guy–we depend on
him around here)

and, thanks for the tip on where to get the GNU/Linux Basic Guide


DD
openSUSE®, the “German Engineered Automobiles” of operating systems!

Thanks for the help everyone and I apologize for not responding sooner. Apparently I was not auto-subscribed to this thread!

My power supply was actually going bad and causing problems with my USB devices. That has been resolved, but this had no effect on the hard drive failures. The drive is a Seagate IDE. Since last week, I’ve installed OpenSuSE 12.1 x86_64 on a separate Seagate SATA hard drive. My drive info from hdparm is below (/dev/sdb is my IDE running 11.3).

 # sudo /sbin/hdparm -i /dev/sda

/dev/sda:

 Model=ST3200822AS, FwRev=3.01, SerialNo=4LJ0T3CS
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=unknown, BuffSize=8192kB, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=390721968
 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4 
 DMA modes:  mdma0 mdma1 mdma2 
 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6 
 AdvancedPM=no WriteCache=enabled
 Drive conforms to: ATA/ATAPI-6 T13 1410D revision 2:  ATA/ATAPI-1,2,3,4,5,6

 * signifies the current active mode

# sudo /sbin/hdparm -i /dev/sdb

/dev/sdb:

 Model=ST3200822A, FwRev=3.01, SerialNo=5LJ0EDLJ
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=unknown, BuffSize=8192kB, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=390721968
 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4 
 DMA modes:  mdma0 mdma1 mdma2 
 UDMA modes: udma0 udma1 *udma2 udma3 udma4 udma5 
 AdvancedPM=no WriteCache=enabled
 Drive conforms to: ATA/ATAPI-6 T13 1410D revision 2:  ATA/ATAPI-1,2,3,4,5,6

 * signifies the current active mode

I did have some different errors reported with 12.1, but the hard drive still continues to pause. I’ll do some more debugging and report back here.

Hi
A couple of things, check the Load Count, I disable it here on the
notebook as it’s way too busy;


smartctl -a /dev/sda |grep Load_Cycle_Count
hdparm -B /dev/sda

http://old-en.opensuse.org/Disk_Power_Management

I need to do something on the systemd side as I manually have to set on
a reboot…


Cheers Malcolm °¿° (Linux Counter #276890)
openSUSE 12.1 (x86_64) Kernel 3.1.0-1.2-desktop
up 15:10, 3 users, load average: 0.01, 0.03, 0.05
CPU Intel i5 CPU M520@2.40GHz | Intel Arrandale GPU