Results 1 to 4 of 4

Thread: Hard disk failure? SMART and smartmontools

  1. #1
    Join Date
    Jun 2008
    Location
    Morecambe Bay
    Posts
    103

    Default Hard disk failure? SMART and smartmontools

    I have a brand new Hitachi drive (250 gb and only 3 days old). I have installed a dual boot system with XP and Suse 10.3. I am also a noob, so please bear with me...

    I keep getting SMART error messages when I'm in Suse. The Hitachi diagnostic tool tells me the disk is fine and smartmontools says "PASSED" after running this command "smartctl -H /dev/sda"

    I have also run a couple of short and long selftests and no errors show up.
    # smartctl -l selftest /dev/sda
    smartctl version 5.37 [i686-suse-linux-gnu] Copyright (C) 2002-6 Bruce Allen
    Home page is smartmontools Home Page (last updated $Date: 2008/06/16 17:31:16 $)

    === START OF READ SMART DATA SECTION ===
    SMART Self-test log structure revision number 1
    Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
    # 1 Short offline Completed without error 00% 43 -
    # 2 Short offline Completed without error 00% 41 -
    # 3 Extended offline Aborted by host 90% 37 -
    # 4 Extended offline Completed without error 00% 36 -
    # 5 Short offline Completed without error 00% 34 -
    Yet the smart log throws out the following
    Error 68 occurred at disk power-on lifetime: 43 hours (1 days + 19 hours)
    When the command that caused the error occurred, the device was active or idle
    .

    After command completion occurred, registers were:
    ER ST SC SN CL CH DH
    -- -- -- -- -- -- --
    84 51 00 29 36 ba e8 Error: ICRC, ABRT at LBA = 0x08ba3629 = 146421289

    Commands leading to the command that caused the error were:
    CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
    -- -- -- -- -- -- -- -- ---------------- --------------------
    25 00 a0 8a 32 ba e0 00 03:23:54.100 READ DMA EXT
    27 00 00 00 00 00 e0 00 03:23:54.100 READ NATIVE MAX ADDRESS EXT
    ec 00 00 00 00 00 a0 02 03:23:54.100 IDENTIFY DEVICE
    ef 03 46 00 00 00 a0 02 03:23:54.100 SET FEATURES [Set transfer mode]
    27 00 00 00 00 00 e0 00 03:23:54.100 READ NATIVE MAX ADDRESS EXT

    Error 67 occurred at disk power-on lifetime: 43 hours (1 days + 19 hours)
    When the command that caused the error occurred, the device was active or idle
    .

    After command completion occurred, registers were:
    ER ST SC SN CL CH DH
    -- -- -- -- -- -- --
    84 51 00 29 36 ba e8 Error: ICRC, ABRT at LBA = 0x08ba3629 = 146421289

    Commands leading to the command that caused the error were:
    CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
    -- -- -- -- -- -- -- -- ---------------- --------------------
    25 00 a0 8a 32 ba e0 00 03:23:53.900 READ DMA EXT
    25 00 00 8a 2e ba e0 00 03:23:53.900 READ DMA EXT
    25 00 00 8a 2a ba e0 00 03:23:53.900 READ DMA EXT
    25 00 00 8a 26 ba e0 00 03:23:53.900 READ DMA EXT
    c8 00 f8 92 25 ba e8 00 03:23:53.900 READ DMA

    Error 66 occurred at disk power-on lifetime: 43 hours (1 days + 19 hours)
    When the command that caused the error occurred, the device was active or idle
    .

    After command completion occurred, registers were:
    ER ST SC SN CL CH DH
    -- -- -- -- -- -- --
    84 51 00 49 18 2e e8 Error: ICRC, ABRT at LBA = 0x082e1849 = 137238601

    Commands leading to the command that caused the error were:
    CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
    -- -- -- -- -- -- -- -- ---------------- --------------------
    c8 00 70 da 17 2e e8 00 03:23:32.200 READ DMA
    c8 00 70 6a 17 2e e8 00 03:23:32.200 READ DMA
    25 00 58 12 16 2e e0 00 03:23:32.200 READ DMA EXT
    c8 00 18 ca 15 2e e8 00 03:23:32.200 READ DMA
    c8 00 c0 5a 12 2e e8 00 03:23:32.200 READ DMA

    Error 65 occurred at disk power-on lifetime: 43 hours (1 days + 19 hours)
    When the command that caused the error occurred, the device was active or idle
    .

    After command completion occurred, registers were:
    ER ST SC SN CL CH DH
    -- -- -- -- -- -- --
    84 51 00 f1 c8 47 e9 Error: ICRC, ABRT at LBA = 0x0947c8f1 = 155699441

    Commands leading to the command that caused the error were:
    CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
    -- -- -- -- -- -- -- -- ---------------- --------------------
    c8 00 48 aa c8 47 e9 00 03:23:31.600 READ DMA
    c8 00 70 6a 3f 2c e8 00 03:23:31.600 READ DMA
    c8 00 48 ca 18 2a e8 00 03:23:31.600 READ DMA
    c8 00 20 82 a0 29 e8 00 03:23:31.600 READ DMA
    c8 00 08 ca 7b 29 e8 00 03:23:31.600 READ DMA

    Error 64 occurred at disk power-on lifetime: 40 hours (1 days + 16 hours)
    When the command that caused the error occurred, the device was active or idle .

    After command completion occurred, registers were:
    ER ST SC SN CL CH DH
    -- -- -- -- -- -- --
    84 51 00 29 36 ba e8 Error: ICRC, ABRT at LBA = 0x08ba3629 = 146421289

    Commands leading to the command that caused the error were:
    CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
    -- -- -- -- -- -- -- -- ---------------- --------------------
    25 00 a0 8a 32 ba e0 00 00:39:24.200 READ DMA EXT
    25 00 00 8a 2e ba e0 00 00:39:24.200 READ DMA EXT
    25 00 00 8a 2a ba e0 00 00:39:24.100 READ DMA EXT
    25 00 00 8a 26 ba e0 00 00:39:24.100 READ DMA EXT
    c8 00 f8 92 25 ba e8 00 00:39:24.100 READ DMA
    Unfortunately this is all Chinese to me and the Hitachi support is closed. Could it be that there is something wrong with the smart monitor? Sending me error messages, although everything's fine?
    Thanks,g

    PS: I don't get any smart messages in XP, btw.
    PPS: I just got another message "Your hard disk drive is failing! S.M.A.R.T. message: Device: /dev/sda, ATA error count increased from 66 to 68"

  2. #2

    Default Re: Hard disk failure? SMART and smartmontools

    You might try booting to a live CD and running:

    fsck /dev/sda1 (or whatever the drive is)

    Do not fsck a mounted drive you might fsck everything up.

    Also, check BIOS. Many BIOS's have a SMART status you can check.

  3. #3
    Join Date
    Jun 2008
    Location
    Morecambe Bay
    Posts
    103

    Unhappy Re: Hard disk failure? SMART and smartmontools

    I used the GParted live disc and identified the part of my disk that is supposedly dodgy (the root directory) and fsck gave me the following reply:
    /dev/hda6 primary superblock features different from backup, check forced

    PASS 1: checking inodes, blocks & sizes
    PASS 2: checking directory structure
    PASS 3: checking directoyr connectivity
    PASS 4: checking reference counts
    PASS 5: checking group summary information

    /dev/hda6: ***file system was modified***
    /dev/hda6/ 173069/2626560 files (0.5% non-contiguous), 982273/524880 blocks
    I have now started Suse again, and SMART tells me my other (backup) hard drive is now failing as well!

    Your hard disk drive is failing! S.M.A.R.T. message: Device: /dev/sdb, 1 Currently unreadable (pending) sectors
    I had to replace my old hard disk, because SMART told me it was failing.
    3 different hd's failing in as many days?
    thanks,
    g

  4. #4

    Default Re: Hard disk failure? SMART and smartmontools

    That is odd. I have never seen those errors and I've been using Linux since 2003. Maybe your motherboard is having issues with it's IDE buss. Can you move the HD to another IDE port on your motherbord? Maybe that will help. Just a wild guess though.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •