smartd reports disk errors

The SMART demon reports a few unreadable disk sectors. See below.

Is there a way to clear the errors? A way to mark the sectors as a no-go area?

2018-01-28T10:54:10-0700 sma-station14l smartd[1344]: Device: /dev/sda [SAT], 1 Currently unreadable (pending) sectors
2018-01-28T10:54:10-0700 sma-station14l smartd[1344]: Device: /dev/sdb [SAT], 145 Currently unreadable (pending) sectors
2018-01-28T10:54:10-0700 sma-station14l smartd[1344]: Device: /dev/sdb [SAT], 145 Offline uncorrectable sectors

Smart should remove automatically unless you have run out of spares.

Time for a new drive :’(

Spinrite may fix but it is not free. https://www.grc.com/intro.htm but unless you plan on fixing a lot of drives you can buy a new TB drive for less money

I’ve found SMART won’t automatically recover, only mark and report problem sectors.

Perhaps most importantly (I’ve found) is that you should re-run tests to see if the number of reported bad sectors increases or stays the same. I have one disk that has remained the same for many years, and although not heavily accessed is powered 24/7.

If numbers of bad sectors increases, then plan on replacing ASAP.

Sectors can also be marked bad by mistake.
If you want to try to recover those sectors, you will need to research recommendations by that disk manufacturer, some OEMs like Seagate make an image available for trying to recover those sectors. But, unless the number of marked sectors is very large, it’s probably not worth the trouble… Nowadays most disks are set up with a reserve of sectors to replace bad, so your total disk capacity isn’t affected.

It’s my impression though that most disk manufacturers don’t provide a way to recover bad sectors, the only way to do so is at the factory (You will need to send the disk to be “repaired”)

TSU

No, and bad sectors cannot be “recovered” anyway, as they are bad on a hardware level.
(special data recovery labs may be able to reconstruct the data though, but that’s really expensive)

Writing to those sectors should cause the harddrive to re-allocate them to a different place, if there are still spare sectors.

You can force it with the badblocks command with the “-n” option.
From “man badblocks”:

   -n     Use non-destructive read-write mode.  By default only  a  non-
          destructive  read-only  test is done.  This option must not be
          combined with the -w option, as they are mutually exclusive.

(you can also specify a range of sectors, see the manpage)
Use it on your own risk though.

OTOH, 145 uncorrectable sectors do sound a bit much. Better replace the hard disk, I’d say.

Take a look at what ‘smartctl --health /dev/sda’ and ‘smartctl --health /dev/sdb’ are reporting.

  • If the news is bad then, a new drive will need to be purchased.

[HR][/HR]The ‘smartctl’ man page offers some examples of how to run self-tests on the drives:For example, selecting LBAs 10 to 100 and 30 to 300: “smartctl -t select,10-100 -t select,30-300 -t afterselect,on -t pending,45 /dev/sda”.

As I posted earlier,
You can download and run Seagate Seatools to perform a low level disk re-format to try to recover bad blocks.
When you run a SeaTools scan, it will test each marked bad block to see if it can be written to and data read from it correctly, over-riding normal system behavior.

AFAIK this is the only publicly available tool that provides a way for a User to recover erroneously marked bad blocks, if it’s <really> bad (no mistake) then yes… In all other circumstances, it’s a problem that can be fixed only by the OEM Factory.

TSU

AFAICS, one of the most useful pieces of information for resolving the question “Is the disk nearing ‘end-of-life’?” is the following section of “smartctl --all” (following example is my SDD used for system directories – the values I consider to be important are highlighted):


Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 **Raw_Read_Error_Rate     0x000f   100   100   050    Pre-fail  Always       -       0/3595062**
  5 Retired_Block_Count     0x0033   100   100   003    Pre-fail  Always       -       0
  9 **Power_On_Hours_and_Msec 0x0032   100   100   000    Old_age   Always       -       19529h+42m+00.140s**
 12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2199
171 Program_Fail_Count      0x0032   000   000   000    Old_age   Always       -       1
172 Erase_Fail_Count        0x0032   000   000   000    Old_age   Always       -       0
174 Unexpect_Power_Loss_Ct  0x0030   000   000   000    Old_age   Offline      -       103
177 Wear_Range_Delta        0x0000   000   000   000    Old_age   Offline      -       1
181 Program_Fail_Count      0x0032   000   000   000    Old_age   Always       -       1
182 Erase_Fail_Count        0x0032   000   000   000    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   030   030   000    Old_age   Always       -       30 (Min/Max 30/30)
195 **ECC_Uncorr_Error_Count  0x001c   100   100   000    Old_age   Offline      -       0/3595062**
196 Reallocated_Event_Count 0x0033   100   100   000    Pre-fail  Always       -       0
231 **SSD_Life_Left           0x0013   100   100   010    Pre-fail  Always       -       0**
233 SandForce_Internal      0x0000   000   000   000    Old_age   Offline      -       256
234 SandForce_Internal      0x0032   000   000   000    Old_age   Always       -       256
241 **Lifetime_Writes_GiB     0x0032   000   000   000    Old_age   Always       -       256**
242 **Lifetime_Reads_GiB      0x0032   000   000   000    Old_age   Always       -       3264**