Hard drive error

Suddenly will not boot. Message requests replacement with bootable device. Entire drive used for 15.3 Leap installation. When connected to other functioning box and started from updated grub, reaches…

[OK] Reached target Init Root Device

and stops. Using testdisk reports no problems. Any suggestions will be appreciated.

# gdisk /dev/sdb
GPT fdisk (gdisk) version 1.0.1


Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Command (? for help): p

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdb: 1953525168 sectors, 931.5 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): EBDEC3E4-CF60-4F22-BF56-78FF95E5785A
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 2048-sector boundaries
Total free space is 1050590 sectors (513.0 MiB)


Number  Start (sector)    End (sector)  Size       Code  Name
   1         1050624      1949329407   929.0 GiB   8300  
   2      1949329408      1953525134   2.0 GiB     8200  


You may want to check everything using journalctl, fsck, smartctl.

Watch for extra messages:

**erlangen:~ #** journalctl -b -g sda 
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)**
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] 4096-byte physical blocks**
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] Write Protect is off**
May 30 11:31:38 erlangen kernel: sd 0:0:0:0: **sda**] Mode Sense: 00 3a 00 00
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA**
May 30 11:31:46 erlangen kernel:  sda: sda1 
May 30 11:31:46 erlangen kernel: **sd 0:0:0:0: ****sda****] Attached SCSI disk**
May 30 11:31:47 erlangen udisksd[1212]: **Mounted /dev/****sda****1 (system) at /HDD on behalf of uid 1000**
May 30 11:31:47 erlangen kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none. 
**erlangen:~ #**
**erlangen:~ #** fsck -f /dev/sda1 
fsck from util-linux 2.37.4 
e2fsck 1.46.5 (30-Dec-2021) 
Pass 1: Checking inodes, blocks, and sizes 
Pass 2: Checking directory structure 
Pass 3: Checking directory connectivity 
Pass 4: Checking reference counts 
Pass 5: Checking group summary information 
HDD: 1770095/244195328 files (1.2% non-contiguous), 586841533/976753920 blocks 
**erlangen:~ #**
**erlangen:~ #** smartctl -A /dev/sda 
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.17.9-1-default] (SUSE RPM) 
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org 

=== START OF READ SMART DATA SECTION === 
SMART Attributes Data Structure revision number: 16 
Vendor Specific SMART Attributes with Thresholds: 
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE 
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       840 
  3 Spin_Up_Time            0x0027   181   175   021    Pre-fail  Always       -       7933 
  4 Start_Stop_Count        0x0032   092   092   000    Old_age   Always       -       8452 
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0 
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0 
  9 Power_On_Hours          0x0032   081   081   000    Old_age   Always       -       14182 
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0 
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0 
 12 Power_Cycle_Count       0x0032   097   097   000    Old_age   Always       -       3102 
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       88 
193 Load_Cycle_Count        0x0032   198   198   000    Old_age   Always       -       8394 
194 Temperature_Celsius     0x0022   123   110   000    Old_age   Always       -       29 
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0 
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       3 
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0 
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0 
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       116 

**erlangen:~ #**

Thank you for the suggestions. I did try them with no problems reported. I gave up and re-installed and so far the drive boots normally.

If you re-install and during the re-installation re-format the disk’s partitions then, any bad blocks will be found and marked in the disk’s bad block table.

  • Checking for bad blocks on a system disk was and, remains, tricky –
    Basically, checking for bad blocks on an active volume was never reliable and, will probably never be reliable …
  • Basically, the only reliable method of dealing with bad blocks on a system disk is, to use another system to perform the bad block scan on the suspect system disk and then, re-install the system on that disk again –
    Often, the bad block(s) have destroyed at least one critical system file and therefore, the only safe way to recover the system is, to re-install it …

[HR][/HR]Yes, yes, I know, modern file systems usually manage to handle any bad blocks appearing in a partition but, especially for the system partition the emphasis has to placed on “usually” … >:)