Suddenly will not boot. Message requests replacement with bootable device. Entire drive used for 15.3 Leap installation. When connected to other functioning box and started from updated grub, reaches…
[OK] Reached target Init Root Device
and stops. Using testdisk reports no problems. Any suggestions will be appreciated.
# gdisk /dev/sdb
GPT fdisk (gdisk) version 1.0.1
Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present
Command (? for help): p
Found valid GPT with protective MBR; using GPT.
Disk /dev/sdb: 1953525168 sectors, 931.5 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): EBDEC3E4-CF60-4F22-BF56-78FF95E5785A
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 2048-sector boundaries
Total free space is 1050590 sectors (513.0 MiB)
Number Start (sector) End (sector) Size Code Name
1 1050624 1949329407 929.0 GiB 8300
2 1949329408 1953525134 2.0 GiB 8200
ionmich:
# gdisk /dev/sdb
GPT fdisk (gdisk) version 1.0.1
Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present
Command (? for help): p
Found valid GPT with protective MBR; using GPT.
Disk /dev/sdb: 1953525168 sectors, 931.5 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): EBDEC3E4-CF60-4F22-BF56-78FF95E5785A
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 2048-sector boundaries
Total free space is 1050590 sectors (513.0 MiB)
Number Start (sector) End (sector) Size Code Name
1 1050624 1949329407 929.0 GiB 8300
2 1949329408 1953525134 2.0 GiB 8200
You may want to check everything using journalctl, fsck, smartctl.
Watch for extra messages:
**erlangen:~ #** journalctl -b -g sda
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)**
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] 4096-byte physical blocks**
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] Write Protect is off**
May 30 11:31:38 erlangen kernel: sd 0:0:0:0: **sda**] Mode Sense: 00 3a 00 00
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA**
May 30 11:31:46 erlangen kernel: sda: sda1
May 30 11:31:46 erlangen kernel: **sd 0:0:0:0: ****sda****] Attached SCSI disk**
May 30 11:31:47 erlangen udisksd[1212]: **Mounted /dev/****sda****1 (system) at /HDD on behalf of uid 1000**
May 30 11:31:47 erlangen kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
**erlangen:~ #**
**erlangen:~ #** fsck -f /dev/sda1
fsck from util-linux 2.37.4
e2fsck 1.46.5 (30-Dec-2021)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
HDD: 1770095/244195328 files (1.2% non-contiguous), 586841533/976753920 blocks
**erlangen:~ #**
**erlangen:~ #** smartctl -A /dev/sda
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.17.9-1-default] (SUSE RPM)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 840
3 Spin_Up_Time 0x0027 181 175 021 Pre-fail Always - 7933
4 Start_Stop_Count 0x0032 092 092 000 Old_age Always - 8452
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 081 081 000 Old_age Always - 14182
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 097 097 000 Old_age Always - 3102
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 88
193 Load_Cycle_Count 0x0032 198 198 000 Old_age Always - 8394
194 Temperature_Celsius 0x0022 123 110 000 Old_age Always - 29
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 3
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 116
**erlangen:~ #**
karlmistelberger:
You may want to check everything using journalctl, fsck, smartctl.
Watch for extra messages:
**erlangen:~ #** journalctl -b -g sda
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)**
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] 4096-byte physical blocks**
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] Write Protect is off**
May 30 11:31:38 erlangen kernel: sd 0:0:0:0: **sda**] Mode Sense: 00 3a 00 00
May 30 11:31:38 erlangen kernel: **sd 0:0:0:0: ****sda****] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA**
May 30 11:31:46 erlangen kernel: sda: sda1
May 30 11:31:46 erlangen kernel: **sd 0:0:0:0: ****sda****] Attached SCSI disk**
May 30 11:31:47 erlangen udisksd[1212]: **Mounted /dev/****sda****1 (system) at /HDD on behalf of uid 1000**
May 30 11:31:47 erlangen kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
**erlangen:~ #**
**erlangen:~ #** fsck -f /dev/sda1
fsck from util-linux 2.37.4
e2fsck 1.46.5 (30-Dec-2021)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
HDD: 1770095/244195328 files (1.2% non-contiguous), 586841533/976753920 blocks
**erlangen:~ #**
**erlangen:~ #** smartctl -A /dev/sda
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.17.9-1-default] (SUSE RPM)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 840
3 Spin_Up_Time 0x0027 181 175 021 Pre-fail Always - 7933
4 Start_Stop_Count 0x0032 092 092 000 Old_age Always - 8452
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 081 081 000 Old_age Always - 14182
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 097 097 000 Old_age Always - 3102
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 88
193 Load_Cycle_Count 0x0032 198 198 000 Old_age Always - 8394
194 Temperature_Celsius 0x0022 123 110 000 Old_age Always - 29
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 3
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 116
**erlangen:~ #**
Thank you for the suggestions. I did try them with no problems reported. I gave up and re-installed and so far the drive boots normally.
If you re-install and during the re-installation re-format the disk’s partitions then, any bad blocks will be found and marked in the disk’s bad block table.
Checking for bad blocks on a system disk was and, remains, tricky –
Basically, checking for bad blocks on an active volume was never reliable and, will probably never be reliable …
Basically, the only reliable method of dealing with bad blocks on a system disk is, to use another system to perform the bad block scan on the suspect system disk and then, re-install the system on that disk again –
Often, the bad block(s) have destroyed at least one critical system file and therefore, the only safe way to recover the system is, to re-install it …
[HR][/HR]Yes, yes, I know, modern file systems usually manage to handle any bad blocks appearing in a partition but, especially for the system partition the emphasis has to placed on “usually ” … >:)