Loads grub, no errors, but will not complete start - bios changed boot order, now will not start

I think I just lost a hard drive. The bios has automatically made a correction and my guess is that is changed the boot order. As it is, grub loads fine, no error codes, it starts to boot, but locks in safe mode. Looks like sdb is a problem, but fdisk will not clean it.

My bios does not have a manual reorder that I can locate. My guess is that fstab needs to be looked at as well as grub. If I can get into linux I can do that with Yast. If there is a better way, please let me know. Any help is appreciated. I need help getting it started period.

This is unclear can you boot?

You can do it via any live/rescue Linux media. But you need to edit /etc/fstab by hand or you must do a chroot to get at the system files via yast

Also running smartctl can tell if the drive is damaged

Haven’t tried the rescue disk yet, usually fdisk cleans it up. Tried that first. Thanks, I will work on that when I get home tonight.

Yes the computer boots, runs grub, but will not load linux.

I can get into windows via dual boot, but not linux.

What do you call “safe mode”? Please attach picture of your screen.

I have a Gecko Linux disk, using it as a rescue disk. I am into the live version. What is it I should be looking for? I am guessing grub.

thanks

I am sorry, but can’t attach a screen shot. It won’t complete the booting process. It hangs during the boot. The 3 flipping green dots. F1 shows:

] a start job is running for dev-disk-by\x2duuid-b0bas . . . .

So you have reference in /etc/fstab to drive or filesystem that is no more present on your system. Boot from openSUSE distribution DVD in rescue mode and compare /etc/fstab entries with what you actually have.

You are past grub at this point. Which is why you always need to describe what you actually see, not what you think it means.

Thanks for the help. Sorry for any missunderstanding. I thought I did say it got past grub, and linux started loading, but then hung. Anyway, below is the last of the text, as best as I could copy it down.

Failed command: read DMA
status ( DRDY ERR )
error ( UNC )
blk_update_request: I/O error dev sdb, sector (a big ol' string) 
buffer I/O error sdb2
Welcome to emergency mode!  blah, blah blah 

As a test I installed Gecko Linux on an empty drive. I compared the /boot/grub2/device.map from that new install to the one on my openSuse.

Gecko shows the drive.map as:
(hd1) /dev/sdb
(hd0) /dev/sdc
(hd2) /dev/sda

OpenSuse shows the drive.map as:
(hd2) /dev/sdc
(hd0) /dev/sda
(hd1) /dev/sdb

Those differences are not a problem. They just reflect that Gecko linux is installed on a different disk.

The other message perhaps indicates a disk problem. Try running “smartctl” in your gecko linux, to test the main hard drive.

I tried from a terminal window, but it doesn’t seem to be on the system. Thoughts?

:~> smartctl
bash: smartctl: command not found

I installed smartmon. Smartctl shows:

[CODE smartctl --all /dev/sdb
smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.1.21-14-default] (SUSE RPM)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar Blue Serial ATA
Device Model: WDC WD1600AAJS-98PSA0
Serial Number: WD-WCAP93951157
LU WWN Device Id: 5 0014ee 10081852c
Firmware Version: 05.06H05
User Capacity: 160,041,885,696 bytes [160 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA/ATAPI-7 (minor revision not indicated)
Local Time is: Sun Jun 19 14:49:32 2016 AKDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 4380) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 58) minutes.
Conveyance self-test routine
recommended polling time: ( 6) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 184 183 051 Pre-fail Always - 474460
3 Spin_Up_Time 0x0003 158 154 021 Pre-fail Always - 3091
4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1772
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x000e 200 200 051 Old_age Always - 0
9 Power_On_Hours 0x0032 027 027 000 Old_age Always - 53821
10 Spin_Retry_Count 0x0012 100 100 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0012 100 100 051 Old_age Always - 0
12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 1118
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 437
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 1772
194 Temperature_Celsius 0x0022 107 087 000 Old_age Always - 36
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 172 171 000 Old_age Always - 732
198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 001 001 051 Old_age Offline FAILING_NOW 79604

SMART Error Log Version: 1
Warning: ATA error count 40326 inconsistent with error log pointer 4

ATA Error Count: 40326 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It “wraps” after 49.710 days.

Error 40326 occurred at disk power-on lifetime: 53820 hours (2242 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH


40 51 00 10 18 00 e5 Error: UNC at LBA = 0x05001810 = 83892240

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


c8 00 08 10 18 00 05 00 2d+22:24:14.908 READ DMA
27 00 00 00 00 00 00 00 2d+22:24:14.904 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 00 00 2d+22:24:14.895 IDENTIFY DEVICE
ef 03 46 00 00 00 00 00 2d+22:24:14.887 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 00 00 2d+22:24:14.872 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]

Error 40325 occurred at disk power-on lifetime: 53820 hours (2242 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH


40 51 00 10 18 00 e5 Error: UNC at LBA = 0x05001810 = 83892240

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


c8 00 08 10 18 00 05 00 2d+22:24:12.626 READ DMA
25 00 80 80 18 00 05 00 2d+22:24:12.615 READ DMA EXT
c8 00 38 40 18 00 05 00 2d+22:24:12.612 READ DMA
27 00 00 00 00 00 00 00 2d+22:24:12.466 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 00 00 2d+22:24:12.457 IDENTIFY DEVICE

Error 40324 occurred at disk power-on lifetime: 53820 hours (2242 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH


40 51 00 10 18 00 e5 Error: UNC at LBA = 0x05001810 = 83892240

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


c8 00 08 10 18 00 05 00 2d+22:24:10.174 READ DMA
c8 00 18 20 18 00 05 00 2d+22:24:10.174 READ DMA
c8 00 08 78 18 00 05 00 2d+22:24:10.174 READ DMA
c8 00 08 38 18 00 05 00 2d+22:24:10.173 READ DMA
c8 00 08 18 18 00 05 00 2d+22:24:10.173 READ DMA

Error 40323 occurred at disk power-on lifetime: 53820 hours (2242 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH


40 51 00 00 08 00 e0 Error: UNC at LBA = 0x00000800 = 2048

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


c8 00 08 00 08 00 00 00 2d+22:24:08.065 READ DMA
27 00 00 00 00 00 00 00 2d+22:24:07.823 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 00 00 2d+22:24:07.814 IDENTIFY DEVICE
ef 03 46 00 00 00 00 00 2d+22:24:07.806 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 00 00 2d+22:24:07.791 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]

Error 40322 occurred at disk power-on lifetime: 53820 hours (2242 days + 12 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH


40 51 00 00 08 00 e0 Error: UNC at LBA = 0x00000800 = 2048

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


c8 00 08 00 08 00 00 00 2d+22:24:05.840 READ DMA
c8 00 08 f0 17 00 05 00 2d+22:24:05.837 READ DMA
c8 00 08 80 17 00 05 00 2d+22:24:05.837 READ DMA
27 00 00 00 00 00 00 00 2d+22:24:05.521 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 00 00 2d+22:24:05.512 IDENTIFY DEVICE

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

1 Extended offline Completed: read failure 90% 48989 3016278

2 Short offline Completed: read failure 90% 48989 3016281

3 Short offline Completed: read failure 90% 19972 61587048

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
]

Appears to be a recurring error on startup. I’d say the drive is quickly becoming toast. Replace it

Do you know what the partition /dev/sdb2 is used for?
Is it important for the boot process of the original os itself? I mean like root (/) or /usr filesystem.
If it just contains data, like /home or similar, you can boot into gecko, mount the original root device somewhere, and edit fstab in its etc subdirectory: comment out the line with /dev/sdb2.
Then try to reboot into the original os.

Thanks. I did remark out the sdb lines. It was the /home directory. It will still not boot. Is there something in grub that needs to change? Or some other boot file?

Depends on the error messages you see (and we dont :wink: ).

In the end you will have to replace the disk, so you could add one now, install a linux on that and see what you can salvage of sdb from there, if necessary.

If you have bad disc sectors chances are that you have corrupted files so I doubt that messing with things will help. Save what you can if you don’t have backups. That drive will only get worse

I have pulled the old bad drive. It is no longer part of the system.

I live booted to Gecko and tried to modify the fstab file, but is still wouldn’t reboot. I did not bring my notes on that error code to work, but will try to get it up here when I get home.

As for what I have tried:
I tried to install Gecko in an open partition. It started through but hung when trying to install grub. It paused and allowed me to modify grub, load to mbr (or not), change boot directory, etc. I tried 3 or 4 variations, but none would successfully allow grub to install and therefore cancelled out the installation.

I was able to install Mint Mate into that partition. Success, finally! When installing, I mounted all of the drives. I then used the fstab file from Mint as a sample for the mount points and hard drive descriptions to clean up the openSuse fstab. While in Mint I was able to run the smart tools to verify the remaining hard drives are all clean. They are.

Next step, boot to openSuse. It ran through grub fine. I selected openSuse. OpenSuse starts up. I select F1 to see the logs. Whereas previously it stalled and kicked me into the text repair or emergency mode (? didn’t write that down) that contains “systemctl reboot”, etc.
This time it stalled out with no error messages.

The last two lines that show are:

  • started X Display Manager
  • Reached target graphical interface
    (then nothing, stalls and hangs)

Thanks for the help. Please let me know what you need me to do to provide any additional information to fix this mess. It has been over two weeks without openSuse and it is sorely missed.

I assume Gecko is some pet name you use for openSUSE???

Is this now an EFI or MBR boot? Are you booting the installer in the correct mode to that you wish to use to boot the OS?? What mode is any other OS using???

What video card???

Maybe show us fdisk -l

I’m pretty sure that he is talking about Gecko Linux, which is distro based on opensuse and using SUSE Studio to build it.