Tumbleweed dead after NTFS partition failed

I have a Debian/Tumbleweed dual boot setup in the machine I’m currently using. Right now, I’m running Debian because Tumbleweed won’t boot.

My machine has an SSD (where Debian is installed) and an HDD (where Tumbleweed is installed). The HDD has an NTFS partition that won’t mount anymore because I need to check it with chkdsk and I don’t have Windows available. That was causing issues on Debian as well, so I disabled it on fstab and now Debian boots up fine.

I tried the same for Tumbleweed but it wasn’t enough. I keep getting errors mounting partitions while booting Tumbleweed and after I wait several minutes the boot throws me to the login command prompt and I get no video.

Any ideas?

HDD/SSD layout:

# SSD:
sda      8:0    0 447,1G  0 disk 
├─sda1   8:1    0 435,4G  0 part /  # Debian is here
├─sda2   8:2    0     1K  0 part 
└─sda5   8:5    0  11,7G  0 part    # swap
sdb      8:16   0   2,7T  0 disk 
├─sdb1   8:17   0 874,3G  0 part    # shared (dirty) NTFS partition 
├─sdb2   8:18   0 732,4G  0 part    # shared ext4 partition
└─sdb3   8:19   0   1,2T  0 part    # Tumbleweed is here

Not without seeing the actual errors, or better the full output of

journalctl -b --no-pager --full
1 Like

I noticed a few additional things while trying to get the journalctl log:

  • On rescue mode, running any command took ages. I also noticed this appeared on the screen after I logged in:

  • Back on Debian, I also noticed mouting and navigating Tumbleweed’s root partition is crawling slow. I feel the whole HDD might be dying :grimacing:.

Anyway, here goes the journalctl log: openSUSE Paste

Show

cat /etc/fstab
lsblk -f

Yes, it is possible. The problem has nothing to do with NTFS partition (at least, directly). systemd-udevd fails to start so systemd is not aware of available devices and fails to mount all partitions. systemd-udevd fails to start due to timeout and systemd-journald is killed also due to timeout when it scans journal directory so it does point to some HDD issues. The dmesg output may give some more information (journal could have been lost).

UUID=8868245b-c659-4be6-8956-74070de9cfed  /                  ext4  defaults             0  1
UUID=0740fa40-66fe-4478-9222-1fe57b68c256  /mnt/rootdebian    ext4  data=ordered         0  2
#UUID=2DCF654B20FA290D                      /mnt/hdextra       ntfs  fmask=133,dmask=022  0  0
UUID=ac09d1f4-f00f-43fa-af93-4d5f3661cf90  /mnt/hdextralinux  ext4  data=ordered         0  2
UUID=4a5217be-9d1c-4960-a83f-a047f8657834  swap               swap  defaults             0  0
NAME   FSTYPE FSVER LABEL          UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sda                                                                                    
├─sda1 ntfs         HD Extra       2DCF654B20FA290D                                    
├─sda2 ext4   1.0   HD Extra Linux ac09d1f4-f00f-43fa-af93-4d5f3661cf90                
└─sda3 ext4   1.0                  8868245b-c659-4be6-8956-74070de9cfed      1T     5% /
sdb                                                                                    
├─sdb1 ext4   1.0                  0740fa40-66fe-4478-9222-1fe57b68c256                
├─sdb2                                                                                 
└─sdb5 swap   1                    4a5217be-9d1c-4960-a83f-a047f8657834                                         

Here: openSUSE Paste

Sometimes it’s not the HDD at fault, but an old, particularly red, SATA cable. Do you have another cable to try? New ones don’t cost much.

How old is the HDD? Is it one of those perpendicular recording types that cost Seagate a $300M lawsuit settlement?

I just ran smartctl on my HDD. This is the error count:

Device Error Count: 1686 (device log contains only the most recent 20 errors)

This is the attributes table:

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   097   094   006    -    200286920
  3 Spin_Up_Time            PO----   093   092   000    -    0
  4 Start_Stop_Count        -O--CK   097   097   020    -    3628
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    472
  7 Seek_Error_Rate         POSR--   079   060   030    -    99353315
  9 Power_On_Hours          -O--CK   084   084   000    -    14474
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   098   098   020    -    3021
183 Runtime_Bad_Block       -O--CK   064   064   000    -    36
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   001   001   000    -    1686
188 Command_Timeout         -O--CK   100   098   000    -    0 1 14
189 High_Fly_Writes         -O-RCK   079   079   000    -    21
190 Airflow_Temperature_Cel -O---K   061   051   045    -    39 (Min/Max 27/39)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    953
193 Load_Cycle_Count        -O--CK   071   071   000    -    58048
194 Temperature_Celsius     -O---K   039   049   000    -    39 (0 16 0 0 0)
197 Current_Pending_Sector  -O--C-   001   001   000    -    18480
198 Offline_Uncorrectable   ----C-   001   001   000    -    18480
199 UDMA_CRC_Error_Count    -OSRCK   200   199   000    -    188
240 Head_Flying_Hours       ------   100   253   000    -    7963h+22m+44.536s
241 Total_LBAs_Written      ------   100   253   000    -    8416897190
242 Total_LBAs_Read         ------   100   253   000    -    545801730357
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Is it just me, or this looks really bad?

Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166

Oh dear. I’m glad I never bought one of those.

Reallocated_Sector_Ct, Current_Pending_Sector and Offline_Uncorrectable bother me too.

2 Likes

It’s dying, emphasis on IDs 197, and 198.

1 Like

Damn…

I guess I’m buying a replacement ASAP. Any ideas how to clone it to a new HDD recovering as much data as possible? Will dd if=/dev/sda of=/dev/sdb be enough?

I’m also accepting HDD recommendations in the <$100 range

@romariorios rotating rust, SSD, 2.5", 3.5", capacity? You need to buy two, one for backup…

I want something as similar as possible to what I already have: 3TB HDD, 3.5 inches. Preferrably 7200 rpm, but I don’t mind downgrading to 5400 rpm to save money.

(I don’t know what rotating rust means)

@romariorios Rotating Rust = HDD :wink: Something like a Western digital RE 3 TB Enterprise Hard Drive: 3.5 Inch, 7200 RPM, SATA III, 64 MB Cache - WD3000FYYZ (Old Model) is US$100… I would make sure it’s got a good warranty and is an Enterprise version.

image

ouch…

(R$ 1358 ≃ US$ 250)


Anyway, I feel like this thread has been derailed enough. I’m gonna mark it as solved and open a new thread asking about the data rescue procedure. Thanks for the help!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.