Catastrophic failure after adding Packman-Essentials repository.

I discovered I needed the libx264 codec and added the Packman-Essentials repository on Friday night. I instructed Yast to switch system pkgs to the new repository. I was horrified when informed that over 200 pkgs were to be updated but proceeded, anyway. Bad move. Upon rebooting, I was presented with an Emergency Mode prompt. Entering ‘journalctl’ gave the following output:

http://tomr.fastmail.net/public/misc/journalctl_01.jpg

http://tomr.fastmail.net/public/misc/journalctl_2.jpg
http://tomr.fastmail.net/public/misc/journalctl_2.jpg
http://tomr.fastmail.net/public/misc/recovery_01.jpg
http://tomr.fastmail.net/public/misc/recovery_01.jpg

Inserting the TW installation media and entering Recovery Mode gives the following:

http://tomr.fastmail.net/public/misc/recovery_01.jpg

I might be mistaken about the above due the dizzying array of startup methods. And I can’t be sure there’s no hardware errors. I saw stuff about what looked like the boot drive in some error messages. The box in question is ancient hardware from 2005. I’ve just been using it as a server. It hasn’t been touched for about 18 months except for automatic security patches.

Adding the packman repo and switching packages to packman should not cause the problems that you saw. I have never had any problems doing that.

I have, however, had a disk fail. It seems to have been the disk electronics rather than the recording surface. And it wasn’t pretty. The number of errors was huge, and running “fsck” only made it worse. (I replaced the disk, reinstalled openSUSE, and restored “/home” from a recent backup – and all has been fine since then).

Hi
If desktop, check disk SATA cable(s) and power supply, if laptop, pull disk, clean contacts and re-insert… It may help…

Thanks, guys. I pulled the disk and and cleaned things up inside the box before I took the pictures of the journalctl output. At this point, I’m still stuck at the Emergency prompt and am prepared to reinstall TW. I verified that the system seems to be able to perform an install but hesitate to do so thinking there may be a less drastic solution. But I gather that no other obvious repair is possible.

Hi
Check the smartctl -a /dev/sdN output to see what the disk attributes are like, run a test of the disk as well… boot from a live (rescue) USB image for this…

I couldn’t get the system into any kind of shell I recognized so I went ahead and reinstalled TW using an old install disk and the Internet. The installation appeared to succeed but then when I went to reboot I was thrown directly into the rescue prompt.

So I shut the machine down, pulled the drive and plugged it into my laptop running 15.2. Running the smarctl command gives the following results. The self-testing capability seems to be limited, but nothing strikes me as obviously wrong:

smartctl 7.0 2019-05-21 r4917 [x86_64-linux-5.3.18-lp152.66-default] (SUSE RPM)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Indilinx Barefoot based SSDs
Device Model:     Patriot Torqx 2 32GB SSD
Serial Number:    BA1407210B0800023048
Firmware Version: S5FAM014
User Capacity:    32,017,047,552 bytes [32.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Fri Mar 19 18:15:44 2021 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.

Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever  
                                        been run.
Total time to complete Offline  
data collection:                (  255) seconds.
Offline data collection
capabilities:                    (0x1b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine  
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (   2) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:

 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAIL
ED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   ---   ---   ---    Pre-fail  Always       -    
   1095216686180
  9 Power_On_Hours          0x0012   ---   ---   ---    Old_age   Always       -    
   1310745700
 12 Power_Cycle_Count       0x0012   ---   ---   ---    Old_age   Always       -    
   12018788
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -    
   160
170 Unknown_Attribute       0x0003   100   100   010    Pre-fail  Always       -    
   34359738417
173 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -    
   36971316
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       -    
   49
218 Unknown_Attribute       0x000b   100   100   050    Pre-fail  Always       -    
   158

SMART Error Log Version: 1
ATA Error Count: 779 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
       CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 779 occurred at disk power-on lifetime: 21 hours (0 days + 21 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 18 fd 0b e0  Error: UNC 8 sectors at LBA = 0x000bfd18 = 785688

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 18 fd 0b e0 08  21d+13:13:21.000  READ DMA
  c8 00 20 78 24 c8 e2 08  21d+13:13:21.000  READ DMA
  27 00 00 00 00 00 e0 08  21d+13:13:21.000  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
  ec 00 00 00 00 00 a0 08  21d+13:13:21.000  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 08  21d+13:13:21.000  SET FEATURES [Set transfer mode]

...


SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_
first_error
# 1  Short offline       Completed without error       00%         0         -
# 2  Short offline       Completed without error       00%         0         -

Hi
I would suggest consulting the manufacturers website for those attributes, those values (raw) seems awfully suspect, plus “ATA Error Count: 779”. Original test was at 0 hours, so not relevant.

Is the device still under warranty, if so arrange a replacement…

Thanks. I tried rebooting again, this time “directly” (without going through the menu on the install medium) and it succeeded. So the ancient box is still doing my bidding for at least a little while longer.

. The drive appeared in 2011! Try some current hardware. SATA SSDs now are fast, reliable and cheap.

#if _FP_W_TYPE_SIZE < 32
#error "Here's a nickel kid.  Go buy yourself a real computer."
#endif
        -- linux/arch/sparc64/double.h