Results 1 to 2 of 2

Thread: Extra Fun With Backup To External Disk

  1. #1
    Join Date
    Jan 2014
    Location
    Erlangen
    Posts
    3,138

    Default Extra Fun With Backup To External Disk

    Backup failed with:
    Code:
    erlangen:~ # journalctl --since 07:00 --until 07:10 -u backup-home-WD25.service   
    -- Logs begin at Fri 2021-06-04 21:27:14 CEST, end at Wed 2021-06-16 15:14:22 CEST. -- 
    Jun 16 07:03:18 erlangen systemd[1]: Starting Backup /home to USB disk... 
    Jun 16 07:03:22 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/Bilder/2015/2015-02-12/20150216075315-77010-map.html": Input/output error (5) 
    Jun 16 07:03:22 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/Bilder/2015/2015-03-10/20150316022150-04180-map.html": Input/output error (5) 
    Jun 16 07:03:29 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/Bilder/2017/2017-06-16/20170617124057-07168-map.html": Input/output error (5) 
    Jun 16 07:03:34 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/Bilder/2018/2018-05-30/.jalbum/p1230368.mp4.info": Input/output error (5) 
    Jun 16 07:03:39 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/Bilder/2019/2019-09-04/p1310878.jpg": Input/output error (5) 
    Jun 16 07:04:27 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/jAlbums/Bilder/2009/2009-07-14/thumbs/imgp7089.jpg": Input/output error (5) 
    Jun 16 07:04:32 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/jAlbums/Bilder/2010/2010-05-19/slides/imgp9397.jpg": Input/output error (5) 
    Jun 16 07:04:34 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/jAlbums/Bilder/2010/2010-08-09/thumbs/imgp0064.jpg": Input/output error (5) 
    Jun 16 07:04:37 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/jAlbums/Bilder/2014/2014-02-01/thumbs/p1010670.jpg": Input/output error (5) 
    Jun 16 07:04:38 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/jAlbums/Bilder/2014/2014-03-26/thumbs/p1020445.jpg": Input/output error (5) 
    Jun 16 07:04:39 erlangen rsync[13087]: rsync: [generator] recv_generator: failed to stat "/WD25/backup/home/Albums/jAlbums/Bilder/2014/2014-05-17/slides/p1030429.jpg": Input/output error (5) 
    Jun 16 07:04:58 erlangen rsync[13086]: rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(701) [sender=3.2.3] 
    Jun 16 07:04:58 erlangen systemd[1]: backup-home-WD25.service: Main process exited, code=exited, status=20/n/a
    Jun 16 07:04:58 erlangen rsync[13087]: rsync error: received SIGUSR1 (code 19) at main.c(1608) [generator=3.2.3] 
    Jun 16 07:04:58 erlangen systemd[1]: backup-home-WD25.service: Failed with result 'exit-code'.
    Jun 16 07:04:58 erlangen systemd[1]: Stopped Backup /home to USB disk. 
    Jun 16 07:04:58 erlangen systemd[1]: backup-home-WD25.service: Consumed 17.297s CPU time.
    erlangen:~ #
    
    Unmounted and ran fsck on 2TB ext4. Got prompted for continuing after read error in directory inode. When prompted for overwriting I did so. Fsck would hang and even 'kill -9' wouldn't work. Thus I unplugged the USB drive and plugged it in again. Another fsck reported a clean file system. I started a long selftest (400 minutes):
    Code:
    erlangen:~ # smartctl -l selftest /dev/sdd            
    smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.12.9-1-default] (SUSE RPM) 
    Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org 
    
    === START OF READ SMART DATA SECTION === 
    SMART Self-test log structure revision number 1 
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error 
    # 1  Extended offline    Completed without error       00%       899         - 
    # 2  Extended offline    Completed without error       00%         8         - 
    
    erlangen:~ #
    Restarted backup service and got a nice message:
    Code:
    erlangen:~ # journalctl --since 13:00 -u backup-home-WD25.service                 
    -- Logs begin at Fri 2021-06-04 21:27:14 CEST, end at Wed 2021-06-16 15:24:47 CEST. -- 
    Jun 16 15:07:31 erlangen systemd[1]: Starting Backup /home to USB disk... 
    Jun 16 15:12:55 erlangen systemd[1]: backup-home-WD25.service: Succeeded. 
    Jun 16 15:12:55 erlangen systemd[1]: Finished Backup /home to USB disk. 
    Jun 16 15:12:55 erlangen systemd[1]: backup-home-WD25.service: Consumed 42.747s CPU time.
    erlangen:~ #
    Smart attributes are what they should be:
    Code:
    erlangen:~ # smartctl -A /dev/sdd                                 
    smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.12.9-1-default] (SUSE RPM) 
    Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org 
    
    === START OF READ SMART DATA SECTION === 
    SMART Attributes Data Structure revision number: 16 
    Vendor Specific SMART Attributes with Thresholds: 
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE 
      1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0 
      3 Spin_Up_Time            0x0027   208   200   021    Pre-fail  Always       -       4600 
      4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       598 
      5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0 
      7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0 
      9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       899 
     10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0 
     11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0 
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       302 
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       258 
    193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       2471 
    194 Temperature_Celsius     0x0022   113   101   000    Old_age   Always       -       39 
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0 
    197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0 
    198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0 
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0 
    200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0 
    
    erlangen:~ #
    Note the 899 Power_On_Hours. An error can occur at any age of the drive.
    AMD Athlon 4850e (2009), openSUSE 13.1, KDE 4, Intel i3-4130 (2014), i7-6700K (2016), i5-8250U (2018), AMD Ryzen 5 3400G (2020), openSUSE Tumbleweed, KDE Plasma 5

  2. #2
    Join Date
    Jan 2014
    Location
    Erlangen
    Posts
    3,138

    Default More Fun With Backup To Hard Disk Drive

    This drive started to fail smart testing:
    Code:
    erlangen:~ # smartctl -i /dev/sdb          
    smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.2-1-default] (SUSE RPM) 
    Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org 
    
    === START OF INFORMATION SECTION === 
    Model Family:     Western Digital Green 
    Device Model:     WDC WD40EZRX-22SPEB0 
    Serial Number:    WD-WCC4E2FYXSNV 
    LU WWN Device Id: 5 0014ee 262d2e71e 
    Firmware Version: 80.00A80 
    User Capacity:    4,000,787,030,016 bytes [4.00 TB] 
    Sector Sizes:     512 bytes logical, 4096 bytes physical 
    Rotation Rate:    5400 rpm 
    Device is:        In smartctl database [for details use: -P show] 
    ATA Version is:   ACS-2 (minor revision not indicated) 
    SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) 
    Local Time is:    Thu Jul 22 06:48:17 2021 CEST 
    SMART support is: Available - device has SMART capability. 
    SMART support is: Enabled 
    
    erlangen:~ #
    
    Bad sectors caused short offline tests to fail:
    Code:
    erlangen:~ # smartctl -l selftest /dev/sdb 
    smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.2-1-default] (SUSE RPM) 
    Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org 
    
    === START OF READ SMART DATA SECTION === 
    SMART Self-test log structure revision number 1 
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error 
     # 1  Extended offline    Completed without error       00%     12832         - 
     # 2  Short offline       Completed without error       00%     12815         - 
    # 3  Short offline       Completed: read failure       20%     12799         11653928 
    # 4  Short offline       Completed: read failure       20%     12799         11651807 
    # 5  Short offline       Completed: read failure       20%     12799         11651806 
    # 6  Short offline       Completed: read failure       20%     12799         11651805 
    # 7  Short offline       Completed: read failure       20%     12798         11651803 
    # 8  Short offline       Completed: read failure       20%     12798         11651802 
    # 9  Short offline       Completed: read failure       20%     12798         11651801 
    #10  Short offline       Completed: read failure       20%     12798         11650680 
    #11  Short offline       Completed: read failure       20%     12798         11651800 
    #12  Short offline       Completed: read failure       90%     12734         11651800 
    #13  Short offline       Completed: read failure       90%     12718         11648558 
    #14  Short offline       Interrupted (host reset)      90%     12678         - 
    #15  Short offline       Completed: read failure       20%     12662         11651800 
    #16  Extended offline    Completed: read failure       90%     12641         11651800 
    #17  Conveyance offline  Completed without error       00%     12564         - 
    #18  Extended offline    Interrupted (host reset)      90%     11508         - 
    #19  Short offline       Completed without error       00%     10641         - 
    #20  Extended offline    Interrupted (host reset)      90%     10386         - 
    #21  Extended offline    Interrupted (host reset)      70%     10331         - 
    13 of 13 failed self-tests are outdated by newer successful extended offline self-test # 1 
    
    erlangen:~ #
    
    Recovery used procedures from https://www.smartmontools.org/wiki/BadBlockHowto debugfs revealed all bad sectors were owned by a single file. Read errors were confirmed using hdparm --read-sector 11651800 /dev/sdb Writing to the bad sectors fixed them: hdparm --yes-i-know-what-i-am-doing --write-sector 11651800 /dev/sdb

    Subsequent checking revealed that the affected file was intact. Reading the drive with 'dd if=/dev/sdb of=/dev/null bs=4M status=progress' succeeded as well as short and extended SMART self tests.

    Since recovering error rates #1 and #200 stay constant during operation. Pending sectors #197 and uncorrectable sectors #198 are fixed but reported erroneously.
    Code:
    erlangen:~ # smartctl -A /dev/sdb              
    smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.2-1-default] (SUSE RPM) 
    Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org 
    
    === START OF READ SMART DATA SECTION === 
    SMART Attributes Data Structure revision number: 16 
    Vendor Specific SMART Attributes with Thresholds: 
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE 
      1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       492 
      3 Spin_Up_Time            0x0027   180   175   021    Pre-fail  Always       -       7966 
      4 Start_Stop_Count        0x0032   094   094   000    Old_age   Always       -       6868 
      5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0 
      7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0 
      9 Power_On_Hours          0x0032   083   083   000    Old_age   Always       -       12840 
     10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0 
     11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0 
     12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2375 
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68 
    193 Load_Cycle_Count        0x0032   198   198   000    Old_age   Always       -       6831 
    194 Temperature_Celsius     0x0022   121   110   000    Old_age   Always       -       31 
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0 
    197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       4 
    198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       2 
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0 
    200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       110 
    
    erlangen:~ #
    
    The following test is expected to fix the inconsistency:
    Code:
    erlangen:~ # smartctl -t offline /dev/sdb              
    smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.2-1-default] (SUSE RPM) 
    Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org 
    
    === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === 
    Sending command: "Execute SMART off-line routine immediately in off-line mode". 
    Drive command "Execute SMART off-line routine immediately in off-line mode" successful. 
    Testing has begun. 
    
    Please wait 53520 seconds for test to complete. 
    Test will complete after Thu Jul 22 22:05:05 2021 CEST 
    Use smartctl -X to abort test. 
    erlangen:~ #
    
    AMD Athlon 4850e (2009), openSUSE 13.1, KDE 4, Intel i3-4130 (2014), i7-6700K (2016), i5-8250U (2018), AMD Ryzen 5 3400G (2020), openSUSE Tumbleweed, KDE Plasma 5

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •