uname -a
Linux dhcppc0 3.11.10-21-desktop #1 SMP PREEMPT Mon Jul 21 15:28:46 UTC 2014 (9a9565d) x86_64 x86_64 x86_64 GNU/Linux
I have a WDC_WD1600JD_22HBB0 250G external SATA laptop drive connected to USB Bus 001 Device 005 via a JMicron (152d:2337) ATA/ATAPI bridge adapter.
I have formatted this as ext4 (/dev/sdc1) using gparted with label ‘old tosh’ and there are approx 32GB of files on the volume.
I want to profile and store the bad blocks on this drive, so I do a
#/sbin/fsck.ext4 -c -c -C 0 /dev/sdc1
and prepare for a 20-30 hour wait…
The badblock check proceeds and it is at 10% after about 2h30m.
I leave the machine alone for a few more hours (2-3) and I return to find that the machine is unresponsive - the mouse cursor
does not respond to mouse movements, although the (usb) mouse is still powered. The caps lock key does not toggle
the caps lock led, although the (usb) keyboard is still powered.
On the screen I read the terminal output and it says
done
old tosh: updating bad block inode
but no signs of it returning to the # root prompt.
The operation seems to have completed far too early.
I look in /var/log/messages and see this sequence (timestamps omitted, but time consistent with failure time)
My USB drive is sd 7:0:0:0: at the start of the fsck operation
sd 7:0:0:0: [sdc] Unhandled error code
sd 7:0:0:0: [sdc]
Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
sd 7:0:0:0: [sdc] CDB:
Write(10): 2a 00 06 f1 40 00 00 00 f0 00
end_request: I/O error, dev sdc, sector 116473856
usb 1-2: reset high-speed USB device number 6 using ehci-pci
usb 1-2: reset high-speed USB device number 6 using ehci-pci
usb 1-2: reset high-speed USB device number 6 using ehci-pci
usb 1-2: reset high-speed USB device number 6 using ehci-pci
usb 1-2: reset high-speed USB device number 6 using ehci-pci
sd 7:0:0:0: Device offlined - not ready after error recovery
sd 7:0:0:0: [sdc] Unhandled error code
sd 7:0:0:0: [sdc]
Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK
sd 7:0:0:0: [sdc] CDB:
Write(10): 2a 00 07 01 bc f0 00 00 f0 00
end_request: I/O error, dev sdc, sector 117554416
sd 7:0:0:0: rejecting I/O to offline device
sd 7:0:0:0: [sdc] killing request
sd 7:0:0:0: [sdc] Unhandled error code
sd 7:0:0:0: [sdc]
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
sd 7:0:0:0: [sdc] CDB:
Write(10): 2a 00 07 01 bd e0 00 00 20 00
end_request: I/O error, dev sdc, sector 117554656
sd 7:0:0:0: rejecting I/O to offline device
sd 7:0:0:0: rejecting I/O to offline device
sd 7:0:0:0: rejecting I/O to offline device
...
sd 7:0:0:0: rejecting I/O to offline device
the log floods with this last message, repeated every few microseconds or so, e.g.
2014-11-14T05:13:27.998640+00:00 dhcppc0 kernel: [21315.550191] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:27.998643+00:00 dhcppc0 kernel: [21315.550215] sd 7:0:0:0: rejecting I/O to offline device
then the device re-attaches at timestamp 21341.283023 and this is the complete log up to the lock-up and re-boot
2014-11-14T05:13:30.862797+00:00 dhcppc0 kernel: [21318.415214] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.862799+00:00 dhcppc0 kernel: [21318.415238] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.862801+00:00 dhcppc0 kernel: [21318.415262] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.862803+00:00 dhcppc0 kernel: [21318.415285] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.862805+00:00 dhcppc0 kernel: [21318.415309] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.862808+00:00 dhcppc0 kernel: [21318.415333] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.862810+00:00 dhcppc0 kernel: [21318.415357] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.862812+00:00 dhcppc0 kernel: [21318.415378] usb 1-2: USB disconnect, device number 6
2014-11-14T05:13:30.862814+00:00 dhcppc0 kernel: [21318.415381] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.862817+00:00 dhcppc0 kernel: [21318.415404] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.862819+00:00 dhcppc0 kernel: [21318.415427] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.862821+00:00 dhcppc0 kernel: [21318.415451] sd 7:0:0:0: rejecting I/O to offline device
2014-11-14T05:13:30.978177+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: emit_signal: 0x6cfbd0
2014-11-14T05:13:31.023019+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: emit_signal: 0x730030
2014-11-14T05:13:31.023667+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: emit_signal: 0x6cfbd0
2014-11-14T05:13:53.730763+00:00 dhcppc0 kernel: [21341.283023] usb 1-2: new high-speed USB device number 7 using ehci-pci
2014-11-14T05:13:53.844763+00:00 dhcppc0 kernel: [21341.397974] usb 1-2: New USB device found, idVendor=152d, idProduct=2337
2014-11-14T05:13:53.844788+00:00 dhcppc0 kernel: [21341.397980] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=5
2014-11-14T05:13:53.844791+00:00 dhcppc0 kernel: [21341.397983] usb 1-2: Product: USB to ATA/ATAPI Bridge
2014-11-14T05:13:53.844793+00:00 dhcppc0 kernel: [21341.397986] usb 1-2: Manufacturer: JMicron
2014-11-14T05:13:53.844795+00:00 dhcppc0 kernel: [21341.397988] usb 1-2: SerialNumber: D50AA92969FF
2014-11-14T05:13:53.847762+00:00 dhcppc0 kernel: [21341.400075] usb-storage 1-2:1.0: USB Mass Storage device detected
2014-11-14T05:13:53.847788+00:00 dhcppc0 kernel: [21341.400258] scsi8 : usb-storage 1-2:1.0
2014-11-14T05:13:53.850404+00:00 dhcppc0 mtp-probe: checking bus 1, device 7: "/sys/devices/pci0000:00/0000:00:13.5/usb1/1-2"
2014-11-14T05:13:53.852007+00:00 dhcppc0 mtp-probe: bus: 1, device: 7 was not an MTP device
2014-11-14T05:13:54.920329+00:00 dhcppc0 kernel: [21342.473154] scsi 8:0:0:0: Direct-Access WDC WD25 00BEVT-22A23T0 PQ: 0 ANSI: 2 CCS
2014-11-14T05:13:54.920353+00:00 dhcppc0 kernel: [21342.473440] sd 8:0:0:0: Attached scsi generic sg3 type 0
2014-11-14T05:13:54.929761+00:00 dhcppc0 kernel: [21342.482625] sd 8:0:0:0: [sdd] 488397168 512-byte logical blocks: (250 GB/232 GiB)
2014-11-14T05:13:54.930785+00:00 dhcppc0 kernel: [21342.483632] sd 8:0:0:0: [sdd] Write Protect is off
2014-11-14T05:13:54.930798+00:00 dhcppc0 kernel: [21342.483637] sd 8:0:0:0: [sdd] Mode Sense: 28 00 00 00
2014-11-14T05:13:54.931775+00:00 dhcppc0 kernel: [21342.484626] sd 8:0:0:0: [sdd] No Caching mode page found
2014-11-14T05:13:54.931795+00:00 dhcppc0 kernel: [21342.484630] sd 8:0:0:0: [sdd] Assuming drive cache: write through
2014-11-14T05:13:54.934821+00:00 dhcppc0 kernel: [21342.487997] sd 8:0:0:0: [sdd] No Caching mode page found
2014-11-14T05:13:54.934843+00:00 dhcppc0 kernel: [21342.488020] sd 8:0:0:0: [sdd] Assuming drive cache: write through
2014-11-14T05:13:54.947107+00:00 dhcppc0 kernel: [21342.500292] sdd: sdd1
2014-11-14T05:13:54.956768+00:00 dhcppc0 kernel: [21342.509156] sd 8:0:0:0: [sdd] No Caching mode page found
2014-11-14T05:13:54.956792+00:00 dhcppc0 kernel: [21342.509164] sd 8:0:0:0: [sdd] Assuming drive cache: write through
2014-11-14T05:13:54.956795+00:00 dhcppc0 kernel: [21342.509170] sd 8:0:0:0: [sdd] Attached SCSI disk
2014-11-14T05:13:55.231909+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: emit_signal: 0x7300b0
2014-11-14T05:13:55.439810+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: emit_signal: 0x7300b0
2014-11-14T05:13:55.448218+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: emit_signal: 0x6cf950
2014-11-14T05:13:55.716810+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: in handle_volume_mount
2014-11-14T05:13:55.815765+00:00 dhcppc0 kernel: [21343.368045] EXT4-fs (sdd1): warning: mounting fs with errors, running e2fsck is recommended
2014-11-14T05:13:55.830269+00:00 dhcppc0 kernel: [21343.377245] EXT4-fs (sdd1): mounted filesystem with ordered data mode. Opts: (null)
2014-11-14T05:13:55.838491+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: emit_signal: 0x6cf950
2014-11-14T05:13:55.846963+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: libbluray/bdnav/index_parse.c:162: indx_parse(): error opening /run/media/user/old tosh/BDMV/index.bdmv
2014-11-14T05:13:55.855869+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: libbluray/bdnav/index_parse.c:162: indx_parse(): error opening /run/media/user/old tosh/BDMV/BACKUP/index.bdmv
2014-11-14T05:13:55.857520+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: emit_signal: 0x71d480
2014-11-14T05:08:57.221125+00:00 dhcppc0 udisksd[3375]: message repeated 28 times: Error performing housekeeping for drive /org/freedesktop/UDisks2/drives/WDC_WD1600JD_22HBB0_WD_WMAL91398348: Error updating SMART data: sk_disk_smart_status: Input/output error (udisks-error-quark, 0)]
2014-11-14T05:13:55.864764+00:00 dhcppc0 udisksd[3375]: Mounted /dev/sdd1 at /run/media/user/old tosh on behalf of uid 1000
2014-11-14T05:13:55.873000+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: in volume_mount_cb
2014-11-14T05:13:55.873759+00:00 dhcppc0 org.gtk.Private.UDisks2VolumeMonitor[2489]: ### debug: success
2014-11-14T05:15:01.626777+00:00 dhcppc0 systemd[1]: Starting Session 25 of user root.
2014-11-14T05:15:01.664319+00:00 dhcppc0 systemd[1]: Started Session 25 of user root.
the drive re-attaches as /dev/sdd, not /dev/sdc
This is the second time this has happened. After the first occurrence, I made the following
changes
-change APM to prevent spin-down
#/sbin/hdparm -B128 /dev/sdc
-another spin-down/standby change
#/sbin/hdparm -S0 /dev/sdc
-added
options usbcore autosuspend=-1
to /etc/modprobe.d/99-local.conf
I installed sdparm and did
#/sbin/sdparm -a /dev/sdc
but no STANDBY parameter is listed.
So - my questions
Why does the USB drive get reset by ehci-pci ?
Why does the USB drive go offline ?
Why does the fsck indicate premature completion ?
Why does Linux lock solid and require a power-off re-boot when this happens ?