MPT3SAS / LSI 9305-16i Issues (HBA/SATA Drive resets) on Multiple machines

I think I have found a problem with the LSI 9305-16i SAS/SATA HBA (and/or MPT3SAS driver) under LEAP15.2 (latest patches). I have been working this issue for about 4 weeks now and I think I’ve pretty much tried everything I can to mitigate the issue (short of simply using a different HBA). I have lots of info (inc dmesg dumps), but here is the basic gist of things:

I have a couple of systems configured as prototypes for a Virtual Machine Server (we are an OEM Integrator). Both of these machines are running LEAP 15.2 on ASUS WS621E Dual Xeon Motherboards with recent online updates, patches, drivers, BIOS, and firmware (part of the troubleshooting process). These systems have Virtual Machines each running on their own dedicated SSD. One VM Guest to one 2.5" SATA SSD. These SATA SSD’s are connected to LSI 9305-16i HBA’s. The Firmware and BIOS on these HBA’s have been updated to the very latest Rev (Dec 2020) but more on that later.

Anywho… The Host O/S has been dismounting and remounting (Read Only) the Host Filesystem on each SSD about once a week for each of these VMs. This has happened pretty consistently on BOTH machines, on two different networks, and has happened to each and every VM regardless of the Guest O/S (Ubuntu Appliance, Suse Linux, OpenSuse SMG Appliance). The Host Filesystem (on each SSD) was originally EXT4 but we have also tried XFS as part of the troubleshooting process. (The only difference is HOW the disconnect problem manifests itself - Changing the Host FS didn’t resolve the issue). Host File Systems are also mounted “noatime” and “nodirtime” to minimize unnecessary updates. Moving the VM’s off the SATA drives and onto an internal RAID array mitigates the issue (likely as it eliminates the LSI HBA and MPT3SAS driver) but this is not a desirable end configuration for us.

DMESG shows the Disk/HBA timeouts clearly related to MPT3SAS. One of the latest online updates actually upgraded this driver (to 35.101.00.00) but did not resolve the issue.

I also found an interesting link (from June of 2020) about SATA drive timeouts on LSI 9305 HBA’s and a firmware update (16.00.12.00) which was privately released to resolve these. Specifically, the fix was supposed to resolve timeouts which only effected SATA drives on this SAS/SATA HBA. I also noticed that a similarly versioned firmware release was made publicly available in Dec 2020. Needless to say I was pretty convinced this would solve the issue as it was a pretty good description of what we were seeing. I applied the latest firmware release and, unfortunately, it did not:

https://www.truenas.com/community/resources/lsi-9300-xx-firmware-update.145/

I opened a ticket with LSI/Avago/Broadcom tech support but have not heard anything back. (Surprise, surprise…) I have a bunch of data and observations (and can surely get what else is needed) and I’ld appreciate the ability to get a dialog started with anyone here who might be able to help provide more insight and/or help resolve the issue.

Update…

It looks like the weekly TRIM operation is directly related to these issues. That operation runs at midnight every Monday morning and that is precisely when the SATA SSD’s connected to the 9305-16i typically see the “remount” (reset) issues.

Output from my latest fstrim logs (journalctl -b -u fstrim.service) looks like this. You can see the one VM (“Dimension” drop off line with an ioctl failure):

– Logs begin at Fri 2021-04-16 18:12:36 CDT, end at Wed 2021-04-21 12:00:01 CDT. –
Apr 19 00:00:01 sundance systemd[1]: Starting Discard unused blocks on filesystems from /etc/fstab…
Apr 19 00:03:43 sundance fstrim[13415]: fstrim: /VMDisks/Dimension: FITRIM ioctl failed: Input/output error
Apr 19 00:07:46 sundance fstrim[13415]: /VMDisks/VMD6: 460.2 GiB (494097096704 bytes) trimmed on /dev/sdi1
Apr 19 00:07:46 sundance fstrim[13415]: /VMDisks/VMD5: 411 GiB (441261481984 bytes) trimmed on /dev/sdh1
Apr 19 00:07:46 sundance fstrim[13415]: /VMDisks/VMD4: 435 GiB (467064438784 bytes) trimmed on /dev/sde1
Apr 19 00:07:46 sundance fstrim[13415]: /VMDisks/Mail: 435 GiB (467063312384 bytes) trimmed on /dev/sdg1
Apr 19 00:07:46 sundance fstrim[13415]: /VMDisks/Web: 411 GiB (441261477888 bytes) trimmed on /dev/sdd1
Apr 19 00:07:46 sundance systemd[1]: fstrim.service: Main process exited, code=exited, status=64/n/a
Apr 19 00:07:46 sundance systemd[1]: Failed to start Discard unused blocks on filesystems from /etc/fstab.
Apr 19 00:07:46 sundance systemd[1]: fstrim.service: Unit entered failed state.
Apr 19 00:07:46 sundance systemd[1]: fstrim.service: Failed with result ‘exit-code’.

Corresponding info from dmesg looks like this:

[193707.808764] sd 10:0:2:0: attempting task abort!scmd(0x00000000a6d4920f), outstanding for 30748 ms & timeout 30000 ms
[193707.808773] sd 10:0:2:0: [sdf] tag#4967 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[193707.808779] scsi target10:0:2: handle(0x001a), sas_address(0x300062b2069af741), phy(1)
[193707.808783] scsi target10:0:2: enclosure logical id(0x500062b2069af740), slot(2)
[193707.808786] scsi target10:0:2: enclosure level(0x0000), connector name( )
.
. (a bunch more like the above)
.
[193836.917666] scsi target10:0:2: target reset: SUCCESS scmd(0x00000000fa8806c1)
[193837.771473] sd 10:0:2:0: Power-on or device reset occurred
[193865.852538] sd 10:0:2:0: attempting task abort!scmd(0x00000000d20880d1), outstanding for 7028 ms & timeout 7000 ms
[193865.852547] sd 10:0:2:0: [sdf] tag#4952 CDB: ATA command pass through(16) 85 06 20 00 00 00 00 00 00 00 00 00 00 00 e5 00
[193865.852552] scsi target10:0:2: handle(0x001a), sas_address(0x300062b2069af741), phy(1)
[193865.852556] scsi target10:0:2: enclosure logical id(0x500062b2069af740), slot(2)
[193865.852559] scsi target10:0:2: enclosure level(0x0000), connector name( )
[193865.881899] sd 10:0:2:0: [sdf] tag#4979 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
[193865.881915] sd 10:0:2:0: [sdf] tag#4981 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
[193865.881917] sd 10:0:2:0: [sdf] tag#4979 CDB: Write(10) 2a 08 1d d8 66 08 00 00 08 00
[193865.881926] blk_update_request: I/O error, dev sdf, sector 500721160 op 0x1:(WRITE) flags 0x20800 phys_seg 1 prio class 0
[193865.881929] sd 10:0:2:0: [sdf] tag#4981 CDB: Read(10) 28 00 00 00 00 00 00 00 08 00
[193865.881933] sd 10:0:2:0: [sdf] tag#4980 timing out command, waited 180s
[193865.881940] blk_update_request: I/O error, dev sdf, sector 500721160 op 0x1:(WRITE) flags 0x20800 phys_seg 1 prio class 0
[193865.881951] blk_update_request: I/O error, dev sdf, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[193865.881953] sd 10:0:2:0: [sdf] tag#4980 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
[193865.881970] sd 10:0:2:0: [sdf] tag#4980 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[193865.881993] blk_update_request: I/O error, dev sdf, sector 249563136 op 0x3:(DISCARD) flags 0x4800 phys_seg 1 prio class 0
[193865.882004] sd 10:0:2:0: task abort: SUCCESS scmd(0x00000000d20880d1)
[193865.882011] Aborting journal on device sdf1-8.
[193865.882032] sd 10:0:2:0: [sdf] tag#4978 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
[193865.882042] sd 10:0:2:0: [sdf] tag#4978 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[193865.882051] blk_update_request: I/O error, dev sdf, sector 249825279 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[193866.521415] sd 10:0:2:0: Power-on or device reset occurred
[193866.525954] EXT4-fs error (device sdf1): ext4_journal_check_start:61: Detected aborted journal
[193866.525957] EXT4-fs (sdf1): Remounting filesystem read-only
[193866.528151] EXT4-fs (sdf1): ext4_writepages: jbd2_start: 9223372036854775806 pages, ino 14; err -30

So, it appears that the FSTRIM operation is not being “liked” by the LSI/Avago/Broadcom 9305-16i HBA. I don’t know if this is unique to the TRIM request or if it just exacerbates some underlying issue…

I expect I can just disable the service, but it seems like I shouldn’t have to…

Anyone?

I would love to know how I could get the above info in front of the engineers at LSI/Broadcom/Avago! They are not responding to the Support Form I posted on their web site.

I think I have this pretty well cornered and you think they would be interested in solving the problem…

https://forums.opensuse.org/showthread.php/552227-Ошибка-выполнения-fstrim-для-NTFS-на-ядрах-5-1X

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7e0815797656f029fab2edc309406cddf931b9d8

Test your systems with kernel from kernel:stable repo.

Thanks for the reply. I am running the latest distro with all online patches applied. This is not a custom kernel. The issue has survived several online updates. As this is a fully configured environment, I’m not inclined to replace/rebuild everything. What I am already running should be a “stable kernel” in as far as it has not been customized or rebuilt in any way. (Correct me if I am wrong…)

In other news… I did finally hear back from LSI. After a brief discussion over the phone they are requesting some additional information as generated by their log collection utility. I am sending this to them now. As part of the discussion, the engineer I am working with hints that this very well may be a TRIM issue. He is not even sure if the HBA supports TRIM. (If it doesn’t, I would simply think it should ignore the request).

Anyway, it seems that the weekly FSTRIM cycle may very well be a key piece to this issue… This problem may also only manifest itself if you have an SSD connected to these HBA’s.

I just got the following info back from LSI (even before I submitted my diagnostics log to them). It appears that the pre-configured (automatic) FSTRIM service may be what is triggering these problems. I would expect that anyone who is attaching SATA SSD’s to a 9305 HBA may very well be exposed to this type of issue. I have temporarily disabled the FSTRIM Timer to hopefully mitigate this problem. However, as these are SSD’s, I think fully operational TRIM support is needed to take advantage of wear leveling in these types of drives.

(BTW we are using 512 GB Samsung 860 Pro’s in this application):

TRIM is partially supported.
I think fstrim may be sending incorrect encapsulation commands through SAT-L data packet.
See the following IT firmware limitations:

LSI SAS HBAs with IT firmware do support TRIM, but with these limitations:

The drives must support both “Data Set Management TRIM supported (limit 8blocks)” and “Deterministic read ZEROs after TRIM” in their ATA options.
The Samsung 850 PROs don’t have “Deterministic read ZEROs after TRIM” support, and thus TRIM cannot be run on these drives when attached to a LSI SAS HBAs with IT firmware

You can also use sg_unmap to send a SCSI UNMAP command. Sg_unmap is part of the sg3_utils, which can be downloaded from here: http://sg.danny.cz/sg/sg3_utils.html

Usage: sg_unmap --grpnum=GN] --help] --in=FILE] --lba=LBA,LBA…]

–num=NUM,NUM…] --timeout=TO] --verbose] --version] DEVICE

Send a SCSI UNMAP command to DEVICE to unmap one or more logical blocks. The unmap capability is closely related to the ATA DATA SET MANAGEMENT command with the “Trim” bit set. Click here for a more detailed description: http://manpages.ubuntu.com/manpages/lucid/man8/sg_unmap.8.html

Example:

In this example there is a SATA SSD at sdc. To tell the capacity of the SATA SSD:

sg_readcap /dev/sdc

Read Capacity results:

Last logical block address=117231407 (0x6fccf2f), Number of block=117231408

Logical block length=512 bytes

Hence:

Device size: 60022480896 bytes, 57241.9 MiB, 60.02 GB

Then run the sg_unmap command:

sg_unmap --lba=0 --num=117231407 /dev/sdc

or

sg_unmap --lba=0 --num=117231408 /dev/sdc

So, LSI/Avago/Broadcom is telling us their SAS HBA’s don’t fully support TRIM unless the target drive supports BOTH

   *    Data Set Management TRIM supported (limit 8 blocks)
   *    Deterministic read ZEROs after TRIM

and when I ask hdparm to do a detailed listing of one of our Samsung 860 Pro SSD’s, it specifically says they DO support both of those features:

hdparm -I /dev/sdf1:

ATA device, with non-removable media
Model Number: Samsung SSD 860 PRO 512GB
Serial Number: S5HTNS0NA02561F
Firmware Revision: RVM02B6Q
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
Used: unknown (minor revision code 0x005e)
Supported: 11 8 7 6 5
Likely used: 11
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63

CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 1000215216
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
Logical Sector-0 offset: 0 bytes
device size with M = 10241024: 488386 MBytes
device size with M = 1000
1000: 512110 MBytes (512 GB)
cache/buffer size = unknown
Form Factor: 2.5 inch
Nominal Media Rotation Rate: Solid State Device
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec’d by Standard, no device specific minimum
R/W multiple sector transfer: Max = 1 Current = 1
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
Write-Read-Verify feature set
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Gen3 signaling speed (6.0Gb/s)
* Native Command Queueing (NCQ)
* Phy event counters
* READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
* DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Asynchronous notification (eg. media change)
* Software settings preservation
Device Sleep (DEVSLP)
* SMART Command Transport (SCT) feature set
* SCT Write Same (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
* reserved 69[4]
* DOWNLOAD MICROCODE DMA command
* SET MAX SETPASSWORD/UNLOCK DMA commands
* WRITE BUFFER DMA command
* READ BUFFER DMA command
** * Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read ZEROs after TRIM
*
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
supported: enhanced erase
2min for SECURITY ERASE UNIT. 2min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5002538e30a290ec
NAA : 5
IEEE OUI : 002538
Unique ID : e30a290ec
Device Sleep:
DEVSLP Exit Timeout (DETO): 50 ms (drive)
Minimum DEVSLP Assertion Time (MDAT): 30 ms (drive)
Checksum: correct

So, something is not as it appears… I have asked LSI to clarify this information as according to hdparm and LSI we should NOT be having any TRIM (fstrim) related issues…

So you’re playing games “Guess what?”, holding back needed info about used drives… Not good…>:)

Samsung SSDs have a long history of errors:

  1. With TRIM https://www.algolia.com/blog/engineering/when-solid-state-drives-are-not-that-solid/
    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/ata/libata-core.c?id=9a9324d3969678d44b330e1230ad2c8ae67acf81
    Now: https://github.com/torvalds/linux/blob/master/drivers/ata/libata-core.c
    Check your libata-core.c.

  2. With NCQ - 860 SATA seies.
    I had problems with Samsung 860 Evo + NCQ + some chipsets.

Check what you’re using, and what is working, with different controllers.
IMHO Samsung products poisons systems.

Why so angry?

These are BRAND NEW drives. These are not “Used”. …and I’m NOT playing any “games” or “holding anything back”. I clearly identified we were using 860 Pro’s in an earlier post. (But thanks for that anyway…)

Yes, I am aware of the blacklisting of Samsung SSD’s, but the code I find (listed on the net) is not always consistent. Sometimes it basically says “Samsung*”, sometimes “Samsung 8*” and sometimes specifically calls out the 840 and 850’s. (I’m not sure which variant is being used in my specific LEAP 15.2 kernel).

…and FSTRIM is absolutely “trying” to work on these drives (despite any blacklisting). So my thought is that this particular model (860 Pro) may not be blacklisted in my specific kernel.

Regardless, it seems that using these drives on my specific HBA (LSI 9305) is problematic and FSTRIM appears to be the initiator.

Furthermore, I just got another email from LSI last night. LSI Support has just confirmed that they do not support TRIM for ANY SATA SSD connected to ANY of their SAS/SATA HBA’s. (They said they DO support the SAS variant of the command - for SAS SSD’s - but fully implementing the SATA variant would essentially be too problematic/difficult/not-worth-it etc).

They specifically recommend connecting SATA SSD’s directly to a dedicated SATA controller (not a SAS controller which also supports SATA). Good luck finding one with anything more than a basic performance rating. Maybe that’s what I need to do in any case…

I have a couple of Areca ARC-1330-8i HBA’s arriving next week for testing. I need support for 16 total drives so 2 HBA’s will be required. We have a very good working relationship with Areca and they tell me they don’t expect any problems. I also a couple of WD RED SSD’s on order to throw into the mix.

I’ld like to report back what I find but I’m not sure its worth doing that if I’m going to get accused of playing games and holding stuff back. I was the major contributor to my own thread to help make sure all information I was receiving was available to anyone who was following but it appears it was not appreciated (at least by some). :frowning: It appears some know better and don’t appreciate folks asking for help so Good For You!

You have to go on and contribute despite some criticism on forum (fora).
You are playing games with hardware manufactures which crave for selling you pro-grade goods with overinflated price, and you are trying to use consumer-level stuff.
For consumer SSDs you may use consumer controllers: https://www.asmedia.com.tw/products-list/8a2YQ99xzaUH2qg5/58dYQ8bxZ4UR9wG5 - ASM1166/ASM1164/etc.
To use them you will need to add theirs IDs into Linux AHCI driver.
IMHO SAS is better than SATA in your case.
IMHO you may use NVME SSD drives.

No, you are patently wrong here! …and your condescending attitude is not helpful or constructive. (…and NO, I don’t need SAS SSD’s. All I want is to be able to use open market SATA SSD’s for the affordable performance they can provide.)

I provide the following info for other users who are genuinely interested in this issue and how it can be resolved.

  • LSI has confirmed that FSTRIM will not work with ANY of their SAS/SATA HBA’s (RAID and also non-RAID). (See prior message in this thread)

  • I acquired a couple of ARECA 1330-8i SAS/SATA HBA’s for testing (for a total of 16 drives) and replaced the LSI card in my system.

  • After installing the Areca Driver (ARCSAS), mounting the associated drives, and booting the respective Virtual Machines, I manually issued an FSTRIM operation (fstrim -a -v).

  • The manual FSTRIM operation succeeded without issue. No VM’s crashed, no filesystems were remounted “Read Only”, no file system corruption. It should be noted that the automatic FSTRIM service (timer) was disabled over a week ago so we had a full payload of trimming which needed to be applied. I therefore re-enabled the automatic FSTRIM service.

  • It should be noted that this was done with the original Samsung 860 Pro drives in position. (I also have some WD SA500 Red SSD’s standing by but it appears I don’t need them).

TLDR: Using an Areca SAS/SATA HBA instead of an LSI card solved this issue. The Samsung 860 Pro SSD’s appear NOT to be a core problem here.

IOW: If you want to take advantage of the relatively high performance provided by SATA SSD’s using a robust controller/chipset, try using an Areca SAS/SATA HBA…