I’ve read these kinds of errors can also be due to bad cables and bad power supplies. My system was working fine with all the same components and cables and an EVO 850 as it’s root drive. I’ve re-seated all cables and swapped them around, with no change.
In my case the kernel seems to have managed to recover from the errors and when compared with backups I’ve yet to discover any corruption. I’ve disabled NCQ and the errors have not recurred.
Problem occurs regardless OS: it exists also with Leap and Windows.
Disabling NCQ cripples SSD perfomance.
Enabling NCQ may lead to unrecoverable errors, and also cripples perfomance by lowering SATA speed and other techniques.
The best way: use another SSD.
Another ways:
Use another controller. ASMedia ASM1061 with updated AHCI firmware is good enough.
Using another NVME drive:
2. For UEFI boot: include NVME driver into BIOS or load it from some media before using it (some FAT16/FAT32 partition).
3. For legacy boot: use separate “/boot” on SATA/IDE/USB media + NVME drive.
Thanks for the warning. From what I’ve googled it seems this problem was resolved ages ago, probably way before I bought mine. Anyway the fstrim timer has been running for months (years) with no issues on two desktops here.
I ran some jobs on my desktop driving reads to about 525 MB/s for queue depth 31. I tried queue depths 1 and 8.
echo 8 > /sys/block/sdb/device/queue_depth
A queue depth of 1 dropped the read rate to 455 MB/s. It’s a 13.3% loss, but not a crippling loss (the loss might be bigger for other kinds of workloads). In the name of stability I can live with that (for a desktop PC).
Which seems to prove that for some usage scenarios, particularly for server level products, disabling NCQ will have a bigger impact.
I wonder if 8 might be more stable than 31 for the 860 EVO and AMD controllers, because it seems to perform as well as 31?
Switching to another controller is a good suggestion. But goolge reveals that different controllers and firmware variants may have different issues, so it will have to wait until I’m hungry for another 13% of throughput. I think for my desktop use, this might be good enough.