[Tumbleweed] NVMe Timeouts & Stutters/Freezes with Ryzen 9700X / RX 9700XT (Mesa 26)

Hey there,
this is my first post, so please correct me if I did sth wrong :slight_smile:
I first describe my problem, then I ask for some ideas what the issue could be.

Problem Description: I am experiencing intermittent but severe system freezes (approx. 9-10 seconds), specifically when gaming during save games or loading “complex” web pages. The system does not fully crash, but the IO hangs completely for several seconds before recovering.

This behavior started appearing recently, likely correlating with the shift to Kernel 6.18.9-1-default or to the Mesa 26 stack. Unofrtunately, I am not sure when this started. Sometimes I boot the PC and it just works, sometimes these freezes happen every 5 min. The system was very stable for months prior to these updates.

Investigation & Logs: dmesg -w reveals that the NVMe drive is timing out and aborting write requests. It seems the drive enters a power-saving state or fails to send an interrupt during heavy GPU bus load, leading to a “lost interrupt” scenario.

Here are the relevant logs capturing the freeze:

[ 4400.554247] [   T290] nvme nvme0: I/O tag 264 (2108) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[ 4400.554257] [   T290] nvme nvme0: I/O tag 265 (2109) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[ 4400.554260] [   T290] nvme nvme0: I/O tag 266 (e10a) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[ 4400.554261] [   T290] nvme nvme0: I/O tag 267 (d10b) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
...
[ 4408.142488] [    C10] nvme nvme0: Abort status: 0x0
[ 4408.193624] [    C10] nvme nvme0: Abort status: 0x0
[ 4408.204505] [    C10] nvme nvme0: Abort status: 0x0
[ 4408.220136] [    C10] nvme nvme0: Abort status: 0x0

Also, sporadic timeouts occurred:

[  276.938426] [   T488] nvme nvme0: I/O tag 150 (8096) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
...
[  290.834301] [    C10] nvme nvme0: Abort status: 0x0

Troubleshooting Steps Taken:
GPU and CPU are in performance mode/governor.

I have verified that the SSD is physically installed in the top M.2 slot (CPU lanes, Bus 04:00.0) via lspci -tv, so it is not routed through the Promontory chipset.

I have applied the following fixes, which seem to mitigate the issue but require aggressive settings:

  1. Filesystem:
  • Disabled BTRFS quotas (btrfs quota disable /) and removed discard from /etc/fstab (replaced with periodic fstrim timer) to reduce controller load.
  1. Kernel Parameters:
  • nvme_core.default_ps_max_latency_us=0 (To disable APST)
  • pcie_aspm.policy=performance (To disable ASPM/L1 states)

Idea: It appears there is a conflict involving the Kioxia/Phison controller, the new aggressive power management in Kernel 6.19, and potentially the IOMMU dropping interrupts during high bandwidth usage by the RDNA 4 GPU (Mesa 26).

Next steps:

  • Adding iommu=soft (or disabling IOMMU in BIOS under AMD CBS → NBIO) seems to be the only way to reliably stop the QID timeouts. However, I do not want to deactivate IOMMU to be honest.

My question, does anybody else see these issues or has another idea what the problem could be?

Specs:

  • OS: OpenSUSE Tumbleweed
  • Kernel: Linux 6.18.9-1-default
  • CPU: AMD Ryzen 7 9700x (Granite Ridge)
  • Motherboard: ASRock B850 Pro-A WiFi
  • GPU: AMD Radeon RX 9070 XT (Navi 48 / RDNA 4) using Mesa 26.0 (git/rc)
  • Storage: Kioxia Exceria Plus G3 NVMe (DRAM-less), installed in M2_1 (CPU-attached Slot).
  • Filesystem: BTRFS

Thanks for you help :slight_smile:

@jan_mrt Hi, just to be sure, the Mesa release in Tumbleweed is now 26.0.0-1.1 not an RC release. Is the system up to date with the latest snapshot 20260214?

Hi, my bad. I see it switched to packman. I do not know when this happened. I will change back to main repo. I am now not @my PC for a week. But it is 26.0.0 nevertheless.

Information for package Mesa:
-----------------------------
Repository     : Packman
Name           : Mesa
Version        : 26.0.0-1699.2.pm.2
Arch           : x86_64
Vendor         : http://packman.links2linux.de
Installed Size : 7,2 KiB
Installed      : Yes (automatically)
Status         : up-to-date
Source package : Mesa-26.0.0-1699.2.pm.2.src
Upstream URL   : https://www.mesa3d.org
Summary        : System for rendering 3-D graphics

@jan_mrt AFAIK as an AMD user your stuck with Packman, patent encumbered encoder/decoder required?

Could try switching back, but not sure how that will go.

1 Like

Ok. I mean it probably is not due to main repo and packman repo I guess.
I also found this bugreport from some time ago. Should be a similar issue also with my Kioxia SSD.

https://bugzilla.kernel.org/show_bug.cgi?id=217871

I will keep you updated if I can solve it, if I am back.
I just hoped somebody had a similar issue :slight_smile:

@jan_mrt I’m not using VMD (RSTe) I’ve disabled on my Dell systems that can use it…

I do have a couple of systems using NVMe’s with the Phison controller (Silicon Power devices) and running Tumbleweed, the other Aeon and don’t have any issues…

/sbin/lspci -nnk | grep NVMe
01:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E12 NVMe Controller [1987:5012] (rev 01)
	Subsystem: Phison Electronics Corporation E12 NVMe Controller [1987:5012]
1 Like

@malcolmlewis thanks for the infos. Will dig a little deeper after my vacation and hope to fix it. At the moment, this issue is very annoying -.-

@malcolmlewis : I set iommu=soft and I still have this issue.

Sometimes it works for hours sometimes I have these timeouts every few seconds/minutes.
It always happens when I “load/open” something. For example, when I open dolphin or a game is saving automatically. The PC is not usable anymore and freezes for several seconds.

Do you have any further ideas? I do not know what else to check by now. :confused:

Here some further logs from couple minutes ago:

[16959.987144] [    T474] nvme nvme0: I/O tag 763 (72fb) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[16959.987155] [    T474] nvme nvme0: I/O tag 764 (02fc) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[16959.987158] [    T474] nvme nvme0: I/O tag 765 (b2fd) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[16959.987160] [    T474] nvme nvme0: I/O tag 766 (b2fe) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[16970.578121] [      C1] nvme nvme0: Abort status: 0x0
[16970.578764] [      C1] nvme nvme0: Abort status: 0x0
[16970.579408] [      C1] nvme nvme0: Abort status: 0x0
[16970.580049] [      C1] nvme nvme0: Abort status: 0x0
[17069.364890] [    T472] nvme nvme0: I/O tag 682 (62aa) opcode 0x1 (I/O Cmd) QID 6 timeout, aborting req_op:WRITE(1) size:262144
[17069.364900] [    T472] nvme nvme0: I/O tag 683 (42ab) opcode 0x1 (I/O Cmd) QID 6 timeout, aborting req_op:WRITE(1) size:262144
[17069.364903] [    T472] nvme nvme0: I/O tag 684 (32ac) opcode 0x1 (I/O Cmd) QID 6 timeout, aborting req_op:WRITE(1) size:262144
[17069.364904] [    T472] nvme nvme0: I/O tag 685 (22ad) opcode 0x1 (I/O Cmd) QID 6 timeout, aborting req_op:WRITE(1) size:262144
[17079.737551] [      C1] nvme nvme0: Abort status: 0x0
[17079.738188] [      C1] nvme nvme0: Abort status: 0x0
[17079.738829] [      C1] nvme nvme0: Abort status: 0x0
[17079.739472] [      C1] nvme nvme0: Abort status: 0x0
[17127.727204] [    T470] nvme nvme0: I/O tag 612 (7264) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[17127.727215] [    T470] nvme nvme0: I/O tag 613 (a265) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[17127.727217] [    T470] nvme nvme0: I/O tag 614 (f266) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[17127.727219] [    T470] nvme nvme0: I/O tag 615 (5267) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[17138.049620] [      C1] nvme nvme0: Abort status: 0x0
[17138.050264] [      C1] nvme nvme0: Abort status: 0x0
[17138.050909] [      C1] nvme nvme0: Abort status: 0x0
[17138.051552] [      C1] nvme nvme0: Abort status: 0x0
[17268.526478] [    T298] nvme nvme0: I/O tag 370 (7172) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[17268.526489] [    T298] nvme nvme0: I/O tag 371 (c173) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[17268.526492] [    T298] nvme nvme0: I/O tag 372 (3174) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[17268.526493] [    T298] nvme nvme0: I/O tag 373 (9175) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[17278.982015] [      C1] nvme nvme0: Abort status: 0x0
[17278.982651] [      C1] nvme nvme0: Abort status: 0x0
[17278.983297] [      C1] nvme nvme0: Abort status: 0x0
[17278.983937] [      C1] nvme nvme0: Abort status: 0x0

Is the machine fully up to date? I.e.

knurpht@Lenovo-P16:~> cat /etc/os-release | grep -i version_id
VERSION_ID="20260220"
knurpht@Lenovo-P16:~> 

Yes it is :slight_smile:

...@localhost: cat /etc/os-release | grep -i version_id
VERSION_ID="20260220"

Is Mesa the stock one or the one from Packman?
zypper se -si Mesa

Thanks for your help.
It is from the packman repo:

i  | Mesa                            | package | 26.0.0-1699.2.pm.2 | x86_64 | Packman
i  | Mesa-32bit                      | package | 26.0.0-1699.2.pm.2 | x86_64 | Packman
i  | Mesa-demo                       | package | 9.0.0-7.1          | x86_64 | Main Repository (OSS)
i  | Mesa-demo-egl                   | package | 9.0.0-7.1          | x86_64 | Main Repository (OSS)
i  | Mesa-demo-es                    | package | 9.0.0-7.1          | x86_64 | Main Repository (OSS)
i  | Mesa-demo-x                     | package | 9.0.0-7.1          | x86_64 | Main Repository (OSS)
i  | Mesa-dri                        | package | 26.0.0-1699.2.pm.4 | x86_64 | Packman
i+ | Mesa-dri-32bit                  | package | 26.0.0-1699.2.pm.4 | x86_64 | Packman
i  | Mesa-libEGL1                    | package | 26.0.0-1699.2.pm.2 | x86_64 | Packman
i  | Mesa-libEGL1-32bit              | package | 26.0.0-1699.2.pm.2 | x86_64 | Packman
i  | Mesa-libGL1                     | package | 26.0.0-1699.2.pm.2 | x86_64 | Packman
i  | Mesa-libGL1-32bit               | package | 26.0.0-1699.2.pm.2 | x86_64 | Packman
i  | Mesa-libva                      | package | 26.0.0-1699.2.pm.4 | x86_64 | Packman
i  | Mesa-vulkan-anti-lag-32bit      | package | 26.0.0-1699.2.pm.4 | x86_64 | Packman
i  | Mesa-vulkan-device-select       | package | 26.0.0-1699.2.pm.4 | x86_64 | Packman
i  | Mesa-vulkan-device-select-32bit | package | 26.0.0-1699.2.pm.4 | x86_64 | Packman

To be honest, by now I am not so sure anymore if it has sth. to do with MESA 26.
BR

Looks OK. Just to make sure, does this also occur for a new freshly created user

Will try that. Did not yet have time. If I find a solution, I will post it here.

For your information: I still have no clue what the issue is.

Full re installation is sth I do not want to do.
Sometimes it works for hrs, then it starts shouting the “nvme timeout” issue every few minutes and I basically cannot use the computer anymore.

This was couple minutes ago after working for ~ 3 hrs flawlessly:

[10251.817039] [     C14] hrtimer: interrupt took 1623 ns
[10494.693544] [    T246] nvme nvme0: I/O tag 244 (00f4) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[10494.693557] [    T246] nvme nvme0: I/O tag 245 (f0f5) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[10494.693561] [    T246] nvme nvme0: I/O tag 246 (30f6) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[10494.693563] [    T246] nvme nvme0: I/O tag 247 (a0f7) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[10512.425170] [     C10] nvme nvme0: Abort status: 0x0
[10512.425832] [     C10] nvme nvme0: Abort status: 0x0
[10512.426496] [     C10] nvme nvme0: Abort status: 0x0
[10512.427163] [     C10] nvme nvme0: Abort status: 0x0
[10655.077521] [    T471] nvme nvme0: I/O tag 7 (4007) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[10655.077547] [    T471] nvme nvme0: I/O tag 8 (a008) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[10655.077551] [    T471] nvme nvme0: I/O tag 9 (f009) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[10655.077554] [    T471] nvme nvme0: I/O tag 10 (b00a) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[10671.929829] [     C10] nvme nvme0: Abort status: 0x0
[10671.930488] [     C10] nvme nvme0: Abort status: 0x0
[10671.931152] [     C10] nvme nvme0: Abort status: 0x0
[10671.931833] [     C10] nvme nvme0: Abort status: 0x0
[10714.984137] [    T246] nvme nvme0: I/O tag 499 (51f3) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[10714.984149] [    T246] nvme nvme0: I/O tag 500 (71f4) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[10714.984152] [    T246] nvme nvme0: I/O tag 501 (01f5) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:262144
[10714.984154] [    T246] nvme nvme0: I/O tag 502 (a1f6) opcode 0x1 (I/O Cmd) QID 10 timeout, aborting req_op:WRITE(1) size:49152
[10725.889333] [     C10] nvme nvme0: Abort status: 0x0
[10725.889989] [     C10] nvme nvme0: Abort status: 0x0
[10725.890655] [     C10] nvme nvme0: Abort status: 0x0
[10725.891321] [     C10] nvme nvme0: Abort status: 0x0
[10805.095536] [    T152] nvme nvme0: I/O tag 354 (c162) opcode 0x1 (I/O Cmd) QID 9 timeout, aborting req_op:WRITE(1) size:208896
[10805.095548] [    T152] nvme nvme0: I/O tag 355 (a163) opcode 0x1 (I/O Cmd) QID 9 timeout, aborting req_op:WRITE(1) size:262144
[10805.095550] [    T152] nvme nvme0: I/O tag 356 (d164) opcode 0x1 (I/O Cmd) QID 9 timeout, aborting req_op:WRITE(1) size:262144
[10805.095552] [    T152] nvme nvme0: I/O tag 357 (2165) opcode 0x1 (I/O Cmd) QID 9 timeout, aborting req_op:WRITE(1) size:262144
[10821.938523] [     C10] nvme nvme0: Abort status: 0x0
[10821.939183] [     C10] nvme nvme0: Abort status: 0x0
[10821.939848] [     C10] nvme nvme0: Abort status: 0x0
[10821.940513] [     C10] nvme nvme0: Abort status: 0x0
[10865.508148] [    T310] nvme nvme0: I/O tag 322 (6142) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[10865.508161] [    T310] nvme nvme0: I/O tag 323 (b143) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[10865.508163] [    T310] nvme nvme0: I/O tag 324 (3144) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[10865.508165] [    T310] nvme nvme0: I/O tag 325 (9145) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[10881.655799] [     C10] nvme nvme0: Abort status: 0x0
[10881.656457] [     C10] nvme nvme0: Abort status: 0x0
[10881.657111] [     C10] nvme nvme0: Abort status: 0x0
[10881.657780] [     C10] nvme nvme0: Abort status: 0x0
[10939.750670] [    T253] nvme nvme0: I/O tag 157 (609d) opcode 0x1 (I/O Cmd) QID 5 timeout, aborting req_op:WRITE(1) size:262144
[10939.750680] [    T253] nvme nvme0: I/O tag 448 (d1c0) opcode 0x1 (I/O Cmd) QID 5 timeout, aborting req_op:WRITE(1) size:262144
[10939.750682] [    T253] nvme nvme0: I/O tag 449 (21c1) opcode 0x1 (I/O Cmd) QID 5 timeout, aborting req_op:WRITE(1) size:262144
[10939.750684] [    T253] nvme nvme0: I/O tag 450 (61c2) opcode 0x1 (I/O Cmd) QID 5 timeout, aborting req_op:WRITE(1) size:262144
[10956.157744] [     C10] nvme nvme0: Abort status: 0x0
[10956.158400] [     C10] nvme nvme0: Abort status: 0x0
[10956.159073] [     C10] nvme nvme0: Abort status: 0x0
[10956.159735] [     C10] nvme nvme0: Abort status: 0x0
[11007.398252] [    T152] nvme nvme0: I/O tag 266 (310a) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[11007.398270] [    T152] nvme nvme0: I/O tag 267 (610b) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[11007.398274] [    T152] nvme nvme0: I/O tag 268 (910c) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:61440
[11007.398278] [    T152] nvme nvme0: I/O tag 269 (910d) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:262144
[11024.083949] [     C10] nvme nvme0: Abort status: 0x0
[11024.084608] [     C10] nvme nvme0: Abort status: 0x0
[11024.085278] [     C10] nvme nvme0: Abort status: 0x0
[11024.085944] [     C10] nvme nvme0: Abort status: 0x0
[11074.405796] [    T252] nvme nvme0: I/O tag 450 (b1c2) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[11074.405808] [    T252] nvme nvme0: I/O tag 451 (11c3) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[11074.405810] [    T252] nvme nvme0: I/O tag 452 (41c4) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:53248
[11074.405812] [    T252] nvme nvme0: I/O tag 453 (c1c5) opcode 0x1 (I/O Cmd) QID 1 timeout, aborting req_op:WRITE(1) size:262144
[11091.590269] [     C10] nvme nvme0: Abort status: 0x0
[11091.590931] [     C10] nvme nvme0: Abort status: 0x0
[11091.591593] [     C10] nvme nvme0: Abort status: 0x0
[11091.592250] [     C10] nvme nvme0: Abort status: 0x0

@jan_mrt what does the output from smartctl -x /dev/nvme0 show? Is the device over heating? sensors nvme-pci-*

output of 'smartctl -x /dev/nvme0

== START OF INFORMATION SECTION ===
Model Number:                       KIOXIA-EXCERIA PLUS G3 SSD
Serial Number:                      5FRKF0ZYZ0EA
Firmware Version:                   ELFA01.2
PCI Vendor/Subsystem ID:            0x1e0f
IEEE OUI Identifier:                0x8ce38e
Controller ID:                      0
NVMe Version:                       1.4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          2.000.398.934.016 [2,00 TB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            8ce38e 050199e91b
Local Time is:                      Sat Mar 21 18:17:59 2026 CET
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     83 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     5.50W       -        -    0  0  0  0        0       0
 1 +     3.60W       -        -    1  1  1  1        0       0
 2 +     2.10W       -        -    2  2  2  2        0       0
 3 -   0.0500W       -        -    3  3  3  3     1500    2500
 4 -   0.0050W       -        -    4  4  4  4     5000   30000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         2
 1 -    4096       0         1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning:                   0x00
Temperature:                        38 Celsius
Available Spare:                    100%
Available Spare Threshold:          5%
Percentage Used:                    2%
Data Units Read:                    24.804.823 [12,7 TB]
Data Units Written:                 49.875.820 [25,5 TB]
Host Read Commands:                 136.077.052
Host Write Commands:                318.838.464
Controller Busy Time:               542
Power Cycles:                       164
Power On Hours:                     601
Unsafe Shutdowns:                   17
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               38 Celsius

Error Information (NVMe Log 0x01, 16 of 255 entries)
No Errors Logged

Self-test Log (NVMe Log 0x06, NSID 0xffffffff)
Self-test status: No self-test in progress
Num  Test_Description  Status                       Power_on_Hours  Failing_LBA  NSID Seg SCT Code
 0   Short             Completed without error                 598            -     -   -   -    -
 1   Short             Completed without error                 597            -     -   -   -    -
 2   Extended          Completed without error                 588            -     -   -   -    -
 3   Short             Completed without error                 581            -     -   -   -    -
 4   Short             Completed without error                 576            -     -   -   -    -
 5   Short             Completed without error                 563            -     -   -   -    -
 6   Short             Completed without error                 558            -     -   -   -    -
 7   Short             Completed without error                 545            -     -   -   -    -
 8   Short             Completed without error                 539            -     -   -   -    -
 9   Short             Completed without error                 534            -     -   -   -    -
10   Short             Completed without error                 523            -     -   -   -    -
11   Short             Completed without error                 512            -     -   -   -    -
12   Short             Completed without error                 505            -     -   -   -    -
13   Short             Completed without error                 504            -     -   -   -    -
14   Short             Completed without error                 492            -     -   -   -    -
15   Short             Completed without error                 484            -     -   -   -    -
16   Extended          Completed without error                 482            -     -   -   -    -
17   Short             Completed without error                 477            -     -   -   -    -
18   Short             Completed without error                 473            -     -   -   -    -
19   Short             Completed without error                 459            -     -   -   -    -

Output sensors:

 sensors        
mt7921_phy0-pci-0b00
Adapter: PCI adapter
temp1:        +28.0°C  

spd5118-i2c-20-51
Adapter: SMBus PIIX4 adapter port 0 at 0b00
temp1:        +31.2°C  (low  =  +0.0°C, high = +55.0°C)
                       (crit low =  +0.0°C, crit = +85.0°C)

amdgpu-pci-0f00
Adapter: PCI adapter
vddgfx:        1.36 V  
vddnb:         1.19 V  
edge:         +38.0°C  
PPT:          12.00 mW 
sclk:         600 MHz 

nvme-pci-0400
Adapter: PCI adapter
Composite:    +36.9°C  (low  =  -0.1°C, high = +82.8°C)
                       (crit = +84.8°C)
Sensor 1:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)

hidpp_battery_0-hid-3-6
Adapter: HID adapter
in0:           3.81 V  

spd5118-i2c-20-53
Adapter: SMBus PIIX4 adapter port 0 at 0b00
temp1:        +28.0°C  (low  =  +0.0°C, high = +55.0°C)
                       (crit low =  +0.0°C, crit = +85.0°C)

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +58.0°C  
Tccd1:        +42.6°C  

amdgpu-pci-0300
Adapter: PCI adapter
vddgfx:      319.00 mV 
fan1:           0 RPM  (min =    0 RPM, max = 3200 RPM)
edge:         +32.0°C  (crit = +110.0°C, hyst = -273.1°C)
                       (emerg = +115.0°C)
junction:     +32.0°C  (crit = +110.0°C, hyst = -273.1°C)
                       (emerg = +115.0°C)
mem:          +32.0°C  (crit = +108.0°C, hyst = -273.1°C)
                       (emerg = +113.0°C)
PPT:          26.00 W  (cap = 304.00 W)
pwm1:              0%
sclk:          48 MHz 
mclk:         772 MHz 

I guess no overheating :smiley:

A lot of writing to that NVMe in not many hours… is that what you expected? Or are system logs filling up with errors.

To be honest, I have no clue. Seema much that is right.

Maybe due to agentic coding in combination with BTRFS? Have to look into it.

@jan_mrt I think you need to investigate;

25.5 TB = 26112 GB / 601 Hrs = 43.45 GB per hour…

That’s a lot of data…