bremse
September 15, 2024, 5:39am
1
I have the following problem: During a game (“Diablo IV” via Lutris) the system freezes completely at irregular intervals. This state lasts for about 10 - 20 seconds, after which the game continues to run without any problems.
At the time of the error I find the following in the log:
greenzack:~ # journalctl --system --since '2024-09-14 15:03' --until '2024-09-14 15:06'
Sep 14 15:04:01 greenzack kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
Details about my system:
bremse@greenzack:~> inxi -bz
System:
Kernel: 6.10.9-1-default arch: x86_64 bits: 64
Desktop: KDE Plasma v: 6.1.5 Distro: openSUSE Tumbleweed 20240912
Machine:
Type: Desktop System: Micro-Star product: MS-7E16 v: 1.0
serial: <superuser required>
Mobo: Micro-Star model: X670E GAMING PLUS WIFI (MS-7E16) v: 1.0
serial: <superuser required> UEFI: American Megatrends LLC. v: 1.60
date: 06/19/2024
CPU:
Info: 12-core AMD Ryzen 9 7900X3D [MT MCP] speed (MHz): avg: 545
min/max: 545/5660
Graphics:
Device-1: Advanced Micro Devices [AMD/ATI] Navi 32 [Radeon RX 7700 XT /
7800 XT] driver: amdgpu v: kernel
Device-2: Advanced Micro Devices [AMD/ATI] Raphael driver: amdgpu
v: kernel
Display: x11 server: X.Org v: 21.1.12 with: Xwayland v: 24.1.2 driver: X:
loaded: modesetting unloaded: fbdev,vesa dri: radeonsi gpu: amdgpu
resolution: 3840x2160~60Hz
API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 24.1.3 renderer: AMD
Radeon RX 7800 XT (radeonsi navi32 LLVM 18.1.8 DRM 3.57 6.10.9-1-default)
Network:
Device-1: Realtek RTL8125 2.5GbE driver: r8169
Device-2: MEDIATEK MT7922 802.11ax PCI Express Wireless Network Adapter
driver: mt7921e
Drives:
Local Storage: total: 1.82 TiB used: 145.48 GiB (7.8%)
Info:
Memory: total: 32 GiB note: est. available: 30.45 GiB used: 3.87 GiB (12.7%)
Processes: 472 Uptime: 0h 15m Shell: Bash inxi: 3.3.36
I have found several posts about this error here in the forum, no one seems to describe “my error” exactly though.
Svyatko
September 16, 2024, 6:36pm
2
Disable builtin GPU, reboot, then post
inxi -aGMSz
I’d try updating Mesa and AMDGPU/kernel itself to bleeding-edge stuff, and DXVK master to see if there’s been any improvements to what’s being ran into to cause the crash/error.
Alternatively disable stuff with RADV/VKD3D/etc envs in-case it’s a specific instruction causing the issue.
Beyond that, you’ll have to figure out what specifically is causing the crash, bisect possibly, try stuff to eliminate that specific crash, and have fun with that rabbit hole. Generally speaking amdgpu_job_timeout
could mean anything with AMDGPU’s stack, so good luck
1 Like
bremse
September 17, 2024, 4:17pm
4
Builtin GPU disabled, did a reboot, inxi -aGMSz
:
System:
Kernel: 6.10.9-1-default arch: x86_64 bits: 64 compiler: gcc v: 14.2.0
clocksource: tsc avail: hpet,acpi_pm
parameters: BOOT_IMAGE=/boot/vmlinuz-6.10.9-1-default
root=UUID=f3477739-a3b9-41f6-9305-1812d2b68c14 splash=silent
resume=/dev/disk/by-uuid/525a40ac-1a90-4f1b-8ba3-684767c2a182
mitigations=auto quiet security=apparmor
Desktop: KDE Plasma v: 6.1.5 tk: Qt v: N/A info: frameworks v: 6.6.0
wm: kwin_x11 with: krunner tools: avail: xscreensaver vt: 2 dm: SDDM
Distro: openSUSE Tumbleweed 20240916
Machine:
Type: Desktop System: Micro-Star product: MS-7E16 v: 1.0
serial: <superuser required>
Mobo: Micro-Star model: X670E GAMING PLUS WIFI (MS-7E16) v: 1.0
serial: <superuser required> uuid: <superuser required> UEFI: American
Megatrends LLC. v: 1.60 date: 06/19/2024
Graphics:
Device-1: Advanced Micro Devices [AMD/ATI] Navi 32 [Radeon RX 7700 XT /
7800 XT] vendor: Tul / PowerColor driver: amdgpu v: kernel arch: RDNA-3
code: Navi-3x process: TSMC n5 (5nm) built: 2022+ pcie: gen: 4
speed: 16 GT/s lanes: 16 ports: active: HDMI-A-1 empty: DP-1, DP-2, DP-3,
Writeback-1 bus-ID: 03:00.0 chip-ID: 1002:747e class-ID: 0300
Display: x11 server: X.Org v: 21.1.12 with: Xwayland v: 24.1.2
compositor: kwin_x11 driver: X: loaded: modesetting unloaded: fbdev,vesa
dri: radeonsi gpu: amdgpu display-ID: :0 screens: 1
Screen-1: 0 s-res: 3840x2160 s-dpi: 96 s-size: 1016x571mm (40.00x22.48")
s-diag: 1165mm (45.88")
Monitor-1: HDMI-A-1 mapped: HDMI-1 model: ASUS VP28U serial: <filter>
built: 2019 res: 3840x2160 hz: 60 dpi: 157 gamma: 1.2
size: 621x341mm (24.45x13.43") diag: 708mm (27.9") ratio: 16:9 modes:
max: 3840x2160 min: 640x350
API: EGL v: 1.5 hw: drv: amd radeonsi platforms: device: 0 drv: radeonsi
device: 1 drv: swrast gbm: drv: kms_swrast surfaceless: drv: radeonsi x11:
drv: radeonsi inactive: wayland
API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 24.1.7 glx-v: 1.4
direct-render: yes renderer: AMD Radeon RX 7800 XT (radeonsi navi32 LLVM
18.1.8 DRM 3.57 6.10.9-1-default) device-ID: 1002:747e memory: 15.62 GiB
unified: no
API: Vulkan v: 1.3.290 layers: 8 device: 0 type: discrete-gpu name: AMD
Radeon RX 7800 XT (RADV NAVI32) driver: N/A device-ID: 1002:747e
surfaces: xcb,xlib
Do you think this has already fixed the error?
As written, the freeze occurs irregularly. I probably won’t be able to test it again until the weekend.
Svyatko
September 30, 2024, 4:59pm
5
Why modesetting? Why not amdgpu?
Something in config files?
1 Like
bremse
October 3, 2024, 4:37am
6
To be honest, I have no idea. I haven’t explicitly configured it that way and I wouldn’t know how to change modesetting to amdgpu.
But: I have now played for several hours under Wayland instead of x11, and the error has not occurred so far. Maybe it is an x11 peculiarity? Anyway, I can live with working with Wayland instead of x11.