Teuniz
October 5, 2023, 8:03am
1
Hi everybody,
I’m experiencing instabilities and system crashes on my brand new workstation.
Output of dmesg:
amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0)
amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000006004000 from client 0x12 (VMC)
amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x0000073A
amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: DCEDMC (0x3)
amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x5
amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1
amdgpu 0000:03:00.0: amdgpu: RW: 0x0
Some system info:
Operating System: openSUSE Leap 15.5
KDE Plasma Version: 5.27.4
KDE Frameworks Version: 5.103.0
Qt Version: 5.15.8
Kernel Version: 5.14.21-150500.55.28-default (64-bit)
Graphics Platform: X11
Processors: 32 × 13th Gen Intel® Core™ i9-13900K
Memory: 31.0 GiB of RAM
Graphics Processor: AMD Radeon Pro W6600
Manufacturer: HP
Product Name: HP Z2 Tower G9 Workstation Desktop PC
I’m not using the closed source AMD Radeon driver, instead I’m using the open source driver that was installed by default.
Any ideas?
Thanks
@Teuniz :
Have you installed the latest Kernel patch «openSUSE-SLE-15.5-2023-3971 » dated the 4th of October 2023?
There’s a couple of DRM related amdgpu repairs in there but, no other amdgpu changes …
Teuniz
October 5, 2023, 9:09am
3
Yes:
pluto@titan:~> uname -a
Linux titan 5.14.21-150500.55.28-default #1 SMP PREEMPT_DYNAMIC Fri Sep 22 10:04:29 UTC 2023 (c11336f) x86_64 x86_64 x86_64 GNU/Linux
pluto@titan:~>
@Teuniz :
Do you have the “kernel-firmware-amdgpu ” package installed?
Yes, there’s also a SLED / SLES 15 SP4 rpm on the AMD support page – <https://www.amd.com/en/support/professional-graphics/amd-radeon-pro/amd-radeon-pro-w6000-series/amd-radeon-pro-w6600 >
I’ve downloaded the thing – “amdgpu-install-5.5.50503-1.noarch.rpm ” but, there’s only some repository files and some other amdgpu installation executables and a EULA in there –
[amdgpu]
name=AMDGPU 5.5.3 repository
baseurl=https://repo.radeon.com/amdgpu/5.5.3/sle/$amdgpudistro/main/x86_64
enabled=1
gpgcheck=1
gpgkey=file:///etc/amdgpu-install/rocm.gpg.key
[amdgpu-src]
name=AMDGPU 5.5.3 repository
baseurl=https://repo.radeon.com/amdgpu/5.5.3/sle/$amdgpudistro/main/source
enabled=0
gpgcheck=1
gpgkey=file:///etc/amdgpu-install/rocm.gpg.key
BASEURL=https://repo.radeon.com
RELEASE=5.5.3
Teuniz
October 6, 2023, 7:58am
6
This morning a hard crash, three minues after turning on the computer. Output of dmesg:
[ 5.456732] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 5.456738] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000006004000 from client 0x12 (VMC)
[ 5.456741] amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5.456742] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: unknown (0x0)
[ 5.456744] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5.456745] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5.456746] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5.456747] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5.456748] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 8.050957] e1000e 0000:00:1f.6 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[ 8.051044] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 8.062737] NET: Registered PF_PACKET protocol family
[ 11.244951] usbcore: registered new interface driver ov534_9
[ 16.399275] usb 1-6.4.4.4: 2:1: cannot get freq at ep 0x84
[ 16.406767] usb 1-6.4.4.4: Warning! Unlikely big volume range (=511), cval->res is probably wrong.
[ 16.406774] usb 1-6.4.4.4: [3] FU [Mic Capture Volume] ch = 1, val = -32767/32767/128
[ 16.407307] usbcore: registered new interface driver snd-usb-audio
[ 40.077098] usb 1-6.4.4.4: 2:1: cannot get freq at ep 0x84
[ 40.081473] usb 1-6.4.4.4: 2:1: cannot get freq at ep 0x84
[ 198.090162] process 'software/Telegram/Telegram/Telegram' started with executable stack
[ 199.557518] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 199.557547] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050841f000 from client 0x1b (UTCL2)
[ 199.557560] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00201031
[ 199.557569] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
[ 199.557577] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
[ 199.557584] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 199.557591] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 199.557594] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 199.557595] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 199.557599] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 199.557602] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050840d000 from client 0x1b (UTCL2)
[ 199.557604] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 199.557606] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 199.557608] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 199.557610] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 199.557611] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 199.557612] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 199.557614] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 199.557617] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 199.557620] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050840e000 from client 0x1b (UTCL2)
[ 199.557622] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 199.557624] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 199.557625] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 199.557627] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 199.557628] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 199.557630] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 199.557631] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 199.557635] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 199.557638] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050841e000 from client 0x1b (UTCL2)
[ 199.557639] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 199.557641] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 199.557643] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 199.557644] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 199.557646] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 199.557647] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 199.557649] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 199.557652] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 199.557655] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050841b000 from client 0x1b (UTCL2)
[ 199.557657] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 199.557658] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 199.557660] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 199.557661] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 199.557663] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 199.557664] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 199.557666] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 199.557669] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 199.557672] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050841d000 from client 0x1b (UTCL2)
[ 199.557674] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 199.557676] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 199.557677] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 199.557679] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 199.557680] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 199.557682] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 199.557683] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 199.557687] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 199.557689] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050840f000 from client 0x1b (UTCL2)
[ 199.557691] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 199.557693] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 199.557695] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 199.557696] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 199.557697] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 199.557699] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 199.557700] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 199.557704] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 199.557707] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050840c000 from client 0x1b (UTCL2)
[ 199.557709] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 199.557710] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 199.557712] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 199.557713] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 199.557715] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 199.557716] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 199.557718] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 199.557721] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 199.557724] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050841c000 from client 0x1b (UTCL2)
[ 199.557726] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 199.557727] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 199.557729] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 199.557731] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 199.557732] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 199.557734] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 199.557735] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 199.557739] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 199.557741] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080010a438000 from client 0x1b (UTCL2)
[ 199.557743] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 199.557745] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 199.557746] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 199.557748] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 199.557749] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 199.557751] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 199.557752] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.713254] gmc_v10_0_process_interrupt: 24 callbacks suppressed
[ 209.713268] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 209.713301] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800508488000 from client 0x1b (UTCL2)
[ 209.713314] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00201031
[ 209.713322] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
[ 209.713331] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
[ 209.713339] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 209.713346] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 209.713353] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 209.713360] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.713372] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 209.713385] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800508489000 from client 0x1b (UTCL2)
[ 209.713396] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 209.713404] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 209.713412] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 209.713419] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 209.713426] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 209.713439] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 209.713448] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.713460] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 209.713472] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050848d000 from client 0x1b (UTCL2)
[ 209.713481] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 209.713489] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 209.713496] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 209.713503] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 209.713509] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 209.713516] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 209.713522] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.713532] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 209.713544] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050848c000 from client 0x1b (UTCL2)
[ 209.713553] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 209.713560] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 209.713568] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 209.713574] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 209.713581] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 209.713589] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 209.713599] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.713608] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 209.713620] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080010a60f000 from client 0x1b (UTCL2)
[ 209.713630] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 209.713637] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 209.713644] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 209.713651] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 209.713657] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 209.713664] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 209.713671] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.713680] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 209.713692] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050848e000 from client 0x1b (UTCL2)
[ 209.713701] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 209.713708] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 209.713715] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 209.713722] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 209.713729] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 209.713735] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 209.713742] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.713752] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 209.713763] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080010a60e000 from client 0x1b (UTCL2)
[ 209.713772] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 209.713780] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 209.713787] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 209.713793] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 209.713800] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 209.713807] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 209.713813] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.713823] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 209.713834] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050848e000 from client 0x1b (UTCL2)
[ 209.713843] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 209.713851] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 209.713858] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 209.713864] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 209.713871] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 209.713878] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 209.713884] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.713893] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 209.713904] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080010a60e000 from client 0x1b (UTCL2)
[ 209.713912] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 209.713918] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 209.713924] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 209.713929] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 209.713934] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 209.713940] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 209.713945] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.713953] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32772, for process plasmashell pid 8364 thread plasmashel:cs0 pid 8406)
[ 209.713963] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000080050848f000 from client 0x1b (UTCL2)
[ 209.713970] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 209.713976] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 209.713982] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 209.713988] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 209.713993] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 209.713999] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 209.714004] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 209.723103] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=8370, emitted seq=8372
[ 209.723241] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process plasmashell pid 8364 thread plasmashel:cs0 pid 8406
[ 209.723353] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
[ 209.844936] amdgpu 0000:03:00.0: amdgpu: free PSP TMR buffer
[ 209.890072] amdgpu 0000:03:00.0: amdgpu: MODE1 reset
[ 209.890074] amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
[ 209.890131] amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
[ 210.417219] amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 210.417452] [drm] PCIE GART of 512M enabled (table at 0x00000081FEB00000).
[ 210.417679] [drm] VRAM is lost due to GPU reset!
[ 210.417681] [drm] PSP is resuming...
[ 210.510381] [drm] reserve 0xa00000 from 0x81fd000000 for PSP TMR
[ 210.630194] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 210.651477] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[ 210.651481] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[ 210.651485] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw program = 0, version = 0x003b2b00 (59.43.0)
[ 210.651491] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
[ 210.651531] amdgpu 0000:03:00.0: amdgpu: use vbios provided pptable
[ 210.703679] amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
[ 210.704933] [drm] DMUB hardware initialized: version=0x0202001E
[ 211.025687] [drm] kiq ring mec 2 pipe 1 q 0
[ 211.029880] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[ 211.030192] [drm] JPEG decode initialized successfully.
[ 211.030234] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 211.030235] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 211.030236] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 211.030236] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[ 211.030237] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[ 211.030237] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[ 211.030238] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[ 211.030238] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[ 211.030239] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[ 211.030239] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[ 211.030240] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 211.030240] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
[ 211.030241] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
[ 211.030241] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
[ 211.030242] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
[ 211.030242] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
[ 211.034486] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start
[ 211.037134] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done
[ 211.037135] [drm] Skip scheduling IBs!
[ 211.037136] [drm] Skip scheduling IBs!
[ 211.037144] amdgpu 0000:03:00.0: amdgpu: GPU reset(2) succeeded!
[ 211.037219] [drm] Skip scheduling IBs!
[ 211.037224] [drm] Skip scheduling IBs!
[ 211.037225] [drm] Skip scheduling IBs!
[ 211.037226] [drm] Skip scheduling IBs!
[ 211.037227] [drm] Skip scheduling IBs!
[ 211.037228] [drm] Skip scheduling IBs!
[ 211.037229] [drm] Skip scheduling IBs!
[ 211.037230] [drm] Skip scheduling IBs!
[ 211.037231] [drm] Skip scheduling IBs!
[ 211.037231] [drm] Skip scheduling IBs!
[ 211.037232] [drm] Skip scheduling IBs!
[ 211.037234] [drm] Skip scheduling IBs!
[ 211.037235] [drm] Skip scheduling IBs!
[ 211.037235] [drm] Skip scheduling IBs!
[ 211.037236] [drm] Skip scheduling IBs!
[ 211.037238] [drm] Skip scheduling IBs!
[ 211.037239] [drm] Skip scheduling IBs!
[ 211.037240] [drm] Skip scheduling IBs!
[ 211.037242] [drm] Skip scheduling IBs!
[ 211.037243] [drm] Skip scheduling IBs!
[ 211.037243] [drm] Skip scheduling IBs!
[ 211.037244] [drm] Skip scheduling IBs!
[ 211.037245] [drm] Skip scheduling IBs!
[ 211.037246] [drm] Skip scheduling IBs!
[ 211.037247] [drm] Skip scheduling IBs!
[ 211.037248] [drm] Skip scheduling IBs!
[ 211.037248] [drm] Skip scheduling IBs!
[ 211.037249] [drm] Skip scheduling IBs!
[ 211.037250] [drm] Skip scheduling IBs!
[ 211.037251] [drm] Skip scheduling IBs!
[ 211.037252] [drm] Skip scheduling IBs!
[ 211.037253] [drm] Skip scheduling IBs!
[ 211.037255] [drm] Skip scheduling IBs!
[ 211.345505] ------------[ cut here ]------------
[ 211.345511] WARNING: CPU: 11 PID: 1336 at ../drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:655 amdgpu_irq_put+0x68/0x90 [amdgpu]
[ 211.345961] Modules linked in: snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device af_packet vboxnetadp(OEN) vboxnetflt(OEN) vboxdrv(OEN) qrtr(N) ns(N) dmi_sysfs snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp intel_rapl_msr intel_rapl_common snd_sof intel_pmc_core x86_pkg_temp_thermal intel_powerclamp snd_sof_utils snd_soc_hdac_hda coretemp snd_soc_acpi_intel_match gspca_ov534_9(N) snd_soc_acpi gspca_main(N) soundwire_bus videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 snd_hda_ext_core videobuf2_common snd_soc_core kvm_intel snd_hda_codec_realtek videodev nls_iso8859_1 nls_cp437 snd_hda_codec_generic vfat iTCO_wdt snd_compress intel_pmc_bxt fat snd_pcm_dmaengine snd_hda_codec_hdmi mc ledtrig_audio iTCO_vendor_support snd_hda_intel kvm snd_intel_dspcfg joydev snd_intel_sdw_acpi snd_hda_codec snd_hda_core irqbypass snd_hwdep snd_pcm hp_wmi e1000e
[ 211.346026] sparse_keymap platform_profile rfkill pcspkr snd_timer wmi_bmof snd i2c_i801 i2c_smbus soundcore thermal acpi_pad acpi_tad(N) button fuse efi_pstore(N) configfs ip_tables x_tables ext4 crc16 mbcache jbd2 hid_generic usbhid amdgpu drm_ttm_helper ttm mfd_core iommu_v2 gpu_sched i2c_algo_bit drm_buddy drm_display_helper sr_mod crc32_pclmul crc32c_intel cdrom drm_kms_helper syscopyarea sysfillrect xhci_pci ghash_clmulni_intel sysimgblt xhci_pci_renesas fb_sys_fops drm xhci_hcd ahci nvme libahci aesni_intel nvme_core libata usbcore nvme_common crypto_simd t10_pi cec cryptd crc64_rocksoft_generic serio_raw rc_core crc64_rocksoft crc64 wmi video sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod msr efivarfs
[ 211.346097] Supported: No, Unsupported modules are loaded
[ 211.346100] CPU: 11 PID: 1336 Comm: kworker/11:2 Tainted: G OE N 5.14.21-150500.55.28-default #1 SLE15-SP5 da22099d09f8a2ed0ffeb0ddc7d6834f2130cf96
[ 211.346107] Hardware name: HP HP Z2 Tower G9 Workstation Desktop PC/895C, BIOS U50 Ver. 02.02.02 06/28/2023
[ 211.346110] Workqueue: events drm_mode_rmfb_work_fn [drm]
[ 211.346168] RIP: 0010:amdgpu_irq_put+0x68/0x90 [amdgpu]
[ 211.346481] Code: e8 48 8b 53 08 f0 ff 0c 82 b8 00 00 00 00 74 09 5b 5d 41 5c c3 cc cc cc cc 89 ea 48 89 de 4c 89 e7 5b 5d 41 5c e9 88 fd ff ff <0f> 0b b8 ea ff ff ff eb dd b8 ea ff ff ff c3 cc cc cc cc b8 fe ff
[ 211.346485] RSP: 0000:ffff9413023f3908 EFLAGS: 00010046
[ 211.346488] RAX: 0000000000000000 RBX: ffff88dc19d26580 RCX: ffffffffc0f236c0
[ 211.346491] RDX: ffff88dd9b38cca0 RSI: 0000000000000001 RDI: ffff88dc19d26580
[ 211.346493] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[ 211.346495] R10: 0000000000000000 R11: ffff9413023f3808 R12: ffff88dc19d20000
[ 211.346497] R13: ffff88dc05012600 R14: 0000000000000002 R15: ffff88dc01227e00
[ 211.346499] FS: 0000000000000000(0000) GS:ffff88e35f2c0000(0000) knlGS:0000000000000000
[ 211.346502] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 211.346504] CR2: 00007f3746e2da94 CR3: 0000000299610003 CR4: 0000000000770ee0
[ 211.346507] PKRU: 55555554
[ 211.346508] Call Trace:
[ 211.346513] <TASK>
[ 211.346518] dm_disable_vblank+0x51/0x130 [amdgpu 17a410583c2993eb962397c227436e66691afb3b]
[ 211.346991] drm_vblank_disable_and_save+0xab/0xf0 [drm a122522a20481b0cab26daeafc4de230de4e73b6]
[ 211.347039] drm_crtc_vblank_off+0xbd/0x250 [drm a122522a20481b0cab26daeafc4de230de4e73b6]
[ 211.347083] amdgpu_dm_atomic_commit_tail+0x178/0x3230 [amdgpu 17a410583c2993eb962397c227436e66691afb3b]
[ 211.347543] ? dcn20_populate_dml_pipes_from_context+0x116/0xe40 [amdgpu 17a410583c2993eb962397c227436e66691afb3b]
[ 211.348007] ? dcn30_internal_validate_bw+0xf4/0xa40 [amdgpu 17a410583c2993eb962397c227436e66691afb3b]
[ 211.348468] ? slab_post_alloc_hook+0x4f/0x250
[ 211.348478] ? dcn30_validate_bandwidth+0x110/0x2d0 [amdgpu 17a410583c2993eb962397c227436e66691afb3b]
[ 211.348937] ? dc_validate_global_state+0x2c9/0x3a0 [amdgpu 17a410583c2993eb962397c227436e66691afb3b]
[ 211.349372] ? dma_resv_iter_first_unlocked+0x62/0x70
[ 211.349380] ? dma_resv_get_fences+0x4d/0x230
[ 211.349385] ? dma_resv_get_singleton+0x2d/0x110
[ 211.349391] ? drm_gem_plane_helper_prepare_fb+0xf2/0x1f0 [drm_kms_helper ca0181e504e84fe85a6bcce4a2f13ad4eabaaa43]
[ 211.349417] ? wait_for_completion_timeout+0xd1/0x100
[ 211.349426] commit_tail+0x91/0x120 [drm_kms_helper ca0181e504e84fe85a6bcce4a2f13ad4eabaaa43]
[ 211.349449] drm_atomic_helper_commit+0x10f/0x140 [drm_kms_helper ca0181e504e84fe85a6bcce4a2f13ad4eabaaa43]
[ 211.349469] drm_atomic_commit+0x93/0xc0 [drm a122522a20481b0cab26daeafc4de230de4e73b6]
[ 211.349523] ? __drm_printfn_seq_file+0x20/0x20 [drm a122522a20481b0cab26daeafc4de230de4e73b6]
[ 211.349573] drm_framebuffer_remove+0x491/0x4d0 [drm a122522a20481b0cab26daeafc4de230de4e73b6]
[ 211.349626] drm_mode_rmfb_work_fn+0x6c/0x80 [drm a122522a20481b0cab26daeafc4de230de4e73b6]
[ 211.349675] process_one_work+0x264/0x440
[ 211.349682] worker_thread+0x217/0x3c0
[ 211.349686] ? process_one_work+0x440/0x440
[ 211.349690] kthread+0x154/0x180
[ 211.349694] ? set_kthread_struct+0x50/0x50
[ 211.349698] ret_from_fork+0x1f/0x30
[ 211.349707] </TASK>
[ 211.349709] ---[ end trace a1efba71edd23e90 ]---
@Teuniz :
What happens with a fresh, new, empty, user – a test user?
On this system –
Operating System: openSUSE Leap 15.5
KDE Plasma Version: 5.27.4
KDE Frameworks Version: 5.103.0
Qt Version: 5.15.8
Kernel Version: 5.14.21-150500.55.28-default (64-bit)
Graphics Platform: X11
Processors: 8 × AMD Ryzen 5 3400G with Radeon Vega Graphics
Memory: 29.3 GiB of RAM
Graphics Processor: AMD Radeon Vega 11 Graphics
Manufacturer: ASUS
I’m seeing this:
# journalctl -b 0 --no-hostname --output=short-monotonic | grep -i 'amdgpu' -B3 -A3
[ 4.997899] kernel: cherry 0003:046A:0023.0002: input,hidraw2: USB HID v1.11 Keyboard [HID 046a:0023] on usb-0000:01:00.0-9/input0
[ 4.998485] kernel: input: HID 046a:0023 as /devices/pci0000:00/0000:00:01.2/0000:01:00.0/usb1/1-9/1-9:1.1/0003:046A:0023.0003/input/input2
[ 5.061525] kernel: cherry 0003:046A:0023.0003: input,hidraw3: USB HID v1.11 Device [HID 046a:0023] on usb-0000:01:00.0-9/input1
[ 5.289007] kernel: [drm] amdgpu kernel modesetting enabled.
[ 5.306549] kernel: amdgpu: Topology: Add APU node [0x0:0x0]
[ 5.306774] kernel: amdgpu 0000:06:00.0: enabling device (0006 -> 0007)
[ 5.306857] kernel: [drm] initializing kernel modesetting (RAVEN 0x1002:0x15D8 0x1043:0x876B 0xC8).
[ 5.306877] kernel: [drm] register mmio base: 0xFCD00000
[ 5.306879] kernel: [drm] register mmio size: 524288
--
[ 5.306980] kernel: [drm] add ip block number 6 <gfx_v9_0>
[ 5.306983] kernel: [drm] add ip block number 7 <sdma_v4_0>
[ 5.306985] kernel: [drm] add ip block number 8 <vcn_v1_0>
[ 5.307108] kernel: amdgpu 0000:06:00.0: amdgpu: Fetched VBIOS from VFCT
[ 5.307112] kernel: amdgpu: ATOM BIOS: 113-PICASSO-118
[ 5.307874] kernel: [drm] VCN decode is enabled in VM mode
[ 5.307875] kernel: [drm] VCN encode is enabled in VM mode
[ 5.307876] kernel: [drm] JPEG decode is enabled in VM mode
[ 5.307977] kernel: Console: switching to colour dummy device 80x25
[ 5.308032] kernel: amdgpu 0000:06:00.0: vgaarb: deactivate vga console
[ 5.308035] kernel: amdgpu 0000:06:00.0: amdgpu: Trusted Memory Zone (TMZ) feature enabled
[ 5.308074] kernel: [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
[ 5.308082] kernel: amdgpu 0000:06:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
[ 5.308086] kernel: amdgpu 0000:06:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[ 5.308089] kernel: amdgpu 0000:06:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[ 5.308098] kernel: [drm] Detected VRAM RAM=2048M, BAR=2048M
[ 5.308100] kernel: [drm] RAM width 128bits DDR4
[ 5.308239] kernel: [drm] amdgpu: 2048M of VRAM memory ready
[ 5.308242] kernel: [drm] amdgpu: 14984M of GTT memory ready.
[ 5.308254] kernel: [drm] GART: num cpu pages 262144, num gpu pages 262144
[ 5.308478] kernel: [drm] PCIE GART of 1024M enabled.
[ 5.308479] kernel: [drm] PTB located at 0x000000F400A00000
[ 5.313842] kernel: amdgpu 0000:06:00.0: amdgpu: PSP runtime database doesn't exist
[ 5.313845] kernel: amdgpu 0000:06:00.0: amdgpu: PSP runtime database doesn't exist
[ 5.313882] kernel: amdgpu: hwmgr_sw_init smu backed is smu10_smu
[ 5.346989] kernel: [drm] Found VCN firmware Version ENC: 1.15 DEC: 3 VEP: 0 Revision: 0
[ 5.347012] kernel: amdgpu 0000:06:00.0: amdgpu: Will use PSP to load VCN firmware
[ 5.367514] kernel: [drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR
[ 5.431424] kernel: amdgpu 0000:06:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 5.436173] kernel: amdgpu 0000:06:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 5.439169] kernel: [drm] DM_PPLIB: values for F clock
[ 5.439171] kernel: [drm] DM_PPLIB: 400000 in kHz, 3099 in mV
[ 5.439173] kernel: [drm] DM_PPLIB: 933000 in kHz, 3574 in mV
--
[ 5.439409] kernel: [drm] Display Core initialized with v3.2.207!
[ 5.477881] kernel: [drm] kiq ring mec 2 pipe 1 q 0
[ 5.491803] kernel: [drm] VCN decode and encode initialized successfully(under SPG Mode).
[ 5.494366] kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[ 5.494455] kernel: amdgpu: sdma_bitmap: 3
[ 5.537763] kernel: memmap_init_zone_device initialised 524288 pages in 12ms
[ 5.537771] kernel: amdgpu: HMM registered 2048MB device memory
[ 5.537847] kernel: amdgpu: Topology: Add APU node [0x15d8:0x1002]
[ 5.537851] kernel: kfd kfd: amdgpu: added device 1002:15d8
[ 5.538084] kernel: amdgpu 0000:06:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 11, active_cu_number 11
[ 5.538202] kernel: amdgpu 0000:06:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[ 5.538205] kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 5.538207] kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 5.538210] kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[ 5.538212] kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[ 5.538214] kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[ 5.538217] kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[ 5.538219] kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[ 5.538221] kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[ 5.538223] kernel: amdgpu 0000:06:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[ 5.538226] kernel: amdgpu 0000:06:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
[ 5.538228] kernel: amdgpu 0000:06:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1
[ 5.538230] kernel: amdgpu 0000:06:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
[ 5.538232] kernel: amdgpu 0000:06:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
[ 5.538235] kernel: amdgpu 0000:06:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
[ 5.543432] kernel: [drm] Initialized amdgpu 3.49.0 20150101 for 0000:06:00.0 on minor 0
[ 5.552095] kernel: fbcon: amdgpudrmfb (fb0) is primary device
[ 5.601962] kernel: Console: switching to colour frame buffer device 240x67
[ 5.626334] kernel: amdgpu 0000:06:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[ 5.697238] systemd[1]: Finished dracut initqueue hook.
[ 5.697503] systemd[1]: Reached target Preparation for Remote File Systems.
[ 5.697563] systemd[1]: Reached target Remote File Systems.
--
[ 15.577613] systemd-fsck[787]: /dev/sdc1: 19 files, 655/63961 clusters
[ 15.580130] systemd[1]: Finished File System Check on /dev/disk/by-uuid/E385-55AF.
[ 15.582189] systemd[1]: Mounting /boot/efi...
[ 15.585971] kernel: snd_hda_intel 0000:06:00.1: bound 0000:06:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[ 15.589192] systemd-fsck[779]: System_Tmp: sauber, 66/262144 Dateien, 37416/1048576 Blöcke
[ 15.629547] systemd[1]: Finished File System Check on /dev/disk/by-uuid/6b6e0183-7d30-47cd-81c4-f2d689b5ad65.
[ 15.635014] systemd[1]: Mounting /tmp...
#
Teuniz
October 7, 2023, 6:18am
8
I’ll try monday but I’m pretty sure now it’s a kernel/driver bug.
When I revert to an older kernel, everything works fine.
Teuniz
October 9, 2023, 10:11am
9
For some, the latest kernel update resolved the problem but not for me.
Apparently I’m a minority and we don’t have a support contract with Suse so I did the only thing I could do which is to set the bootloader to use an older kernel (5.14.21-150500.55.19-default) until the problem is really resolved.
https://bugzilla.suse.com/show_bug.cgi?id=1215523
Teuniz
October 17, 2023, 9:16am
10
Recently there was another kernel update for Leap 15.5 (5.14.21-150500.55.31-default).
I installed it and the system crashed after a couple of hours:
[ 5.565784] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 5.565790] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000006004000 from client 0x12 (VMC)
[ 5.565792] amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x0000073A
[ 5.565794] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: DCEDMC (0x3)
[ 5.565795] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5.565796] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x5
[ 5.565797] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 5.565798] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1
[ 5.565799] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 8.190663] e1000e 0000:00:1f.6 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[ 8.190762] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 8.202946] NET: Registered PF_PACKET protocol family
[ 11.352915] usbcore: registered new interface driver ov534_9
[ 16.398870] usb 1-6.4.4.4: 2:1: cannot get freq at ep 0x84
[ 16.407310] usb 1-6.4.4.4: Warning! Unlikely big volume range (=511), cval->res is probably wrong.
[ 16.407318] usb 1-6.4.4.4: [3] FU [Mic Capture Volume] ch = 1, val = -32767/32767/128
[ 16.407757] usbcore: registered new interface driver snd-usb-audio
[ 40.649290] usb 1-6.4.4.4: 2:1: cannot get freq at ep 0x84
[ 40.653864] usb 1-6.4.4.4: 2:1: cannot get freq at ep 0x84
[ 180.725212] process 'software/Telegram/Telegram/Telegram' started with executable stack
[ 358.078011] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 358.078026] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000006004000 from client 0x12 (VMC)
[ 358.078039] amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x0000073A
[ 358.078048] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: DCEDMC (0x3)
[ 358.078053] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 358.078055] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x5
[ 358.078057] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 358.078061] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1
[ 358.078063] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 2216.913933] usb 1-6-port2: disabled by hub (EMI?), re-enabling...
[ 2216.914668] usb 1-6.2: USB disconnect, device number 4
[ 2217.200390] usb 1-6.2: new low-speed USB device number 8 using xhci_hcd
[ 2217.304691] usb 1-6.2: New USB device found, idVendor=093a, idProduct=2510, bcdDevice= 1.00
[ 2217.304703] usb 1-6.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 2217.304706] usb 1-6.2: Product: USB OPTICAL MOUSE
[ 2217.304709] usb 1-6.2: Manufacturer: PIXART
[ 2217.311224] input: PIXART USB OPTICAL MOUSE as /devices/pci0000:00/0000:00:14.0/usb1/1-6/1-6.2/1-6.2:1.0/0003:093A:2510.0004/input/input20
[ 2217.311636] hid-generic 0003:093A:2510.0004: input,hidraw2: USB HID v1.11 Mouse [PIXART USB OPTICAL MOUSE] on usb-0000:00:14.0-6.2/input0
[ 5492.829194] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5492.829226] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155410000 from client 0x1b (UTCL2)
[ 5492.829236] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601031
[ 5492.829242] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
[ 5492.829248] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
[ 5492.829253] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5492.829258] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 5492.829263] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5492.829268] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5492.829278] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5492.829287] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155425000 from client 0x1b (UTCL2)
[ 5492.829295] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5492.829301] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5492.829306] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5492.829311] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5492.829316] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5492.829321] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5492.829326] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5492.829334] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5492.829343] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155424000 from client 0x1b (UTCL2)
[ 5492.829350] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5492.829356] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5492.829361] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5492.829366] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5492.829371] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5492.829376] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5492.829381] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5492.829388] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5492.829397] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155411000 from client 0x1b (UTCL2)
[ 5492.829404] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5492.829409] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5492.829415] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5492.829420] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5492.829425] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5492.829430] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5492.829434] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5492.829442] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5492.829451] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155415000 from client 0x1b (UTCL2)
[ 5492.829457] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5492.829463] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5492.829468] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5492.829473] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5492.829478] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5492.829483] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5492.829488] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5492.829495] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5492.829504] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155412000 from client 0x1b (UTCL2)
[ 5492.829511] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5492.829516] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5492.829522] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5492.829527] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5492.829532] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5492.829537] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5492.829542] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5492.829549] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5492.829558] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155422000 from client 0x1b (UTCL2)
[ 5492.829565] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5492.829570] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5492.829576] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5492.829580] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5492.829585] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5492.829590] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5492.829595] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5492.829603] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5492.829611] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155413000 from client 0x1b (UTCL2)
[ 5492.829618] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5492.829623] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5492.829629] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5492.829634] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5492.829639] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5492.829643] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5492.829654] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5492.829663] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5492.829671] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155423000 from client 0x1b (UTCL2)
[ 5492.829678] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5492.829684] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5492.829689] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5492.829694] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5492.829699] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5492.829704] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5492.829709] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5492.829716] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5492.829725] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155446000 from client 0x1b (UTCL2)
[ 5492.829731] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5492.829737] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5492.829742] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5492.829747] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5492.829752] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5492.829757] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5492.829762] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.013405] gmc_v10_0_process_interrupt: 6 callbacks suppressed
[ 5503.013418] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5503.013444] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155484000 from client 0x1b (UTCL2)
[ 5503.013456] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601031
[ 5503.013464] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
[ 5503.013471] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
[ 5503.013478] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5503.013484] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 5503.013490] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5503.013496] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.013508] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5503.013520] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155484000 from client 0x1b (UTCL2)
[ 5503.013529] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5503.013536] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5503.013543] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5503.013549] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5503.013555] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5503.013561] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5503.013567] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.013576] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5503.013587] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155485000 from client 0x1b (UTCL2)
[ 5503.013596] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5503.013603] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5503.013609] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5503.013615] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5503.013621] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5503.013628] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5503.013634] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.013643] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5503.013653] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155481000 from client 0x1b (UTCL2)
[ 5503.013662] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5503.013669] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5503.013675] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5503.013681] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5503.013687] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5503.013693] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5503.013699] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.013708] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5503.013719] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155480000 from client 0x1b (UTCL2)
[ 5503.013727] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5503.013734] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5503.013741] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5503.013746] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5503.013752] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5503.013759] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5503.013765] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.013774] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5503.013784] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155480000 from client 0x1b (UTCL2)
[ 5503.013792] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5503.013806] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5503.013812] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5503.013818] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5503.013824] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5503.013831] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5503.013837] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.013846] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5503.013857] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155472000 from client 0x1b (UTCL2)
[ 5503.013865] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5503.013872] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5503.013879] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5503.013885] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5503.013891] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5503.013897] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5503.013903] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.013912] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5503.013923] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155473000 from client 0x1b (UTCL2)
[ 5503.013931] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5503.013938] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5503.013945] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5503.013950] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5503.013956] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5503.013963] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5503.013969] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.013978] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5503.013988] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155477000 from client 0x1b (UTCL2)
[ 5503.013996] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5503.014003] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5503.014010] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5503.014016] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5503.014022] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5503.014028] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5503.014034] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.014043] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process kicad pid 19222 thread kicad:cs0 pid 19243)
[ 5503.014054] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800155631000 from client 0x1b (UTCL2)
[ 5503.014062] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 5503.014069] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 5503.014076] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0
[ 5503.014082] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
[ 5503.014088] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 5503.014094] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 5503.014100] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[ 5503.023222] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=234323, emitted seq=234325
[ 5503.023354] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kicad pid 19222 thread kicad:cs0 pid 19243
[ 5503.023459] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
[ 5503.150178] amdgpu 0000:03:00.0: amdgpu: free PSP TMR buffer
[ 5503.195282] amdgpu 0000:03:00.0: amdgpu: MODE1 reset
[ 5503.195284] amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
[ 5503.195340] amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
[ 5503.717309] amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 5503.717552] [drm] PCIE GART of 512M enabled (table at 0x00000081FEB00000).
[ 5503.717770] [drm] VRAM is lost due to GPU reset!
[ 5503.717772] [drm] PSP is resuming...
[ 5503.810562] [drm] reserve 0xa00000 from 0x81fd000000 for PSP TMR
[ 5503.932157] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 5503.953441] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[ 5503.953446] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[ 5503.953450] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw program = 0, version = 0x003b2b00 (59.43.0)
[ 5503.953455] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
[ 5503.953496] amdgpu 0000:03:00.0: amdgpu: use vbios provided pptable
[ 5504.005938] amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
[ 5504.007193] [drm] DMUB hardware initialized: version=0x0202001E
[ 5504.327974] [drm] kiq ring mec 2 pipe 1 q 0
[ 5504.332179] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[ 5504.333071] [drm] JPEG decode initialized successfully.
[ 5504.333117] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 5504.333118] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 5504.333119] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 5504.333119] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[ 5504.333120] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[ 5504.333120] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[ 5504.333121] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[ 5504.333121] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[ 5504.333122] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[ 5504.333122] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[ 5504.333123] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 5504.333123] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
[ 5504.333124] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
[ 5504.333124] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
[ 5504.333125] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
[ 5504.333125] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
[ 5504.336840] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start
[ 5504.342479] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done
[ 5504.342481] [drm] Skip scheduling IBs!
[ 5504.342482] [drm] Skip scheduling IBs!
[ 5504.342489] amdgpu 0000:03:00.0: amdgpu: GPU reset(2) succeeded!
[ 5504.342591] [drm] Skip scheduling IBs!
[ 5504.342606] [drm] Skip scheduling IBs!
[ 5504.342609] [drm] Skip scheduling IBs!
[ 5504.342623] [drm] Skip scheduling IBs!
[ 5504.613573] ------------[ cut here ]------------
[ 5504.613579] WARNING: CPU: 17 PID: 19476 at ../drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:655 amdgpu_irq_put+0x68/0x90 [amdgpu]
[ 5504.613976] Modules linked in: snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device af_packet vboxnetadp(OEN) vboxnetflt(OEN) vboxdrv(OEN) qrtr(N) ns(N) dmi_sysfs snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_codec_realtek snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_hda_codec_generic snd_hda_ext_core ledtrig_audio snd_soc_core intel_rapl_msr intel_rapl_common intel_pmc_core snd_hda_codec_hdmi x86_pkg_temp_thermal snd_compress intel_powerclamp snd_pcm_dmaengine coretemp gspca_ov534_9(N) snd_hda_intel gspca_main(N) snd_intel_dspcfg iTCO_wdt snd_intel_sdw_acpi intel_pmc_bxt videobuf2_vmalloc iTCO_vendor_support videobuf2_memops snd_hda_codec kvm_intel videobuf2_v4l2 snd_hda_core videobuf2_common snd_hwdep nls_iso8859_1 videodev snd_pcm nls_cp437 vfat hp_wmi snd_timer i2c_i801 sparse_keymap kvm fat
[ 5504.614040] i2c_smbus mc joydev platform_profile snd e1000e rfkill irqbypass pcspkr wmi_bmof soundcore thermal acpi_pad acpi_tad(N) button fuse configfs efi_pstore(N) ip_tables x_tables ext4 crc16 mbcache jbd2 hid_generic usbhid amdgpu drm_ttm_helper ttm mfd_core iommu_v2 gpu_sched i2c_algo_bit drm_buddy drm_display_helper drm_kms_helper crc32_pclmul crc32c_intel sr_mod cdrom syscopyarea xhci_pci sysfillrect sysimgblt xhci_pci_renesas ghash_clmulni_intel fb_sys_fops drm xhci_hcd ahci nvme libahci aesni_intel nvme_core libata usbcore crypto_simd cryptd nvme_common cec t10_pi serio_raw crc64_rocksoft_generic rc_core crc64_rocksoft crc64 wmi video sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod msr efivarfs
[ 5504.614115] Supported: No, Unsupported modules are loaded
[ 5504.614118] CPU: 17 PID: 19476 Comm: kworker/17:0 Tainted: G OE N 5.14.21-150500.55.31-default #1 SLE15-SP5 50ce754eb473b546d66f6e8b4e6e4e306f340e78
[ 5504.614126] Hardware name: HP HP Z2 Tower G9 Workstation Desktop PC/895C, BIOS U50 Ver. 02.02.02 06/28/2023
[ 5504.614129] Workqueue: events drm_mode_rmfb_work_fn [drm]
[ 5504.614197] RIP: 0010:amdgpu_irq_put+0x68/0x90 [amdgpu]
[ 5504.614488] Code: e8 48 8b 53 08 f0 ff 0c 82 b8 00 00 00 00 74 09 5b 5d 41 5c c3 cc cc cc cc 89 ea 48 89 de 4c 89 e7 5b 5d 41 5c e9 88 fd ff ff <0f> 0b b8 ea ff ff ff eb dd b8 ea ff ff ff c3 cc cc cc cc b8 fe ff
[ 5504.614492] RSP: 0018:ffffb33201bab908 EFLAGS: 00010046
[ 5504.614496] RAX: 0000000000000000 RBX: ffff9fb8221a6580 RCX: ffffffffc0f236c0
[ 5504.614499] RDX: ffff9fb8191ce6a0 RSI: 0000000000000001 RDI: ffff9fb8221a6580
[ 5504.614501] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[ 5504.614503] R10: 0000000000000000 R11: ffffb33201bab808 R12: ffff9fb8221a0000
[ 5504.614505] R13: ffff9fb85cf40c00 R14: 0000000000000002 R15: ffff9fb8013fb5c0
[ 5504.614507] FS: 0000000000000000(0000) GS:ffff9fbf5f440000(0000) knlGS:0000000000000000
[ 5504.614511] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5504.614514] CR2: 00007fe78001dff8 CR3: 000000045c810000 CR4: 0000000000750ee0
[ 5504.614517] PKRU: 55555554
[ 5504.614518] Call Trace:
[ 5504.614524] <TASK>
[ 5504.614529] dm_disable_vblank+0x51/0x130 [amdgpu 2c72935baa5937dfb9e671dc8e6b836e4a24dc2d]
[ 5504.614974] drm_vblank_disable_and_save+0xab/0xf0 [drm b7a9a327381557ef05f2df2f04ddba3d982e9f24]
[ 5504.615022] drm_crtc_vblank_off+0xbd/0x250 [drm b7a9a327381557ef05f2df2f04ddba3d982e9f24]
[ 5504.615061] amdgpu_dm_atomic_commit_tail+0x178/0x3230 [amdgpu 2c72935baa5937dfb9e671dc8e6b836e4a24dc2d]
[ 5504.615471] ? dcn20_populate_dml_pipes_from_context+0x116/0xe40 [amdgpu 2c72935baa5937dfb9e671dc8e6b836e4a24dc2d]
[ 5504.615896] ? dcn30_internal_validate_bw+0xf4/0xa40 [amdgpu 2c72935baa5937dfb9e671dc8e6b836e4a24dc2d]
[ 5504.616317] ? slab_post_alloc_hook+0x4f/0x250
[ 5504.616327] ? dcn30_validate_bandwidth+0x110/0x2d0 [amdgpu 2c72935baa5937dfb9e671dc8e6b836e4a24dc2d]
[ 5504.616727] ? dc_validate_global_state+0x2c9/0x3a0 [amdgpu 2c72935baa5937dfb9e671dc8e6b836e4a24dc2d]
[ 5504.617094] ? dma_resv_iter_first_unlocked+0x62/0x70
[ 5504.617102] ? dma_resv_get_fences+0x4d/0x230
[ 5504.617107] ? dma_resv_get_singleton+0x2d/0x110
[ 5504.617112] ? drm_gem_plane_helper_prepare_fb+0xf2/0x1f0 [drm_kms_helper b9d92b90f05cb9ae9f6cf3d96cd9cdabf337b730]
[ 5504.617142] ? wait_for_completion_timeout+0xd1/0x100
[ 5504.617150] commit_tail+0x91/0x120 [drm_kms_helper b9d92b90f05cb9ae9f6cf3d96cd9cdabf337b730]
[ 5504.617172] drm_atomic_helper_commit+0x10f/0x140 [drm_kms_helper b9d92b90f05cb9ae9f6cf3d96cd9cdabf337b730]
[ 5504.617191] drm_atomic_commit+0x93/0xc0 [drm b7a9a327381557ef05f2df2f04ddba3d982e9f24]
[ 5504.617238] ? __drm_printfn_seq_file+0x20/0x20 [drm b7a9a327381557ef05f2df2f04ddba3d982e9f24]
[ 5504.617279] drm_framebuffer_remove+0x491/0x4d0 [drm b7a9a327381557ef05f2df2f04ddba3d982e9f24]
[ 5504.617323] drm_mode_rmfb_work_fn+0x6c/0x80 [drm b7a9a327381557ef05f2df2f04ddba3d982e9f24]
[ 5504.617360] process_one_work+0x264/0x440
[ 5504.617368] worker_thread+0x217/0x3c0
[ 5504.617372] ? process_one_work+0x440/0x440
[ 5504.617376] kthread+0x154/0x180
[ 5504.617380] ? set_kthread_struct+0x50/0x50
[ 5504.617384] ret_from_fork+0x1f/0x30
[ 5504.617390] </TASK>
[ 5504.617391] ---[ end trace c3c72df3d54bc608 ]---
Teuniz
October 18, 2023, 6:19am
12
I don’t think so. Apart from that it’s brand new, kernel 5.14.21-150500.55.19-default works fine.
There aren’t many people using the combination Linux & Radeon Pro W6600 & open source driver, I guess.
Teuniz
November 19, 2023, 5:07pm
13
After some DDG’ing, I found multiple complaints about kernel crashes caused by amd gpu driver:
kernel, crash, amdgpu
https://bbs.archlinux.org/viewtopic.php?id=284076&p=3
At home I’m using an AMD Radeon RX 550 and it seems rock stable with Leap 15.5 and the latest kernel updates.
Unfortunately, my workstation at work which has an AMD Radeon Pro W6600 needs to stay on and old and insecure kernel otherwise it crashes.
When I have time I’ll try the suggested kernel parameters like “amdgpu.mcbp=0” etc.
Teuniz
February 9, 2024, 4:21pm
14
They asked me to file a new bug report which I did.
Then they asked me to try a very recent kernel (6.7.3) from a particular repo which I did.
Now, finally, my new workstation runs stable for the first time since I bought it!
Apparently, there are not much suse users with an AMD Radeon Pro W6600 who use the opensource driver…
1 Like