AMD Driver doesn't load after kernel upgrade to 6.14.6-1+

Hello everyone,

Around May i did my monthly zypper dup and after updating the kernel I noticed my external monitor was not working.

System

opensuse Tumbleweed.
Framework Intel 11th Gen
MSI AMD Rx 560 Connected via a Thunderbolt Enclosure.

Please note that the Thunderbolt enclosure works perfectly fine, it charges the laptop and I have other peripheral (mouse, keyboard, ethernet) connected to it and it works perfectly fine.

The external monitor is connected to the External GPU. And everything has been working perfectly fine up until the latest kernel upgrade to .

After rebooting, I noticed my external monitor was not being recognized and the fan on the GPU was off… I sensed something was off and checking on nvtop i noticed my GPU was not detected. I tried the classic, unplug / replug (both the thunderbolt cable and the GPU to the PCI E Slot) with no success. At this moment I feared the worst… But I decided to boot back the old kernel and… The GPU and the Monitor turned on fine. I chcked nvtop and ran a few games and everything seemed normal. Then i rebooted to the new kernel and the neither the GPU nor the monitor turned on.

All right, so I ran some dmesg on the new and old kernel to compare and to my surprise the new kernel can’t load the amdgpu driver :

dmesg | grep amd 6.14.6-1-default_NEW_NOTWORKING

[  25.968117] [   T3229] [drm] amdgpu kernel modesetting enabled.
[   25.968243] [   T3229] amdgpu: Virtual CRAT table created for CPU
[   25.968253] [   T3229] amdgpu: Topology: Add CPU node
[   25.968352] [   T3229] amdgpu 0000:82:00.0: enabling device (0000 -> 0003)
[   25.968766] [   T3229] amdgpu 0000:82:00.0: amdgpu: detected ip block number 0 <vi_common>
[   25.968767] [   T3229] amdgpu 0000:82:00.0: amdgpu: detected ip block number 1 <gmc_v8_0>
[   25.968769] [   T3229] amdgpu 0000:82:00.0: amdgpu: detected ip block number 2 <tonga_ih>
[   25.968770] [   T3229] amdgpu 0000:82:00.0: amdgpu: detected ip block number 3 <gfx_v8_0>
[   25.968771] [   T3229] amdgpu 0000:82:00.0: amdgpu: detected ip block number 4 <sdma_v3_0>
[   25.968772] [   T3229] amdgpu 0000:82:00.0: amdgpu: detected ip block number 5 <powerplay>
[   25.968773] [   T3229] amdgpu 0000:82:00.0: amdgpu: detected ip block number 6 <dm>
[   25.968774] [   T3229] amdgpu 0000:82:00.0: amdgpu: detected ip block number 7 <uvd_v6_0>
[   25.968775] [   T3229] amdgpu 0000:82:00.0: amdgpu: detected ip block number 8 <vce_v3_0>
[   26.245651] [   T3229] amdgpu 0000:82:00.0: amdgpu: Fetched VBIOS from ROM BAR
[   26.245658] [   T3229] amdgpu: ATOM BIOS: 113-C98121-M01
[   26.251947] [   T3229] amdgpu 0000:82:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[   26.251952] [   T3229] amdgpu 0000:82:00.0: amdgpu: PCIE atomic ops is not supported
[   26.723470] [   T3229] amdgpu 0000:82:00.0: BAR 2 [mem 0x6070000000-0x60701fffff 64bit pref]: releasing
[   26.723475] [   T3229] amdgpu 0000:82:00.0: BAR 0 [??? 0x00000000 flags 0x0]: releasing
[   26.723476] [   T3229] [drm:amdgpu_device_resize_fb_bar [amdgpu]] *ERROR* Problem resizing BAR0 (-16).
[   26.723725] [   T3229] amdgpu 0000:82:00.0: BAR 2 [mem 0x6070000000-0x60701fffff 64bit pref]: assigned
[   26.723750] [   T3229] amdgpu 0000:82:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
[   26.723751] [   T3229] amdgpu 0000:82:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[   26.723767] [   T3229] Modules linked in: amdgpu(+) amdxcp drm_panel_backlight_quirks rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device af_packet nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables vboxnetadp(OE) vboxnetflt(OE) qrtr cmac vboxdrv(OE) algif_hash algif_skcipher nf_tables af_alg bnep iptable_filter binfmt_misc nls_iso8859_1 nls_cp437 cdc_mbim cdc_wdm vfat fat snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel soundwire_cadence snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof iwlmvm snd_sof_utils snd_soc_acpi_intel_match
[   26.723892] [   T3229]  amdgpu_bo_init+0x3b/0x80 [amdgpu ceeb1a328b927030634b9ceb6c638b1de44ce6ca]
[   26.724064] [   T3229]  gmc_v8_0_sw_init+0x2d2/0x6c0 [amdgpu ceeb1a328b927030634b9ceb6c638b1de44ce6ca]
[   26.724375] [   T3229]  amdgpu_device_init.cold+0x166c/0x21db [amdgpu ceeb1a328b927030634b9ceb6c638b1de44ce6ca]
[   26.724727] [   T3229]  amdgpu_driver_load_kms+0x19/0x70 [amdgpu ceeb1a328b927030634b9ceb6c638b1de44ce6ca]
[   26.724988] [   T3229]  amdgpu_pci_probe+0x199/0x430 [amdgpu ceeb1a328b927030634b9ceb6c638b1de44ce6ca]
[   26.725283] [   T3229]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu ceeb1a328b927030634b9ceb6c638b1de44ce6ca]
[   26.725602] [   T3229] [drm:amdgpu_bo_init [amdgpu]] *ERROR* Unable to set WC memtype for the aperture base
[   26.725864] [   T3229] [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* sw_init of IP block <gmc_v8_0> failed -22
[   26.726207] [   T3229] amdgpu 0000:82:00.0: amdgpu: amdgpu_device_ip_init failed
[   26.726209] [   T3229] amdgpu 0000:82:00.0: amdgpu: Fatal error during GPU init
[   26.726266] [   T3229] amdgpu 0000:82:00.0: amdgpu: amdgpu: finishing device.
[   26.726421] [   T3229] amdgpu 0000:82:00.0: probe with driver amdgpu failed with error -22
dmesg | grep amd 6.14.1-1-default OLD WORKING
[   14.948100] [   T1928] [drm] amdgpu kernel modesetting enabled.
[   14.949267] [   T1928] amdgpu: Virtual CRAT table created for CPU
[   14.950370] [   T1928] amdgpu: Topology: Add CPU node
[   14.951532] [   T1928] amdgpu 0000:82:00.0: enabling device (0000 -> 0003)
[   14.956549] [   T1928] amdgpu 0000:82:00.0: amdgpu: detected ip block number 0 <vi_common>
[   14.957542] [   T1928] amdgpu 0000:82:00.0: amdgpu: detected ip block number 1 <gmc_v8_0>
[   14.958515] [   T1928] amdgpu 0000:82:00.0: amdgpu: detected ip block number 2 <tonga_ih>
[   14.959479] [   T1928] amdgpu 0000:82:00.0: amdgpu: detected ip block number 3 <gfx_v8_0>
[   14.960446] [   T1928] amdgpu 0000:82:00.0: amdgpu: detected ip block number 4 <sdma_v3_0>
[   14.961577] [   T1928] amdgpu 0000:82:00.0: amdgpu: detected ip block number 5 <powerplay>
[   14.962804] [   T1928] amdgpu 0000:82:00.0: amdgpu: detected ip block number 6 <dm>
[   14.963906] [   T1928] amdgpu 0000:82:00.0: amdgpu: detected ip block number 7 <uvd_v6_0>
[   14.964961] [   T1928] amdgpu 0000:82:00.0: amdgpu: detected ip block number 8 <vce_v3_0>
[   15.242315] [   T1928] amdgpu 0000:82:00.0: amdgpu: Fetched VBIOS from ROM BAR
[   15.243464] [   T1928] amdgpu: ATOM BIOS: 113-C98121-M01
[   15.253963] [   T1928] amdgpu 0000:82:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[   15.255108] [   T1928] amdgpu 0000:82:00.0: amdgpu: PCIE atomic ops is not supported
[   15.730616] [   T1928] amdgpu 0000:82:00.0: BAR 2 [mem 0x6070000000-0x60701fffff 64bit pref]: releasing
[   15.731639] [   T1928] amdgpu 0000:82:00.0: BAR 0 [mem 0x6060000000-0x606fffffff 64bit pref]: releasing
[   15.738177] [   T1928] amdgpu 0000:82:00.0: BAR 0 [mem size 0x100000000 64bit pref]: can't assign; no space
[   15.739384] [   T1928] amdgpu 0000:82:00.0: BAR 0 [mem size 0x100000000 64bit pref]: failed to assign
[   15.740601] [   T1928] amdgpu 0000:82:00.0: BAR 2 [mem 0x52200000-0x523fffff 64bit pref]: assigned
[   15.750813] [   T1928] amdgpu 0000:82:00.0: BAR 0 [mem 0x6060000000-0x606fffffff 64bit pref]: assigned
[   15.753106] [   T1928] amdgpu 0000:82:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
[   15.754253] [   T1928] amdgpu 0000:82:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[   15.757850] [   T1928] [drm] amdgpu: 4096M of VRAM memory ready
[   15.758974] [   T1928] [drm] amdgpu: 32044M of GTT memory ready.
[   15.771258] [   T1928] amdgpu: hwmgr_sw_init smu backed is polaris10_smu
[   16.139077] [   T1928] snd_hda_intel 0000:82:00.1: bound 0000:82:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[   16.323496] [   T1928] kfd kfd: amdgpu: skipped device 1002:67ff, PCI rejects atomics 730<0
[   16.324620] [   T1928] amdgpu 0000:82:00.0: amdgpu: SE 2, SH per SE 1, CU per SH 8, active_cu_number 16
[   16.330503] [   T1928] amdgpu 0000:82:00.0: amdgpu: Using BOCO for runtime pm
[   16.331818] [   T1928] amdgpu 0000:82:00.0: [drm] Registered 5 planes with drm panic
[   16.332961] [   T1928] [drm] Initialized amdgpu 3.61.0 for 0000:82:00.0 on minor 0
[   16.349142] [   T1928] amdgpu 0000:82:00.0: [drm] fb1: amdgpudrmfb frame buffer device

I checked today and a new kernel 6.14.6-2-default. I quickly updated to it but got no deal, the driver doenst load. Unfortunately my wizard-fu doesn’t go that far and I’m unable to troubleshoot further.

Any Ideas? Is this a bug/regression on the kernel/Opensuse?

Thanks,

(Edited OP post to use preformatted text rather than quoted - please note the difference and use the </> button in the editor for thisrather than the double quote button. Preformatted is much more readable. Thanks!)

1 Like

@lakostem There is a rebuilt kernel 6.14.6-2-default in the latest snapshots, suggest to zypper dup and test that…

@ malcolmlewis I reran zypper dup today, I could not find any 6.14.6-2-default only some kernel-longterm-6.12.31-1.1.x86_64 . Nevertheless I updated and I booted into longterm 6.12.31 and it works fine (which I believe is because it is older than 6.14.6.

Any other idea ?

@lakostem I suggest running env ZYPP_PCK_PRELOAD=0 zypper -vvv dup or zypper -vvv dup there have been mirror issues,which should be fixed now.

I just did zypper -vvv dup and got some mesa packages but no new kernel.

I guess i will stay on LTS 6.12.31 until the bug is fixed

Hello,

I ran zypper dup today and it updated to kernel 6.15. I immediately rebooted and booted into 6.15 and I can confirm it is loading the amdgpu driver correctly yay.

Issue solved I believe.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.