Amdgpu Trouble

Trouble was first noticed when a 1 minute freeze occurred: “kwin_x11[1741]: Freeze in OpenGL initialization detected”. Updating the machine resulted in the following error during boot: “kwin_x11[1431]: kwin_core: Compositing is not possible”.

localhost:~ # inxi -SCGxxx
System:    Host: localhost.localdomain Kernel: 5.8.4-1-default x86_64 bits: 64 compiler: N/A Console: tty 1 wm: kwin_x11 
           dm: SDDM Distro: openSUSE Tumbleweed 20200906 
CPU:       Topology: Quad Core model: AMD Ryzen 5 3400G with Radeon Vega Graphics bits: 64 type: MT MCP arch: Zen+ rev: 1 
           L1 cache: 384 KiB L2 cache: 2048 KiB L3 cache: 4096 KiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 59090 
           Speed: 1395 MHz min/max: 1400/3700 MHz boost: enabled Core speeds (MHz): 1: 1398 2: 1317 3: 1256 4: 1257 5: 1397 
           6: 1267 7: 1256 8: 1255 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Picasso driver: amdgpu v: kernel bus ID: 06:00.0 chip ID: 1002:15d8 
           Display: server: X.Org 1.20.9 compositor: kwin_x11 driver: amdgpu FAILED: ati unloaded: fbdev,modesetting,vesa 
           resolution: 1920x1200~60Hz s-dpi: 96 
           OpenGL: renderer: AMD RAVEN (DRM 3.38.0 5.8.4-1-default LLVM 10.0.1) v: 4.6 Mesa 20.1.7 direct render: Yes 
localhost:~ # 

Any idea?

Selecting System Settings > Display and Monitor > Rendering Backend > OpenGL 3.1 seems to fix the problem.

More trouble: https://bugzilla.opensuse.org/show_bug.cgi?id=1177428

    4.355661] kernel: amdgpu 0000:06:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx test failed (-110)
    4.355743] kernel: [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v9_0> failed -110
    4.355745] kernel: amdgpu 0000:06:00.0: amdgpu: amdgpu_device_ip_init failed
    4.355748] kernel: amdgpu 0000:06:00.0: amdgpu: Fatal error during GPU init
    4.430736] kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
    4.430738] kernel: #PF: supervisor read access in kernel mode
    4.430739] kernel: #PF: error_code(0x0000) - not-present page
    4.263558] systemd-udevd[316]: 0000:06:00.0: Worker [334] failed
    4.432830] kernel: BUG: unable to handle page fault for address: 0000000000016928
    4.432835] kernel: #PF: supervisor read access in kernel mode
    4.432837] kernel: #PF: error_code(0x0000) - not-present page

Needed to boot into snapshot 20201007 and roll back. :frowning:

Fixed failed boot by booting with ‘nomodeset’ and running the following commands:


rm -rf /lib/firmware/amdgpu
zypper dup

Suspend/resume issue still persisting.

Fixed: https://bugzilla.opensuse.org/show_bug.cgi?id=1177428#c79

Developers upstream suggest to track down the bug using bisection. Building a kernel from the actual version at https://github.com/openSUSE/kernel, installing and running it is straight forward. However I am lost with building older versions. Any suggestions?