AMDGPU broken after Kernel Upgrade

Hi,
I’ve been running LEAP 42.3 for months with an AMD A8-9600 R7 IGP (Carizzo) using amdgpu with no problems: full desktop effects, HW decoding of .h265 etc. I had been using a Kernel from the Kernel:HEAD repository from the start, because that gave me proper suspend-to-ram, which did not work with the standard Kernel. But the graphics had been working with the standard Kernel, too.

Then, about a week ago, after a regular software update which included a kernel update, my graphics were broken:

  • When booting, most of the time, after I see the first few lines of boot screen logging, the display briefly flashes and then turns off, with the monitor going to standby. Then the monitor comes back on, but stays black. Also, switching to other terminals (Ctrl-Alt-F1) does nothing.
  • About one out of 10 times, the system comes up to KDE properly, but no accelerated graphics like desktop effects, HW video decoding etc.
  • I can reliably boot in recovery and then startx, but that obviously does not have accelerated graphics, either.
  • Behavior is the same no matter which version kernel I boot - in particular also the kernel that had been working fine before has the above issues now.

So in order to clean things up, I removed all newer Kernels and installed the latest from opensuse-updates, 4.4.104-39-default. But this did not help either, still the same behavior.

I have the latest opensuse-update versions of “kernel-firmware” and “xf86-video-amdgpu” installed, so should be matching the kernel.
I also have a bunch of packages (mostly “lib*”) with “vdpau” or “amdgpu” in the name, and “ucode-amd”. I have made sure those are all the most recent from opensuse-update, too. I did not change anything there in a long time, so that might not be the issue.

One peculiar thing is that when I run mkinitrd, I get a bunch of lines like

dracut: Possible missing firmware "amdgpu/stoney_ce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_rlc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_mec2.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_mec.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_me.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_pfp.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_ce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_sdma1.bin" for kernel module "amdgpu.ko"

But these files are clearly present:

# ll /lib/firmware/amdgpu
total 9636
-rw-r--r-- 1 root root   8832 Mai 30  2017 carrizo_ce.bin
-rw-r--r-- 1 root root  17024 Mai 30  2017 carrizo_me.bin
-rw-r--r-- 2 root root 262784 Mai 30  2017 carrizo_mec2.bin
-rw-r--r-- 2 root root 262784 Mai 30  2017 carrizo_mec.bin
-rw-r--r-- 1 root root  17024 Mai 30  2017 carrizo_pfp.bin
-rw-r--r-- 1 root root  18932 Mai 30  2017 carrizo_rlc.bin
-rw-r--r-- 2 root root  10624 Mai 30  2017 carrizo_sdma1.bin
-rw-r--r-- 2 root root  10624 Mai 30  2017 carrizo_sdma.bin
-rw-r--r-- 1 root root 268000 Mai 30  2017 carrizo_uvd.bin
-rw-r--r-- 1 root root 175840 Mai 30  2017 carrizo_vce.bin
-rw-r--r-- 2 root root   8832 Mai 30  2017 fiji_ce.bin
...

More info:

# cat /proc/cmdline                                                                
BOOT_IMAGE=/boot/vmlinuz-4.4.104-39-default root=UUID=68f92734-8c00-4cb5-9800-ca7e17df9309 resume=/dev/disk/by-id/ata-Samsung_SSD_840_Series_S19HNEBD343680A-part2 splash=nosplash quiet showopts

So no “vga=” or anything.

# lsmod | grep -i amd
edac_mce_amd           28672  0 
amdkfd                139264  1 
amd_iommu_v2           20480  1 amdkfd
amdgpu                679936  0 
i2c_algo_bit           16384  1 amdgpu
drm_kms_helper        155648  1 amdgpu
ttm                   106496  1 amdgpu
drm                   393216  3 ttm,drm_kms_helper,amdgpu
# lspci -nnk | grep -A3 VGA
00:01.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Carrizo [1002:9874] (rev e2)
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1e20]
        Kernel modules: amdgpu
00:01.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Kabini HDMI/DP Audio [1002:9840]

So it seems the amdgpu driver is loaded and used.

# glxgears

shows gears but runs a bit choppy.

# vdpauinfo                                                                       
display: :0   screen: 0                                                                             
libva info: VA-API version 0.39.4                                                                   
libva info: va_getDriverName() returns -1                                                           
libva error: va_getDriverName() failed with unknown libva error,driver_name=(null)                  
API version: 1                                                                                      
Information string: OpenGL/VAAPI backend for VDPAU                                                  
                                                                                                    
Video surface:                                                                                      
                                                                                                    
name   width height types                                                                           
-------------------------------------------                                                         
420     4096  4096  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8                                           
422     4096  4096  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8                                           
444     4096  4096  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 

Decoder capabilities:

name                        level macbs width height
----------------------------------------------------
MPEG1                          --- not supported ---
MPEG2_SIMPLE                   --- not supported ---
...

So no HW decoding.

Any help or ideas would be much appreciated…
Thanks a lot in advance!
Greetz

kingstah

Hi and welcome to the Forum :slight_smile:
Sounds like it’s the same type of bug as this one, just different firmware…
https://bugzilla.opensuse.org/show_bug.cgi?id=1072431

I would suggest a new bug for your hardware would be in order and reference the above one in your report;
openSUSE:Submitting bug reports - openSUSE

Thanks. I filed https://bugzilla.opensuse.org/show_bug.cgi?id=1077848

I came across this: https://forums.opensuse.org/showthread.php/528958-amd-driver-not-loaded-during-startup Turns out I also had some conf files in /etc/dracut.conf.d/:

 # ll /etc/dracut.conf.d/ 
total 20
-rw-r--r-- 1 root root 22 Dez 22 14:13 02-early-microcode.conf 
-rw-r--r-- 1 root root 487 Dez 22 14:13 99-debug.conf 
-rw-r--r-- 1 root root 96 Nov 12 19:06 amdgpu-4.13.0-2.g7e9e30a-default.conf 
-rw-r--r-- 1 root root 88 Sep 9 19:03 amdgpu-pro-4.4.85-22-default.conf 
drwxr-xr-x 3 root root 4096 Aug 11 21:44 modules.d 

After I deleted them, a new run of mkinitrd did not yield any warnings, and all problems were fixed…