AMDGPU failure after zypper dup today

I am seeing


amdgpu 0000:2a:00.0: amdgpu: amdgpu_device_ip_init failed

after zypper dup today. I have been able to boot with nomodeset to get the system up, but booting the older kernel does not fix it. This must be a driver issue for the integrated graphics I have on my Ryzen 3400G. I’ve found bug 1177428 which seems like it contains a similar issue with the AMDGPU and a recent f/w update which I suspect I installed with my update today.

That bug says run


zypper in --oldpackage --force http://download.opensuse.org/history/20201007/tumbleweed/repo/oss/noarch/kernel-firmware-amdgpu-20200916-1.1.noarch.rpm

but that gives errors when I try to run it


Problem: kernel-firmware-all-20201005-1.1.noarch requires kernel-firmware-amdgpu = 20201005, but this requirement cannot be provided
 Solution 1: deinstallation of kernel-firmware-all-20201005-1.1.noarch
 Solution 2: do not install kernel-firmware-amdgpu-20200916-1.1.noarch
 Solution 3: break kernel-firmware-all-20201005-1.1.noarch by ignoring some of its dependencies


but I cancelled as I was not sure which if any option was safe, dont want to make things worse. I’m guessing option 3 but not sure?

Having just looked at the kernel-firmware-all package it is a meta package so I guess option 3 is worth a try…

Stuart

Well I did option 3 and now my system comes up OK. I guess there is a bug in the latest kernel-firmware-amdgpu. My question is will this get fixed by that bug or do I need to open a new one as the bug has other issues as well?

I’ve locked the kernel-firmware-amdgpu for now until it gets resolved.

Stuart

Snapshot 20201008 works fine with the old firmware on one of my installations. However suspend/resume is still broken. I think no new bugreport is needed: https://bugzilla.opensuse.org/show_bug.cgi?id=1177428#c26

Thanks, I’ll wait to see how the bug gets resolved and leave the f/w locked for now. I don’t use suspend or hibernate as my SSD is so fast it boots quicker than the old system resumed from hibernate!

Stuart

Hello,

Just to mention, getting the same issue on a ryzen 3750H. Reverted to the snapshot from before zypper dup as it didn’t seem to boot with just the old kernel.

amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx test failed (-110)
[drm:amdgpu_device_ip_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v9_0> failed -110
amdgpu 0000:05:00.0: amdgpu: amdgpu_device_ip_init failed

Can’t test suspend/resume since i don’t use it but the system not booting was odd enough.

Hibernate has always been slow, but suspend to RAM is less than 1 second on my Intel machines. On resume you can hear a video already playing before the display of the monitor is up.

Hmm, I seem to have the same problem - with Intel and NVidia on a laptop…
At least it sounds very similar to me and I started a thread here: https://forums.opensuse.org/showthread.php/545803-Tumbleweed-dup-10-Oct-2020-lost-GUI-with-nouveau-(works-with-nomodeset)?p=2973343

Same issue here:
[FONT=monospace][AMD/ATI] Picasso [1002:15d8] (rev c3)
ATOM BIOS: 113-PICASSO-114

Solution works:
boot with nomodesetting parameter to reach desktop and use commands to downgrade broken packge:
zypper in --oldpackage --force http://download.opensuse.org/history/20201007/tumbleweed/repo/oss/noarch/kernel-firmware-amdgpu-20200916-1.1.noarch.rpm
choose solution 3: Solution 3: break kernel-firmware-all-20201005-1.1.noarch by ignoring some of its dependencies

[/FONT]

long time lurker, short time Opensuse TW user (after many years with other distros).
Just registered to report same issue.
CPU is AMD 3400G on AsRock A300 deskmini.

After yesterday zypper dup, this morning I could not get started getting:

AMD-Vi: Unable to write to IOMMU perf counter

I could boot editing boot command with:

nomodeset iommu=off

But then when GDM prompted for login, it did not login. I have been looping on login prompt despite typing the correct password.

I was able to start my system getting back to a previous zypper snapshot of a few days ago.

I can use my system with a previous state but since I am not experienced in Opensuse TW, how to properly update my latest system status/snapshot (the one giving error) once a fix is out?

thanks.

Firmware is broken:


erlangen:~ # ll /lib/firmware/amdgpu/picasso*.xz
--w------- 11 root root    0 Oct 12 09:21 /lib/firmware/amdgpu/picasso_asd.bin.xz
--w------- 11 root root    0 Oct 12 09:21 /lib/firmware/amdgpu/picasso_ce.bin.xz
--w------- 11 root root    0 Oct 12 09:21 /lib/firmware/amdgpu/picasso_gpu_info.bin.xz
--w------- 11 root root    0 Oct 12 09:21 /lib/firmware/amdgpu/picasso_me.bin.xz
--w------- 11 root root    0 Oct 12 09:21 /lib/firmware/amdgpu/picasso_mec.bin.xz
--w------- 11 root root    0 Oct 12 09:21 /lib/firmware/amdgpu/picasso_mec2.bin.xz
--w------- 11 root root    0 Oct 12 09:21 /lib/firmware/amdgpu/picasso_pfp.bin.xz
--w------- 11 root root    0 Oct 12 09:21 /lib/firmware/amdgpu/picasso_rlc.bin.xz
-rw-r--r--  1 root root 9160 Oct  8 13:23 /lib/firmware/amdgpu/picasso_rlc_am4.bin.xz
--w------- 11 root root    0 Oct 12 09:21 /lib/firmware/amdgpu/picasso_sdma.bin.xz
-rw-r--r--  1 root root 9548 Oct  8 13:23 /lib/firmware/amdgpu/picasso_ta.bin.xz
--w------- 11 root root    0 Oct 12 09:21 /lib/firmware/amdgpu/picasso_vcn.bin.xz
erlangen:~ # 

Install previous version: https://forums.opensuse.org/showthread.php/545773-AMDGPU-failure-after-zypper-dup-today?p=2973652#post2973652 Lock and upgrade to current snapshot.

my problem is that I can’t reach desktop with my latest snapshot, I get into that password trouble.

Login as root into a virtual console and fix the system. When fixed it will boot into the desktop.

:~ # zypper in --oldpackage --force http://download.opensuse.org/repositories/home:/tiwai:/test:/fw-fix/openSUSE_Factory/noarch/kernel-firmware-amdgpu-20201005-336.1.noarch.rpm

fixed it but I have installed this:

zypper in --oldpackage --force http://download.opensuse.org/history…1.1.noarch.rpm

Instead of your fix.

How to lock it so it does not get upgraded to the broken firmware?

Thank you

You would add a lock: zypper al kernel-firmware-amdgpu

However, it’s only the upgrade which is broken. See https://forums.opensuse.org/showthread.php/544219-Amdgpu-Trouble?p=2973760#post2973760 for fixing the upgrade.

Possibly, I have the same problem after today’s dup to [FONT=arial]20201009.
System halts before prompting a password to /home partition.
After a couple of restarts (the last was without quiet option) pre-dup snapshot was broken either (may be after purging old kernels).
Nomodeset option allow me to boot into GUI.

The system is Ryzen 3600, ASUS X470 Pro, GTX980, nouevau, KDE desktop.

[/FONT]

There is a kernel-firmware update for TW available today which has a new amdgpu version in it (20201005-3.1). Does this fix the issue where you have to nomodeset to get the system up? Anyone tried it yet?

Stuart

Decided to bite the bullet and install the amdgpu firmware update and yes it boots fine. Not sure about the other issues like suspend resume etc but at least my system came up fine.

Stuart