I have a new system, the headline specs are dual AMD EPYC 7282 16 Core processors on a Supermicro H11DSi-NT motherboard. Leap 15.1 installs (all defaults) and boots fine. Tumbleweed installs fine, but just after a grub selection is made at “Loading initial ramdisk”, the machine resets. All installs are fresh, and it’s not a dual boot.
Notes:
It does not hang, as is the case with seemingly every other bug report regarding a failure at this point in the boot.
It’s not graphics. I’ve installed Tumbleweed server. Same problem.
I have installed with both Legacy and UEFI booting. The machine supports both. Same problem with both.
I have installed on both NVMe and SATA drives. Same problem with both.
The Tumbleweed live stick behaves exactly the same way, with the same failure, as does the latest Manjaro (18.1.4). All other OS-s I’ve tried do not suffer this issue (OpenSUSE Leap 15.1, Fedora 30/31, Ubuntu 18.04/19.10).
I’ve tried installing Leap 15.1 and zypper dup-ing to Tumbleweed repos. Same problem. I can’t successfully boot the old kernels after the upgrade, but I can snapper back to a working Leap 15.1 system.
I’ve tried removing “quiet” and adding “loglevel” settings to the grub execution, but the reset seems to happen too quickly for any output to happen.
Thoughts? Anyone aware of the issue? Any ideas how to get some more logging information?
I can post more hardware details if needed. Any help would be appreciated. I would very much like to run Tumbleweed if possible. I’ve been running it on my last system for a while, and it’s great.
I’m afraid glibc on Leap 15.2 isn’t compatible with the latest kernel in that repository. In yast:
This request will break your system!
kernel-default-5.4.5-1.1.g47eef04.i586 conflicts with libc.so.6()(64bit) provided by glibc-2.26-lp151.18.7.x86_64
On a similar tack, I have tried upgrading the whole system to Tumbleweed (e.g., by doing this https://www.techrepublic.com/article/how-to-upgrade-opensuse-leap-to-opensuse-tumbleweed/). That process also gives me multiple kernels* I can try booting from, all of which fail. This leads me to believe that it’s not the kernel that’s the issue. Might it be the bootloader or something like that?
*Kernels available are 4.12.14-lp151.27.3 (from 15.1 install), 4.12.14-lp151.28.36.1 (from updates to 15.1), and 5.3.12-1.1 (from upgrade to Tumbleweed)
You could try the Tumbleweed kernel, which is a 5.3 kernel. Add the Tumbleweed repo, install just the kernel, then disable that repo. That would test if there is a problem with 5.3 kernels.
My advice, however, would be to skip that test. Tumbleweed should be updating to a 5.4 kernel in the next few days. I would suggest waiting for that. And then try the tumbleweed live iso, to see whether that boots. Don’t try updating your installed system to Tumbleweed until you have a live iso that boots.
Changing the runlevel didn’t do anything. On a more positive note, though …
… I found it. The offending package is “ucode-amd”. In hindsight that might seem obvious for bleeding edge AMD processors. I, however, figured this out the tedious way by installing the most basic Leap 15.1 system I could, switching the repos over to Tumbleweed, and updating things one-by-one (well, in vaguely logical groups).
So, I can have a Tumbleweed system by installing Leap 15.1, switching the repos over to Tumbleweed, putting a lock on the “ucode-amd” package, and then running zypper dup. Or, by installing Tumbleweed, then booting a live stick, manually downgrading that package in a chroot, then locking it. That’s currently what I’m running, and the lock will remain until the “ucode-amd” package is updated such that the Tumbleweed live image boots. If that seems like a terrible idea for reasons I am unaware of, please, anyone, let me know.
Thank you, @nrickert for your help. Before this I really had no idea that you could pull repos in from other releases and selectively upgrade or downgrade in order to test compatibility in this way. Without your suggestions along these lines I would not have cracked this.
Does anyone know who develops/maintains “ucode-amd”? I’d like to report the issue if I can. Does it come from AMD themselves?
You can probably install from the DVD installer. But, during the install, click on “Software” on the summary screen. And mark “ucode-amd” to not install. Perhaps you can even lock it there.
Note that I have not tested this. Still, it is a possibility to consider if you need to reinstall.
Thank you, @nrickert for your help. Before this I really had no idea that you could pull repos in from other releases and selectively upgrade or downgrade in order to test compatibility in this way.
We don’t usually recommend that, because it often causes problems. But there are circumstances where it can solve hardware compatibility issues.
Does anyone know who develops/maintains “ucode-amd”? I’d like to report the issue if I can. Does it come from AMD themselves?
Quick update in case anyone else is following/experiencing this issue. I raised a bug report here: https://bugzilla.opensuse.org/show_bug.cgi?id=1160204. The conclusion of discussion there is that the latest release of AMD microcode within the kernel firmware repository fixes this issue. We now just need to wait until it becomes part of the Tumbleweed release. Until then, I’m keeping the lock and the package version from Leap 15.1.