System hangs at POST only after running openSuse - AMD FX CPU

I have a really curious problem:

I have to computers with similar setup, same motherboard (Asus Sabertooth 990fx R2), AMD FX 8 core CPUs, ECC memory (main reason to go with AMD/Asus a few years ago. One is running Windows 10 and the other openSUSE Leap. Both have running quite stable for several years. When support for Leap 42.3 ended, I upgraded to Leap 15.0, and the problem started. I upgraded to Leap 15.1 and the problem persists.

The computer running Linux, after rebooting (warm boot) or shutting down and powering on (cold boot), hangs at POST (Power on Self Test), screen remains blank. After a minute or two the motherboard emits beeps that seem to indicate a graphics card or memory problem. If I switch off and on the power supply, or I press the reset button, the computer boots. It also boots if I press the “MemOK” button of these motherboards. After booting the UEFI BIOS shows a message “Overclocking failed” even if no overclocking is attempted.

I have tried different memory sticks and video cards. Then I swapped the boot disks of the computers, with the same result: the one with Windows reboots normally, while the one with openSUSE does not, so I do not think it is caused by a failing component (which was my first guess)

When I have time, I intend to try another linux distro and other Asus AM3+ motherboard.

It looks to me like something in the most recent kernels stays resident until the system is completely powered off or a hardware reset is performed. Any idea what might be? Perhaps CPU microcode updates survive cold reboots? Can and older ucode-amd package be installed?

Hi and welcome to the Forum :slight_smile:
Maybe the fwupdate (firmware) service pending/running? Could try adding iommu=soft to the boot options as well.

have a read here: https://www.kernel.org/doc/Documentation/x86/x86_64/boot-options.txt

Is the problem really during POST?
Hard to tell since you didn’t provide an exact sequence and description of failure,
But traditional thought is that a POST failure is a hardware or BIOS/EFI failure, not anything to do with the OS because the requirements to read the OS off a disk haven’t even been loaded yet(Minimal requirements are generally memory, CPU, disk and then sometimes display, I/O like mouse, keyboard).

TSU

message “Overclocking failed”

Possibly automatic overclocking in BIOS is turned on - because this is a default settings.
Manually turn off all possible options for a automatic overclocking.

For an AMD AM3+ you need a special tuning with a iommu and other kernel parameters.

Hi everybody! And thank you for your answers

I know this is very weird. The problem seemed to be solved last week after applying a zypper update. I did not examine all the packages included, but I am sure kernel-firmware was one of them. Perhaps this affects the chipset?

I intend to do further testing when I have more time.

Possibly automatic overclocking in BIOS is turned on - because this is a default settings.

Nope, At first I had mild overclocking on turbo states, but disabling it did not make any difference

Is the problem really during POST?
Hard to tell since you didn’t provide an exact sequence and description of failure,

I will try to be clearer. The sequences tried are:

  1. System running -> reboot -> system shows messeges shutting down services, unmounting drives, etc. until halt message -> screen goes blank -> after one or two minutes beep errors from motherboard
  2. System running -> shutdown from OS -> computer shuts down normally (fans stop, power led goes off). Press power button -> fan start spinning, screen is blank -> after one or two minutes, beep errors
  3. When motherboard is beeping -> press reset button on CPU -> the computers starts normally (one short beep after first phase of POST, screen shows BIOS messages, disk drives found, RAM, etc. Then loads GRUB and Linux)
  4. If I press the MemOK button on the motherboard before de error beeps, it also manages to boot.
  5. Shutdown system -> turn off power supply or disconnect from mains -> reconnect -> push power button -> computer starts normally

When running Windows 10 it always reboots normally

Maybe the fwupdate (firmware) service pending/running? Could try adding iommu=soft to the boot options as well.

That I have not tried. I will keep it in mind for when I have time to review the issue.