HELP - AMD Microcode issues, Kernel ACPI mode, screen tearing, single thread use (100%) instead of 24

Hi everyone,

in the last couple of days and with the last kernel (6.8.1.2) update I have nothing but trouble with openSUSE Tumbelweed. The system started to don’t load after the Grub2 selector and gave me microcode messages relative to CPU (Ryzen 9 3900X). I tried quick and fresh reinstalls even opting for Leap (always from USB) but this one don’t work even selecting other alternative kernels. So, I managed to reinstall Tumbleweed with the alternative kernel (no ACPI mode) but I’ve ended with really bad performances and neofetch only shows me 1 thread instead of 24. This is the first time something so unkward happens, and I have no clue on how to solve this.

Some legit questions/considerations:

  1. The microcode isn’t the motherboard bios update, so, if is relative to CPU only, how can I update from BIOS and not from a sluggish and potentially unstable environment?
  2. I’ve updated the latest mobo BIOS - B450 MORTAR MAX | Motherboard | MSI Global - but apart from a factory reset equal to a CMOS battery (yes, I’ve tried this option too which is equal to a CPU dismount, which I’ve done too) nothing changed. The CPU isn’t brick because the post and all the diagnostic works as usual but it’s like something unstable and different is inside of it.
  3. Mostly importantly: if I don’t solve here, I have to go to a tech center and for me, personally, it’s a huge humiliation because I’m a bit qualified too. Any help from someone who already had this exact issue on AMD CPU’s?
  4. Please, give me the favor, if answer me, to post detailed procedures when I have to inquiry the openSUSE system if this issue can be solved from Linux but I hope is going to be fixed from the BIOS motherboard.
  5. I have the APC Pro 1300 backup, just in case someone wonder if I have the necessary precautions.

Thanks in Advance, sincerely,
Carlo

ADDITIONAL NOTE: Before this happened I was into another OS session (win11 debloated version), normally gaming, and the system suddenly shut down and rebooted. After the reboot, I was no longer able to launch Win and openSUSE.
Now, It’s easy to blame MS on the dynamics that bring me to this but it’s unfair because the cause, the origin, isn’t exactly clear. I don’t know HOW exactly happened (if through internet connection or a MS Windows update in the background). Now, I have a OS with screen tearing, no CPU threads, and even opening a browser is heavily dangerous for the CPU heat. Can’t work like that.

As you have already tried reinstalling and even older kernels with Leap, it could only be the new BIOS firmware.
I’m guessing it’s not possible to downgrade that or reset to default without taking it to a hardware specialist. Hopefully someone more experienced can chime in.

To confirm the theory and make sure the basics are fine:

  1. Run memtest, memtest_vulkan, and glmark2 to see if the normal RAM, VRAM, or graphics H/W is problematic.
  2. Try a few different live ISOs like Fedora, Debian (for its 6.1 LTS kernel), etc. for the latter two GPU tests.
1 Like

Thank you, I’ll look into and I’ll let you know about the memtest. Meanwhile I’ve catched another info through CoreControl (which it seems to “see” CPU only, no GPU) that clearly show me the microcode version ( 0x8701030) and, sadly, only one core x one thread instead of 12/24. CPU-X shows similar results + temps.

@Citizen839X If your down to one core and lost the GPU (is this a discrete one requiring power?) I would be checking out the power supply and connections…

I don’t think it’s a matter of hardware/electric instability after 6 hours of three passed memtests…

@Citizen839X so it show all the cores available in memtest… Are they all online cat /sys/devices/system/cpu/online

I don’t understand…

@Citizen839X you indicated the system was only showing one core working…

Yes, through neofetch and Corecontrol. Now I’ve just checked that path folder and I don’t have “online” folder under cpu.

@Citizen839X what about cat /sys/devices/system/cpu/offline and cat /sys/devices/system/cpu/present

Ok, they are under a sort of text file, right? I opened with Kate and “offline” is empty, present shows “0” value. There is also “online” at also shows “0” as value.

@Citizen839X no need to open, just use the cat command from a terminal for example;

:~> cat /sys/devices/system/cpu/offline

:~> cat /sys/devices/system/cpu/present
0-35

:~> cat /sys/devices/system/cpu/online 
0-35

What about the output from;

:~> lscpu | grep NUMA

NUMA node(s):                         1
NUMA node0 CPU(s):                    0-35

it says:

localhost:/home/carlo # lscpu | grep NUMA
NUMA node(s): 1
NUMA node0 CPU(s): 0

and I confirm 0 value for the previous ones.

By the way, I was thinking if “0” is a value that indicates 1 core, my total should be 23, right? I mean as threads, no cores. Cores should be 11 (0+11 for a total of 12).

@Citizen839X all very strange since MemTest sees the device and all the cores… If you check the system BIOS what does it show…

Why strange? I’m running under an alternative kernel with no ACPI and compatible video mode. It’s like running in a limited virtual machine environment.

@Citizen839X ahh ok. So what release is this, cat /etc/os-release and kernel?

Man, honestly, your intervention is pointless. I can hear the CPU is blowing heat from here and just opening the browser. It’s using only one core.

Anyway,

NAME=“openSUSE Tumbleweed”

VERSION=“20240329”

ID=“opensuse-tumbleweed”
ID_LIKE=“opensuse suse”
VERSION_ID=“20240329”
PRETTY_NAME=“openSUSE Tumbleweed”
ANSI_COLOR=“0;32”

CPE 2.3 format, boo#1217921

CPE_NAME=“cpe:2.3:o:opensuse:tumbleweed:20240329:::::::*”
#CPE 2.2 format
#CPE_NAME=“cpe:/o:opensuse:tumbleweed:20240329”
BUG_REPORT_URL=“https://bugzilla.opensuse.org
SUPPORT_URL=“https://bugs.opensuse.org
HOME_URL=“https://www.opensuse.org
DOCUMENTATION_URL=“Portal:Tumbleweed - openSUSE Wiki
LOGO=“distributor-logo-Tumbleweed”

So, very likely, if I do removing acpi=off from GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub
the system will not start.
Meanwhile I found this interesting explanation - ACPI Kernel Parameters and how to choose them – Discovery