Hello!
Since upgrading to kernel 6.9.x on opensuse tumbleweed, all my apps run on mostly only 1 core of the CPU.
CPU is AMD FX 8350 8 core(*)
Motherboard is GA-990FXA-UD7 (rev 1.x) (**)
(Yes, I know this is very ancient hardware)
htop
while firefox loads with currently latest 6.9.1:
(Only CORE0 sees most of activity)
cpupower monitor
:
CPU| C0 | Cx | Freq || POLL | C1 | C2
0| 13.87| 86.13| 2418|| 0.00| 1.92| 84.38
1| 0.35| 99.65| 1414|| 0.00| 0.01| 99.65
2| 0.05| 99.95| 1656|| 0.00| 0.01| 99.93
3| 0.06| 99.94| 1458|| 0.00| 0.00| 99.93
4| 0.18| 99.82| 1677|| 0.00| 0.02| 99.80
5| 0.02| 99.98| 2398|| 0.00| 0.00| 99.98
6| 0.45| 99.55| 2733|| 0.00| 0.02| 99.55
7| 0.08| 99.92| 2446|| 0.00| 0.00| 99.97
htop
while firefox loads with previous kernel 6.8.9:
(threads are spread across all cores).
cpupower monitor
:
| Mperf || Idle_Stats
CPU| C0 | Cx | Freq || POLL | C1 | C2
0| 2.86| 97.14| 1430|| 0.00| 0.86| 96.39
7| 8.78| 91.22| 1406|| 0.00| 2.31| 89.06
3| 4.07| 95.93| 1839|| 0.00| 0.06| 95.96
1| 3.40| 96.60| 1575|| 0.00| 0.82| 95.95
6| 2.46| 97.54| 1396|| 0.00| 0.02| 97.59
5| 3.51| 96.49| 1396|| 0.00| 1.24| 95.36
2| 3.19| 96.81| 1472|| 0.00| 0.00| 96.87
this is the suspicous entry I find in the logs (journalctl --boot
):
on kernel 6.9.1, while smpboot is initialising the other cores, I get a bunch of errors:
jun 03 10:02:33 saturn kernel: smpboot: CPU0: AMD FX(tm)-8350 Eight-Core Processor (family: 0x15, model: 0x2, stepping: 0x0)
jun 03 10:02:33 saturn kernel: Performance Events: Fam15h core perfctr, AMD PMU driver.
jun 03 10:02:33 saturn kernel: ... version: 0
jun 03 10:02:33 saturn kernel: ... bit width: 48
jun 03 10:02:33 saturn kernel: ... generic registers: 6
jun 03 10:02:33 saturn kernel: ... value mask: 0000ffffffffffff
jun 03 10:02:33 saturn kernel: ... max period: 00007fffffffffff
jun 03 10:02:33 saturn kernel: ... fixed-purpose events: 0
jun 03 10:02:33 saturn kernel: ... event mask: 000000000000003f
jun 03 10:02:33 saturn kernel: signal: max sigframe size: 1776
jun 03 10:02:33 saturn kernel: rcu: Hierarchical SRCU implementation.
jun 03 10:02:33 saturn kernel: rcu: Max phase no-delay instances is 1000.
jun 03 10:02:33 saturn kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
jun 03 10:02:33 saturn kernel: smp: Bringing up secondary CPUs ...
jun 03 10:02:33 saturn kernel: smpboot: x86: Booting SMP configuration:
jun 03 10:02:33 saturn kernel: .... node #0, CPUs: #2 #4 #6
jun 03 10:02:33 saturn kernel: __common_interrupt: 2.55 No irq handler for vector
jun 03 10:02:33 saturn kernel: __common_interrupt: 4.55 No irq handler for vector
jun 03 10:02:33 saturn kernel: __common_interrupt: 6.55 No irq handler for vector
jun 03 10:02:33 saturn kernel: #1 #3 #5 #7
jun 03 10:02:33 saturn kernel: __common_interrupt: 1.55 No irq handler for vector
jun 03 10:02:33 saturn kernel: ------------[ cut here ]------------
jun 03 10:02:33 saturn kernel: WARNING: CPU: 3 PID: 0 at kernel/sched/core.c:6482 sched_cpu_starting+0x193/0x250
jun 03 10:02:33 saturn kernel: Modules linked in:
jun 03 10:02:33 saturn kernel: CPU: 3 PID: 0 Comm: swapper/3 Not tainted 6.9.1-1-default #1 openSUSE Tumbleweed c5471a56f12c40709b95530f47f6c0b39e75f136
jun 03 10:02:33 saturn kernel: Hardware name: Gigabyte Technology Co., Ltd. GA-990FXA-UD7/GA-990FXA-UD7, BIOS F11d 07/09/2013
jun 03 10:02:33 saturn kernel: RIP: 0010:sched_cpu_starting+0x193/0x250
jun 03 10:02:33 saturn kernel: Code: 00 8b 0d 80 33 fd 01 39 c8 0f 83 6c ff ff ff 48 63 d0 48 8b 3c d5 00 be 4f 8b 4c 01 e7 39 c3 75 c7 4c 89 b7 68 0c 00 00 eb c7 <0f> 0b eb c3 be 04 00 00 00 89 df e8 dd 51 02 00 84 c0 0f 85 71 ff
jun 03 10:02:33 saturn kernel: RSP: 0000:ffffb279800e3e28 EFLAGS: 00010006
jun 03 10:02:33 saturn kernel: RAX: 0000000000000001 RBX: 0000000000000003 RCX: 0000000000000008
jun 03 10:02:33 saturn kernel: RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff94586eabc740
jun 03 10:02:33 saturn kernel: RBP: ffff94554004fc98 R08: 0000000000000003 R09: ffff94586eb80000
jun 03 10:02:33 saturn kernel: R10: ffff94554004fc98 R11: 0000000000000006 R12: 000000000003c740
jun 03 10:02:33 saturn kernel: R13: 000000000003c740 R14: ffff94586eb3c740 R15: 0000000000000003
jun 03 10:02:33 saturn kernel: FS: 0000000000000000(0000) GS:ffff94586eb80000(0000) knlGS:0000000000000000
jun 03 10:02:33 saturn kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
jun 03 10:02:33 saturn kernel: CR2: 0000000000000000 CR3: 00000001aba36000 CR4: 00000000000406f0
jun 03 10:02:33 saturn kernel: Call Trace:
jun 03 10:02:33 saturn kernel: <TASK>
jun 03 10:02:33 saturn kernel: ? sched_cpu_starting+0x193/0x250
jun 03 10:02:33 saturn kernel: ? __warn.cold+0xa8/0x102
jun 03 10:02:33 saturn kernel: ? sched_cpu_starting+0x193/0x250
jun 03 10:02:33 saturn kernel: ? report_bug+0xd8/0x150
jun 03 10:02:33 saturn kernel: ? handle_bug+0x3c/0x80
jun 03 10:02:33 saturn kernel: ? exc_invalid_op+0x17/0x70
jun 03 10:02:33 saturn kernel: ? asm_exc_invalid_op+0x1a/0x20
jun 03 10:02:33 saturn kernel: ? sched_cpu_starting+0x193/0x250
jun 03 10:02:33 saturn kernel: ? sched_cpu_starting+0x16a/0x250
jun 03 10:02:33 saturn kernel: ? __pfx_sched_cpu_starting+0x10/0x10
jun 03 10:02:33 saturn kernel: cpuhp_invoke_callback+0xf8/0x450
jun 03 10:02:33 saturn kernel: __cpuhp_invoke_callback_range+0x67/0xb0
jun 03 10:02:33 saturn kernel: start_secondary+0x9c/0x140
jun 03 10:02:33 saturn kernel: common_startup_64+0x13e/0x141
jun 03 10:02:33 saturn kernel: </TASK>
jun 03 10:02:33 saturn kernel: ---[ end trace 0000000000000000 ]---
jun 03 10:02:33 saturn kernel: __common_interrupt: 3.55 No irq handler for vector
jun 03 10:02:33 saturn kernel: __common_interrupt: 5.55 No irq handler for vector
jun 03 10:02:33 saturn kernel: __common_interrupt: 7.55 No irq handler for vector
jun 03 10:02:33 saturn kernel: smp: Brought up 1 node, 8 CPUs
jun 03 10:02:33 saturn kernel: smpboot: Total of 8 processors activated (64306.06 BogoMIPS)
jun 03 10:02:33 saturn kernel: ------------[ cut here ]------------
jun 03 10:02:33 saturn kernel: WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2408 build_sched_domains+0x724/0x1310
jun 03 10:02:33 saturn kernel: Modules linked in:
jun 03 10:02:33 saturn kernel: CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.9.1-1-default #1 openSUSE Tumbleweed c5471a56f12c40709b95530f47f6c0b39e75f136
jun 03 10:02:33 saturn kernel: Hardware name: Gigabyte Technology Co., Ltd. GA-990FXA-UD7/GA-990FXA-UD7, BIOS F11d 07/09/2013
jun 03 10:02:33 saturn kernel: RIP: 0010:build_sched_domains+0x724/0x1310
jun 03 10:02:33 saturn kernel: Code: 04 41 89 56 3c 48 8b 15 72 2e 8f 02 48 63 4d 14 39 34 8a 0f 8e 9c fe ff ff 25 e9 ef ff ff 80 cc 04 41 89 46 3c e9 8b fe ff ff <0f> 0b 41 be f4 ff ff ff 48 8b 44 24 70 8b 10 85 d2 0f 84 09 02 00
jun 03 10:02:33 saturn kernel: RSP: 0018:ffffb27980033d88 EFLAGS: 00010202
jun 03 10:02:33 saturn kernel: RAX: 00000000ffffff01 RBX: 0000000000000000 RCX: 00000000ffffff01
jun 03 10:02:33 saturn kernel: RDX: 00000000fffffff8 RSI: 0000000000000003 RDI: ffff94554004f660
jun 03 10:02:33 saturn kernel: RBP: ffff945540234a00 R08: ffff94554004f660 R09: 0000000000000000
jun 03 10:02:33 saturn kernel: R10: ffffb27980033d50 R11: 0000000039b6461e R12: 0000000000000001
jun 03 10:02:33 saturn kernel: R13: ffff94554004f018 R14: 0000000000000001 R15: ffff94554004f2c0
jun 03 10:02:33 saturn kernel: FS: 0000000000000000(0000) GS:ffff94586ea00000(0000) knlGS:0000000000000000
jun 03 10:02:33 saturn kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
jun 03 10:02:33 saturn kernel: CR2: ffff9455eca01000 CR3: 00000001aba36000 CR4: 00000000000406f0
jun 03 10:02:33 saturn kernel: Call Trace:
jun 03 10:02:33 saturn kernel: <TASK>
jun 03 10:02:33 saturn kernel: ? build_sched_domains+0x724/0x1310
jun 03 10:02:33 saturn kernel: ? __warn.cold+0xa8/0x102
jun 03 10:02:33 saturn kernel: ? build_sched_domains+0x724/0x1310
jun 03 10:02:33 saturn kernel: ? report_bug+0xd8/0x150
jun 03 10:02:33 saturn kernel: ? handle_bug+0x3c/0x80
jun 03 10:02:33 saturn kernel: ? exc_invalid_op+0x17/0x70
jun 03 10:02:33 saturn kernel: ? asm_exc_invalid_op+0x1a/0x20
jun 03 10:02:33 saturn kernel: ? build_sched_domains+0x724/0x1310
jun 03 10:02:33 saturn kernel: ? build_sched_domains+0x35b/0x1310
jun 03 10:02:33 saturn kernel: ? alloc_cpumask_var_node+0x23/0x40
jun 03 10:02:33 saturn kernel: ? alloc_cpumask_var_node+0x23/0x40
jun 03 10:02:33 saturn kernel: ? __pfx_kernel_init+0x10/0x10
jun 03 10:02:33 saturn kernel: sched_init_smp+0x3e/0xc0
jun 03 10:02:33 saturn kernel: ? stop_machine+0x30/0x40
jun 03 10:02:33 saturn kernel: ? __pfx_kernel_init+0x10/0x10
jun 03 10:02:33 saturn kernel: kernel_init_freeable+0x137/0x2a0
jun 03 10:02:33 saturn kernel: ? __pfx_kernel_init+0x10/0x10
jun 03 10:02:33 saturn kernel: kernel_init+0x1a/0x130
jun 03 10:02:33 saturn kernel: ret_from_fork+0x34/0x50
jun 03 10:02:33 saturn kernel: ? __pfx_kernel_init+0x10/0x10
jun 03 10:02:33 saturn kernel: ret_from_fork_asm+0x1a/0x30
jun 03 10:02:33 saturn kernel: </TASK>
jun 03 10:02:33 saturn kernel: ---[ end trace 0000000000000000 ]---
whereas with kernel 6.8.9, smpboot seems to run without problems:
jun 03 11:03:48 saturn kernel: smpboot: CPU0: AMD FX(tm)-8350 Eight-Core Processor (family: 0x15, model: 0x2, stepping: 0x0)
jun 03 11:03:48 saturn kernel: RCU Tasks: Setting shift to 3 and lim to 1 rcu_task_cb_adjust=1.
jun 03 11:03:48 saturn kernel: RCU Tasks Rude: Setting shift to 3 and lim to 1 rcu_task_cb_adjust=1.
jun 03 11:03:48 saturn kernel: RCU Tasks Trace: Setting shift to 3 and lim to 1 rcu_task_cb_adjust=1.
jun 03 11:03:48 saturn kernel: Performance Events: Fam15h core perfctr, AMD PMU driver.
jun 03 11:03:48 saturn kernel: ... version: 0
jun 03 11:03:48 saturn kernel: ... bit width: 48
jun 03 11:03:48 saturn kernel: ... generic registers: 6
jun 03 11:03:48 saturn kernel: ... value mask: 0000ffffffffffff
jun 03 11:03:48 saturn kernel: ... max period: 00007fffffffffff
jun 03 11:03:48 saturn kernel: ... fixed-purpose events: 0
jun 03 11:03:48 saturn kernel: ... event mask: 000000000000003f
jun 03 11:03:48 saturn kernel: signal: max sigframe size: 1776
jun 03 11:03:48 saturn kernel: rcu: Hierarchical SRCU implementation.
jun 03 11:03:48 saturn kernel: rcu: Max phase no-delay instances is 1000.
jun 03 11:03:48 saturn kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
jun 03 11:03:48 saturn kernel: smp: Bringing up secondary CPUs ...
jun 03 11:03:48 saturn kernel: smpboot: x86: Booting SMP configuration:
jun 03 11:03:48 saturn kernel: .... node #0, CPUs: #2 #4 #6 #1 #3 #5 #7
jun 03 11:03:48 saturn kernel: smp: Brought up 1 node, 8 CPUs
jun 03 11:03:48 saturn kernel: smpboot: Max logical packages: 1
jun 03 11:03:48 saturn kernel: smpboot: Total of 8 processors activated (64313.13 BogoMIPS)
> grep 'smp' boot.6.*
boot.6.8.txt:jun 03 11:03:48 saturn kernel: smpboot: Allowing 8 CPUs, 0 hotplug CPUs
boot.6.8.txt:jun 03 11:03:48 saturn kernel: smpboot: CPU0: AMD FX(tm)-8350 Eight-Core Processor (family: 0x15, model: 0x2, stepping: 0x0)
boot.6.8.txt:jun 03 11:03:48 saturn kernel: smp: Bringing up secondary CPUs ...
boot.6.8.txt:jun 03 11:03:48 saturn kernel: smpboot: x86: Booting SMP configuration:
boot.6.8.txt:jun 03 11:03:48 saturn kernel: smp: Brought up 1 node, 8 CPUs
boot.6.8.txt:jun 03 11:03:48 saturn kernel: smpboot: Max logical packages: 1
boot.6.8.txt:jun 03 11:03:48 saturn kernel: smpboot: Total of 8 processors activated (64313.13 BogoMIPS)
boot.6.9.txt:jun 03 10:02:33 saturn kernel: smpboot: CPU0: AMD FX(tm)-8350 Eight-Core Processor (family: 0x15, model: 0x2, stepping: 0x0)
boot.6.9.txt:jun 03 10:02:33 saturn kernel: smp: Bringing up secondary CPUs ...
boot.6.9.txt:jun 03 10:02:33 saturn kernel: smpboot: x86: Booting SMP configuration:
boot.6.9.txt:jun 03 10:02:33 saturn kernel: smp: Brought up 1 node, 8 CPUs
boot.6.9.txt:jun 03 10:02:33 saturn kernel: smpboot: Total of 8 processors activated (64306.06 BogoMIPS)
boot.6.9.txt:jun 03 10:02:33 saturn kernel: sched_init_smp+0x3e/0xc0
Does anybody has an idea how to further investigate / report this bug?
(*): 4 modules of 2 half-cores each, sharing L2 cache and FPU between both half-cores.
(**): The rev 1.x uses a BIOS-based firmware (with some EFI compatibility layer), not a UEFI-based firmware unlike later rev 3.x,