Leap 15.5 as guest display locks up on display blanking

Have been using OpenSuse as guest OS on VMWare Fusion (on a Mac) for years with no issues. One of the latest kernel updates (in the last month or so) has surfaced this problem.

There are three scenarios in which the OS hangs the screen all of which involve blanking the screen:

  1. When the system puts the display to sleep based on the setting to put display to sleep after certain time
  2. Selecting the screen lock command from menu
  3. Shutdown of the machine when the screen changes to small VGA mode

In 1 and 2 above, the system remains available via SSH but sends a beep every 15 seconds or so with the following appearing in the messages log

The commands to shutdown from the SSH window or from the VMWare Fusion menu results in a hang with no further response.

This seems to be connected to some kind of display mode change. I have increased the available CPUs and memory to quite large amounts with no difference. So, it is not about running out of memory or cpus as the messages may indicate. Something gets stuck in kernel mode at this display change.

I am using the latest openvm-tools available in Opensuse distribution. Not using a full screen display, just a windowed display in VMWare Fusion.

Any thoughts on how to fix this?

If I change the setting to never let display sleep, then the system runs for as long as I keep it up with no problem at all until I shutdown when it changes to VGA mode or I select lock screen when it blanks and hangs.

When it is hung, VMWare Fusion is unable to shut it down and Fusion itself cannot be shutdown saying the guest is busy. So, I have to force kill Fusion.

Nothing from VMWare Fusion has changed before the problem appeared.

kernel:[   93.552538][    C0] BUG: workqueue lockup - pool cpus=2 node=0 flags=0x0 nice=0 stuck for 70s!
2024-01-05T05:23:02.461305-08:00 HOMESERVER kernel: [   93.552538][    C0] BUG: workqueue lockup - pool cpus=2 node=0 flags=0x0 nice=0 stuck for 70s!
2024-01-05T05:23:02.461369-08:00 HOMESERVER kernel: [   93.553105][    C0] Showing busy workqueues and worker pools:
2024-01-05T05:23:02.461397-08:00 HOMESERVER kernel: [   93.553168][    C0] workqueue events: flags=0x0
2024-01-05T05:23:02.461776-08:00 HOMESERVER kernel: [   93.553288][    C0]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=18/256 refcnt=20
2024-01-05T05:23:02.461780-08:00 HOMESERVER kernel: [   93.553301][    C0]     in-flight: 324:drm_mode_rmfb_work_fn [drm] BAR(1580)
2024-01-05T05:23:02.461781-08:00 HOMESERVER kernel: [   93.553346][    C0]     pending: free_work, kernfs_notify_workfn, cgroup_bpf_release, cgroup_bpf_release, cgroup_bpf_release, cgroup_bpf_release, kfree_rcu_monitor, cgroup_bpf_release, cgroup_bpf_release, cgroup_bpf_release, cgroup_bpf_release, netstamp_clear, console_callback, cgroup_bpf_release, cgroup_bpf_release, cgroup_bpf_release, cgroup_bpf_release
2024-01-05T05:23:02.461785-08:00 HOMESERVER kernel: [   93.553433][    C0]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
2024-01-05T05:23:02.461785-08:00 HOMESERVER kernel: [   93.553444][    C0]     pending: e1000_watchdog [e1000]
2024-01-05T05:23:02.461786-08:00 HOMESERVER kernel: [   93.553515][    C0] workqueue events_power_efficient: flags=0x80
2024-01-05T05:23:02.461787-08:00 HOMESERVER kernel: [   93.553543][    C0]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
2024-01-05T05:23:02.461788-08:00 HOMESERVER kernel: [   93.553555][    C0]     pending: neigh_periodic_work
2024-01-05T05:23:02.461788-08:00 HOMESERVER kernel: [   93.553600][    C0] workqueue rcu_gp: flags=0x8
2024-01-05T05:23:02.461789-08:00 HOMESERVER kernel: [   93.553629][    C0]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
2024-01-05T05:23:02.461790-08:00 HOMESERVER kernel: [   93.553641][    C0]     pending: wait_rcu_exp_gp
2024-01-05T05:23:02.462057-08:00 HOMESERVER kernel: [   93.553705][    C0] workqueue mm_percpu_wq: flags=0x8
2024-01-05T05:23:02.462061-08:00 HOMESERVER kernel: [   93.553741][    C0]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=2/256 refcnt=4
2024-01-05T05:23:02.462062-08:00 HOMESERVER kernel: [   93.553752][    C0]     pending: vmstat_update, lru_add_drain_per_cpu BAR(50)
2024-01-05T05:23:02.462063-08:00 HOMESERVER kernel: [   93.553815][    C0] workqueue cgroup_destroy: flags=0x0
2024-01-05T05:23:02.462063-08:00 HOMESERVER kernel: [   93.553840][    C0]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/1 refcnt=7
2024-01-05T05:23:02.462064-08:00 HOMESERVER kernel: [   93.553852][    C0]     pending: css_killed_work_fn
2024-01-05T05:23:02.462065-08:00 HOMESERVER kernel: [   93.553856][    C0]     inactive: css_killed_work_fn, css_killed_work_fn, css_killed_work_fn, css_killed_work_fn, css_killed_work_fn
2024-01-05T05:23:02.491230-08:00 HOMESERVER kernel: [   93.554199][    C0] pool 4: cpus=2 node=0 flags=0x0 nice=0 hung=70s workers=3 idle: 111 28
2024-01-05T05:23:02.491240-08:00 HOMESERVER kernel: [   93.554260][    C0] Showing backtraces of running workers in stalled CPU-bound worker pools:

Tried this with a fresh stock reinstall of Leap 15.5. Same problem.

Reverting back to 5.14.21 kernel fixes this. Not sure which kernel version after that started the problem.

Also found changing resolution to anything hangs the display. So, it is related to display driver.

Looks like VMWare’s open-vm-tools open source included with the distribution and containing its own display driver is incompatible with newer kernels.