I have been using TW for about 1.5 years in a VMWare VM with Windows 10 as the host.
Overall it has worked well and I have not encountered any major issues and generally have updated with zypper dup shortly after new updates are available.
Based on that success, I decide to install TW with dual boot.
As a dry run test, I first downloaded the latest TW ISO image and created a new VM where I installed it and then went through the process of installing other apps/packages and configuring it as I had done in the first VM.
For the last 2 weeks or so I have been having random problems and crashes in the newly built VM.
I use that VM all day long and don’t seem to run into any issues, however, sometime after the machine becomes idle, generally at night when I go to bed the machine crashes.
After it happened the first few times, I started 2 ssh sessions on the host computer, one running TOP and the other running journalctl --follow so that I could see what happens just before it crashes.
From that I have found 2 common things:
- plasmashell was at 100% CPU
- journalctl has an error like this: watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [plasmashell:37158]
Today there was some additional info:
watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [plasmashell:37158]Modules linked in: binfmt_misc udp_diag tcp_diag inet_diag md4 cmac nls_utf8 cifs libarc4 dns_resolver fscache libdes af_packet nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables rfkill ip6table_filter ip6_tables iptable_filter ip_tables x_tables bpfilter vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event snd_seq snd_ens1371 snd_ac97_codec ac97_bus gameport snd_rawmidi intel_rapl_msr intel_rapl_common kvm_intel snd_seq_device snd_pcm snd_timer snd kvm soundcore vmw_balloon irqbypass e1000 mptctl vmw_vmci joydev i2c_piix4 pcspkr efi_pstore
tiny_power_button button ac nls_iso8859_1 nls_cp437 vfat fat fuse configfs hid_generic usbhid sr_mod cdrom ata_generic vmwgfx drm_kms_helper crct10dif_pclmul crc32_pclmul ghash_clmulni_intel xhci_pci xhci_pci_renesas syscopyarea sysfillrect sysimgblt xhci_hcd fb_sys_fops cec rc_core ttm uhci_hcd drm aesni_intel glue_helper crypto_simd ehci_pci ehci_hcd cryptd usbcore ata_piix serio_raw mptspi scsi_transport_spi mptscsih mptbase btrfs blake2b_generic libcrc32c crc32c_intel xor raid6_pq sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr efivarfs
CPU: 3 PID: 37158 Comm: plasmashell Tainted: G W 5.11.4-1-default #1 openSUSE Tumbleweed
Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.16722896.B64.2008100651 08/10/2020
RIP: 0010:native_queued_spin_lock_slowpath+0x20/0x1d0
Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 0f 1f 44 00 00 ba 01 00 00 00 8b 07 85 c0 75 09 f0 0f b1 17 85 c0 75 f2 c3 f3 90 <eb> ed 81 fe 00 01 00 00 74 43 40 30 f6 85 f6 75 65 f0 0f ba 2f 08
RSP: 0000:ffffb9aa82f37c88 EFLAGS: 00000202
RAX: 0000000000000008 RBX: ffff9f5340000000 RCX: 0000000000000027
RDX: 0000000000000001 RSI: 0000000000000008 RDI: fffff8b601ee8028
RBP: ffffb9aa82f37cc0 R08: fffff8b601ee8028 R09: 0000000081000200
R10: 000000007ba00fff R11: 0000000000000000 R12: ffff9f53bba00000
R13: ffff9f54d3357080 R14: 00007f2a6e800000 R15: 0000000000000000
FS: 00007f2d543fb980(0000) GS:ffff9f5575ec0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2a6e7fffe0 CR3: 0000000107bea006 CR4: 00000000001706e0
Call Trace:
Occasionally, I am come back to the machine after it has been idle for a while and before it has crashed and I have observed that all the swap space has been consumed (seems like there must be some memory leak occurring, although it seems odd that I can work on the machine all day and not have that problem until the machine is idle at night).
The VM has 8 GB of ram allocated and a 8 GB swap partition also allocated.
Generally, when I leave the machine for the night there are only a few apps left running, Konsole, KSysGuard, and Chrome.
On the few occasions, where I get to the machine before it crashes I have tried to find out what was consuming all that swap space but have been unable to determine what it is.
The host machine is an Intel i7 with 32 GB of RAM.
The “stuck” CPU problem has never happened with the host machine (Windows 10), nor has it happened with any of my Windows 10 Virtual machines.
I am only using 2 additional repos, one for skype, and the other for VLC (following the TW instructions for adding it)
Here’s the repository list:
Repository priorities in effect:
90 (raised priority) : 1 repository
99 (default priority) : 7 repositories
# | Alias | Name | Enabled | GPG Check | Refresh
--+----------------------------------+---------------------------+---------+-----------+--------
1 | download.opensuse.org-non-oss | Main Repository (NON-OSS) | Yes | (r ) Yes | Yes
2 | download.opensuse.org-oss | Main Repository (DEBUG) | Yes | (r ) Yes | Yes
3 | download.opensuse.org-oss_1 | Main Repository (Sources) | Yes | (r ) Yes | Yes
4 | download.opensuse.org-oss_2 | Main Repository (OSS) | Yes | (r ) Yes | Yes
5 | download.opensuse.org-tumbleweed | Main Update Repository | Yes | (r ) Yes | Yes
6 | google-chrome | google-chrome | Yes | (r ) Yes | Yes
7 | openSUSE-20210215-0 | openSUSE-20210215-0 | No | ---- | ----
8 | skype-stable | skype (stable) | Yes | (r ) Yes | Yes
9 | vlc | VLC | Yes | (r ) Yes | Yes
The above error occurred last night while running TW release 20210311 but similar errors were occurring with the builds from the last 2 weeks or so. I just did a zypper dup up to 20210312.
I would greatly appreciate any help in how to best go about debugging this issue.
Thanks!