System fails hard with IO errors, bus errors and segfaults

Hello, I’m not sure this is the proper subforum for such a question, but anyway, here it is.
I’m using Tumbleweed on a Dell T5600 workstation. Sometimes (rarely, but today it has happened twice!), all of a sudden all terminal commands start failing, printing “Segmentation fault” errors, “bus error” and “I/O error” messages. Shortly after, the system freezes, the Plasma desktop disappears, the background goes black and the only thing I still see are my open windows. At this point, there’s nothing else I can do apart from restarting the machine using the Ctrl+Alt+SysRq+B key combination.
It has happened both during light usage scenarios and during very CPU-intensive tasks (today it has happened twice while compiling AOSP - the Android OS).
I am very worried because I’m afraid some piece of hardware is failing, but I’m not really sure it is a hardware problem.
Has anyone got any suggestion to better investigate on the issue?
I will check journalctl in a short while and I’ll post any relevant line here.
Thanks!

Searching for the hardware reveals that, it was possibly produced around 2013 – is now about 8 years old …

  • So yes, it could be a hardware failure …

[HR][/HR]Have you tried replacing the (BIOS CMOS) battery on the Mainboard?
[HR][/HR]Why Tumbleweed?

  • Given the age of the box, Leap should be perfectly usable …

The reason why I’m wondering whether it could be a hardware failure is because the seller who sold me the workstation a couple of months ago told me it was refurbished and tested…
I haven’t tried replacing the CMOS battery.
I’m using Tumbleweed because I am a computer engineering student & I do Android development in my free time, and I want to have the latest versions of development software & tools.

It is worth mentioning that I was using a custom kernel, in which I stripped lots of modules using modprobed-db, maybe that is the reason… I’ll investigate more deeply in the next days.

Have you ran memory test using memtest or similar tool?

I would do that as the first step because the symptoms are just what you might encounter if the system was having issues with RAM and/or CPU heating.

Young engineers should be taught to be pragmatic – I would change to another engineering school …

  • Which lecturer is teaching that custom kernels should be used on machines being used for software development?
    *=2]If you’re writing code for Android then, the target is not the development system’s hardware – you’re writing code which will be cross-compiled …
    *=2]If you’re testing your code in an ARM-emulator then, I would be very careful about using “latest versions” – the emulator’s support for the newest versions should be viewed with suspicion – you may well end up being an emulator tester as well as proving that, your own code is running in an ARM environment …

Lately I experienced similar problems while using a certain external USB device. Unplugging and re-plugging the cable a couple of times solved the problem; but not for very long.
It seems that corrosion is a problem with my 9 years old system…

Nope, I haven’t. Will run memory testing soon.

I’m using Tumbleweed because, as I’ve mentioned, I often need and/or want up-to-date packages: I needed Java 14 as soon as it was released, in a short while I’ll need a very up-to-date C++ compiler, etc. Having the latest Plasma desktop is a big bonus too.
I’m using a custom kernel because I like trying new things and tweaking stuff. I implemented some patches from Zen kernel & the MuQSS CPU scheduler. This kind of stuff has never given me trouble on other computers, but I’ll try to revert to a vanilla kernel for a while to see if it makes a difference.

Did you keep a copy of the standard kernel. You can certainly break things rolling your own. Nice to experiment and all but you need to leave yourself a way back in case… :’(
You can keep several kernels and choose which to use at boot. But need to follow the rules

You’re going by Gentoo-way (or other stuff), not SUSE-way.

Java 14 is available for Leap & TW: https://software.opensuse.org/search?utf8=✓&baseproject=ALL&q=java+14
openSUSE Software

You may install different C/C++ compilers with a different versions with Leap.

It is better to use openSUSE’s kernels, not vanilla ones.

Choose what to do: tune custom kernels or develop software.