OpenSUSE Keeps "crashing"

I have been having problems with SUSE 15.1 “crashing”.

It started out with the mouse and/or keyboard would stop working. Usually re-plugging the USB would solve the problem, and the system would come back to life. Then it got to a point where the only way back was a reboot.

Then Thunderbird failed to load. I gave the desktop box to my brother, who was able to get it working again. He suspected a fault in the HDD. I purchased a new HDD (Seagate 2TB), the seller installed Windows, and my brother installed SUSE 15.1 from a ISO he had. (The Seagate replaced a WD 1TB)

The problems persisted, so I gave the box to my brother again. He was able to fix it, but he advised me that it might be beneficial to upgrade my hardware. So I purchased a new motherboard and CPU. The problems persisted, so I purchased a new power supply (650W, up from 470W). The problems persisted. As well as the USB dropouts, I was getting errors “Configuration file [somefile] not writeable”. (refer an earlier thread “Configuration file not writeable”)

I refitted my old WD drive (1TB), reinstalled SUSE 15.1 (from the same ISO my brother had and gave to me), and the problems seemed to go away. So maybe the system doesn’t like Seagate 2TB 7200rpm drives. So I purchased a new WD 2TB drive (WD20EZAZ 2TB), installed Windows 10, and SUSE 15.1. That performed ok for a few days, then the problems with the USB dropouts started again. This morning, GRUB failed and shut down the system (restarted ok).

The connection to “outside” is via a bridging modem and a Powerline pack with AC pass-through, to a modem connected to Fibre to the Node optic fibre (Australian NBN) network. Only because the phone connection to the NBN is in the kitchen, and the computer is in the study, aka bedroom.

I have done nothing the alter the default behaviour of OpenSUSE. When I install, I choose the Plasma option – should I be opting for GNOME? I chose to have a seperate /home partition. And I chose to have a larger swap partition (even on the 1TB drives).

Its probably only since connection to the NBN that I have been having these problems.

I am running out of hair to pull trying to figure out what is going on here.


  greg@linux-tqjb:~> sudo fdisk -l 
[sudo] password for root:  
**Disk /dev/sda: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors**
Disk model: WDC WD20EZAZ-00G 
Units: sectors of 1 * 512 = 512 bytes 
Sector size (logical/physical): 512 bytes / 4096 bytes 
I/O size (minimum/optimal): 4096 bytes / 4096 bytes 
Disklabel type: dos 
Disk identifier: 0x4bd16b36 

**Device****Boot****Start****End****Sectors****Size****Id****Type**
/dev/sda1  *          2048     104447     102400    50M  7 HPFS/NTFS/exFAT 
/dev/sda2           104448 1301272282 1301167835 620.5G  7 HPFS/NTFS/exFAT 
/dev/sda3       1301272576 1302323199    1050624   513M 27 Hidden NTFS WinRE 
/dev/sda4       1302323200 3907029167 2604705968   1.2T  f W95 Ext'd (LBA) 
   	 	 	 	   /dev/sda5       1302325248 2182000639  879675392 419.5G 83 Linux 
/dev/sda6       2182002688 3878436863 1696434176 808.9G 83 Linux 
/dev/sda7       3878438912 3907029167   28590256  13.6G 82 Linux swap / Solaris
   

A possibly good place to start is to try to determine what your machine was doing just before it crashed.

You can display the last 100 lines of your system log for your previous system boot. You can modify the “-1” to display logs from previous boots going back in time, you just need to remember which times you crashed and which not. You can also display fewer or more lines by modifying the “100”

journalctl -b -1 -n 100

My condolences for spending so much on your machine… Sounds like it’s practically brand new by now!

TSU

Thanks Hsu.

Well, I don’t think there is a single piece of hardware that hasn’t been replaced. New case. New PS. New HDD. New DVD. New MB / CPU. New fans!


greg@linux-tqjb:~> sudo journalctl -b -1 100
Specifying boot ID or boot offset has no effect, no persistent journal was found.

(have made journal persistent)

Another problem that is occurring is Firefox just suddenly shuts down. Mozilla informed, but it then will not restart, and seems to corrupt the system shutdown (system shutdown only possible thru reset button).

I read somewhere about GPT partitioning and drives >2TB. Any likelihood the OS or maybe the BIOS doesn’t like 2TB drives?

The annoying thing is that I can use old hardware with small HDDs and never have these problems.

You may or may not believe this. I was moving the mouse to log out of this forum and the **** system locked up! had to use the reset button. After I rebooted, I opened Firefox and was still logged in here.

You did a lot of changes to your hardware!

To begin with

  • Did you check that all cables are connected properly and are plugged into the correct places?
  • Have you checked for any broken cables?
  • Are all memory modules correctly plugged in?
  • Are all bus devices (graphics card, WLAN-card, …) correctly plugged in?

You do have MS Windows on your system. Do you have the same or similar problems using it?

When you are using USB-devices do you plug them into the ports provided by the motherboard or into some remote ports provided by your computer case?

It might be helpful if you could provide details (manufacturer, modell, …) of your motherboard, CPU, memory modules, power supply, …

Regards

susejunky

susejunky,

It only seems to be problematic when I run a 2TB HDD.

Anyway, new hardware:

                    GigaByte AMD D3SH M/B

AMD Ryzen 5 2600 6 core CPU (I think)
2 x Team 8GB DDR4 2666 RAM
Thermaltake LitePower 650W P/S
Seagate ST2000D 2TB HDD (causes problems)
or
WD WD20EZAZ 2TB HDD (causes problems)
or
WD WD1003 1TB (doesn’t cause problems)

Also, getting:



                        
 Message from syslogd … BUG: soft lockup – CPU#1 stuck for 22s! (threaded-ml:2244)

  

Your journal should have been persistent by default.
Do you have any idea how that might have changed?
Was this a fresh install 15.1 or did you upgrade from an earlier version?
Did you make your journal persistent by changing the setting in

/etc/systemd/journald.conf

If you get that working properly,
it should give you best chance of determining the problem.
You might notice for instance that error “soft lockup” is insufficient info to look for your problem.

TSU

Sorry, but i could not find any “GigaByte AMD D3SH M/B”. There are some “DS3H” Motherboards but you would have to be more specific to allow identification of your board.

However even older motherboards should support disks up to 2TB.

So again:

  • Have you checked for broken/not properly connected cables?
  • Does MS Windows show the same problems?

You could try to

  • update the firmware (BIOS/UEFI) of your motherboard.
  • format your 2TB disks using a GPT partition scheme.

As far as i know does a ‘soft lockup’ happen when “something” causes the kernel to loop in kernel mode for more than a predefined period of time without giving other tasks a chance to run. Are there any other log messages that could help to identify the culprit?

Run as “root”

# journalctl -b 0 -p 3

and show the result here.

Regards

susejunky

No idea. Installed like that? I didn’t touch it until I discovered it might help with the diagnosis.

Did you make your journal persistent by changing the setting in

/etc/systemd/journald.conf

Yes. Every option was #d. Original option was ‘auto’, changed to ‘persistent’.

Sorry, but i could not find any “GigaByte AMD D3SH M/B”. There are some “DS3H” Motherboards but you would have to be more specific to allow identification of your board.

My bad, Its a Gigabyte B450M S2H. And the CPU is a Ryzen 3200G with Radeon graphics. Network is also ‘onboard’.

However even older motherboards should support disks up to 2TB.

So again:

  • Have you checked for broken/not properly connected cables?

Yes. Actually took the MB out and re-installed it.

  • Does MS Windows show the same problems?

Win 10. No problems, although I don’t use Win that often. Only have it because I need it for some very specific apps.

You could try to

  • update the firmware (BIOS/UEFI) of your motherboard.

Has the latest BIOS (well it was when I purchased it about 3 months ago)

  • format your 2TB disks using a GPT partition scheme.

I just let SUSE install decide how to format. So presumably it formats GPT

Just out of interest, I have the WD20EZAZ installed against the “old” hardware (MB, RAM, PS) while I’m responding, and it has been behaving itself quite nicely.

You can check here:
B450M S2H (rev. 1.x) Support | Motherboard - GIGABYTE Global
to see whether your BIOS/UEFI is up-to-date or not.

Hmmm, …

… this is definitely NO GPT scheme. So probably the one who installed MS Windows first used a MBR-scheme and the openSUSE-installer will not change the partition scheme.

When your MS Windows shows no errors then you definitely need to check your openSUSE journal for errors.

Regards

susejunky

  1. For Ryzen 3200G (Picasso) you need B1 BIOS revision.
    https://www.gigabyte.com/Motherboard/B450M-DS3H-rev-10/support#support-cpu

  2. For Ryzen 3200G (Picasso) you need Leap 15.2 or 15.1 + newer kernel, not standart 4.12.

  3. WDC WD20EZAZ was created to serve as cheap and low-speed media vault, not as a system disk.
    Additionally, it is using ill-fated SMR recording. If you can - revert it to seller.
    Use SSD as a system disk, NVME is a better option.

  4. Use UEFI + GPT.
    You need to reinstall Leap, and, maybe Windows.

According to this https://forums.opensuse.org/showthread.php/543271-OpenSUSE-Keeps-quot-crashing-quot?p=2957164#post2957164 post OP uses a Gigabyte B450M S2H and not a Gigabyte B450M DS3H.

However both boards support a Ryzen 3200G CPU when firmware version F40 (or newer) is installed. If the board was purchased after May 2019 that probably is the case anyway.

Regards

susejunky

I was able to format as a GPT, however that was even worse! The system would hang during installation! Or the system would hang during boot! Or the system would hang after login!

Anyway, I’ve gone back the original MBR BIOS, running on the old MB/CPU and only 8MB DDR3, and - touch wood - no problems. And I can even have IOMMU enabled with no error messages on boot.

FYI:


greg@localhost:~> cd /etc/systemd/
greg@localhost:/etc/systemd> more journald.conf
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.
#
# Entries in this file show the compile time defaults.
# You can change settings by editing this file.
# Defaults can be restored by simply deleting this file.
#
# See journald.conf(5) for details.

[Journal]
#Storage=auto
Storage=persistent
#Compress=yes
#Seal=yes

journal.conf has ALL options commented out!

However both boards support a Ryzen 3200G CPU when firmware version F40 (or newer) is installed

BIOS version F50.

There seems to be a later version https://download.gigabyte.com/FileList/BIOS/mb_bios_b450m-s2h_f51.zip

You could try to use a later kernel with openSUSE Leap 15.2 (from repository https://download.opensuse.org/repositories/Kernel:/stable/standard/).

The easiest way to test whether a newer Kernel would help is to use a openSUSE Tumbleweed Live system (https://download.opensuse.org/tumbleweed/iso/openSUSE-Tumbleweed-KDE-Live-x86_64-Current.iso).

Regards

susejunky