I think my homeserver is going to die

And it’s not even 14 years old. :upside_down_face: Well, I knew the day would come…

Here’s the hardware background:

I kept the server running at 24/7, always with quite a bad conscience. At last, I configured it to shut down at night after backup and come back on WOL, any key or by BIOS clock. Starting some weeks ago, a couple of times but not yet reproducibly, it didn’t. It refused to start even with the regular power key. Only shutting it off the power supply for some time let it start on power key, again. The other methods work well the next day. Everything I can think of is screaming “condensors”. Due to its age I’m not trying any repair or to pursue this in the hardware section, unless someone came up with a very easy remedy.

The thing is, I spent ages setting everything up i.e. printer and scanner server, filesharing, backups, Twonkyserver, Nextcloud, Zoneminder etc. At last, everything is working as I want it to. Although I have documented everything quite meticulously it would still take ages to get it all running the same way, again. So, after getting a new mobo I am going to try to move the system to it without a fresh install.

The questions:
Does anybody have advice on what to take care of, what to do, what NOT to do?
Anything I should / I take care of before decommissioning the old mobo?

More details:

lanserv:~ # inxi -SMNDmPG
System:    Host: lanserv Kernel: 5.14.21-150400.24.55-default x86_64 bits: 64 Console: pty pts/0 Distro: openSUSE Leap 15.4
Machine:   Type: Desktop Mobo: Gigabyte model: GA-790XTA-UD4 serial: N/A BIOS: Award v: F2 date: 12/03/2009
Memory:    RAM: total: 17.5 GiB used: 1.15 GiB (6.6%)
           Array-1: capacity: 32 GiB note: est. slots: 4 EC: None
           Device-1: A0 size: 2 GiB speed: 1800 MT/s
           Device-2: A1 size: No Module Installed
           Device-3: A2 size: 8 GiB speed: 1800 MT/s
           Device-4: A3 size: 8 GiB speed: 1800 MT/s
Graphics:  Device-1: NVIDIA G98 [GeForce 8400 GS Rev. 2] driver: nouveau v: kernel
           Display: server: X.Org 1.20.3 driver: loaded: nouveau unloaded: fbdev,modesetting,nv,vesa
           resolution: 1920x1080~60Hz
           OpenGL: renderer: llvmpipe (LLVM 11.0.1 128 bits) v: 4.5 Mesa 21.2.4
Network:   Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver: r8169
Drives:    Local Storage: total: 3.4 TiB used: 761.39 GiB (21.9%)
           ID-1: /dev/sda vendor: Seagate model: BarraCuda Q1 SSD ZA240CV10001 size: 223.57 GiB
           ID-2: /dev/sdb vendor: Seagate model: ST3500514NS size: 465.76 GiB
           ID-3: /dev/sdc vendor: Marvell model: Raid VD 0 size: 931 GiB
           ID-4: /dev/sdd vendor: Western Digital model: WD20NPVZ-00WFZT0 size: 1.82 TiB
Partition: ID-1: / size: 60 GiB used: 18.43 GiB (30.7%) fs: btrfs dev: /dev/sda2
           ID-2: /home size: 12.51 GiB used: 1.59 GiB (12.7%) fs: xfs dev: /dev/sda3
           ID-3: /opt size: 60 GiB used: 18.43 GiB (30.7%) fs: btrfs dev: /dev/sda2
           ID-4: /tmp size: 60 GiB used: 18.43 GiB (30.7%) fs: btrfs dev: /dev/sda2
           ID-5: /var/log size: 60 GiB used: 18.43 GiB (30.7%) fs: btrfs dev: /dev/sda2
           ID-6: /var/tmp size: 60 GiB used: 18.43 GiB (30.7%) fs: btrfs dev: /dev/sda2
           ID-7: swap-1 size: 2.01 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/sda1

I’m going to stick with a bare metal install, legacy boot from MBR, no other OS involved, no secure boot. I’m preferring Asus for the mobo, AMD CPU, some basic graphics just for the convenience. System is on sda! sdc is the data raid, sdb / sdd for backups.

I’ll backup /etc/. I assume I need to take care of grub / boot, kernel / firmware, modify /etc/fstab/.
Any advice possible, already?
I haven’t started to look for the new hardware, yet. It will take some time as spring has come and the garden is demanding its caretaking. I’ll be grateful for any help, process may be slow - no offence!!

Thanks to all and have nice weekend! Happy Easter to the ones who care! :wink:

@kasi042 Hi, you might need to select some older hardware that still has a legacy boot option…

Hi Malcolm,

Yes, thanks! I’m certainly not looking for cutting edge. If I have to switch toUEFI I may be doomed, right?

@kasi042 well as long as it has the option for legacy boot or UEFI boot… If you find a board, get the User Manual and check the BIOS settings section.

Do the capacitors look ok on the motherboard? It could just be the CMOS battery, power supply all ok?

Not sure! :thinking: That’s why I’m Mechanical Engineer, rather than Electrical Engineer. :smiley:
But hey:

I first suspected the power supply but it’s fairly new, maybe 2-3 years. I got reminded after opening the case and giving it a thorough vacuuming.
CMOS battery may be worth a try. It’s not the first one, though. I didn’t think of this as the box did better after the disconnection of the main supply rather than loose settings. However, I’ll get a new one and then we’ll see.
Thanks for the hint!

@mrmazda has info on 'bulging capacitors"… maybe they will advise.

To my experience with aged hardware you need to take apart everything and reassemble a minimal setup, which is power supply, mainboard, CPU and RAM. Anything can happen. It started with a Puzzling Keyboard and ended with the replacement of the mainboard: Segfault Trouble Shooting.

Badcaps.net has all the information anyone could use about electrolytic capacitors, but basically if the top is not flat, and/or anything can be detected oozing from top or bottom, it’s either already failed, or destined to fail sooner than later. Newer motherboards have largely switched from using electrolytics to polys for all the larger values, mostly those near CPU, so finding bad ones is much less common than in the decade starting around 2002. Polys are typically much closer in height to width than high value electrolytics, and don’t have plastic wraps for labeling. A motherboard made in 2009 even with electrolytics is much less likely to have bad caps. This is not the case with PSUs. I’ve yet to see a PSU with a poly anywhere in it, and most less well made PSUs are equipped with less than the best electrolytics have to offer.

If a PSU is under warranty you would hope to use, you can’t get it open to inspect for obviously bad ones, because of a seal. Unless you have an outright failure, the only real way to tell if you have a problem with one is a swap for known good. A digital tester might report a serious problem, but is usually useless for intermittent issues. If warranty is expired, open it to check for obvious badness, as well as manufacturer’s names. If all the names are Japanese companies you can count on long life. Rubycon, (United) Chemicon (UCC), Panasonic/Matsushita, Nichicon produce the undisputed best. Second tier I can’t name, but are usually close behind. The rest you don’t want to see in your PSU. That list is long.

Before deciding what to do, spend as much time as possible running memtest86 (proprietary, with free to-use-prior version; most up to date on newer chip technogies) or memtest86+ (FOSS). 4 full passes should be the minimum, but the more done, the more likely it will spot an intermittent failure. Memory can go bad from old age.

Hi all,
Thanks for the advice and actually confirming my thoughts on the hardware sector. I’m not intending to try any repair on the mobo. I’d mess it up, anyway. I must admit mentioning the “very easy rememdy” was a rather rethorical part. I’m not hoping for it.
The PSU was alreay replace by a BeQuiet! Dark Power Pro maybe 3-4 years ago. Nothing is impossible, but I don’t have it under suspect.
So I’m rather focussing on moving the system (on sda) to a new mobo hoping to avoid a fresh install. Any advice on this parts is very welcome.
@ malcolmlewis: In my desktop I have an ASUS Prime X570-P and it’s working fine with legacy boot. The word “legacy” isn’t even mentioned in the manual and all hints referring to “boot” are rather sparse. I’ll have an eye on this but probably I’ll just take the chance here.