I am running an OpenSUSE 13.2 system with ext4 filesystems. The system shall survive hard reboots (power off) as good as possible because the power supply is not 100% stable (UPS not available).
So I want to configure a filesystem check with repair on each reboot.
But OpenSUSE uses systemd version 220 and the fsck.repair=yes parameter was introduced in version 223.
Is there any other possibility to perform a file system check plus repair (also unsecure repair) on boot?
Do you know a link where I can download system version >= 223 for OpenSUSE 13.2?
Thanks for any hints!
Uli
P.S.: Please do not answer that I should install a UPS. I aggree that would be the best solution but it’s not possible at the moment.
After a power failure the root file system sometimes is not clear and therefore the system does not boot properly.
During the boot process fsck is checking the root file system and finds errors but it does not repair the errors. I guess the errors cannot be repaired safely.
Therefore the boot process stops.
To do a repair is not automatic you must do it manually from a rescue disk. The normal way is to check and replay the journal if needed. It is not certain that a repair will work so it must be done manually.
You can be left with partially recovered files and a directory full of lost&found
But the systems are running unattended and there is no chance to upgrade. Assume that I have a slow ssh connection - not more.
Therefore beside the good ideas to upgrade the system I am looking for a way to automatically repair the file system. If something goes wrong the system is not running. Ok, but it is also not running if I don’t try to repair it.
That’s why I try to do such “stupid” things.
First,
If you’re supporting a remote system with absolutely no possibility of local access (Anyone on premises who can execute specified instructions if necessary?), then you **must **look into implementing a “lights out” solution which is essentially a side by side minimal hardware computer which can be used to access main systems at a very low level. Better hardware designed to be deployed in a colo will have this option.
As noted,
Later systemd will have this option, and so the easiest solution is to upgrade, even if you have to build a machine locally with modern software and fully configured to replace your remote machine, then ship the machine to the colo for them to replace.
Else,
I suppose it might be possible to cobble together a custom software system… You’d probably need to start with a full bootlog to understand the boot sequence, then insert a trigger activating your fsck script if needed. There are also a number of posted solutions on the Internet, look for a solution published around the same vintage of your OS (2011?). Anything posted for CentOS, RHEL or Fedora has a decent chance of working on openSUSE 13.1 as well.
the root partition (and all other partitions) are checked and fixed before they are mounted.
Sounds a bit complicated but the dracut hook is required because otherwise the boot process stops if there is an error on the root file system. The hook now tries to fix such an error. Maybe it doesn’t work. But in such a case all other manual repairs will also not work.