Recovering JFS Support

Hi
Here you go: [opensuse-factory] [PLEASE SPEAK UP] Disabling legacy file systems by default? - openSUSE Factory - openSUSE Mailing Lists

The blacklist was added, but it is up to the end user to enable legacy filesystems if they choose to use, it won’t get enabled by default.

Hi everyone,

Honestly, I am not 100% where to ask this but here goes…

When I upgraded to 15.2 about 4 months ago, it was the first time in a very long time that I did a clean install. I had been using JFS for everything except the
boot partition for over 15 years. When updated to 15.1, I had to un-blacklist JFS so that it would work with my system as I was doing a live update. Given the
blacklist though, I thought it would be best to switch to a different file-system during the 15.2 fresh install and I did.

Now in just those 4 months I already my my first file corruption while I have NEVER had one while using JFS for more than 15 years!

Where can we ask for JFS support back? A Google search shows an external article saying that ‘OpenSUSE is considering blacklisting filesystem’ but I cannot
find the actual discussion. Is it publicly available? Would be good to know what led to the decision because JFS has been incredible stable and reliable while
also delivering very solid performance.

Thanks,

  • Itai

It was never really removed, the file you need to edit or remove is

/etc/modprobe.d/60-blacklist_fs-jfs.conf

And the reasoning is clear; JFS is not actively developed and has outstanding CVE’s.

That being said, if you use ext4 and have corruption, I’d say the issue lies elsewhere and not the choice of FS.

Re-post from my earlier answer:

It was never really removed, the file you need to edit or remove is

/etc/modprobe.d/60-blacklist_fs-jfs.conf

And the reasoning is clear; JFS is not actively developed and has outstanding CVE’s.

That being said, if you use ext4 and have corruption, I’d say the issue lies elsewhere and not the choice of FS.

Thank you for all the responses!

Don’t know why the opensuse-factory thread was so hard to find. That’s exactly what I was looking for.

My worry - and the reason I switched away from JFS at the opportunity - is that once something is black-listed, it is more likely to be removed in an upcoming
future release.

Also be clear about the issue, file-system corruption is extremely rare and I’ve not seen it happen for a very long time. Certainly never under normal operation.
There were a few kernels over the last 20 years that made my system unstable and would either freeze or reboot the system. When that happened, then there
were a few incidence of file-system corruption. I wasn’t counting but around a dozen, that’s less than 1 per year on a heavily used system! The vast majority
of this time, I was running JFS on nearly all partitions but the boot partition used some variant of EXT2/3/4. Of all those times there was corruption, it was on
EXT-based file-systems only despite the fact that those were significantly less used! There are 16 TB of disks on my machine and until the OpenSUSE 15.2
upgrade, all but 2GB were JFS.

Yesterday, my machine experienced a spontaneous reboot resulting in file-system corruption, fcsk ran and marked the FS as clean but I found one file that got
truncated to zero size. It’s just a day’s work but it reminded me that this never happened with JFS. Exactly zero times. Could be just bad luck but the statistical
difference is frightening.

  +0.003970] mce: [Hardware Error]: Machine check events logged
  +0.000001] mce: [Hardware Error]: CPU 5: Machine Check: 0 Bank 5: bea0000000000108
  +0.000002] mce: [Hardware Error]: TSC 0 ADDR 7f1f808d0eb2 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
  +0.000003] mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1612291925 SOCKET 0 APIC 5 microcode 8701021

Hopefully the error here won’t happen again, I upgraded the motherboard BIOS as recommended by people who had this error too.

Not going to switch file-systems right away but I would consider it if another corruption happens this year.

  • Itai

Anytime a system crashes in the middle of a write operation, there is a possibility of corruption.
So, it stands to reason that you should understand why your system crashed, and take steps to prevent it from happening again.

That said,
It looks like EXT3 and later (including EXT4) has a journaling layer so in theory had a chance to automatically recover although it obviously didn’t.

There are other FS to choose from as well.
In fact, BTRFS is supposed to take FS journaling a step further by autodetecting and autorepairing.
And, BTRFS snapshots are supposed to retain a version copy of every file that has changed for that volume (Likely because this can cause latencies, is configured to exclude a number of directories).

TSU

Absolutely. A random crash is worrisome. That’s the first thing I looked into and didn’t happen again after the BIOS update but others also suggested to disable C-State.
It’s rather hard one to diagnose unfortunately but fixing it was the top priority.

That said,
It looks like EXT3 and later (including EXT4) has a journaling layer so in theory had a chance to automatically recover although it obviously didn’t.

There are other FS to choose from as well.
In fact, BTRFS is supposed to take FS journaling a step further by autodetecting and autorepairing.
And, BTRFS snapshots are supposed to retain a version copy of every file that has changed for that volume (Likely because this can cause latencies, is configured to exclude a number of directories).

TSU

That’s pretty much the observation: That some file-systems are more resilient than others. Particularly is how they handle corruption and recover.

BRTFS seems interesting but the complexity for such a relatively new file-system made me consider to wait for long-term users to report. It’s one
component I consider too risk to try something so new.

  • Itai

Try ZFS.
BTRFS is too slow for me.
Use UPS, ECC RAM, etc. server stuff.

Thank! Will take a look at ZFS… I even double checked that it’s not blacklisted!

Glad you mentioned performance because that was the initial reason stay with JFS for a long time. Not only did it manage to be incredible robust but
it also out-performed other file-systems in terms of I/O with less performance overhead. XFS is similar in terms of performance, with a few operations
faster and some slower, depending on I/O size, type of drive, etc. I was responsible for realtime performance for almost a decade, so I got to do these
tests enormous data using all sorts of disk configurations.

My workstation is simpler but still runs on high-end components, including the UPS which helps reduce unexpected shutdowns. While not zero, these
spontaneous reboots have occurred a few times but not since the BIOS upgrade, so probably solved firmware bug.

  • Itai

Hi
You might also want to look at the i/o schedulers in use, none these days with the likes of nvme devices… zero issues seen with btrfs for many years now for the operating system - openSUSE/SLE (no snapshots/snapper).