Safest Long Term Storage

What’s the safest long term file storage on Linux. I am looking to setup archive box for A/V media files. Something in the range of 25-50 TB to start with.

I use BTRFS but just read about this >> https://events.static.linuxfound.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf

Trying to avoid bit-rot or crashy smashy.

Also what raid setup would you recommend? I was thinking 5E with Parity + Spare would be a good one to aim for. Cut’s out the use of BTRFS though…

No real XFS for Linux aside from Ubuntu, but would rather not use Ubuntu.

Thoughts?

I am a bit confused.

First I get the impression from the title and what you start with, that you ask for what hardware is the best for long time (many years?) storage (of archives).

But then you switch to the subject “file systems”. I have no idea why you mention Btrfs here. I would say that it is best used when you need snapshotting and that is, at least on openSUSE, mainly connected with the root file system and the snapshotting related to software updates on such a system. Not particulary for long time archiving (my personal opinion).

Then again you switch to RAID. Whivch can of course be used to save yourself from hardware failures and that you can add to the usage of reliable hardwre. but I would see, handle it step by step: save place, hardware, RAID or similar, file system (or not), archiving software. And not jump random between those.

You seem to suggest that XFS is not supported in openSUSE. The fact that nowadays XFS is used by default for /home on a fresh installation, seems to contradict that. But maybe I misunderstand completely your remark about “real XFS”.

I’ve got 4x2tb WD fully encrypted hard disks using LUKS/ReiserFS and they’re still working perfectly after 4, nearly 5 years. Actually since the 20th of July 2014 @ 14:32 to be precise ;).

They’re backup drives that live in a wall safe in anti-static bags and I only connect them when I need them, so they’re not spinning permanently of course.

I just connected them to test them and they’re still working perfectly as far as I can tell.

I did once get some file corruption on one of them while copying new files and panicked, thinking I’d lost everything - but it was solved perfectly by running the ReiserFS repair tool. Quite impressive seeing as the drive is encrypted!

The filesystem shouldn’t make much of a difference, the physical media and how it’s stored is the main concern in my opinion. Just make sure you have a backup of the backup (redundancy), that’s why I have four drives - 2 are copies of the other 2.

Interesting read

One thing about using a file system of a certain type is in my opinion that it must be supported by the operating system you use to manage those disks. E.g. Reiserfs is basicaly dead and nobody knows how long it will be supported in e.g. openSUSE. Thus stuck to the file system and do not change/upgrade the system that can handle Reiserfs anymore. Which is a very stable situation of course, but be sure that you do need any additions for that system whatsoever in the future. And that the other hardware (besides the disks used for archiving) has also to be stable, because when it brakes and you are forced to require a new system with a new CPU, etc., it may not be supported by the old OS (of which you should still have the installation media, that should still be readable).

Indeed, IMHO creating something that will last functioning for a very long time is not easy.

Yes, good point there @hccv.

I have a SATA->USB2&/eSATA adapter box and a live Linux USB stick which has support for the filesystems and stays in storage with them (just in case).

Hopefully that should cover me for a few years more, as long as USB2 is still backwardly supported, but if that changes then I’d need to look for another solution.

Future-proofing is not an easy task!

Correct lol!

And my remarks were mainly to the OP to show him such.

I have it on good authority talking directly to Disk Storage manufacturers who spend enormous amounts of money testing their products that Bit Rot is FUD, there is no rational explanation for that to happen if you store your disks in an optimal environment.

And, it is relevant to consider temperature and humidity, if you look closely at your rotating disk enclosure, there will be at least one pin hole marked “Do not cover” or similar to allow for exchange between the interior and external atmospheres. That said, there are also some drives that are built with special sealed nitrogen to eliminate introduction of humidity, thereby causing oxidation and your “Bit Rot”

So,
Just use common sense based on your situation.
If your drives are mounted in a machine with no special environment, then fire up those drives once in awhile to heat them up and “dry out” the disk interior.
If something is to be archived, then unmount your disk, remove and store in a special controlled environment indefinitely.

Or, if you are suspicious of any electromagnetic storage, then choose something that is less volatile like SSD or optical media… But even these should be stored according to recommendations associated with that type of medium.

TSU

From what I’ve seen during Leap 15.1 Beta testing, no longer true …

  • AFAICS, a default Leap 15.1 single-disk installation will use a single Btrfs partition with “/home/” as a sub-volume …
  1. Make sure that, the disks are suitable for 24 hour operation – NAS operation.
  2. Make sure that, the box can support RAID.
  3. Even if you purchase 14 TB drives, you’ll need to consider at least 12 drives (3 per RAID) …
  4. If it’s only an archive, consider magnetic tape – still the most reliable, and cheapest, long-term medium for archiving – “LTO” …

On Wed, 20 Mar 2019 02:38:03 +0000, PeterChz wrote:

> No real XFS for Linux aside from Ubuntu, but would rather not use
> Ubuntu.

I use XFS here for my home partition on Leap 15.0. Not sure what you
mean by this.


Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C

Don’t forget that other risk.

You store your data on reliable media, guaranteed to have no bit rot.

And then you discover that technology has changed and you can no longer find devices that can read that media.

Data archived on CD-ROM was supposed to have a “shelf-life” of more than 30 years … Experience has shown that, it depends on the manufactured quality and, the burning …

Yes, there are special “glass” drives for archiving data but, AFAIK, not a consumer product …

@PeterChz:

The NAS I prefer to use, because it’s “Linux friendly”, has this statement regarding their filesystem choice: <https://www.qnap.com/solution/qnap-ext4/en-us/>.
Yes, using an Internet search tool to try and find out which is “better” – ext4, XFS or Btrfs – currently doesn’t seem to find anything which one could take as being “current” – there’s information which is at least 6 years old which, IMHO, is currently not the actual “state of affairs” …

XFS:

  • It has features such as “striped RAID arrays” and native backup/restore utilities which, IMHO, are “very nice to have
    ” for archives …

Thanks everyone for all the responses.
I just realized my typo, ZFS :X not XFS. sorry.
XFS does seem like the solid alternative. But my fear was, even though OpenSuse or other OS CAN implement ZFS manually. I don’t know if they should. Was curious if anyone rolls their own ZFS storage setup.

From my research and your guys help, Tape, if I can find a machine cheap enough would be the best archival-able copy. Though I had it recommended if I go with tape, it’s best to use the same machine. Some have had issues with same model but different machines not reading tapes 100%. Again might be one person’s bad experience.

Having one system semi live running as a NAS is a good standby, we have a small non RAID version of this going currently, I’ll have to check that link @dcurtisfra for a more robust setup. Curious if there are setups that put the RAID card or server to sleep unless woken up to use. Like Wake On Lan? That way periodic tests of systems can be done or even automated. Reduce use.

Interesting to hear that bit rot is FUD. I’ve felt the pain on old floppies, CD’s and some drives have died just due to old age and use causing data loss. But low use, properly stored drives, I can see have a chance of survival.

I find this such an interesting topic too, there are so many levels of choice that need to harmonize.

Software:
| NAS software: SAMBA, NFS, etc.
| Operating System
| Logical Formatting of disks: XFS, ZFS, ReiserFS

Hardware:
| Hardware Raid of Disk
| Size and Type of Hard Disk

Advice on any level is appreciated, or a level I am not considering.

Having a hard disk copy stored offsite somewhere is a good idea. More ambitious ideas like, building an openstack cloud between servers built across the country at friends or family houses or CoLos have crossed my mind. But that maybe enterprise work day seeping into my skull. Overkill for keeping family photo’s safe for 20-50 years. :slight_smile: :slight_smile: Cheaper to just create a handful of TB drive backups.

… how about writing data to 35mm film… and other mad scientist ideas. . . .

Interesting 35mm film data storage exists.
https://www.cpclondon.com/
Using QR codes to save data on each frame. yikes 300~400bytes per frame. Not data dense enough for most things, but talk about long term storage.

As someone that backed up for a major fortune company - they started with high density tape cartridges - after 6 months when we needed to extract some files - both the primary and backup tapes had read errors that we had to get around.

I recommended a different backup scheme - that they implemented - it runs on SLES - the file systems are LVM - each file system has 3 mirrors - each backup device in in a different city to prevent a disaster from making the data unavailable.

The primary set of backups are on the live servers - each live server has an addition set of backups that are offline except for the nightly rsync to make them a good backup and them they are put offline again.

The storage is large EMC, HP, Fujitsu storage arrays that also mirror to other sites in other states. Different brands to protect from a possible design problem.

Disk failures still happen but no data is lost. Just a drive rebuild via LVM.

The backups have a weekly low priority dd for each drive to look for failures - It is sad that almost every week we see a drive go bad. Makes no difference if Seagate, WD, Toshiba, IBM - they all will expires sometime - I have an old system with an MFM drive that is 30 years old and still works (64MB - takes 30 minutes to read it all - on 386sx Mandrake Linux from 1994)

My hone backups are the same except they are on USB3 4 TB drives on 3 separate systems that rsync after I confirm that nothing has be compromised. I use Opensuse 15.0 and LVM to mirror and ext4 file systems under LVM.

Personally, I dislike btrfs as too complex with too many jobs running and no real benefits - I backup and con restore my computers to any one of the last 30 weeks of backups. I test on another system every so often to make sure that the backups are complete and the stuff to reset grub2 and /boot worked fine on new iron (unformatted disk booted off live USB key).

I have been using Unix/Xenix/Linux since 1973 - I almost know what I am doing. The sad thing is every time I almost know everything to do they change it again. But I don’t miss the 4 hour boot times when the power went off. (fsck everything).

At least you got time then to go for a good dinner. lol!

Rotating hard drives are typically an important part of any backup system today, and is supposed to be even reliable long term storage if done the right way. Perhaps more importantly for many, reliable storage can be the prime way to avoid needing to do a recovery from backup at all.

Backblaze has been releasing statistical results of the drives it goes through annually, and as a major backup company goes through enormous numbers of drives annually.

The following article includes Backblaze statistics for the year 2018

https://www.extremetech.com/extreme/175089-who-makes-the-most-reliable-hard-drives

It should be no surprise that the Backblaze reports have a major impact on the street price of drives, in some years I’ve seen HGST drives (best rated) twice the price of Seagate (for a few years had really poor reliability statistics). It’s probably also a primary reason why laggards in some years like Seagate have greatly improved their reliability since.

TSU

Rotating drives is a really good suggestion. I wasn’t familiar with that technique but failing hard drives as many have mentioned is the worst fear. With an enterprise contract it’s not such a big deal, they must make tons on these flaky drives at the very least make us feel dependent on their support contracts. But with some raid inplace, like Raid 1 on primary device has been a life saver to me. I was able to recover twice in 5 years from disastrous failures.

larryr, that is a fantastic setup. Presuming these are user files, how was access control enforced? Local System Users, Samba or NFS to supply the rest of the office? Curious if there is a better way to provide network attached storage. At home a SAMBA share running one shared user is what connects our phones, TV, Computers to view photos, documents etc. I know SAMBA is more powerful that what I use it for, and I should probably dive in a little deeper. I never jumped into offerings like plex, owncloud, etc. They seem to change and update too quickly for my taste. Just looking for a reliable standby that can integrate with just about any other O.S. to provide greatest functionality.

Rotating the three drives for your home setup is great too. I appreciate the ext4 with LVM setup. Pretty minimal but robust. Never had trouble with that myself.

This discussion really makes me realize all the issues with my current setup. While I have a little bit extra in place, not nearly enough if I value my data. I have to look into a safety lock box for some offsite storage, where I can rotate a drive every few months. I don’t know that Ill be lucky enough to setup something at a distant relatives or the like. Someone needs to start a low cost co-lo in their garage, or a few people and we can store running computers to archive off site storage. :slight_smile: :slight_smile:

Never use CDs for archiving, I learned the hard way.
Look it up, the ink that is hit by the laser in a Writer is completely exposed to air, humidity, physical scratching, anything you can think of.
A DVD though is different, the ink layer is sandwiched between plastic layers on both sides so is less exposed.

TSU