Bit rot is real

ZStefan · January 25, 2014, 6:36am

I have recently read an article about bit rot:
http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/

Bit rot is the phenomenon in hard drives when a bit arbitrarily switches its value in a file on HD.

The author says that bit rot is quite common:
*
“It’s not uncommon at all to see five, or 10, or 50 checksum errors on a disk that’s been in service for a few years”
*
And I am afraid I have encountered it many times. Some of the files had to be discarded.

Bit rot cannot be easily detected in ext4 and many other filesystems since the filesystem’s integrity is not violated in most cases. A simple backup of the file made after bit rot will, naturally, also be corrupted, and, unless special measures are taken, the file will be corrupted.

The basic way to find out whether bit rot has taken place is to calculate the checksums of the file.

Now, I wonder whether there are tools to reveal and to fight the bit rot without switching to btrfs? Maybe some utilities that would calculate the md5sums of important files automatically, and compare them from time to time?

vazhavandan · January 25, 2014, 12:25pm

ZStefan wrote:
>
> I have recently read an article about bit rot:
> http://tinyurl.com/nc5763k
>

>
>
The article probably blows things out of proportion

–
GNOME 3.10.2
openSUSE 13.1 (Bottle) (x86_64) 64-bit
Kernel Linux 3.11.6-4-desktop

robin_listas · January 25, 2014, 3:13pm

On 2014-01-25 06:46, ZStefan wrote:
>
> I have recently read an article about bit rot:
> http://tinyurl.com/nc5763k

Interesting.

Two snags, though: it requires duplicating hardware, ie, using the raid
version of btrfs. And, I have seen btrfs failing completely.

It was also possible, since many years, to store data with some
redundancy enough to recover damaged sectors, without using raid. It is
routinely used for such thing as transmissions from remote space
missions, which use a single data stream (a retransmission, even if
possible, could take hours to request, so error detection is
insuficient). It is called forward error recovery.

–
Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

gogalthorp · January 25, 2014, 8:04pm

If you see this a lot then you should check the drive .smartctl -a /dev/sdX where X is the drive number

robin_listas · January 25, 2014, 8:33pm

On 2014-01-25 20:06, gogalthorp wrote:
>
> If you see this a lot then you should check the drive .smartctl -a
> /dev/sdX where X is the drive number

No, smartctl can not detect this type of problem. It detects sectors
completely damaged. Sectors that you write something and read something
else entirely, everytime you try.

This problem is way more subtle, like just a bit changing in gigabytes
of data. It simply can not be detected unless the filesystem (or the
hardware) is designed for it.

–
Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

gogalthorp · January 26, 2014, 6:52am

Sure bit rot happens but it is rare if you see it a lot something in the hardware is going south.
A bit change still should set a checksum wrong and that should be recorded by smart
Never hurts to check in any case.

ZStefan · January 26, 2014, 7:44am

I am looking for some automated system, so that I don’t need to keep track of checksums of data files. Data files are the ones that are not going to be modified after acquisition.

smartctl does not read files, does not create and compare checksums.

robin_listas · January 26, 2014, 2:28pm

On 2014-01-26 06:56, gogalthorp wrote:
>
> Sure bit rot happens but it is rare if you see it a lot something in the
> hardware is going south.

You will never know if this happens, so you don’t know how frequent it
is or not.

> A bit change still should set a checksum wrong and that should be
> recorded by smart

No, it is not. Or not always.

In the demonstration of the article, a single bit is changed in one of
the mirror sides, using system tools. As far as the disk hardware is
concerned, the data is absolutely correct, so no detection is possible.

But the advanced features of the btrfs detected and corrected the error,
automatically.

That’s the point.

> Never hurts to check in any case.

Oh, I do. No errors.

Actually, if you look carefully at the output from modern disks, you
will see some figures that show the underlying error rate, pretty high.
These errors are continuously corrected, the hardware is designed for
this. The signals the head reads are simply very close to noise level.
With bigger disk sizes, the chances of errors that can not be corrected
(with nothing going (more) wrong) increase.

–
Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

djh-novell · January 27, 2014, 12:06pm

ZStefan wrote:
> I have recently read an article about bit rot:
> http://tinyurl.com/nc5763k
>
> Bit rot is the phenomenon in hard drives when a bit arbitrarily switches
> its value in a file on HD.
>
> The author says that bit rot is quite common:

Err, this is all total rubbish.

Yes magnetic media get bit errors. That’s why magnetic media have used
error-correction since the Ark. If a bit rots in a sector, the drive
detects and corrects it and there is no data corruption. That’s why
filesystems haven’t bothered much historically, because the medium is
reliable.

Sure, if you manually change a bit in a file, you will see that a bit
has been changed. But if a bit arbitrarily changes state in a sector,
the hardware will detect and correct it.

ZStefan · January 28, 2014, 7:18am

Well, maybe the error correction on hardware or firmware levels has its limits, since I have seen files with bit rot. I still keep one such file, a few GB in size, in which some bits evidently are wrong. All files were saved on HDs and then read, years later. Unfortunately, I didn’t even think about creating md5 sums of them immediately after acquisition.

While the change of file’s content can take place because of various reasons, in two cases I suspected a defect which was magnetic in nature, and investigated. I checked the HDs in a variety of ways, from soft non-data-destructive tools to using the tools from manufacturer. All tests showed healthy HDs. How does one explain bit rot then? Data from HDs was backed up regularly, but the content was not visually checked. I discarded the HDs.

Another interesting statement from the article:

“It’s a common misconception to think that RAID protects data from corruption since it introduces redundancy. The reality is exactly the opposite: traditional RAID increases the likelihood of data corruption since it introduces more physical devices with more things to go wrong.” There is more explanation in the article, reasoning that non-catastrophic failure of a disk in RAID leads to data corruption.

I don’t use RAID and advise colleagues not to use RAID for storage, and only to use RAID0 for speed. “For storage, use single disks and make backups instead of relying on RAID’s abilities,” I say to them. I have read somewhere that Google does not use RAIDs for data storage, but this information may be old or wrong.

nrickert · January 28, 2014, 6:13pm

A memory error could cause that. Low end computers do not have error detection for memory, but they still have it for hard drives. And I’m not sure about SSD devices.

gogalthorp · January 28, 2014, 9:17pm

Also there is a finite error rate on reading and writing data. When you get into huge files or disks they become more evident. Also correction algorithms are not perfect. Example is each sector has a check sum but a check some does not uniquely identify the data . It can only signal that the data does not match the checksum. this normally will trigger a reread (possibly many) of the sector but it is possible to still have a data error and the checksum match just the checksum of the data matches the checksum recorded. then you can have bad data but a matching checksum.

Ok this would be rare but mathematically possible But when you speak of billions of bytes in is something you have to face.

ZStefan · January 29, 2014, 5:21am

I check all my computers with memtest from time to time. There were none. But, of, course, there are transient RAM errors. I still think those two cases were magnetic media errors.

ZStefan · January 29, 2014, 6:08am

ext4 filesystem only has option for keeping track of files’ metadata’s checksum. btrfs can keep track of file’s data’s checksum but the filesystem is experimental (their own roadmark has no end and they use phrases like “… is no longer unstable”). Simple backup may serve a disservice since the bit rot will be backed up. Apparmor does not monitor files. RAID is not recommended for data storage.

What tools are left to at least automatically detect a file corruption? Is the only way left to calculate and monitor checksums myself?

djh-novell · January 29, 2014, 11:46am

gogalthorp wrote:
> Also there is a finite error rate on reading and writing data. When you
> get into huge files or disks they become more evident. Also correction
> algorithms are not perfect. Example is each sector has a check sum

No, a sector does not have a checksum, it has an ECC!

As nrickert says, on normal desktop machines its the main memory and the
buses that don’t have ECC. That’s where errors can arise more easily.
Those are the first things to fix on any machine intended to store data.

> Ok this would be rare but mathematically possible But when you speak of
> billions of bytes in is something you have to face.

Right, but its also worth bearing in mind that computers do work and
that many, many large companies over many years store much, much data
without having to invoke mystical methods to preserve the data.

There’s an awful lot of literature been published about the errors that
can and do occur and the best ways to minimise their impact. Best to
read that rather than dubious blogs and forum postings.

gogalthorp · January 29, 2014, 9:41pm

ECC is a type of checksum and it can only correct small changes and it is not a unique 1 to 1 value to the data. and it does not uniquely match the data otherwise we could have humungous compression. But maybe it is 42 after all

ZStefan · January 30, 2014, 9:34am

Here is a simplified example that I know of how simple checksumming may fail to detect errors.

This is for old hard drives, now not in use. Sector size is 512 bytes and checksum size is 2 bytes. A simple checksumming that was used is insensitive to byte swap.

Now consider two consecutive bytes on hard drive. The bit content of the first and second bytes are
00000000 10000000

If, because of bit rot, two bits in the bytes change their values, the new content will be
10000000 00000000

This is equivalent to byte swap and will go unnoticed.

I can imagine that similar things can happen even with modern, more advanced checks.

robin_listas · January 30, 2014, 11:08am

On 2014-01-25 06:46, ZStefan wrote:
> Now, I wonder whether there are tools to reveal and to fight the bit rot
> without switching to btrfs? Maybe some utilities that would calculate
> the md5sums of important files automatically, and compare them from time
> to time?

par2.

Generates parity files
Uses Reed-Solomon algorithm to produce a RAID-like data protection and
recovery system

–
Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

nrickert · January 30, 2014, 2:46pm

Those old hard drives used CRC16 check sums, which would have detected this error.

lwfinger · January 30, 2014, 8:55pm

On 01/30/2014 07:56 AM, nrickert wrote:
>
> ZStefan;2620584 Wrote:
>> Here is a simplified example that I know of how simple checksumming may
>> fail to detect errors.
>>
>> This is for old hard drives, now not in use. Sector size is 512 bytes
>> and checksum size is 2 bytes. A simple checksumming that was used is
>> insensitive to byte swap.
>>
>> Now consider two consecutive bytes on hard drive. The bit content of the
>> first and second bytes are
>> 00000000 10000000
>
> Those old hard drives used CRC16 check sums, which would have detected
> this error.

@ZStefan: I know your mind is probably made up and we should not bother you with
facts, but we may still influence some other readers of this thread. Before you
contribute to FUD, please read on how ECC works.

A legacy disk with 512-byte sectors uses a 50 byte ECC field, and the newer
4K-byte sector disks use 100 bytes. Given the number of errors that such ECC
lengths can correct, any procedure that could be devised to “refresh” the files
would greatly increase the probability of error. This increase would be due to
the extra load on the disk drives causing earlier failure, and the cosmic-ray
and other mechanisms that affect consumer-grade hardware without ECC memory or
ECC-protected data paths.