large file cp or diff error

openSUSE 13.1 fresh install…

Copied some vmware vmdk files with vmware not installed, i.e., an archive copy to disk) from eSATA to SATA diskto different places (install (1st) - files copied after 1.5TB other data, then errors; then re-install clean, copy with no other data, to a different place on disk, then check again);

Copying mix of small and 17G or greater files, then have diff or cmp errors on the large files (not sure how many, as diff/cmp seem to halt after first difference)

Only having the problem on these large files, many smaller files seem okay (e.g., up to 4 or 5 GB). Ran it twice as above. Have not seen this with previous versions (11.3, 11.4, 12.1, 12.2, 12.3) (I do this each time for a new disk)

source disk is a 3000GB, destination is a 2000GB, not a disk overflow; running luks on both disks, single root (encrypted) and /boot, ext4.

Ran badblocks (no errors in r/w) on the destination disk; disk has been used before in this manner with no problems.

Has anyone else seen this type of problem?

When I installed my first 13.1,
I immediately noticed that by default the /tmp (and other tmp) directories are deployed as tmpfs, which means that they run in memory and don’t write to the hard drive.

When I was experimenting with this in 12.3, I quickly found that although there are some performance advantages to deploying /tmp to tmpfs, it also meant that file transfers and downloads which place a temporary copy in /tmp would be a problem in “memory challenged” machines when transferring very large files. <Some> apps like bittorent don’t suffer from this problem because they transfer only tiny pieces at a time and write <everything> generally to the same directory as the app and don’t write to /tmp but in general everything else will write to /tmp.

So, if I were to guess you have no more than 4GB RAM in your machine.

The solution is to re-point /tmp to a location on disk. In my case, this was easy because I typically don’t allocate the whole disk for these (and other) little emergencies, I simply created a partition in my empty space and pointed /tmp there. But, you can point /tmp anywhere you have sufficient space.

I expect that for a distro like openSUSE, it’s probably safer to deploy /tmp to disk by default.
And, if the machine has <8GB RAM, I probably recommend choosing “custom layout” and re-configuring this option during initial install.

TSU

3 GB RAM; I don’t see /tmp deployed anywhere (tmpfs entries in ‘df -a’ for /var/run, /var/lock, etc. but no mount point for /tmp; On a 12.3 machine, same thing. One thing, I use scripts to install and my disk setup is 16G for swap, whether the machine has 3GB (like this one) or 4, 6 or 8GB. Machine in question is hp-dc5700MT, E6700 Core2 Duo and 3GB RAM. I don’t make seperate partitions for /home, /var, or /tmp anymore, just /boot (plain) and / and swap (encrypted) because it’s just easier with LUKS (only one password at boot time). Just wondering what is different about 13.1; I have not had this problem in previous versions of openSUSE.

Thanks for the reply, btw.

Type mount

in 13.1 some mounts are done by systemd using tmpfs and are no longer in fstab

Not there, either – thanks

And even noticed that /tmp is persistent when another disk was pulled from a machine and mounted as a backup device.

Well, for now, problem solved by using dd to copy the huge files: “dd if=bigfile1 of=bigfile2 bs=16M”

It’s quick enough and “diff -q bigfile1 bigfile2” produces no error messages. Could experiment with different block sizes (bs=) but that’s for another day.

Thanks to Tsu and gogalthorp (did not know about the tmpfs mounts from systemd).

It really depends on how you got to 13.1. If you did an online upgrade and the old fstbl is kept then the mounts still maybe in there. But as I understand those should be removed and let systemd handle it.

.

No. /tmp is not setup as tmpfs in openSUSE, not even on 13.1.
By default it is just a directory on the / partition, that’s why it is not in the fstab either.

/run, /var/run, /var/lock, and /dev are tmpfs’s however, f.e.

On 2014-01-15 19:26, tsu2 wrote:

> When I installed my first 13.1,
> I immediately noticed that by default the /tmp (and other tmp)
> directories are deployed as tmpfs, which means that they run in memory
> and don’t write to the hard drive.

Not on any of my installs.


Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

On 2014-01-16 06:16, ThomasZMitchell wrote:

> Well, for now, problem solved by using dd to copy the huge files: “dd
> if=bigfile1 of=bigfile2 bs=16M”

I now use dd_rescue instead for disk images.


dd_rhelp source destination

It can restart a copy half made, and is way much reliable in case of
read errors.

13.1 ships with a different version of dd_rescue, the gnu version, with
a slightly different syntax. I can not check it now.

Alternatively, to copy directories for backup, I use rsync with checksums:


rsync --archive --acls --xattrs --hard-links --stats  \
--human-readable  --checksum  \
/source_directory/  /destination_directory

If there are errors, it should copy again the bad parts. You can even
run it twice for rechecks.

> It’s quick enough and “diff -q bigfile1 bigfile2” produces no error
> messages. Could experiment with different block sizes (bs=) but that’s
> for another day.

If ‘cp’ copies with errors, I would bugzilla about it. Seems serious.


Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

On 2014-01-16 12:46, wolfi323 wrote:
>
> No. /tmp is not setup as tmpfs in openSUSE, not even on 13.1.
> By default it is just a directory on the / partition, that’s why it is
> not in the fstab either.
>
> /run, /var/run, /var/lock, and /dev are tmpfs’s however, f.e.

Correct.

The systemd people want /tmp to be a tmpfs. The openSUSE maintainers
refuse. This causes some problems, like incorrect purging of old files
in /tmp (on 12.3 at least).


Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

Oops! Works on the 16G files but not the 68G files. I will re-test with 12.3 (fresh install) and re-post. Suspecting the hardware, but this machine has been rock-solid over the years (hp’s frustatingly conservative build that’s just plain reliable).

I have never ‘upgraded’ to a new version – I have always done fresh installs. I often read of tiny problems from the upgrade procedure – Linux is so huge now and even the fresh installs have little bugs – I think it is just better to install fresh to minimize problems.

Kudos to openSUSE for keeping ‘/tmp’ as a directory off of ‘/’…Every now and then I copy a file out of /tmp and get a hard copy elsewhere. I know that /tmp files get purged occasionally, but you should be able to reboot and immediately get something you were just doing when the machine crashes or you accidentally reboot (I have sometimes thought, “Leave-Sleep” and instead clicked on, “Leave-Shutdown” The world could really use an ‘Always Do Only What I Mean’ operating System!).

There is an inherent error rate. Normally we do not see it but it is still there and may show on huge file transfers. It also makes RAID on huge file systems a bit chancy. Consumer grade disks have an order of magnitude higher rate then commercial grade drives. ie You get what you pay for. Check the published error rates for the hardware

Perhaps true. Not heavily loaded, here, though, and the best cure for inherent failure rate is to groan use diff to check the cp results (which I very often do). Perhaps not possible in every scenario, but true in mine. Mostly, I keep the disks cool and I have had one Hitachi (before WD acquired them) start to go bad, but I had left it in a machine for transfers and then forgot about it. Linux began to report S.M.A.R.T. errors so Bye-Bye disk!. It was not well cooled in its installed position. Otherwise, I usually upgrade before they even have a chance to fail. I do regard disks as fragile so I am careful with them. Lately, I have been using Toshiba’s consumer drives because I think they are like the old Hitachi’s (Didn’t WD have to sell something to someone to acquire Hitachi, and wasn’t it the desktop drive division?)…not seeing any issues with the Toshibas.

Actually, for any manufacturer, the published specs seem suspicious to me because there’s not even 10,000 hours in a year and I see MTBF’s at 150,000 or more hours; thats like 17 years! And that’s the mean time which suggests the expectation that some drives must be able to last 34 years! How can they really say? Who keeps a drive for even 5 years? Who tests their hardware for even 10 years while not releasing it to market?

NOW, FOR THE REST OF THE STORY. SORRY ABOUT THE POST. MACHINES WITH MORE RAM (DIFFERENT CONFIG, AS WELL) DID NOT HAVE THESE CP/DIFF ERRORS. THIS MACHINE DID NOT HAVE ERRORS LIKE THIS BEFORE. AN EQUIVALENTLY CONFIGURED MACHINE DID NOT HAVE ANY CP/DIFF ERRORS NOW SO I SUSPECT BAD RAM (MemTest+ 4.2 did not catch it). I AM SORRY FOR THE TIME I WASTED IF ANYONE INVESTIGATED THIS FURTHER. THE CONVERSATION WAS ENLIGHTENING, HOWEVER.

On 2014-01-17 10:46, ThomasZMitchell wrote:
>
> gogalthorp;2616629 Wrote:
>> There is an inherent error rate. Normally we do not see it but it is
>> still there and may show on huge file transfers. It also makes RAID on
>> huge file systems a bit chancy. Consumer grade disks have an order of
>> magnitude higher rate then commercial grade drives. ie You get what you
>> pay for. Check the published error rates for the hardware
>
> Perhaps true. Not heavily loaded, here, though, and the best cure for
> inherent failure rate is to groan use diff to check the cp results
> (which I very often do).

Or simply use tools that compare the copies as they generate them, as I
posted a while back.

Comment: MsDos had a system wide switch that forced verification of
every disk write operation. It was off by default. I don’t think Windows
has it, but might be.


Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

I use the diff method because I want to compare what is stored on the disk…With verification as you go (like rsync) and hardware (and software) caching, I worry that I might just be writing to disk and then comparing my in-memory ‘original’ data with data stored in a cache. When I perform a backup or archive, I want to know what is on the platters and the software cache in Linux is several GB afaik, and the disk cache is whatever drive is supposed to have (8 to 64Mb on my drives). But again, I am not using the machine for a while so I can do that. If the data set is huge, I diff after a copy operation. If the data set is relatively small, I will actually reboot and then diff.

I use Toshiba/Hitachi because I can ‘hdparm -W0 /dev/sda’ (write-cache off) and it makes a difference. I have read that other drives may or may not have this. But I still don’t really know unless I am relatively certain that the data has to be read off the platters

On 2014-01-17 17:56, ThomasZMitchell wrote:
>
> robin_listas;2616792 Wrote:

> I use the diff method because I want to compare what is stored on the
> disk…With verification as you go (like rsync) and hardware (and
> software) caching, I worry that I might just be writing to disk and then
> comparing my in-memory ‘original’ data with data stored in a cache.

Good point.

Which is why I run rsync twice. The second run can’t hit cached copies,
because the copies are way larger than memory.

> I use Toshiba/Hitachi because I can ‘hdparm -W0 /dev/sda’ (write-cache
> off) and it makes a difference. I have read that other drives may or
> may not have this. But I still don’t really know unless I am relatively
> certain that the data has to be read off the platters

Disabling the write cache makes the disk slower, and possibly increases
wear on it.


Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

Checking my most recent openSUSE 13.1 installs, it looks like wolfi is right, /tmp is not installed to tmpfs by default.

Now, it’s got me to wondering where I saw it long ago.
Am thinking it might have been an upgrade when 12.3 was configured to deploy /tmp as tmpfs, but was <definitely> later changed to a disk location. That could mean a rarely seen bug where during the upgrade the original /tmp configuration was re-applied.

But, for 99% of all openSUSE 13.1, /tmp should be on disk.
Which of course means that you should have plenty of free disk space in the root partition.

And, if it’s an SSD it becomes even more imperative to enable the TRIM command.

TSU

On 2014-01-18 02:06, tsu2 wrote:
>
> Checking my most recent openSUSE 13.1 installs, it looks like wolfi is
> right, /tmp is not installed to tmpfs by default.
>
> Now, it’s got me to wondering where I saw it long ago.
> Am thinking it might have been an upgrade when 12.3 was configured to
> deploy /tmp as tmpfs, but was <definitely> later changed to a disk
> location. That could mean a rarely seen bug where during the upgrade the
> original /tmp configuration was re-applied.

I have not seen /tmp as tmpfs in any openSUSE release. Maybe they tried
this in factory and then switched back. Maybe an install done during
this time would retain the tmpfs setting.


Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

Wait – disabling write cache does not make the disk slower – it runs at the same speed, regardless. Having a write cache enabled increases the user’s perception of quickness because you click on save and the system is ready for a new activity even though your 50 MB file is not quite written to disk, yet. Well, you know what I mean. Disabling write cache will make the disk heads work more when writing, and thus wear out sooner, but my assumption is that ext4 is also optimizing in a way that minimizes disk head movement. Most likely elevator seeking optimizations. This code probably disagrees with hdd caching code. Most hdd are supposed to respect one of the ATA commands to flush ‘just this data’ to disk, NOW,’ but I have read that not all honor that command. Don’t know which ones. Not even sure about Toshiba or Hitachi.

There is a rub with ‘hardware’ write cache on the disk. The relevant OS filesystem with journalling code knows the proper order of writes to achieve its goals. Sort of like, “(1) Gonna write new data and where it goes, (2) Wrote temp copy and where it went, (3) Gonna point relavant inode to new data (or just copy new data over old), (4) Pointed to it (or copied to it), (5) Successfully done.” I think that’s a simplistic overview of how its done (hand-waving) and if it is, you can have a catastrophic halt anywhere in there and only lose what you were writing if the halt occurs at steps (1) or (2) (but the rest of the disk’s other data and the original are intact). If a system halt occurs at (3) or (4) you can ‘recover’ your data with a slight delay upon reboot. No loss of data. If there is a failure at (5) you will ‘waste’ a bit of time at reboot recovering something that did not need to be recovered. No big deal.

If the hdd re-orders the writes, (which it would have to, to be optimal) then imagine what can happen if the steps above come out as (3) (4) (5) (1) (2) or (5) (1) (2) (3) (4) and there is a halt in the sequence. Loss of data. I do not know how directories are handled, but if they are also journaled, you can have some amount of data corruption. I do not want any. Not knowing the design or code from ext4, I do not know what safeguards are in there or how directories are handled. Are they journaled?

We take other steps to safeguard the data such as using a reliable machine and a UPS. But there are also kernel panics that just halt everything. The journal can be halted. I think the hdd only halts for power failure. If the hdd itself fails, then your data is toast, anyway. I have heard of battery backed controllers and disks that guarantee to write all the data in the cache before they spin down in a power failure, but I am not so sure about 64Mb. And the probability of such failures is very low (though I want zero). And we are talking about writing data. Does not happen so much overall in a consumer environment. Commercial environment, transactional processing, real-time response requrements are a different story.

reading data is not affected at all by write-cache disable. We read much, much more than we write, we just want the writing to be accurate.

I have read articles on the web about this. Many comments such as ‘I want the performance,’ and ‘I’ve never had an issue.’ But other comments exhorting us to disable to prevent data loss. I disable.

I occasionally see startup messages about recovering from journal (ext4) after what I thought was a clean shutdown (1 in a hundred boots). openSUSE 12.3 and 13.1, mostly. For me, even shutdown can be problematic.

One comment – For setting up a backup copy of my data, I could have enabled caching to speed the write and then done the diff. So for that scenario, disabled write cache is inefficient and overkill. But habits are habits.