13.1 freezes while copying files

Hi all

The system:

It is an otherwise stable running openSUSE 13.1 64bit on a new hard disk with the latest updates made.
Desktop is KDE.
Dual-boot with win7.
Core i5 with 8 GB RAM.
ASUS motherboard.
Intel onboard graphics.
Traditional partition table (MBR).

The job:

Copy a bit more than one million files from a backup on an external 2TB USB 3.0 hard disk (formatted NTFS)
to a freshly formatted 1.2TB NTFS partition - win7 ‘drive’ D: .
I started that job yesterday night.

The symptoms:

In the ‘Digital Clock Settings’ I did choose that the seconds are shown.
But when I come back to the machine, the seconds aren’t counted anymore by the ‘Digital Clock’,
or only sporadically at intervals of sometimes more than 10 minutes.

Further, I had the impression that copying had slowed down.

I tried to cancel copying by clicking the ‘stop’ button in the notifications.

Seconds not counted again by the ‘Digital Clock’.

15 mins later, copying actually stopped.

Still the desktop was not responsive !

I had to Ctrl-Alt-F1 and after login say ‘shutdown -h now’.

Any idea?
Mike

Rather good problem description. Only that fails from my point of view is how you started the copying (I would pobably do a form of the cp command, but you may do completely different).

BTW (and purely personal amazement), is Linux realy better in doing this copying from a non-LInux fs to another non-Linux fs than a native to the fs’s Windows system?

Hi Henk!

I used the desktop, logged in as root, dragging the selected directories to the other Dolphin window.

Apart from the point with the freezing:

absolutely, yes, it is!

Beforehand I tried to copy contents of the NTFS ‘drive’ D: to another external USB hard disk using win7.
Those previous contents of ‘drive’ D: resulted from cloning (using dd) the failing internal hard disk
that was present in the PC beforehand.

For a few times win7 just refused to do the job.
It aborted - while collecting the data to be copied - without any error message !

Much worse: when I did choose to copy smaller amounts of the data, win7 did not abort,
but after finishing the last copy, and re-booting, that external NTFS hard disk wasn’t readable any more !!
And win7 was not able to repair that.
Obviously win7 had shot the NTFS on that external disk.

I just never experienced such a behaviour under Linux,
neither with Linux-native file systems, nor with NTFS or FAT … :wink:

In fact, using openSUSE, I could make a backup copy to a second external NTFS hard disk
(and to a third external USB hard disk formatted ReiserFS)
from which I wanted to restore the data now.

Thanks
Mike

That realy shocks me. Was not able to read any further.

The reason was that I didn’t want to type (or copy/paste) dozens of directory names,
while on the other hand I still need root privileges to write to NTFS partition D: .
The latter may either be so because I didn’t edit polkit settings yet,
or because I checked “don’t mount at startup” in the mount options of YaST’s partitioner,
or both.

At least I unplugged the ethernet cable before proceeding.

I do not know what the ethernet cable has to do with loging in as root in the GUI (or with all other self-inflicting injuries one might do on one’s system), but I feel not authorized to get a complete image of what you have done and what the consequenses may be. In my working live I would have got a large quarrel with mu boss about such a situation. Now, being a volunteer, I stop where my expertise stops.

Maybe others have more imagination.

Not really. If you were able to login to tty1, system was not frozen (contrary to subject). So what you had to do was to investigate - check current CPU load (using top would already provide some useful information), check memory and swap consumption, check disk activity. This would give some starting point. My best bet - program you used to copy files filled up all available memory so system started to swap heavily.

On 2014-06-02 12:06, ratzi wrote:
>
> Hi Henk!
>
> hcvv;2646893 Wrote:
>> Rather good problem description. Only that fails from my point of view
>> is how you started the copying (I would pobably do a form of the cp
>> command, but you may do completely different).
>
> I used the desktop, logged in as root, dragging the selected directories
> to the other Dolphin window.

Login as root in the desktop is anathema here :slight_smile:

You have been godwinated :-p

For doing large copy jobs, I refuse to use Dolphin or anything similar.
I simply use ‘mc’ in a terminal where I do “su -” first.

(using windows for the copy)

> For a few times win7 just refused to do the job.
> It aborted - while collecting the data to be copied - without -any-
> error message !

Maybe one of those disks, source or destination, has the ntfs filesystem
badly corrupted. See that both systems have problems. I would look at
the Linux syslog during the problem period.

> Much worse: when I did choose to copy smaller amounts of the data, win7
> did not abort,
> but after finishing the last copy, and re-booting, that external NTFS
> hard disk wasn’t readable any more !!

Well, you do have problems with that disk.

> And win7 was not able to repair that.
> Obviously win7 had shot the NTFS on that external disk.

I know of a certain proprietary application that does a good job of
recovering files out of a bad ntfs partition, and it is not expensive,
if that’s what you want… (yes, I tried photorec first. Horrible).


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

Hi arvidjaar!

[quote="“arvidjaar,post:7,topic:100818”]

Not really. If you were able to login to tty1, system was not frozen (contrary to subject). So what you had to do was to investigate - check current CPU load (using top would already provide some useful information), check memory and swap consumption, check disk activity. This would give some starting point. My best bet - program you used to copy files filled up all available memory so system started to swap heavily.[/QUOTE]

That swap may have been used seems to be a very likely scenario.
But it may not be the whole story.

I finally decided to copy the data in smaller portions.

That works quite well, but it doesn’t solve every thing.

Now I have a copy of 34’000 files (or 27 GB) running.

Clicking on the chameleon in the lower left corner of the screen (or the ‘Kickoff Application Launcher’)
the clock freezes again.

Then I hit Ctrl-Alt-F1, and logged in as root, to enter ‘top’ and ‘free’.

The system again lags considerably - the login took time, and I had to wait minutes before the system returned to the desktop session after hitting Ctrl-Alt-F7 from there.

Since that system is almost frozen I’m no longer able to copy outputs of ‘top’ and ‘free’ to an USB pen drive to transfer them to the machine that I’m using to write this. So I’m typing a few things by hand.

‘top’ gives 7.3% of CPU for plasma-desktop, 3.3% for mount.ntfs-3g, and a few entries (Xorg, kio_file) just below 1%.
The rest is negligible.

‘free’ gives (again after some time)

            total     used      free  shared  buffers   cached
Mem:      7891596  7768600    122996   42444  3485388  3829696
-/+ buffers/cache:  453516   7438080
Swap:    16777212        0  16777212

The program I used to copy the files is the file manager of KDE: dolphin.

Disk activity: the LED indicating disk activity (signal provided by the ASUS motherboard)
is flickering all the time at high frequency.

Thanks
Mike

On 2014-06-02 23:56, ratzi wrote:

> The program I used to copy the files is the file manager of KDE:
> dolphin.

Use ‘mc’, in a terminal.


su -
mc

> Disk activity: the LED indicating disk activity (signal provided by the
> ASUS motherboard)
> is flickering all the time at high frequency.

Try “iotop -o” on tty1 or 2, as root. It will tell you what is using the
disk, and how much.


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

Could you show output of “cat /proc/meminfo”?

Perhaps look for bad I/O error rates … bad cable, bad drive, loose or dirty connections?

I copied recently, my entire /home incl. hidden to a backup USB NTFS drive, only about 50GB
All done as a normal user BTW - and using Dolphin.

It went perfectly.

Things can get flaky when dealing with FAT/NTFS Others have noted problems when copying large numbers of files to and from these File Systems. Using a command line tools rather then a GUI would probably help some if copying tons of files

Dear Carlos!

May be :wink:

Yes, of course there are reasons to not log in as root!
One of the biggest shortcomings of windows for years was that the user always has been logged in as ‘root’
(which was the same in versions of MacOS older than OS X as well !!)
leading to a high vulnerability.
All the windows viruses and worms tell a long story of that!

I did read around a bit, and mc as well as rsync seem to be fine,
while one won’t get progress info from ‘cp’

  • which even for ‘dd’ is different, because there one can use the ‘kill’ command,
    to get ‘dd’ to tell a bit more about what it’s actually doing :wink:

But in the end dolphin should do the job. See a posting to follow.

No, I don’t think that I have a hardware problem with that external disk.
Look, I’m still using an old G3 Mac running MacOS 9,
and I sometimes transfer data from it to Linux by means of an USB pen drive.
Sometimes I make backup copies of those USB pen drives.

Now MacOS 9 adds a file ‘FINDER.DAT’ and sometimes a folder ‘RESOURCE.FRK’
in every folder of a USB pen drive, which itself is formated FAT32.

I already made the following experience: making a backup to a windows volume of a USB pen drive -
that by MacOS 9 has been ‘amended’ this way - results in that ‘scandisk’ of windows thinks that
the volume has errors and has to be repaired!
But if I manually delete all these 'FINDER.DAT’s and 'RESOURCE.FRK’s beforehand,
scandisk is completely happy!

That I tried to copy files from a NTFS volume, the file system of which may have a few corrupted entries,
may have made it harder for windows.

But the result was clear: this external hard disk was very healthy until the last copying of files to it using win7!

After all my experience I would say that Linux is much more robust!!

Thank you.
Maybe I’ll contact you on that if everything here goes wrong.

But usually I prefer to have a sufficient number of backup copies (which just decreased by one, see above),
instead of relying on such tools:

I just bricked a USB pen drive using ‘testdisk’ included on the GParted live CD … it’s light went out for ever after telling ‘testdisk’ to write the changes to the partition table.
This USB pen drive doesn’t even appear anymore in the list of dolphin of partitions/drives available.
But it still hinders my BIOS from booting when it is plugged - kind of a zombie, now.

I had no luck with ‘iotop’, when I called it I got

myHost:~ # iotop -o
If 'iotop' is not a typo you can use command-not-found to lookup the package that contains it, like this:
    cnf iotop
myHost:~ # 

Probably that one needs a .rpm which I haven’t installed.

Best wishes
Mike

Hi Gerry!

No, it doesn’t appear to be a problem of that kind.

When I copy only few files of a huge size like the iso’s of openSUSE DVD’s and CD’s, the rate is as much as up to 135 MB/s.

This rather looks like a rate you get with USB 3.0 - instead of a hardware problem.

The problem rather is the tons of small files which you have if you work with LaTeX or if you save WEB pages for offline reading, etc.

Thanks
Mike

Hi all !

Triggered by the insinuations in some postings of this thread that,
running a root session may pose a problem,
I removed the option values “user,noauto,users,gid=users,fmask=133,dmask=022”
in my fstab file for the windows NTFS partition ‘D:’,
in order to be able to write to it as the standard user (not requiring root privileges).

Result: no more freezing, and copying went as if the handbrake has been taken off !

Because it was not clear, if these changes, or the fact that I was no longer been logged in as root caused this change,
I made another test.

I logged in as root again.

This time I copied a folder with 122 GB in 200’000 files in 8500 sub-folders.
No freeze.

So apparently, removing the option values “user,noauto,users,gid=users,fmask=133,dmask=022” from fstab was critical.

Being logged in as standard user or root, at the end of the day, didn’t seem to have any effect !

To me it appears rather likely, that with the option values “user,noauto,users,gid=users,fmask=133,dmask=022” in place,
the system seems to check for every single small file, if it has the permission to do so.

That as well is confirmed be the overall behaviour during the copy: while beforehand the disk light was flickering at a high rate all the time,
after removing the option values “user,noauto,users,gid=users,fmask=133,dmask=022” from fstab,
access to the internal hard disk is only from time to time, but then heavily.

Thank all of you!

For me the problem seems to be solved.

Mike

Gratz on figuring it out. I’d say that is worth a posting to bugzilla to report this behavior to the developers. perhaps a fix can be had in the next version

On 2014-06-03 19:56, ratzi wrote:

> So apparently, removing the option values
> “user,noauto,users,gid=users,fmask=133,dmask=022” from fstab was
> critical.

Could you please use the command “mount” with no options, to find out
how that filesystem was actually mounted? Then umount, comment out the
line, mount with the desktop, and then repeat the “mount” command.

Post here both result lines and we can compare what are the differences.

> To me it appears rather likely, that with the option values
> “user,noauto,users,gid=users,fmask=133,dmask=022” in place,
> the system seems to check for every single small file, if it has the
> permission to do so.

No, no, not so. No matter how you mount a filesystem, permissions check
are done always.

> Thank all of you!
>
> For me the problem seems to be solved.

Welcome! But please, let us investigate the issue a bit more. It is not
often that we find someone with this particular problem, so, if you
don’t mind, some help from you would be nice. Perhaps we can pinpoint
the root cause.

For instance, if there is a content indexer that tries to index every
file that is written, this can cause a heavy load. I read a report of
such an indexer (the new one in KDE, IIRC) was indexing files downloaded
by torrent, at the same time they were downloaded. These files change
continuously and slowly, over hours or days, and each change caused the
indexer to scan the complete file again. This caused a huge load.

And indexers may scan some directories and ignore others. And, depending
how you mount, they go to somewhere or other.

I’m not saying that is your case, but it is a possibility; running
“iotop -o” in a terminal could tell us that immediately. Or course, you
have to install it. Package iotop, default repos.


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

Hi Carlos,

do you really want me to re-introduce the parameters in fstab that caused my problems?

I could do, yes, and it would cause me more trouble, but what would be shown by that?

I did the following, in order that you can see the effect:
I mounted /windows/C (not mounted by default), before I called ‘mount’.

The result of ‘mount’ then is

devtmpfs on /dev type devtmpfs (rw,relatime,size=3932868k,nr_inodes=983217,mode=755)
tmpfs on /dev/shm type tmpfs (rw,relatime)
tmpfs on /run type tmpfs (rw,nosuid,nodev,relatime,mode=755)
devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
/dev/sda7 on / type reiserfs (rw,relatime)
proc on /proc type proc (rw,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
mqueue on /dev/mqueue type mqueue (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
tmpfs on /var/lock type tmpfs (rw,nosuid,nodev,relatime,mode=755)
tmpfs on /var/run type tmpfs (rw,nosuid,nodev,relatime,mode=755)
/dev/sda4 on /windows/D type fuseblk (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other,blksize=4096)
/dev/sda9 on /home type reiserfs (rw,relatime)
/dev/sda8 on /home123 type ext3 (ro,nosuid,nodev,noexec,relatime,data=ordered,user)
/dev/sdb1 on /run/media/root/HP v165w type vfat (rw,nosuid,nodev,relatime,fmask=0022,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,showexec,utf8,flush,errors=remount-ro,uhelper=udisks2)
/dev/sdb1 on /var/run/media/root/HP v165w type vfat (rw,nosuid,nodev,relatime,fmask=0022,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,showexec,utf8,flush,errors=remount-ro)
/dev/sda3 on /windows/C type fuseblk (rw,nosuid,nodev,noexec,relatime,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096)

The differences between the settings for /windows/C and /windows/D
(beforehand, the parameters for these 2 partitions in fstab were the same)
may give you the clue that you want.

I disabled ‘Desktop Search’ some time ago before all of that.

Look, I don’t like to return to the state again, in which my system freezes during large copys.
If all of my data has been copied and checked (!!), then I may make that test.
Please don’t ask me now, because I may risk to loose my personal data (like photographs of my kids and more).

Best wishes
Mike