Copying is very very CPU intensive

Hi,

I just switched from Windows to OpenSUSE, and my problem is the following:
Copying is very cpu intensive, the kworker processes starts to use the whole cpu capacity and my system become unresponsive. If I copy with mc, it uses too a lot of cpu with the kworkers, but if I copy with Krusader, Krusader’s cpu usage is ok (but not the kworkers!).
When I copy (from one dir to another in the same drive) with Krusader I can see that the transfer rates sometimes goes up in my system irrealy high 60-70MB/s, and sometimes go down to real 20-30MB/sec, and to stalling(!?).

(running badblocks is not cpu intensive, it uses 8% and the kworkers are 0-1%).

I’m using OpenSUSE 12.1, kernel 3.1.9-1.4-desktop x86_64, KDE 4.7.2.
I’m using lvm encrypted volumes.

I tried the kernel-desktop-3.1.10~jng3-3.x86_64 without any luck.

I have a HP 8510p notebook, with Intel(R) Core™2 Duo T9300 @ 2.50GHz, and 8GB RAM.

What can I do, what can be wrong, and how can I fix it?
If you can, please help.

Thank you forward,

atskler

(sorry for my poor English)

On 2012-02-19 00:56, atskler wrote:

> I just switched from Windows to OpenSUSE, and my problem is the
> following:

Is the source or the destination of your copy one of your old windows
partitions? The ntfs driver is very much cpu intensive.

Another possibility is that i/o on your disk is not using DMA.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

I’m using lvm encrypted volumes.

using encryption will be slow
And on a notebook with one drive it will be very slow

On 2012-02-19 02:16, JohnVV wrote:
>> > I’m using lvm encrypted volumes.
>> >
> using encryption will be slow

True, I missed that. Encryption needs cpu power.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

First, thank you for the answers:

To: #2 / robin_listas
I know that the ntfs driver is cpu intensive.
But I copied from ext4 partition to ext4 partition, and now from an external ext4 to my internal ext4.

DMA: I hdparmed my disks and I see the following, and if I’m right with DMA everything is OK.

>>> Internal sata disk: /dev/sda:

Model=ST9320423AS, FwRev=0002SDM1, SerialNo=5VJ1JDYW
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=625142448
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=yes: unknown setting WriteCache=enabled
Drive conforms to: unknown: ATA/ATAPI-4,5,6,7

  • signifies the current active mode

Timing cached reads: 10748 MB in 1.99 seconds = 5393.46 MB/sec
Timing buffered disk reads: 248 MB in 3.00 seconds = 82.58 MB/sec

>>> External esata disk with pcmcia adapter: /dev/sdb:

Model=WDC WD10EURS-630AB1, FwRev=80.00A80, SerialNo=WD-WCAV5P255694
Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=50
BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=off
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=1953525168
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 udma6
AdvancedPM=yes: unknown setting WriteCache=enabled
Drive conforms to: Unspecified: ATA/ATAPI-1,2,3,4,5,6,7

  • signifies the current active mode

Timing cached reads: 7554 MB in 2.00 seconds = 3786.22 MB/sec
Timing buffered disk reads: 180 MB in 3.02 seconds = 59.57 MB/sec

To: #3 / JohnVV
Concerning encryption: in Windows I used Truecrypt to encrypt my whole system, and I never experienced such high cpu usage when I copied something, even if I copied to an another external Truecrypt encrypted storage. I think my CPU is strong enough to calmly handle this kind of operations, at least in Windows with … I never really noticed that I’m encrypted under Windows.

And I think if I’m right, the disk encryption should be done by kcryptd and kcryptd_io but they do nothing.


Addendum:
I copied from my external ext4 pcmcia esata drive to my internal ext4 sata drive, and I do some top and iotop measurements. Please inspect not only the top and iotop indications, inspect the copy process window too (MiB/s, stalled, paused etc.).
Please note that I can’t capture really high cpu or system load because my system in this state unresponsive.

Measurement screenshots are here – hi res. images:
dbg - top, iotop

Thank you forward for your further help.

The mc cp measurements. Same results.: > mc-cp <

On 2012-02-19 16:46, atskler wrote:
>
> First, thank you for the answers:
>
> To: #2 / robin_listas
> I know that the ntfs driver is cpu intensive.
> But I copied from ext4 partition to ext4 partition, and now from an
> external ext4 to my internal ext4.
>
> DMA: I hdparmed my disks and I see the following, and if I’m right with
> DMA everything is OK.

Ok.

> Timing buffered disk reads: 248 MB in 3.00 seconds = 82.58 MB/sec

So, your hardware is capable.

(next time, please use code tags to post such text. Advanced editor, # button).

> To: #3 / JohnVV
> Concerning encryption: in Windows I used Truecrypt to encrypt my whole
> system, and I never experienced such high cpu usage when I copied
> something, even if I copied to an another external Truecrypt encrypted
> storage. I think my CPU is strong enough to calmly handle this kind of
> operations, at least in Windows with … I never really noticed that I’m
> encrypted under Windows.

Nor me in Linux, but things can change.

> Measurement screenshots are here – hi res. images:
> ‘dbg - top, iotop’ (http://atskler.net/dbg/)

So, you are using krusader for the copy. There have been reports of slow
response while using kde for copy operations. I think you have seen the thread.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

> So, you are using krusader for the copy. There have been reports of slow
> response while using kde for copy operations. I think you have seen the thread.

Yes, and I tried mc and cp with the same results - under KDE Terminal if it matters. You can check it in: #6

On 2012-02-19 23:06, atskler wrote:
> Yes, and I tried mc and cp with the same results - under KDE Terminal
> if it matters. You can check it in: #6

Then try in text mode (ctrl-alt-f1), outside of kde. Even better, before
you log into kde.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

> Then try in text mode (ctrl-alt-f1), outside of kde. Even better, before
> you log into kde.

Tried. I started my system without xdm, KDE. The situation is exactly the same.

On 2012-02-20 00:46, atskler wrote:
>
>> Then try in text mode (ctrl-alt-f1), outside of kde. Even better, before
>> you log into kde.
>
> Tried. I started my system without xdm, KDE. The situation is exactly
> the same.

Then I have no idea…


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

On 02/20/2012 12:46 AM, atskler wrote:
> The situation is exactly the same.

then what remains constant that might cause the symptom:

hardware (drive, cable and/or disk controller board)? heat?
software: using RAID? reiserfs? brtfs? with vs without encryption? BIOS
problem (upgrade)?


DD http://tinyurl.com/DD-Caveat
Read what Distro Watch writes: http://tinyurl.com/SUSEonDW

OK. I did some further research.
I found this:

/usr/src/linux-3.1.9-1.4/Documentation/workqueue.txt

It says:

  1. Debugging

Because the work functions are executed by generic worker threads
there are a few tricks needed to shed some light on misbehaving
workqueue users.

Worker threads show up in the process list as:

root      5671  0.0  0.0      0     0 ?        S    12:07   0:00 [kworker/0:1]
root      5672  0.0  0.0      0     0 ?        S    12:07   0:00 [kworker/1:2]
root      5673  0.0  0.0      0     0 ?        S    12:12   0:00 [kworker/0:0]
root      5674  0.0  0.0      0     0 ?        S    12:13   0:00 [kworker/1:0]

If kworkers are going crazy (using too much cpu), there are two types
of possible problems:

  1. Something beeing scheduled in rapid succession
  2. A single work item that consumes lots of cpu cycles

The first one can be tracked using tracing:

	$ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
	$ cat /sys/kernel/debug/tracing/trace_pipe > out.txt
	(wait a few secs)
	^C

If something is busy looping on work queueing, it would be dominating
the output and the offender can be determined with the work item
function.

For the second type of problems it should be possible to just check
the stack trace of the offending worker thread.

	$ cat /proc/THE_OFFENDING_KWORKER/stack

The work item’s function should be trivially visible in the stack
trace.

And my out.txt - http://atskler.net/dbg/out.txt (3 MB!) - is full with:

function=kcryptd_crypt
function=do_dbs_timer
function=flush_to_ldisc

So my problem, as you correctly thought, at least in one part arises form disk encryption.


Related docs:
High CPU Utilization When Copying to Ext4
Hard performance hit after encrypting

On 02/20/2012 02:56 PM, atskler wrote:
>
> I found this:
> /usr/src/linux-3.1.9-1.4/Documentation/workqueue.txt
>
> So my problem, as you correctly thought, at least in one part arises
> form disk encryption.

cool find…i’ve never been in that directory before…it is mostly over
my head…but very happy it is useful to you…

what i do not know is, if this crypt slowdown is more than should be
expected, or not…i don’t know, maybe the crypto code needs work, or
maybe the high use of CPU ticks is expected and ‘normal’…

you might join an irc channel and see if a crypto-wise kernel kruncher
is hanging out…begin here:
http://en.opensuse.org/openSUSE:Communication_channels


DD http://tinyurl.com/DD-Caveat http://tinyurl.com/SUSEonDW

> what i do not know is, if this crypt slowdown is more than should be
> expected, or not…i don’t know, maybe the crypto code needs work, or
> maybe the high use of CPU ticks is expected and ‘normal’…

you are right, as I said I used Truecrypt in Windows and in this - in Linux too - there is a benchmark function where you can check the speed of the different kind of encryption in RAM. In my system:
AES 184 MB/s
Twofish 153 MB/s
Serpent 79 MB/s
AES-Twofish 79 MB/s
Serpent-AS 52 MB/s
Twofish-Serpent 52 MB/s
Serpent-Twofish-AES 40 MB/s
AES-Twofish-Serpent 40 MB/s

So, one of my thoughts is that OpenSuSE 12.1 installer chooses something stronger and/or slower encryption than Truecrypt’s AES. Currently I’m trying to find out this encryption type and then somehow benchmarking it in RAM.

> you might join an irc channel and see if a crypto-wise kernel kruncher
> is hanging out…begin here:
> openSUSE:Communication channels - openSUSE

I’m really inexperienced in English, and in Linux too. It would be difficult to me …


DD DD Caveat DistroWatch.com: openSUSE[/QUOTE]

On 2012-02-20 18:14, DenverD wrote:
> what i do not know is, if this crypt slowdown is more than should be
> expected, or not…i don’t know, maybe the crypto code needs work, or maybe
> the high use of CPU ticks is expected and ‘normal’…

It depends on the data, I guess, and the cpu used.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

On 2012-02-20 19:06, atskler wrote:

> So, one of my thoughts is that OpenSuSE 12.1 installer chooses
> something stronger and/or slower encryption than Truecrypt’s AES.
> Currently I’m trying to find out this encryption type and then somehow
> benchmarking it in RAM.

It is not that simple, you have to create the encrypted filesystem manually
and then test it.

To find out what you are using, do something like this:

Telcontar:~ # file -s /dev/sdc9
/dev/sdc9: LUKS encrypted file, ver 1 [aes, cbc-essiv:sha256, sha1] UUID: 4*3

To do it fast, you will have to script it.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

I measured my system and it can encrypt with ~105 MB/sec.
I made further investigations and I think my problem comes from I/O scheduling and from caching.

Currently I have a lot of work and I suspended my research on my problem.

On 2012-02-23 17:16, atskler wrote:
>
> I measured my system and it can encrypt with ~105 MB/sec.

That’s about a typical hard disk limit. Measure the HD speed on non
encrypted writes.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)