A lot of interesting observations on my end. First, my cpu makes a strange squeak noise while this code runs. It seems to run better either inside a virtual environment, or more likely using 32-bit math (my host is 64-bit). There seems to be little difference when compiled using either clang or gcc. Also that if I use -ffast-math it is actually “fast” on openSUSE. It uses a lot more resident memory on openSUSE. I used the command in the file to compile (g++ -Wall -Wextra -O2 check_speed.cpp) and did the best of three runs. Edit: I wonder if this code isn’t specifically designed to exploit a bug (in libc?) perhaps?
Lol - of course you know where the squeak noise comes from… No, it is not the sound of many tired electrons needed to be moved around to yield those cos()-s and exp()-s.
Still, the times you get don’t answer the main question: is the running in opensuse 12.3 slower than in opensuse 12.2; how much and why?
And VM, Debian, clang and -fast-math introduce four new degrees of freedom which only complicates the 12.3 vs.12.2 comparison.
I don’t use -fast-math. But it is interesting how miserably slow the VM with opensuse 12.3 and kernel 3.7.10-1.1-default is. I can only guess that it is because the compilation cannot be optimized in a good way.
I am complicating nothing. I show the results with openSUSE 12.3 and something not opensuse 12.3. I do not feel like downloading openSUSE 12.2, I only have debian and opensuse 12.3 images available. You can do your own comparison if you are interested in adding the data to the bug report. I did this to learn and have fun.
Either way, I am still quite sure the overall performance of openSUSE 12.3 is not poor. I think this code is specfic somehow. If I spent more than 5 minutes in c++ code in my life, I would know why.
What is the processor and what is the maximum operation frequency?
One way to get the answers is to look at file /proc/cpuinfo while the code is being executed.
My computers have Intel Core i7 (two similar varieties of the processor when compared in single execution thread mode), and CPU frequency around 3.5 GHz, which is boosted under load to about 4 GHz.
On 2013-03-26 04:16, ZStefan wrote:
>> I would be happy to test a better example.
> Here is a better example. On opensuse 12.2, it takes 2 s. On opensuse
> 12.3, it takes 4.4 s.
cer@Telcontar:~/bin/test> time a.out
z=34 a0=12 a1=34
real 0m0.001s
user 0m0.001s
sys 0m0.000s
cer@Telcontar:~/bin/test> time a.out
z=34 a0=12 a1=34
real 0m0.001s
user 0m0.000s
sys 0m0.000s
cer@Telcontar:~/bin/test> time a.out
z=34 a0=12 a1=34
real 0m0.001s
user 0m0.000s
sys 0m0.000s
cer@Telcontar:~/bin/test>
cpuinfo:
model name : Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz
Zero seconds run, 12.1. Not enough time to make measurements. How can it
take 2…4 seconds on your system? :-?
–
Cheers / Saludos,
Carlos E. R.
(from 12.1 x86_64 “Asparagus” at Telcontar)
A bit of speculation.
Since most of the time is spent in the cos and exp calculations if it
was really faster on 12.2 it tends to point to glibc as the problem as
this contains the libm.
–
PC: oS 12.3 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.10.0 | GTX 650 Ti
ThinkPad E320: oS 12.3 x86_64 | i3@2.30GHz | 8GB | KDE 4.10.0 | HD 3000
HannsBook: oS 12.3 x86_64 | SU4100@1.3GHz | 2GB | KDE 4.10.0 | GMA4500
Am 26.03.2013 17:23, schrieb Martin Helm:
> A bit of speculation.
> Since most of the time is spent in the cos and exp calculations if it
> was really faster on 12.2 it tends to point to glibc as the problem as
> this contains the libm.
>
Hm, maybe I am too tired now, but I do not see a double cos in cmath
only a float and a long double cos ???
–
PC: oS 12.3 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.10.0 | GTX 650 Ti
ThinkPad E320: oS 12.3 x86_64 | i3@2.30GHz | 8GB | KDE 4.10.0 | HD 3000
HannsBook: oS 12.3 x86_64 | SU4100@1.3GHz | 2GB | KDE 4.10.0 | GMA4500
Am 26.03.2013 17:32, schrieb Martin Helm:
> Am 26.03.2013 17:23, schrieb Martin Helm:
>> A bit of speculation.
>> Since most of the time is spent in the cos and exp calculations if it
>> was really faster on 12.2 it tends to point to glibc as the problem as
>> this contains the libm.
>>
> Hm, maybe I am too tired now, but I do not see a double cos in cmath
> only a float and a long double cos ???
>
That was red herring, but I see the same program on a crappy old CPU
performing faster with 12.2 64bit than on my i7 with 12.3 64 bit
michaela@michaela-pc:~/scratch> time ./a.out
x = 0.998523
real 0m3.499s
user 0m3.495s
sys 0m0.001s
michaela@michaela-pc:~/scratch>
Intel(R) Core™2 Quad CPU Q8300 @ 2.50GHz
What’s going on here?
–
PC: oS 12.3 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.10.0 | GTX 650 Ti
ThinkPad E320: oS 12.3 x86_64 | i3@2.30GHz | 8GB | KDE 4.10.0 | HD 3000
HannsBook: oS 12.3 x86_64 | SU4100@1.3GHz | 2GB | KDE 4.10.0 | GMA4500
Did one more test with the ‘Code 1’ I described in Post #9](https://forums.opensuse.org/english/other-forums/development/programming-scripting/484988-executable-running-slower-opensuse-12-3-a.html#post2540874) of this thread. I compiled it in a VM running OpenSuSE 11.3_64 bit, Kernel 2.6.34.7-0.5-desktop with g++ V 4.5.0 20100604. This binary run it in this VM with 940 iterations / second. If I run the very same binary (no recompilation!) on OpenSuSE 12.3_64 bit (not a VM but a ‘real PC’), with stock kernel I get only 230 iterations per second (same as previously in Post #9). The only compiler flag I used is -O2, nothing else. The CPU is an Inten Xeon E3-1270v2 (same core as the i7 3770…)
So since this code was compiled with an older version of gcc (V. 4.5.0 instead of the one that ships with OS12.3, which is 4.7.x) I assume the problem lies within a library that was supplied with the 12.3 distro, rather than the compiler itself.
Am 26.03.2013 18:46, schrieb trs123:
> So since this code was compiled with an older version of gcc (V. 4.5.0
> instead of the one that ships with OS12.3, which is 4.7.x) I assume the
> problem lies within a library that was supplied with the 12.3 distro,
> rather than the compiler itself.
It’s the libm, I copied over the libm from 12.2
LD_PRELOAD=./libm-2.15.so time ./a.out
x = 0.998523
3.32user
0.00system
0:03.32elapsed
compared to
real 0m7.838s
user 0m7.828s
sys 0m0.002s
without the LD_PRELOAD using the 2.17 system library.
–
PC: oS 12.3 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.10.0 | GTX 650 Ti
ThinkPad E320: oS 12.3 x86_64 | i3@2.30GHz | 8GB | KDE 4.10.0 | HD 3000
HannsBook: oS 12.3 x86_64 | SU4100@1.3GHz | 2GB | KDE 4.10.0 | GMA4500
Yeay! I can confirm this! I copied a very old libm.so from OpenSuSE11.3 (64 bit version) over /lib64/libm-2.17.so that was provided with OpenSuSE 12.3. Speedup is tremendous:
Code 1: went from 230 i/s (see post #9) to 940 i/s (factor of 4.09 improvement!)
Code 2: went from 920 i/s (see post #9) to 2450 i/s (factor of 2.66 improvement!)
Glad it’s not my ‘buggy code’ as some posters mentioned
I hope there will be a corrected version supplied through the online update soon so everyone will benefit.
Am 26.03.2013 20:16, schrieb trs123:
> Glad it’s not my ‘buggy code’ as some posters mentioned
>
> I hope there will be a corrected version supplied through the online
> update soon so everyone will benefit.
>
You should file a bug report about this problem otherwise nobody will
fix it.
I can also do it later tonight.
–
PC: oS 12.3 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.10.0 | GTX 650 Ti
ThinkPad E320: oS 12.3 x86_64 | i3@2.30GHz | 8GB | KDE 4.10.0 | HD 3000
HannsBook: oS 12.3 x86_64 | SU4100@1.3GHz | 2GB | KDE 4.10.0 | GMA4500