I have some ugly performance issue on on a 64 bit system, the 32 bit binary run 2x faster than the 64 “native” version on the same system.
I was doing some benchmarks of my own and this puzzles me.
The system is a fully updated 11.4.
I was initially testing two java programs on differents machines, an 11.3 32bit on PhenomIIx4 2.6GHZ and my old notebook with 11.4, finding the same performance.
Then I used a couple mini-bech test programs on C, Just a very inefficient but cpu intensive prime number counter.
The exact same source in pure ASCII C, compiled without any add on libraries produces with -m32 a nearly 2x faster program, than a “native” 64 bit compiled program.
I first copied the 32 bit version from my 11.3 machine, then build on the 11.4 machine, the results are the same my notebook has a Turion64x2 2Ghz cpu, way older, being a K8.
Both machines use the standard -desktop kernel of their corresponding 11.x versions.
A BIG gcc bug?
A kernel bug?
Libraries bug?.. Even statically compiled versions perform the same scale factor.
The implication is that all 64 bit systems out there might be outperformed by the equivalent 32bit distro.
Any ideas?
#include <stdio.h>
#include <stdlib.h>
inline int is_prime2(long num){
long den=1;
do{
den++;
}while( num%den != 0);
if(den == num){
return 1;
}else{
return 0;
}
}
int main(int argc,char* argv]){
long i=0;
int j=0;
for(i=2;i<=200000;i++){
if (is_prime2(i) ){
//printf("%li / ",i);
j++;
}
}
printf("Num Primos: %i
",j);
return 0;
}
Then compiled with :
gcc -Wall -m32 -O2 prim.c -o prim32
gcc -Wall -m64 -O2 prim.c -o prim64
And tested with:
notebook:~/src>test ./prim32 ; test prim64
Num Primos: 17984
real 0m36.638s
user 0m36.600s
sys 0m0.004s
Num Primos: 17984
real 1m4.446s
user 1m4.412s
sys 0m0.002s
If anyone can reproduce this results please let me know.
PD: I know that 64 systems have bigger <long><double>,etc. But this means bigger use of memory, it should not affect speed per se.