Executable's speed jumping between low and high in 13.1

The compilation command I use is


g++ -Wall -O2 code.C

But the bistable behavior is observed with any executable compiled with any other, or no options.

On 2014-01-19 13:25, flymail wrote:
> On 2014-01-18, Carlos E. R. <> wrote:
>> No, no, I mean the line to compile your code.
>
> Doesn’t ZStefan’s post (#12) say…
>
>


> g++ -Wall -O2 code.C
> 

>
> … or am I missing something?

Much later, on another post than the code. And as something he is
trying, IIRC. I don’t know if that’s how we should try it as well.

…]

Ok, built it and tried. Run time for each time was around 4.3 seconds, a
few going to 4.4. So, no big variance.

cpuinfo:

model name : Pentium(R) Dual-Core CPU T4300 @ 2.10GHz


Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

Am 18.01.2014 09:16, schrieb ZStefan:
> If I know correctly, microcode is some sort of helper driver of the
> processor. It may execute the compiled code in different ways if it
> is defective.
>
you could check if you have ucode-intel or ucode-amd on your system and
if not install the appropriate one or if you have one uninstall it and
see if it makes a difference.
That is a shot in the dark of course.


PC: oS 13.1 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.11 | GTX 650 Ti
ThinkPad E320: oS 13.1 x86_64 | i3@2.30GHz | 8GB | KDE 4.11 | HD 3000
HTPC: oS 13.1 x86_64 | Celeron@1.8GHz | 2GB | Gnome 3.10 | HD 2500

Thank you for your thoughts. I agree with your conclusions as much as you do.

When I was installing opensuse 13.1 on a powerful computer in December (Intel i7, 16 GB RAM), I observed the same bistable behavior with a math computation program. However, after a series of updates that I did in a week after installation, the speed stabilized. “Stabilized” means the speed is mostly normal, but occasionally and uncontrollably it can slow down, perhaps once a week.

User-induced external influences like loading the CPU, tasking the OS, hot-changing hardware does not have effect. I monitor the CPU frequency and it is not decreased during a slow phase of bistable behavior.

For these reasons, I am thinking that all computers running current opensuse with its kernel and other packages are prone to slowdown, and the fact that it is not observed in computers that contributors to this thread tested does not convince me: I think there is some chance that slowdown will be observed in some of the computers with Intel CPUs. I would suggest to test the speed right after a reboot or major update.

What I am sure that there was nothing like this behavior in previous opensuse-s. I haven’t seen bistable behavior with any other OS and computer.

Well, maybe next kernel will change things, but I made an attempt with a newer kernel, and got no changes.

My best guess is that there is something wrong inside the CPU or with CPU-kernel interaction.

I have ucode-intel installed. Will delete it, and see what happens. Thanks for the advice.

I have found several benchmarking packages from opensuse, will try them. I haven’t dealt with benchmarking software before.

I could check the effect of microcode in the Intel i7 computer. The weak computer is not with me today.

The nominal frequency of the processor is 3600 MHz and a slight overclocking is allowed in BIOS.

I removed the ucode-intel package which was installed by default. The frequency of the processor got fixed at 1360 MHz. The executable ran slowly and the load couldn’t raise the frequency.

When I re-installed the package, the frequency became dynamic, rising to 3800 MHz. The executable ran normally.

A full shutdown, perhaps with power off, is required for the full effects of microcode package to take effect - this is what I observed.

Earlier, as the executable ran slowly, I suspected the frequency and monitored it, but didn’t see anything wrong. It was fixed to 3800 MHz under load.

But interesting is the ratio of speeds. The ratio of speeds, close to 2.8, is equal to the ratio of frequencies 3800/1360 with accuracy ± 3%.

So, the ucode-intel package may have something to do with speed jumping. I will get more information as I do the same on the weak computer. Maybe I can install an older version of ucode-intel in that computer. It may be that the speed is jumping because the CPU frequency is jumping, although my earlier observation shows fixed frequency.

On 2014-01-19 21:36, ZStefan wrote:

> But interesting is the ratio of speeds. The ratio of speeds, close to
> 2.8, is equal to the ratio of frequencies 3800/1360 with accuracy ± 3%.

Very interesting.

> So, the ucode-intel package may have something to do with speed jumping.
> I will get more information as I do the same on the weak computer. Maybe
> I can install an older version of ucode-intel in that computer. It may
> be that the speed is jumping because the CPU frequency is jumping,
> although my earlier observation shows fixed frequency.

You are getting somewhere indeed.


Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

On 2014-01-18, ZStefan <ZStefan@no-mx.forums.opensuse.org> wrote:
>
> flymail;2616993 Wrote:
>>
>> I don’t understand - what is `defective microcode’?
>>
>
> If I know correctly, microcode is some sort of helper driver of the
> processor. It may execute the compiled code in different ways if it is
> defective.

Sounds like a dodgy CPU firmware workaround. Have you considered a manual BIOS update?

The BIOS is the latest.

I checked the effect of removing and downgrading the ucode-intel package in the weak computer. Bistable behavior is the same. The ratio of speeds does not match any ratio of frequencies. I observe sometimes low speed, sometimes high speed, and sometimes jumping from high speed to low speed during a run. This is independent of ucode-intel - present, absent, or version. The frequency of the loaded core is fixed to highest value.

While downgrading, I found out that the package ucode-intel, and, earlier, microcode, contains only updates to microcode. The same is actually said in Yast’s description of packages. While installing an older version of microcode, the rpm installer complained that the old package’s files conflict with kernel. (I forced the installation). So, the microcode files likely also exist in kernel.

The reason of bistable speed is still unknown. Currently, the most likely culprit is the kernel. There was no such behavior with previous versions of opensuse.

The most influential factor for obtaining high-speed running is a long shutdown before booting. But the effect does not seem to be thermal.

On 2014-01-20 19:36, ZStefan wrote:

> The reason of bistable speed is still unknown. Currently, the most
> likely culprit is the kernel. There was no such behavior with previous
> versions of opensuse.
>
> The most influential factor for obtaining high-speed running is a long
> shutdown before booting. But the effect does not seem to be thermal.

At this point I would consider a Bugzilla. Post your sample code, test
runs, what happens removing the microcode update, that it works on one
machine but not on another (and exact cpu model)… Those people should
know best what the kernel contains.


Cheers / Saludos,

Carlos E. R.
(from 12.3 x86_64 “Dartmouth” at Telcontar)

Report to bugzilla submitted, # 859561

Observed the same behavior with kernel 3.13.0-rc8-3.gf011587

By the way, it looks like the program in Post # 10 still uses libm in some way. Here is what KDE System Guard is showing about the process:


Library Usage

The memory usage of a process is found by adding up the memory usage of each of its libraries, plus the process's own heap, stack and any other mappings. 

Private

60 KB   [heap]
40 KB   /usr/lib64/libstdc++.so.6.0.18
24 KB   /lib64/libc-2.18.so
16 KB   /home/user/a.out
12 KB   [stack]
8 KB    /lib64/libgcc_s.so.1
8 KB    /lib64/libm-2.18.so
8 KB    /lib64/ld-2.18.so

Shared

480 KB  /usr/lib64/libstdc++.so.6.0.18
412 KB  /lib64/libc-2.18.so
108 KB  /lib64/ld-2.18.so
64 KB   /lib64/libm-2.18.so
20 KB   /lib64/libgcc_s.so.1
4 KB    [vdso]

On 2014-01-25, ZStefan <ZStefan@no-mx.forums.opensuse.org> wrote:
>
> Observed the same behavior with kernel 3.13.0-rc8-3.gf011587
>
> By the way, it looks like the program in Post # 10 still uses libm in
> some way. Here is what KDE System Guard is showing about the process:
>

Hmm. I’m not sure it’s actually using it. Consider the following `do nothing’ program (a.cpp):


int main(int argc, char* argv]) {
return 0;
}

Now:


sh-4.2$ g++ a.cpp
sh-4.2$ ./a.out
sh-4.2$ ldd a.out
linux-vdso.so.1 (0x00007fffc17fe000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f7680f2d000)
libm.so.6 => /lib64/libm.so.6 (0x00007f7680c2a000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f7680a13000)
libc.so.6 => /lib64/libc.so.6 (0x00007f7680664000)
/lib64/ld-linux-x86-64.so.2 (0x00007f7681235000)
sh-4.2$

Now convert from C++ to C:


sh-4.2$ mv a.cpp a.c
sh-4.2$ gcc a.c
sh-4.2$ ./a.out
sh-4.2$ ldd a.out
linux-vdso.so.1 (0x00007fffa31b5000)
libc.so.6 => /lib64/libc.so.6 (0x00007f3fca218000)
/lib64/ld-linux-x86-64.so.2 (0x00007f3fca5c7000)
sh-4.2$

From this I infer that g++ (not gcc) includes libm as a library dependency even if it isn’t explicity included or used
in code.

I will re-program the code from C++ to C and see how the speed behaves. Will take a few days.

On 2014-01-28, ZStefan <ZStefan@no-mx.forums.opensuse.org> wrote:
> I will re-program the code from C++ to C and see how the speed behaves.
> Will take a few days.

Of course, you’re free to do so, but I don’t think it will make any difference because you’ll end up using libm in C
anyway. My guess in C++ is that libstdc++ pulls in libm and so g++ will link to libm whether or not it’s needed by your
C++ code. Comparing the assembler outputs for the C and C++ versions of the `do nothing’ program above shows identical
machine instructions, and I strongly suspect the same will be the case after you translate your C++ to C.

I translated the program into C, compiled and ran, and the same unstable behavior is observed.

The executable does not use libm.

I also ran a small benchmark called himeno, observed the same.

Here’s the code:


#include <stdio.h> 
#include <stdlib.h>
#include <time.h>

//---------------------------------------------------------------------------
static const volatile double COS2PI5  =  0.30901699437494745;
static const volatile double SIN2PI5  =  0.95105651629515353;
static const volatile double COS4PI5  = -0.80901699437494734;
static const volatile double SIN4PI5  =  0.58778525229247325;

//---------------------------------------------------------------------------
void fft5(double* ro, double* io, double* ri, double* ii) {
double sr1, sr2, si1, si2, dr1, dr2, di1, di2;
double ar1, ar2, ai1, ai2, br1, br2, bi1, bi2;

sr1 = ri[1] + ri[4]; si1 = ii[1] + ii[4];
dr1 = ri[1] - ri[4]; di1 = ii[1] - ii[4];
sr2 = ri[2] + ri[3]; si2 = ii[2] + ii[3];
dr2 = ri[2] - ri[3]; di2 = ii[2] - ii[3];

ar1 = ri[0] + sr1*COS2PI5 + sr2*COS4PI5;
ai1 = ii[0] + si1*COS2PI5 + si2*COS4PI5;
ar2 = ri[0] + sr1*COS4PI5 + sr2*COS2PI5;
ai2 = ii[0] + si1*COS4PI5 + si2*COS2PI5;

br1 = dr1*SIN2PI5 + dr2*SIN4PI5;
bi1 = di1*SIN2PI5 + di2*SIN4PI5;
br2 = dr1*SIN4PI5 - dr2*SIN2PI5;
bi2 = di1*SIN4PI5 - di2*SIN2PI5;

ro[0] = ri[0] + sr1 + sr2; io[0] = ii[0] + si1 + si2;
ro[1] = ar1 + bi1;         io[1] = ai1 - br1;
ro[2] = ar2 + bi2;         io[2] = ai2 - br2;
ro[3] = ar2 - bi2;         io[3] = ai2 + br2;
ro[4] = ar1 - bi1;         io[4] = ai1 + br1;
}


//---------------------------------------------------------------------------
int main(int argc, char* argv])
{
int i, j, n, N, M;
M = 50;
N = 50000000;
n = 5;
clock_t t0, t1;

double* ri = (double*)(malloc(sizeof(double) * n));
double* ii = (double*)(malloc(sizeof(double) * n));
double* ro = (double*)(malloc(sizeof(double) * n));
double* io = (double*)(malloc(sizeof(double) * n));

for (i=0; i<n; i++) {ri*=(double)i; ii*=(double)(n-i);}

fft5(ro, io, ri, ii);

printf("Result: 
");
for (i = 0; i<n; i++) 
 {
  printf("%5d:   (%12.8lf,  %12.8lf) 
", i, ro*, io*);
 }

for (j = 0; j<M; j++) {
 t0 = clock();
 for (i = N; i; i--) {fft5(ro, io, ri, ii);}
 t1 = clock();
 printf("j: %4d   Number of clock cycles = %12d
", j, (int)(t1-t0));
}

free(ri); free(ii); free(ro); free(io);
return 0;  
}

On 2014-01-29, ZStefan <ZStefan@no-mx.forums.opensuse.org> wrote:
> I translated the program into C, compiled and ran, and the same unstable
> behavior is observed.
>
> The executable does not use libm.

Ahh. I thought you were previously referring to your program (which calculates cosines and sines) not mine - I already
had a C version of my program! But your experiment has me thinking - do you see the same behaviour when you compile as a
32-bit executable (e.g. gcc -m32 code.c)? IIRC, gcc compiles 64-bit executables using SIMD instructions for many
mathematical operations whereas 32 bit compiles with bog standard FPU instructions. Comparing the two may therefore help
identify the problem area.

I suppose the definitive test to exclude compilation issues entirely is to rewrite the mathematical function in
inline assembler because that we bypass the compiler. However, it’s at least 100 lines of assembler, whether you use
SIMD or FPU instructions. My guess it’s not worth it, because you also say…

On 2014-01-29, ZStefan <ZStefan@no-mx.forums.opensuse.org> wrote:
> I also ran a small benchmark called himeno, observed the same.

…which shows a precompiled benchmark program suffers a similar fate. Again, I think the problem is either the machine
(unlikely if it’s fine with other openSUSE versions) or its interaction with the kernel. Another possibility may be
unwanted effects of openSUSE patches. To confirm the kernel is problem, I’d run the same benchmarks on another distro
with the same kernel version (3.11) such as Mint Petra.

Thanks for the advice.

I compiled with -m32. The running executable shows the same bistable behavior. Overall, it is about 20% slower than the 64 bit executable.