* Confusing optimization
@ 2010-05-09 19:25 Luca Béla Palkovics
2010-05-10 4:49 ` Ian Lance Taylor
0 siblings, 1 reply; 5+ messages in thread
From: Luca Béla Palkovics @ 2010-05-09 19:25 UTC (permalink / raw)
To: gcc-help
void a()
{
... do my stuff
}
void b()
{
... do my stuff
}
int main(int argc, char *argv[])
{
a();
b();
}
>g++ main.cpp -O3
>./a.out
a takes 125ms
b takes 340ms
now the same but seperated
int main(int argc, char *argv[])
{
a();
}
>g++ main.cpp -O3
>./a.out
a takes 85ms
Is this normal ? b has nothing todo with a .. why does a get slower ?
(b is also faster without a...)
Luca.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Confusing optimization
2010-05-09 19:25 Confusing optimization Luca Béla Palkovics
@ 2010-05-10 4:49 ` Ian Lance Taylor
2010-05-10 14:27 ` Luca Béla Palkovics
0 siblings, 1 reply; 5+ messages in thread
From: Ian Lance Taylor @ 2010-05-10 4:49 UTC (permalink / raw)
To: Luca Béla Palkovics; +Cc: gcc-help
"Luca Béla Palkovics" <luca.bela.palkovics@gmail.com> writes:
> Is this normal ? b has nothing todo with a .. why does a get slower ?
> (b is also faster without a...)
There are a number of possibilities. It's hard to know what is
happening without an exact test case. You also neglected to say what
platform you are running on.
Some possibilities are:
1) Measurement error. Surprisingly often people are not measuring
what they think they are measuring, and you didn't provide any
details about how you got your timings.
2) Instruction cache effects, if a() and b() call other functions.
When both are linked together, those other functions will be at
different addresses, and whether they are contiguous may change,
all affecting the instruction cache.
3) Exact aligment of loop starts may shift when both are linked
together, affecting the processor's branch optimizers if it has
any. Similarly, the exact alignment of labels may shift. You can
control these using gcc options like -falign-functions,
-falign-jumps, -falign-labels, -falign-loops.
There are other, less likely, possibilities.
Ian
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Re: Confusing optimization
2010-05-10 4:49 ` Ian Lance Taylor
@ 2010-05-10 14:27 ` Luca Béla Palkovics
2010-05-10 16:41 ` Ian Lance Taylor
0 siblings, 1 reply; 5+ messages in thread
From: Luca Béla Palkovics @ 2010-05-10 14:27 UTC (permalink / raw)
To: Ian Lance Taylor; +Cc: gcc-help
Am Sonntag, den 09.05.2010, 21:49 -0700 schrieb Ian Lance Taylor:
> Some possibilities are:
>
> 1) Measurement error. Surprisingly often people are not measuring
> what they think they are measuring, and you didn't provide any
> details about how you got your timings.
function is extern "C" and I messure the time outside.
long start=GetTime(); //own function using gettimeofday (returns ms)
A();
long time=GetTime()-start;
//Output time..
So shold be correct..
> 2) Instruction cache effects, if a() and b() call other functions.
> When both are linked together, those other functions will be at
> different addresses, and whether they are contiguous may change,
> all affecting the instruction cache.
> 3) Exact aligment of loop starts may shift when both are linked
> together, affecting the processor's branch optimizers if it hasvar
> any. Similarly, the exact alignment of labels may shift. You can
> control these using gcc options like -falign-functions,
> -falign-jumps, -falign-labels, -falign-loops.
Well I checked out now asm..
.. Ubuntu 10.04 with g++ 4.4.3
First line is function "a" alone 125ms
Second line is function "a" with "b" 130ms
...
.text:00000000004023BA jmp short loc_402425
.text:00000000004023BA jmp short loc_40241F
...
.text:0000000000402425 cmp [rsp+298h+var_118], 3
.text:000000000040241F cmp [rsp+298h+var_118], 3
.text:000000000040242D jz short loc_40243E
.text:0000000000402427 jz short loc_402438
...
LOOP START
...
.text:000000000040243E cmp [rsp+298h+var_110],
0FFFFFEh
.text:0000000000402438 cmp [rsp+298h+var_110],
0FFFFFEh
.text:000000000040244A setle al
.text:0000000000402444 setle al
.text:000000000040244D test al, al
.text:0000000000402447 test al, al
.text:000000000040244F jnz loc_4023BC
.text:0000000000402449 jnz loc_4023BC
...
.text:00000000004023C3 cmp [rsp+298h+var_98], 3
.text:00000000004023BC cmp [rsp+298h+var_58], 3
.text:00000000004023CA jz short loc_4023D9
.text:00000000004023C4 jz short loc_4023D3
...
.text:00000000004023D9 mov rax, [rsp+298h+var_110]
.text:00000000004023D3 mov rax, [rsp+298h+var_110]
.text:00000000004023E1 add [rsp+298h+var_90], rax
.text:00000000004023DB add [rsp+298h+var_50], rax
.text:00000000004023E9 cmp [rsp+298h+var_D8], 3
.text:00000000004023E3 cmp [rsp+298h+var_98], 3
.text:00000000004023F1 jz short loc_4023FF
.text:00000000004023EB jz short loc_4023F9
...
.text:00000000004023FF inc [rsp+298h+var_D0]
.text:00000000004023F9 inc [rsp+298h+var_90]
.text:0000000000402407 cmp [rsp+298h+var_118], 3
.text:0000000000402401 cmp [rsp+298h+var_118], 3
.text:000000000040240F jz short loc_40241D
.text:0000000000402409 jz short loc_402417
...
.text:000000000040241D inc [rsp+298h+var_110]
.text:0000000000402417 inc [rsp+298h+var_110]
.text:0000000000402425 cmp [rsp+298h+var_118], 3
.text:000000000040241F cmp [rsp+298h+var_118], 3
.text:000000000040242D jz short loc_40243E
.text:0000000000402427 jz short loc_402438
...
LOOP END
...
C++ code:
for(i=0L;i<0xFFFFFFL;i++)
{
Temp+=i;
Test++;
}
= ++ += ++ and < are overloaded functions..
So well .. must be the cache or the align..
Maybe I should flip my ifs .. it evertime jz.. hmmm
I hope I answered this mail in the correct way..
This is the 3. time using a mailing list.. :P
Luca.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Confusing optimization
2010-05-10 14:27 ` Luca Béla Palkovics
@ 2010-05-10 16:41 ` Ian Lance Taylor
2010-05-12 14:24 ` Luca Béla Palkovics
0 siblings, 1 reply; 5+ messages in thread
From: Ian Lance Taylor @ 2010-05-10 16:41 UTC (permalink / raw)
To: Luca Béla Palkovics; +Cc: gcc-help
"Luca Béla Palkovics" <luca.bela.palkovics@gmail.com> writes:
> Am Sonntag, den 09.05.2010, 21:49 -0700 schrieb Ian Lance Taylor:
>> Some possibilities are:
>>
>> 1) Measurement error. Surprisingly often people are not measuring
>> what they think they are measuring, and you didn't provide any
>> details about how you got your timings.
> function is extern "C" and I messure the time outside.
>
> long start=GetTime(); //own function using gettimeofday (returns ms)
> A();
> long time=GetTime()-start;
> //Output time..
>
> So shold be correct..
Using gettimeofday means that the measurements are highly sensitive to
any other load on the machine. To get measurements this way, you must
at the very least run the function many times in a loop, and you need
an outer loop to test first one function then the other. Then you
need to average all the results.
If you are interested in how much CPU time the functions require,
using the clock function will be more accurate. However, clock can be
misleading if the functions make any system calls.
Ian
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Confusing optimization
2010-05-10 16:41 ` Ian Lance Taylor
@ 2010-05-12 14:24 ` Luca Béla Palkovics
0 siblings, 0 replies; 5+ messages in thread
From: Luca Béla Palkovics @ 2010-05-12 14:24 UTC (permalink / raw)
To: Ian Lance Taylor; +Cc: gcc-help
Am Montag, den 10.05.2010, 09:41 -0700 schrieb Ian Lance Taylor:
> If you are interested in how much CPU time the functions require,
> using the clock function will be more accurate. However, clock can be
> misleading if the functions make any system calls.
Okay, I will checkout the clock function.
I have rewritten a lot of my code and the problem disappeared.
Thanks alot for the help/information.
Luca.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-05-12 14:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-09 19:25 Confusing optimization Luca Béla Palkovics
2010-05-10 4:49 ` Ian Lance Taylor
2010-05-10 14:27 ` Luca Béla Palkovics
2010-05-10 16:41 ` Ian Lance Taylor
2010-05-12 14:24 ` Luca Béla Palkovics
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).