public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libgomp/97213] New: OpenMP "if" is dramatically slower than code-level "if" - why?
@ 2020-09-26 13:50 ttsiodras at gmail dot com
2020-09-26 14:12 ` [Bug libgomp/97213] " jakub at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: ttsiodras at gmail dot com @ 2020-09-26 13:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213
Bug ID: 97213
Summary: OpenMP "if" is dramatically slower than code-level
"if" - why?
Product: gcc
Version: 10.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libgomp
Assignee: unassigned at gcc dot gnu.org
Reporter: ttsiodras at gmail dot com
CC: jakub at gcc dot gnu.org
Target Milestone: ---
In trying to understand how OpenMP `task` works, I did this benchmark:
#include <omp.h>
#include <stdio.h>
long fib(int val)
{
if (val < 2)
return val;
long total = 0;
{
#pragma omp task shared(total) if(val==45)
total += fib(val-1);
#pragma omp task shared(total) if(val==45)
total += fib(val-2);
#pragma omp taskwait
}
return total;
}
int main()
{
#pragma omp parallel
#pragma omp single
{
long res = fib(45);
printf("fib(45)=%ld\n", res);
}
}
It's a simple Fibonacci calculation, that only spawns two tasks at the
top-level of fib(45) - basically, one thread does fib(44), the other does
fib(43); and the results are added and returned.
I know there's a chance for a race on the "+=" of the total - but that's not
the point of this... Here's the performance in my i5 laptop:
$ gcc -O2 with_openmp_if.c -fopenmp
$ time ./a.out
fib(45)=1134903170
real 1m4.244s
user 1m44.696s
sys 0m0.010s
64 seconds... Now compare this, to the same code, but with the "if" moved from
OpenMP level, to user code level - i.e. this change in "fib":
long fib(int val)
{
if (val < 2)
return val;
long total = 0;
{
if (val == 45) {
#pragma omp task shared(total)
total += fib(val-1);
#pragma omp task shared(total)
total += fib(val-2);
#pragma omp taskwait
} else
return fib(val-1) + fib(val-2);
}
return total;
}
$ gcc -O2 with_normal_if.c -fopenmp
$ time ./a.out
fib(45)=1134903170
real 0m8.585s
user 0m14.021s
sys 0m0.011s
We go from 64 seconds down to 8.5 seconds.
Why?
What does the OpenMP-level "if" do so differently, that it causes an order of
magnitude less performance?
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libgomp/97213] OpenMP "if" is dramatically slower than code-level "if" - why?
2020-09-26 13:50 [Bug libgomp/97213] New: OpenMP "if" is dramatically slower than code-level "if" - why? ttsiodras at gmail dot com
@ 2020-09-26 14:12 ` jakub at gcc dot gnu.org
2020-09-26 14:58 ` ttsiodras at gmail dot com
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-26 14:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Even with if(false) the implementation has to create a new data environment
etc.
if(false) just means the task will be included, i.e. the generating task will
only continue when the included task finishes and the generating thread will
execute the task.
You'd need to add mergeable clause also to let the implementation for if(false)
pretend there wasn't the task directive at all, but that is just an
optimization option that GCC doesn't use right now (would require basically
copying the region once again).
Also, there is the overhead of the taskwait that you perform unconditionally at
all levels.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libgomp/97213] OpenMP "if" is dramatically slower than code-level "if" - why?
2020-09-26 13:50 [Bug libgomp/97213] New: OpenMP "if" is dramatically slower than code-level "if" - why? ttsiodras at gmail dot com
2020-09-26 14:12 ` [Bug libgomp/97213] " jakub at gcc dot gnu.org
@ 2020-09-26 14:58 ` ttsiodras at gmail dot com
2020-09-26 15:01 ` jakub at gcc dot gnu.org
2020-09-26 15:07 ` ttsiodras at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: ttsiodras at gmail dot com @ 2020-09-26 14:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213
--- Comment #2 from Thanassis Tsiodras <ttsiodras at gmail dot com> ---
I see. I was not aware of "mergeable", TBH - thanks for pointing it out (it led
me to reading about "data environments").
Thanks, Jakub.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libgomp/97213] OpenMP "if" is dramatically slower than code-level "if" - why?
2020-09-26 13:50 [Bug libgomp/97213] New: OpenMP "if" is dramatically slower than code-level "if" - why? ttsiodras at gmail dot com
2020-09-26 14:12 ` [Bug libgomp/97213] " jakub at gcc dot gnu.org
2020-09-26 14:58 ` ttsiodras at gmail dot com
@ 2020-09-26 15:01 ` jakub at gcc dot gnu.org
2020-09-26 15:07 ` ttsiodras at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-26 15:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, I think significant speedup is in tail recursion optimization which will
be prevented even with mergeable task. Computing fibonacci this way is not
efficient.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libgomp/97213] OpenMP "if" is dramatically slower than code-level "if" - why?
2020-09-26 13:50 [Bug libgomp/97213] New: OpenMP "if" is dramatically slower than code-level "if" - why? ttsiodras at gmail dot com
` (2 preceding siblings ...)
2020-09-26 15:01 ` jakub at gcc dot gnu.org
@ 2020-09-26 15:07 ` ttsiodras at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: ttsiodras at gmail dot com @ 2020-09-26 15:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213
Thanassis Tsiodras <ttsiodras at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|UNCONFIRMED |RESOLVED
--- Comment #4 from Thanassis Tsiodras <ttsiodras at gmail dot com> ---
Marking as resolved.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-09-26 15:07 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-26 13:50 [Bug libgomp/97213] New: OpenMP "if" is dramatically slower than code-level "if" - why? ttsiodras at gmail dot com
2020-09-26 14:12 ` [Bug libgomp/97213] " jakub at gcc dot gnu.org
2020-09-26 14:58 ` ttsiodras at gmail dot com
2020-09-26 15:01 ` jakub at gcc dot gnu.org
2020-09-26 15:07 ` ttsiodras at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).