public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libgomp/97213] New: OpenMP "if" is dramatically slower than code-level "if" - why?
@ 2020-09-26 13:50 ttsiodras at gmail dot com
  2020-09-26 14:12 ` [Bug libgomp/97213] " jakub at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: ttsiodras at gmail dot com @ 2020-09-26 13:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213

            Bug ID: 97213
           Summary: OpenMP "if" is dramatically slower than code-level
                    "if" - why?
           Product: gcc
           Version: 10.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ttsiodras at gmail dot com
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---

In trying to understand how OpenMP `task` works, I did this benchmark:

    #include <omp.h>
    #include <stdio.h>

    long fib(int val)
    {
        if (val < 2)
            return val;

        long total = 0;
        {
            #pragma omp task shared(total) if(val==45)
            total += fib(val-1);
            #pragma omp task shared(total) if(val==45)
            total += fib(val-2);
            #pragma omp taskwait
        }
        return total;
    }

    int main()
    {
        #pragma omp parallel
        #pragma omp single
        {
            long res = fib(45);
            printf("fib(45)=%ld\n", res);
        }
    }

It's a simple Fibonacci calculation, that only spawns two tasks at the
top-level of fib(45) - basically, one thread does fib(44), the other does
fib(43); and the results are added and returned.

I know there's a chance for a race on the "+=" of the total - but that's not
the point of this... Here's the performance in my i5 laptop:

    $ gcc -O2 with_openmp_if.c -fopenmp

    $ time ./a.out 
    fib(45)=1134903170

    real    1m4.244s
    user    1m44.696s
    sys     0m0.010s

64 seconds... Now compare this, to the same code, but with the "if" moved from
OpenMP level, to user code level - i.e. this change in "fib":

    long fib(int val)
    {
        if (val < 2)
            return val;

        long total = 0;
        {
            if (val == 45) {
                #pragma omp task shared(total)
                total += fib(val-1);
                #pragma omp task shared(total)
                total += fib(val-2);
                #pragma omp taskwait
            } else
                return fib(val-1) + fib(val-2);
        }
        return total;
    }

    $ gcc -O2 with_normal_if.c -fopenmp

    $ time ./a.out 
    fib(45)=1134903170

    real    0m8.585s
    user    0m14.021s
    sys     0m0.011s

We go from 64 seconds down to 8.5 seconds.

Why? 

What does the OpenMP-level "if" do so differently, that it causes an order of
magnitude less performance?

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-09-26 15:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-26 13:50 [Bug libgomp/97213] New: OpenMP "if" is dramatically slower than code-level "if" - why? ttsiodras at gmail dot com
2020-09-26 14:12 ` [Bug libgomp/97213] " jakub at gcc dot gnu.org
2020-09-26 14:58 ` ttsiodras at gmail dot com
2020-09-26 15:01 ` jakub at gcc dot gnu.org
2020-09-26 15:07 ` ttsiodras at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).