public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/49575] New: OpenMP has a problem with -funroll-loops
@ 2011-06-29  9:46 sailorweb2 at hotmail dot com
  2011-06-29 18:41 ` [Bug c++/49575] " jakub at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: sailorweb2 at hotmail dot com @ 2011-06-29  9:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49575

           Summary: OpenMP has a problem with -funroll-loops
           Product: gcc
           Version: 4.5.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: sailorweb2@hotmail.com


-funroll-loops optimisation option does not work on OpenMP in some cases.
Attached an example. Compiled with options

g++ -g -O2 -funroll-loops -fomit-frame-pointer -march=native -fopenmp

On a 4-core single Intel CPU machine with Kubuntu 11.04, the following program
compiled with OpenMP is around 20 times slower than the program compiled
without OpenMP, because -funroll-loops option does not work on the OpenMP
version. Defining k variable as constant or moving its declaration to inside
the for loop can solve the problem, but they should not be necessary. I think
-funroll-loops should work on the following program as it is.


#include <math.h>
#include <iostream>

using namespace std;

int main ()
{
  long double i=0;
  long double k=0.7;

  #pragma omp parallel for firstprivate(k) reduction(+:i)
  for(int t=1; t<300000000; t++){       
    for(int n=1; n<16; n++){
      i=i+pow(k,n);
    }
  }

  cout << i<<"\t";
  return 0;
}

Initial discussion on this topic was at
http://stackoverflow.com/questions/6506987/why-openmp-version-is-slower


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug c++/49575] OpenMP has a problem with -funroll-loops
  2011-06-29  9:46 [Bug c++/49575] New: OpenMP has a problem with -funroll-loops sailorweb2 at hotmail dot com
@ 2011-06-29 18:41 ` jakub at gcc dot gnu.org
  2011-06-29 19:38 ` sailorweb2 at hotmail dot com
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-06-29 18:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49575

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |openmp
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-06-29 18:40:42 UTC ---
The problem isn't that the loop isn't unrolled, it is unrolled just fine.
The problem is that ompexp pass runs too early, no CCP is performed before
that, so it isn't able to figure out that k is constant in the loop unless you
explicitly say so or unless you declare it in the body of the parallel region.
We currently expand omp before SSA, while CCP needs SSA, but not sure what
kinds of issues could cause the pass reordering.
Alternative to reshuffling the passes would be to do some kind of OpenMP IPA
optimization, if constant/gimple invariant values are stored into the omp_data
structure, we could modify both the caller not to store them and callees to
replace all reads from that with the invariant.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug c++/49575] OpenMP has a problem with -funroll-loops
  2011-06-29  9:46 [Bug c++/49575] New: OpenMP has a problem with -funroll-loops sailorweb2 at hotmail dot com
  2011-06-29 18:41 ` [Bug c++/49575] " jakub at gcc dot gnu.org
@ 2011-06-29 19:38 ` sailorweb2 at hotmail dot com
  2011-06-29 19:50 ` jakub at gcc dot gnu.org
  2011-12-16  0:43 ` [Bug middle-end/49575] " pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: sailorweb2 at hotmail dot com @ 2011-06-29 19:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49575

--- Comment #2 from Duncan <sailorweb2 at hotmail dot com> 2011-06-29 19:38:00 UTC ---
I am new to OpenMP so I do not know the details, but as far as I know, variable
k is defined as firstprivate, so each thread will have an independent local
copy. Why does it need defining constant or declaring it in the body of the
parallel region even though it is defined firstprivate? Won't firstprivate
automatically define a local copy of k in the body of the parallel region?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug c++/49575] OpenMP has a problem with -funroll-loops
  2011-06-29  9:46 [Bug c++/49575] New: OpenMP has a problem with -funroll-loops sailorweb2 at hotmail dot com
  2011-06-29 18:41 ` [Bug c++/49575] " jakub at gcc dot gnu.org
  2011-06-29 19:38 ` sailorweb2 at hotmail dot com
@ 2011-06-29 19:50 ` jakub at gcc dot gnu.org
  2011-12-16  0:43 ` [Bug middle-end/49575] " pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-06-29 19:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49575

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-06-29 19:49:47 UTC ---
With firstprivate of course you get a copy of the variable, but it is still a
variable.  The *.omp_fn* function which is created from the #omp parallel
region will load it from parameter's structure and use it:
  k = .omp_data_i->k;
...
  D.1646 = (double) k;
  i.33 = i + (long double) D.1646;
  temp.34 = pow (D.1646, 2.0e+0);
  i.36 = i.33 + (long double) temp.34;
  temp.37 = pow (D.1646, 3.0e+0);
  i.39 = i.36 + (long double) temp.37;
  temp.40 = pow (D.1646, 4.0e+0);
  i.42 = i.39 + (long double) temp.40;
  temp.43 = pow (D.1646, 5.0e+0);
...
but as it doesn't look at the caller, k is variable for it.  Without -fopenmp,
ccp propagates the 0.7 value to k uses and we end up with:
  i.32 = i + 6.999999999999999555910790149937383830547332763671875e-1;
  i.33 = i.32 + 4.89999999999999935607064571740920655429363250732421875e-1;
  i.34 = i.33 + 3.42999999999999916067139338338165543973445892333984375e-1;
  i.35 = i.34 + 2.400999999999999523492277830882812850177288055419921875e-1;
  i.36 = i.35 + 1.680699999999999416644413940957747399806976318359375e-1;
  i.37 = i.36 + 1.1764899999999996194066653742993366904556751251220703125e-1;
  i.38 = i.37 + 8.23542999999999636440151107308338396251201629638671875e-2;
...
because gcc saw pow (0.7, 2.0e+0), pow (0.7, 3.0e+0); etc. and constant folded
that.  So, there aren't any pow calls without -fopenmp (wonder why nothing
folded even the additions with -ffast-math though).  Now, if with -fopenmp we
do an interprodecural optimization so that .omp_data_i->k loads in the callee
are replaced with what constant value has been stored into that field after
constant propagation is performed in the caller, we can again constant
propagate the 0.7 down into the pow calls and constant fold them.
Of course the testcase doesn't make any sense this way, but the constant
propagation accross the parallel region boundary could help real-world OpenMP
programs too.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug middle-end/49575] OpenMP has a problem with -funroll-loops
  2011-06-29  9:46 [Bug c++/49575] New: OpenMP has a problem with -funroll-loops sailorweb2 at hotmail dot com
                   ` (2 preceding siblings ...)
  2011-06-29 19:50 ` jakub at gcc dot gnu.org
@ 2011-12-16  0:43 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-12-16  0:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49575

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2011-12-16
          Component|c++                         |middle-end
     Ever Confirmed|0                           |1
           Severity|normal                      |enhancement

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-12-16 00:40:09 UTC ---
confirmed.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-12-16  0:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-29  9:46 [Bug c++/49575] New: OpenMP has a problem with -funroll-loops sailorweb2 at hotmail dot com
2011-06-29 18:41 ` [Bug c++/49575] " jakub at gcc dot gnu.org
2011-06-29 19:38 ` sailorweb2 at hotmail dot com
2011-06-29 19:50 ` jakub at gcc dot gnu.org
2011-12-16  0:43 ` [Bug middle-end/49575] " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).