public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/60997] New: -fopenmp conflicts with -floop-interchange
@ 2014-04-29  9:54 dominiq at lps dot ens.fr
  2014-04-29 12:11 ` [Bug tree-optimization/60997] " rguenth at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: dominiq at lps dot ens.fr @ 2014-04-29  9:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60997

            Bug ID: 60997
           Summary: -fopenmp conflicts with -floop-interchange
           Product: gcc
           Version: 4.10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dominiq at lps dot ens.fr
                CC: grosser at gcc dot gnu.org, jakub at gcc dot gnu.org,
                    mircea.namolaru at inria dot fr

Created attachment 32703
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32703&action=edit
Test for three variants of the matrix product

Compiling the attached code with -Ofast gives the following timing at run time

[Book15] Fortran/omp_tst% gfc -Ofast omp_tst_4_db_2.f90
[Book15] Fortran/omp_tst% time a.out
   94378416668672.000     
 Elapsed time =   3.7326660000000000      seconds
   94378416668672.000     
 Elapsed time =  0.57225000000000004      seconds
   94378416668672.000     
 Elapsed time =   6.9233669999999998      seconds
   94378416668672.000     
 Elapsed time =  0.47757300000000003      seconds
11.704u 0.030s 0:11.73 100.0%    0+0k 0+0io 2pf+0w

Adding -floop-interchange at compile time gives

[Book15] Fortran/omp_tst% gfc -Ofast omp_tst_4_db_2.f90 -floop-interchange
[Book15] Fortran/omp_tst% time a.out
   94378416668672.000     
 Elapsed time =  0.57357899999999995      seconds
   94378416668672.000     
 Elapsed time =  0.56863100000000000      seconds
   94378416668672.000     
 Elapsed time =  0.56851499999999999      seconds
   94378416668672.000     
 Elapsed time =  0.47033199999999997      seconds
2.195u 0.015s 0:02.21 99.5%    0+0k 0+0io 0pf+0w

i.e., the three variants of the loop are transformed to the fastest one. Adding
-fopenmp (and -fexternal-blas -framework vecLib) gives

[Book15] Fortran/omp_tst% gfc -Ofast omp_tst_4_db_2.f90 -floop-interchange
-fopenmp -fexternal-blas -framework vecLib
[Book15] Fortran/omp_tst% time a.out
   94378416668672.000     
 Elapsed time =   1.8143670000000001      seconds
   94378416668672.000     
 Elapsed time =  0.12886900000000001      seconds
   94378416668672.000     
 Elapsed time =   2.0025420000000000      seconds
   94378416668672.000     
 Elapsed time =   2.9204999999999998E-002 seconds
31.030u 0.064s 0:04.00 777.2%    0+0k 4+4io 2pf+0w

i.e., the loop interchange is prevented by the -fopenmp option. This is
probably due to the fact that the -fopenmp option is processed before the
graphite optimizations.

The last timings are for the MATMUL intrinsic as a reference (using the system
BLAS gives a 15 times speed-up).


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/60997] -fopenmp conflicts with -floop-interchange
  2014-04-29  9:54 [Bug tree-optimization/60997] New: -fopenmp conflicts with -floop-interchange dominiq at lps dot ens.fr
@ 2014-04-29 12:11 ` rguenth at gcc dot gnu.org
  2014-04-29 13:38 ` dominiq at lps dot ens.fr
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-04-29 12:11 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60997

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
IIRC -fopenmp has similar issues for vectorization (it defeats points-to
analysis)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/60997] -fopenmp conflicts with -floop-interchange
  2014-04-29  9:54 [Bug tree-optimization/60997] New: -fopenmp conflicts with -floop-interchange dominiq at lps dot ens.fr
  2014-04-29 12:11 ` [Bug tree-optimization/60997] " rguenth at gcc dot gnu.org
@ 2014-04-29 13:38 ` dominiq at lps dot ens.fr
  2014-04-29 15:24 ` mircea.namolaru at inria dot fr
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: dominiq at lps dot ens.fr @ 2014-04-29 13:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60997

--- Comment #2 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
> IIRC -fopenmp has similar issues for vectorization (it defeats
> points-to analysis)

PR46032?

[Book15] f90/bug% g++ -Ofast pr46032.cpp -fopt-info-vec -fgraphite
-fgraphite-identity -floop-block -floop-flatten -floop-interchange
-floop-strip-mine -ftree-loop-linear -floop-parallelize-all
pr46032.cpp:11:34: note: loop vectorized
[Book15] f90/bug% g++ -Ofast pr46032.cpp -fopt-info-vec -fopenmp -fgraphite
-floop-block -floop-flatten -floop-interchange -floop-strip-mine
-ftree-loop-linear 
pr46032.cpp:11:34: note: loop vectorized
pr46032.cpp:11:34: note: loop versioned for vectorization because of possible
aliasing
pr46032.cpp:11:34: note: loop peeled for vectorization to enhance alignment

but

[Book15] f90/bug% g++ -Ofast pr46032.cpp -fopt-info-vec -fopenmp
-fgraphite-identity
[Book15] f90/bug% g++ -Ofast pr46032.cpp -fopt-info-vec -fopenmp -fgraphite
-floop-block -floop-flatten -floop-interchange -floop-strip-mine
-ftree-loop-linear -floop-parallelize-all
[Book15]


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/60997] -fopenmp conflicts with -floop-interchange
  2014-04-29  9:54 [Bug tree-optimization/60997] New: -fopenmp conflicts with -floop-interchange dominiq at lps dot ens.fr
  2014-04-29 12:11 ` [Bug tree-optimization/60997] " rguenth at gcc dot gnu.org
  2014-04-29 13:38 ` dominiq at lps dot ens.fr
@ 2014-04-29 15:24 ` mircea.namolaru at inria dot fr
  2014-06-12 18:22 ` dominiq at lps dot ens.fr
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: mircea.namolaru at inria dot fr @ 2014-04-29 15:24 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60997

--- Comment #3 from Mircea Namolaru <mircea.namolaru at inria dot fr> ---
It is not that -floop-interchange is disabled, but the code received by
graphite is different if the option -fopenmp is enabled. In this case the check
for data
dependencies required by loop-interchange fails. I wil check more in depth if 
the data dependencies are right in this case or there is a problem with them
(but probably not). 

I guess that the problem is the same for vectorization (but there the data
dependencies for vectorization are not checked by graphite).


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/60997] -fopenmp conflicts with -floop-interchange
  2014-04-29  9:54 [Bug tree-optimization/60997] New: -fopenmp conflicts with -floop-interchange dominiq at lps dot ens.fr
                   ` (2 preceding siblings ...)
  2014-04-29 15:24 ` mircea.namolaru at inria dot fr
@ 2014-06-12 18:22 ` dominiq at lps dot ens.fr
  2014-06-12 18:58 ` jakub at gcc dot gnu.org
  2014-06-12 19:55 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: dominiq at lps dot ens.fr @ 2014-06-12 18:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60997

Dominique d'Humieres <dominiq at lps dot ens.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2014-06-12
     Ever confirmed|0                           |1

--- Comment #4 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
What if the vectorization or graphite optimizations were done before handling
openmp?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/60997] -fopenmp conflicts with -floop-interchange
  2014-04-29  9:54 [Bug tree-optimization/60997] New: -fopenmp conflicts with -floop-interchange dominiq at lps dot ens.fr
                   ` (3 preceding siblings ...)
  2014-06-12 18:22 ` dominiq at lps dot ens.fr
@ 2014-06-12 18:58 ` jakub at gcc dot gnu.org
  2014-06-12 19:55 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-06-12 18:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60997

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
That is not possible.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/60997] -fopenmp conflicts with -floop-interchange
  2014-04-29  9:54 [Bug tree-optimization/60997] New: -fopenmp conflicts with -floop-interchange dominiq at lps dot ens.fr
                   ` (4 preceding siblings ...)
  2014-06-12 18:58 ` jakub at gcc dot gnu.org
@ 2014-06-12 19:55 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-06-12 19:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60997

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Also, the compiler really isn't allowed to interchange the !$OMP DO loop
(outermost) with the inner ones, it has different semantics because of the
markup.  The compiler could still interchange the inner two loops if it is
beneficial.  You can also try to use collapse(2) or collapse(3) clause on the
!$OMP DO, then the outermost two or all 3 loops are workshared together.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-06-12 19:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-29  9:54 [Bug tree-optimization/60997] New: -fopenmp conflicts with -floop-interchange dominiq at lps dot ens.fr
2014-04-29 12:11 ` [Bug tree-optimization/60997] " rguenth at gcc dot gnu.org
2014-04-29 13:38 ` dominiq at lps dot ens.fr
2014-04-29 15:24 ` mircea.namolaru at inria dot fr
2014-06-12 18:22 ` dominiq at lps dot ens.fr
2014-06-12 18:58 ` jakub at gcc dot gnu.org
2014-06-12 19:55 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).