[Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics
@ 2013-05-19  2:39 spam.brian.taylor at gmail dot com
  2013-05-21  1:03 ` [Bug fortran/57328] " bdavis at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: spam.brian.taylor at gmail dot com @ 2013-05-19  2:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

            Bug ID: 57328
           Summary: Missed optimization: Unable to vectorize Fortran min
                    and max intrinsics
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: spam.brian.taylor at gmail dot com

Created attachment 30145
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30145&action=edit
Test case for vectorization of loops containing max and if

Use of the Fortran min or max intrinsic functions within a loop appears to
prevent vectorization of the loop.  Replacement of min or max with conditional
assignment using if statements allows vectorization.

A simple test case using max is attached.  If compiled with "gfortran -O2
-msse2 -ftree-vectorize -ftree-vectorizer-verbose=1 -c max_vs_ifs_in_loop.F90",
I get (with extraneous output snipped):

...
max_vs_ifs_in_loop.F90:1: note: vectorized 0 loops in function.
...
max_vs_ifs_in_loop.F90:17: note: LOOP VECTORIZED.
...

gfortran should be able to vectorize loops containing min or max, using any
number of arguments to these intrinsics, e.g. "tmp = max(r1, r2, r3, r4)".

Compiler info:
user@host $ gfortran --version
GNU Fortran (GCC) 4.8.0
Copyright (C) 2013 Free Software Foundation, Inc.

GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
@ 2013-05-21  1:03 ` bdavis at gcc dot gnu.org
  2013-05-21  7:08 ` burnus at gcc dot gnu.org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bdavis at gcc dot gnu.org @ 2013-05-21  1:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

Bud Davis <bdavis at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bdavis at gcc dot gnu.org

--- Comment #1 from Bud Davis <bdavis at gcc dot gnu.org> ---
subroutine max_in_loop(rin, rout)
integer :: rin(1000), rout(1000), tmp
!real :: rin(1000), rout(1000), tmp
integer :: i

do i = 2, 1000
  tmp = min(rin(i-1), rin(i))
  rout(i) = tmp
end do

end subroutine

Is vectorized.
The floating point number makes it special in some way.

Looking in trans-intrinic.c , it is special.

   /* FIXME: When the IEEE_ARITHMETIC module is implemented, the call to
         __builtin_isnan might be made dependent on that module being loaded,
         to help performance of programs that don't rely on IEEE semantics.  */


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
  2013-05-21  1:03 ` [Bug fortran/57328] " bdavis at gcc dot gnu.org
@ 2013-05-21  7:08 ` burnus at gcc dot gnu.org
  2013-05-21  9:16 ` rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: burnus at gcc dot gnu.org @ 2013-05-21  7:08 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

Tobias Burnus <burnus at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |burnus at gcc dot gnu.org

--- Comment #2 from Tobias Burnus <burnus at gcc dot gnu.org> ---
(In reply to Bud Davis from comment #1)
> The floating point number makes it special in some way.

My suspicion is that this is due to special handling for IEEE 754:2008, which
requires that
   MAX (NaN, x) = MAX (x, NaN) = x
   MIN (NaN, x) = MIN (x, NaN) = x


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
  2013-05-21  1:03 ` [Bug fortran/57328] " bdavis at gcc dot gnu.org
  2013-05-21  7:08 ` burnus at gcc dot gnu.org
@ 2013-05-21  9:16 ` rguenth at gcc dot gnu.org
  2013-05-21  9:56 ` glisse at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-05-21  9:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Yes, you generally need -ffast-math here (or -ffinite-math-only at least).


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
                   ` (2 preceding siblings ...)
  2013-05-21  9:16 ` rguenth at gcc dot gnu.org
@ 2013-05-21  9:56 ` glisse at gcc dot gnu.org
  2013-05-21 10:16 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2013-05-21  9:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

--- Comment #4 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> Yes, you generally need -ffast-math here (or -ffinite-math-only at least).

SSE2 has an unord comparison instruction (aka isnan) though, so vectorizing the
full version of min/max should still work, and be even more worth it than for
the finite-only min/max... Maybe a target issue?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
                   ` (3 preceding siblings ...)
  2013-05-21  9:56 ` glisse at gcc dot gnu.org
@ 2013-05-21 10:16 ` jakub at gcc dot gnu.org
  2013-05-21 11:21 ` glisse at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-05-21 10:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
But vectorization reorders the loop iterations, thus say if some value is sNaN,
you'd get exceptions in different order.  So, I'm afraid without -ffast-math
you can vectorize this only if the user says that the order of iterations
doesn't matter (say using OpenMP 4.0 #pragma omp simd or Cilk+ #pragma simd).


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
                   ` (4 preceding siblings ...)
  2013-05-21 10:16 ` jakub at gcc dot gnu.org
@ 2013-05-21 11:21 ` glisse at gcc dot gnu.org
  2013-05-21 18:17 ` spam.brian.taylor at gmail dot com
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2013-05-21 11:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization

--- Comment #6 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #5)
> But vectorization reorders the loop iterations, thus say if some value is
> sNaN, you'd get exceptions in different order.  So, I'm afraid without
> -ffast-math you can vectorize this only if the user says that the order of
> iterations doesn't matter (say using OpenMP 4.0 #pragma omp simd or Cilk+
> #pragma simd).

Ah, I was only thinking of quiet nans. -fno-signaling-nans should be enough
though, no? (I checked and it doesn't help, which makes sense since it is the
default) I think it is quite common to care about quiet nans but not use
signaling nans.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
                   ` (5 preceding siblings ...)
  2013-05-21 11:21 ` glisse at gcc dot gnu.org
@ 2013-05-21 18:17 ` spam.brian.taylor at gmail dot com
  2013-05-21 20:25 ` bdavis at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: spam.brian.taylor at gmail dot com @ 2013-05-21 18:17 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

--- Comment #7 from Brian Taylor <spam.brian.taylor at gmail dot com> ---
(In reply to Jakub Jelinek from comment #5)
> But vectorization reorders the loop iterations, thus say if some value is
> sNaN, you'd get exceptions in different order.  So, I'm afraid without
> -ffast-math you can vectorize this only if the user says that the order of
> iterations doesn't matter (say using OpenMP 4.0 #pragma omp simd or Cilk+
> #pragma simd).

I'm not sure this is actually a problem (or perhaps there is a another bug),
because as I noted in the PR replacing min or max with a "functionally
equivalent" sequence of if statements allows vectorization.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
                   ` (6 preceding siblings ...)
  2013-05-21 18:17 ` spam.brian.taylor at gmail dot com
@ 2013-05-21 20:25 ` bdavis at gcc dot gnu.org
  2013-05-21 21:25 ` [Bug tree-optimization/57328] " glisse at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bdavis at gcc dot gnu.org @ 2013-05-21 20:25 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

--- Comment #8 from Bud Davis <bdavis at gcc dot gnu.org> ---
The compiler generates code for min and max that checks if an argument is NaN. 
(floating point numbers only, of course).

This is different than the example you posted, as it would not give the correct
answer when an argument is NaN.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
                   ` (7 preceding siblings ...)
  2013-05-21 20:25 ` bdavis at gcc dot gnu.org
@ 2013-05-21 21:25 ` glisse at gcc dot gnu.org
  2013-05-22 14:22 ` glisse at gcc dot gnu.org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2013-05-21 21:25 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|fortran                     |tree-optimization

--- Comment #9 from Marc Glisse <glisse at gcc dot gnu.org> ---
The difficulty seems to be with vectorizing an AND or OR of 2 conditions.
a<b || b unord b leaves control flow around
a<b | b unord b complains that bit-precision arithmetic is not supported

The second one in particular looks like a limitation in the vectorizer that
would be nice to lift. I get the same issue with a loop using a[i]<0&b[i]<=0,
it isn't related to unord in particular.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
                   ` (8 preceding siblings ...)
  2013-05-21 21:25 ` [Bug tree-optimization/57328] " glisse at gcc dot gnu.org
@ 2013-05-22 14:22 ` glisse at gcc dot gnu.org
  2013-06-13 20:29 ` glisse at gcc dot gnu.org
  2014-05-19  8:09 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2013-05-22 14:22 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |glisse at gcc dot gnu.org

--- Comment #10 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Marc Glisse from comment #9)
> The difficulty seems to be with vectorizing an AND or OR of 2 conditions.
> a<b || b unord b leaves control flow around
> a<b | b unord b complains that bit-precision arithmetic is not supported
> 
> The second one in particular looks like a limitation in the vectorizer that
> would be nice to lift. I get the same issue with a loop using
> a[i]<0&b[i]<=0, it isn't related to unord in particular.

Interestingly enough, using cond=a[i]<0&b[i]<=0 in a cond_expr fails to
vectorize, but using (double)cond!=0 in the same cond_expr does vectorize (to a
horrible result), thanks to vect_recog_bool_pattern, which is triggered by a
conversion of the result of AND. I assume we want a similar pattern thing
triggered by cond_expr. What pattern should it present to the vectorizer?
Something like c1=(a[i]<0)?-1:0 (for c1, -1 and 0 of an integer type of the
same size as the cond_expr), c2=(b[i]<=0)?-1:0, c=c1&c2, cond_expr(c!=0,...)
maybe, so it looks as close as possible to the desired vectorized form? (we
don't have to use -1 instead of 1 here, sticking to 1 may allow to share more
code with the existing pattern, but then we would want a later pass to change
it)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
                   ` (9 preceding siblings ...)
  2013-05-22 14:22 ` glisse at gcc dot gnu.org
@ 2013-06-13 20:29 ` glisse at gcc dot gnu.org
  2014-05-19  8:09 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2013-06-13 20:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2013-06-13
             Blocks|                            |53947
     Ever confirmed|0                           |1

--- Comment #11 from Marc Glisse <glisse at gcc dot gnu.org> ---
Confirming that we currently can't vectorize code that involves a
(con|dis)jonction of several conditions.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/57328] Missed optimization: Unable to vectorize Fortran min and max intrinsics
  2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
                   ` (10 preceding siblings ...)
  2013-06-13 20:29 ` glisse at gcc dot gnu.org
@ 2014-05-19  8:09 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-05-19  8:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57328

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
      Known to work|                            |4.10.0
         Resolution|---                         |FIXED

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
The testcase is vectorized just fine for me with -Ofast for 4.8, 4.9 and trunk.
With -O3 half of it is vectorized as reported.

On trunk this is fixed and we vectorize both functions (after the fix
for PR61194).


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-05-19  8:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-19  2:39 [Bug fortran/57328] New: Missed optimization: Unable to vectorize Fortran min and max intrinsics spam.brian.taylor at gmail dot com
2013-05-21  1:03 ` [Bug fortran/57328] " bdavis at gcc dot gnu.org
2013-05-21  7:08 ` burnus at gcc dot gnu.org
2013-05-21  9:16 ` rguenth at gcc dot gnu.org
2013-05-21  9:56 ` glisse at gcc dot gnu.org
2013-05-21 10:16 ` jakub at gcc dot gnu.org
2013-05-21 11:21 ` glisse at gcc dot gnu.org
2013-05-21 18:17 ` spam.brian.taylor at gmail dot com
2013-05-21 20:25 ` bdavis at gcc dot gnu.org
2013-05-21 21:25 ` [Bug tree-optimization/57328] " glisse at gcc dot gnu.org
2013-05-22 14:22 ` glisse at gcc dot gnu.org
2013-06-13 20:29 ` glisse at gcc dot gnu.org
2014-05-19  8:09 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).