public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/37150] vectorizer misses some loops
       [not found] <bug-37150-4@http.gcc.gnu.org/bugzilla/>
@ 2012-07-13  8:44 ` rguenth at gcc dot gnu.org
  2012-10-06 10:39 ` Joost.VandeVondele at mat dot ethz.ch
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-13  8:44 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |53947

--- Comment #12 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-13 08:43:54 UTC ---
Link to vectorizer missed-optimization meta-bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] vectorizer misses some loops
       [not found] <bug-37150-4@http.gcc.gnu.org/bugzilla/>
  2012-07-13  8:44 ` [Bug middle-end/37150] vectorizer misses some loops rguenth at gcc dot gnu.org
@ 2012-10-06 10:39 ` Joost.VandeVondele at mat dot ethz.ch
  2013-03-27 12:26 ` [Bug middle-end/37150] basic-block vectorization misses some unrolled loops rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2012-10-06 10:39 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150

Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2009-08-06 07:54:57         |2012-10-06 7:54:57

--- Comment #13 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2012-10-06 10:38:57 UTC ---
reconfirming this with current trunk 

ifort:            1.02s
gfortran 4.8:     2.01s

gfortran -ffast-math -march=native -O3 -v PR37150.f90

-march=corei7 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mno-abm
-mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx
-mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c
-mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] basic-block vectorization misses some unrolled loops
       [not found] <bug-37150-4@http.gcc.gnu.org/bugzilla/>
  2012-07-13  8:44 ` [Bug middle-end/37150] vectorizer misses some loops rguenth at gcc dot gnu.org
  2012-10-06 10:39 ` Joost.VandeVondele at mat dot ethz.ch
@ 2013-03-27 12:26 ` rguenth at gcc dot gnu.org
  2013-03-27 12:53 ` Joost.VandeVondele at mat dot ethz.ch
  2021-02-11 11:10 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-03-27 12:26 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
         AssignedTo|unassigned at gcc dot       |rguenth at gcc dot gnu.org
                   |gnu.org                     |
            Summary|vectorizer misses some      |basic-block vectorization
                   |loops                       |misses some unrolled loops

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> 2013-03-27 12:26:40 UTC ---
Confirmed - this should be handled by BB vectorization.

t.f90:1: note: === vect_analyze_slp ===
t.f90:1: note: Failed to SLP the basic block.
t.f90:1: note: not vectorized: failed to find SLP opportunities in basic block.

a smaller, but still sensible, testcase would be appreciated.

For now BB analysis stops at

  coef_x = {};

because it cannot find a vector type for it.  If we fix that we end up
with

t.f90:1: note: === vect_slp_analyze_data_ref_dependences ===
t.f90:1: note: can't determine dependence between coef_x[_2719] and coef_x
t.f90:1: note: not vectorized: unhandled data dependence in basic block.

because dependence analysis appearantly does not handle.  If we fix that
we end up with

t.f90:1: note: can't determine dependence between coef_x[_2719] and coef_x[0]
t.f90:1: note: not vectorized: unhandled data dependence in basic block.

so the issue boils down to the fact that we first do all dependence checks
rather than look for SLP opportunities and only check dependences within
the basic-block region the SLP tree covers.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] basic-block vectorization misses some unrolled loops
       [not found] <bug-37150-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2013-03-27 12:26 ` [Bug middle-end/37150] basic-block vectorization misses some unrolled loops rguenth at gcc dot gnu.org
@ 2013-03-27 12:53 ` Joost.VandeVondele at mat dot ethz.ch
  2021-02-11 11:10 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 12+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-03-27 12:53 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150

--- Comment #15 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2013-03-27 12:53:16 UTC ---
Created attachment 29738
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29738
maybe smaller testcase version ?

Attached is what I think is roughly the smallest kernel of this type that we
have in the code. I checked this is at least partially vectorized with ifort,
but not so with gfortran trunk. It is still not such a small testcase, I'm
afraid.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] basic-block vectorization misses some unrolled loops
       [not found] <bug-37150-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2013-03-27 12:53 ` Joost.VandeVondele at mat dot ethz.ch
@ 2021-02-11 11:10 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-11 11:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2012-10-06 07:54:57         |2021-2-11

--- Comment #30 from Richard Biener <rguenth at gcc dot gnu.org> ---
For the non-reduced testcase the problem is (still) that there is no
grouped store, the only stores left at the point of vectorization are

             grid(i,j,k) = grid(i,j,k)     + s01
             grid(i,j2,k) = grid(i,j2,k)   + s03
             grid(i,j,k2) = grid(i,j,k2)   + s02
             grid(i,j2,k2) = grid(i,j2,k2) + s04

so the coef_xy and coef_x arrays are completely elided.  And the above
stores are not contiguous.

The approaches to start from arbitrary seeds with SLP vectorization would
eventually help here (likewise of course starting from the loads which
is something that's brought up at some points).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] vectorizer misses some loops
  2008-08-18 15:34 [Bug middle-end/37150] New: vectorizer issue jv244 at cam dot ac dot uk
                   ` (5 preceding siblings ...)
  2009-08-06 10:49 ` irar at il dot ibm dot com
@ 2009-08-06 11:11 ` jv244 at cam dot ac dot uk
  6 siblings, 0 replies; 12+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-08-06 11:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from jv244 at cam dot ac dot uk  2009-08-06 11:11 -------
(In reply to comment #10)
> Finding a benchmark could really help to push these items to the top of
> vectorizer's todo list.

we're lucky here ;-)

http://gcc.gnu.org/benchmarks/

has a link to

http://cp2k.berlios.de/gfortran/

the code discussed (in particular the above collocate_fast_2.f90) is (in a
slightly older but equivalent variant, see
ftp://ftp.berlios.de/pub/cp2k/gfortran/gcc_bench.tgz) a significant part of the
bench01 benchmark. Getting the same performance as ifort on this kernel would
speedup this benchmark with ~ 10% which would be highly significant.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] vectorizer misses some loops
  2008-08-18 15:34 [Bug middle-end/37150] New: vectorizer issue jv244 at cam dot ac dot uk
                   ` (4 preceding siblings ...)
  2009-08-06 10:24 ` jv244 at cam dot ac dot uk
@ 2009-08-06 10:49 ` irar at il dot ibm dot com
  2009-08-06 11:11 ` jv244 at cam dot ac dot uk
  6 siblings, 0 replies; 12+ messages in thread
From: irar at il dot ibm dot com @ 2009-08-06 10:49 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from irar at il dot ibm dot com  2009-08-06 10:49 -------
Yes. The problem is that only a basic implementation was added. To vectorize
this code several improvements must be done: support stmt group sizes greater
than vector size, allow loads and stores to the same location, initiate SLP
analysis from groups of loads, support misaligned access, etc. 

Finding a benchmark could really help to push these items to the top of
vectorizer's todo list.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] vectorizer misses some loops
  2008-08-18 15:34 [Bug middle-end/37150] New: vectorizer issue jv244 at cam dot ac dot uk
                   ` (3 preceding siblings ...)
  2009-08-06  9:40 ` rguenth at gcc dot gnu dot org
@ 2009-08-06 10:24 ` jv244 at cam dot ac dot uk
  2009-08-06 10:49 ` irar at il dot ibm dot com
  2009-08-06 11:11 ` jv244 at cam dot ac dot uk
  6 siblings, 0 replies; 12+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-08-06 10:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from jv244 at cam dot ac dot uk  2009-08-06 10:24 -------
(In reply to comment #8)
> I think that scalar code vectorization should instead catch this.

is this 'scalar code vectorization' the same as the SLP that has already been
added? 


-- 

jv244 at cam dot ac dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] vectorizer misses some loops
  2008-08-18 15:34 [Bug middle-end/37150] New: vectorizer issue jv244 at cam dot ac dot uk
                   ` (2 preceding siblings ...)
  2009-08-06  7:55 ` jv244 at cam dot ac dot uk
@ 2009-08-06  9:40 ` rguenth at gcc dot gnu dot org
  2009-08-06 10:24 ` jv244 at cam dot ac dot uk
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-08-06  9:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from rguenth at gcc dot gnu dot org  2009-08-06 09:40 -------
I think that scalar code vectorization should instead catch this.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] vectorizer misses some loops
  2008-08-18 15:34 [Bug middle-end/37150] New: vectorizer issue jv244 at cam dot ac dot uk
  2008-08-24 22:50 ` [Bug middle-end/37150] vectorizer misses some loops pinskia at gcc dot gnu dot org
  2008-12-27  6:33 ` pinskia at gcc dot gnu dot org
@ 2009-08-06  7:55 ` jv244 at cam dot ac dot uk
  2009-08-06  9:40 ` rguenth at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-08-06  7:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from jv244 at cam dot ac dot uk  2009-08-06 07:54 -------
Just verified that current trunk is not yet able to vectorize the test.f90
code,
it would be cool if this could be fixed (maybe along the lines of Richard's
previous patch?) as this is similar to CP2K's kernel routines:

> gfortran -O3 -march=native -ffast-math test.f90 &> /dev/null
> time ./a.out

real    0m2.306s
user    0m2.304s
sys     0m0.000s
> ifort -O3 -xT test.f90 &> /dev/null
> time ./a.out

real    0m1.812s
user    0m1.808s
sys     0m0.004s


-- 

jv244 at cam dot ac dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2008-12-27 06:31:06         |2009-08-06 07:54:57
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] vectorizer misses some loops
  2008-08-18 15:34 [Bug middle-end/37150] New: vectorizer issue jv244 at cam dot ac dot uk
  2008-08-24 22:50 ` [Bug middle-end/37150] vectorizer misses some loops pinskia at gcc dot gnu dot org
@ 2008-12-27  6:33 ` pinskia at gcc dot gnu dot org
  2009-08-06  7:55 ` jv244 at cam dot ac dot uk
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-12-27  6:33 UTC (permalink / raw)
  To: gcc-bugs



-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2008-12-27 06:31:06
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/37150] vectorizer misses some loops
  2008-08-18 15:34 [Bug middle-end/37150] New: vectorizer issue jv244 at cam dot ac dot uk
@ 2008-08-24 22:50 ` pinskia at gcc dot gnu dot org
  2008-12-27  6:33 ` pinskia at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-08-24 22:50 UTC (permalink / raw)
  To: gcc-bugs



-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
            Summary|vectorizer issue            |vectorizer misses some loops


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-02-11 11:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-37150-4@http.gcc.gnu.org/bugzilla/>
2012-07-13  8:44 ` [Bug middle-end/37150] vectorizer misses some loops rguenth at gcc dot gnu.org
2012-10-06 10:39 ` Joost.VandeVondele at mat dot ethz.ch
2013-03-27 12:26 ` [Bug middle-end/37150] basic-block vectorization misses some unrolled loops rguenth at gcc dot gnu.org
2013-03-27 12:53 ` Joost.VandeVondele at mat dot ethz.ch
2021-02-11 11:10 ` rguenth at gcc dot gnu.org
2008-08-18 15:34 [Bug middle-end/37150] New: vectorizer issue jv244 at cam dot ac dot uk
2008-08-24 22:50 ` [Bug middle-end/37150] vectorizer misses some loops pinskia at gcc dot gnu dot org
2008-12-27  6:33 ` pinskia at gcc dot gnu dot org
2009-08-06  7:55 ` jv244 at cam dot ac dot uk
2009-08-06  9:40 ` rguenth at gcc dot gnu dot org
2009-08-06 10:24 ` jv244 at cam dot ac dot uk
2009-08-06 10:49 ` irar at il dot ibm dot com
2009-08-06 11:11 ` jv244 at cam dot ac dot uk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).