public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872
@ 2013-04-02 11:21 ysrumyan at gmail dot com
  2013-04-02 11:22 ` [Bug tree-optimization/56812] " ysrumyan at gmail dot com
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-02 11:21 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

             Bug #: 56812
           Summary: Simple loop is not SLP-vectorized after r196872
    Classification: Unclassified
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: ysrumyan@gmail.com


A simple loop in attached test-case is not SLP-vectorized after this fix with
message:

t.cc:12: note: can't determine dependence between this_4(D)->data[0] and
this_4(D)->data[i_14]
t.cc:12: note: not vectorized: unhandled data dependence in basic block.

To reproduce the failure it is sufficient to compile this test on x86 with the
following options:
  -O3 -funroll-loops -march=IVB


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
@ 2013-04-02 11:22 ` ysrumyan at gmail dot com
  2013-04-02 11:41 ` ysrumyan at gmail dot com
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-02 11:22 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #1 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-02 11:22:45 UTC ---
Created attachment 29775
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29775
testcase

Need to compile with -O3 -funroll-loops options.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
  2013-04-02 11:22 ` [Bug tree-optimization/56812] " ysrumyan at gmail dot com
@ 2013-04-02 11:41 ` ysrumyan at gmail dot com
  2013-04-02 13:20 ` rguenth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-02 11:41 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #2 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-02 11:41:23 UTC ---
Sorry, i did a typo in -march option - it must be -march=corei7 -mavx.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
  2013-04-02 11:22 ` [Bug tree-optimization/56812] " ysrumyan at gmail dot com
  2013-04-02 11:41 ` ysrumyan at gmail dot com
@ 2013-04-02 13:20 ` rguenth at gcc dot gnu.org
  2013-04-02 13:27 ` ysrumyan at gmail dot com
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-04-02 13:20 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2013-04-02
     Ever Confirmed|0                           |1

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> 2013-04-02 13:20:47 UTC ---
It should not make a difference ... I see that vectorization is determined
to be not profitable though:

t.C:11: note: === vect_update_slp_costs_according_to_vf ===cost model: prologue
peel iters set to vf/2.cost model: epilogue peel iters set to vf/2 because
peeling for alignment is unknown.
t.C:11: note: Cost model analysis:
  Vector inside of loop cost: 1
  Vector prologue cost: 11
  Vector epilogue cost: 2
  Scalar iteration cost: 1
  Scalar outside cost: 0
  Vector outside cost: 13
  prologue iterations: 2
  epilogue iterations: 2
  Calculated minimum iters for profitability: 17

but it's not vectorized with 4.8 either.  -fno-vect-cost-model fixes this.

Are you sure you attached the correct testcase?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (2 preceding siblings ...)
  2013-04-02 13:20 ` rguenth at gcc dot gnu.org
@ 2013-04-02 13:27 ` ysrumyan at gmail dot com
  2013-04-02 13:36 ` rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-02 13:27 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #4 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-02 13:27:15 UTC ---
Yes, the test-case is correct. If we delete your changes we got thee following
(with -ftree-vectorizer-verbose-3):

t.cc:12: note: vectorizing stmts using SLP.BASIC BLOCK VECTORIZED

t.cc:12: note: basic block vectorized using SLP


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (3 preceding siblings ...)
  2013-04-02 13:27 ` ysrumyan at gmail dot com
@ 2013-04-02 13:36 ` rguenth at gcc dot gnu.org
  2013-04-02 14:06 ` rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-04-02 13:36 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |ASSIGNED
         AssignedTo|unassigned at gcc dot       |rguenth at gcc dot gnu.org
                   |gnu.org                     |

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> 2013-04-02 13:36:11 UTC ---
Ah, I see.  Confirmed.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (4 preceding siblings ...)
  2013-04-02 13:36 ` rguenth at gcc dot gnu.org
@ 2013-04-02 14:06 ` rguenth at gcc dot gnu.org
  2013-04-03  8:04 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-04-02 14:06 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> 2013-04-02 14:06:12 UTC ---
The BB vectorization case ran into

      /* When vectorizing a basic block unknown depnedence can still mean
         grouped access.  */
      if (vect_check_interleaving (dra, drb))
         return false;

that is, whenever dra and drb are part of the same interleaving chain
it considers the accesses to be independent.  I'm not sure this is
a good idea in general, but it's easy to re-instantiate.

There is no reason why dependence analysis should fail here though
(it does because dr_may_alias_p does not use SCEV info as computed by
dr_analyze_innermost and used by interleaving chain analysis, but instead
it looks at the base DR_REF which is also used as DR_BASE_OBJECT).


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (5 preceding siblings ...)
  2013-04-02 14:06 ` rguenth at gcc dot gnu.org
@ 2013-04-03  8:04 ` rguenth at gcc dot gnu.org
  2013-04-04  9:35 ` schwab@linux-m68k.org
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-04-03  8:04 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> 2013-04-03 08:04:16 UTC ---
Author: rguenth
Date: Wed Apr  3 08:03:33 2013
New Revision: 197390

URL: http://gcc.gnu.org/viewcvs?rev=197390&root=gcc&view=rev
Log:
2013-04-03  Richard Biener  <rguenther@suse.de>

    PR tree-optimization/56812
    * tree-vect-data-refs.c (vect_slp_analyze_data_ref_dependence):
    DRs of the same interleaving chain are independent.

    * g++.dg/vect/slp-pr56812.cc: New testcase.

Added:
    trunk/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vect-data-refs.c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (6 preceding siblings ...)
  2013-04-03  8:04 ` rguenth at gcc dot gnu.org
@ 2013-04-04  9:35 ` schwab@linux-m68k.org
  2013-04-04  9:45 ` rguenther at suse dot de
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: schwab@linux-m68k.org @ 2013-04-04  9:35 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #8 from Andreas Schwab <schwab@linux-m68k.org> 2013-04-04 09:35:51 UTC ---
The test is failing on ia64:

$ grep basic slp-pr56812.cc.115t.slp 
/usr/local/gcc/gcc-20130404/gcc/testsuite/g++.dg/vect/slp-pr56812.cc:16: note:
not vectorized: unsupported alignment in basic block.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (7 preceding siblings ...)
  2013-04-04  9:35 ` schwab@linux-m68k.org
@ 2013-04-04  9:45 ` rguenther at suse dot de
  2013-04-04 17:24 ` schwab@linux-m68k.org
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenther at suse dot de @ 2013-04-04  9:45 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> 2013-04-04 09:45:27 UTC ---
On Thu, 4 Apr 2013, schwab@linux-m68k.org wrote:

> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
> 
> --- Comment #8 from Andreas Schwab <schwab@linux-m68k.org> 2013-04-04 09:35:51 UTC ---
> The test is failing on ia64:
> 
> $ grep basic slp-pr56812.cc.115t.slp 
> /usr/local/gcc/gcc-20130404/gcc/testsuite/g++.dg/vect/slp-pr56812.cc:16: note:
> not vectorized: unsupported alignment in basic block.

Does adding

/* { dg-require-effective-target vect_hw_misalign } */

work?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (8 preceding siblings ...)
  2013-04-04  9:45 ` rguenther at suse dot de
@ 2013-04-04 17:24 ` schwab@linux-m68k.org
  2013-04-07 13:36 ` dominiq at lps dot ens.fr
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: schwab@linux-m68k.org @ 2013-04-04 17:24 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #10 from Andreas Schwab <schwab@linux-m68k.org> 2013-04-04 17:24:14 UTC ---
Yes, that will skip the test.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (9 preceding siblings ...)
  2013-04-04 17:24 ` schwab@linux-m68k.org
@ 2013-04-07 13:36 ` dominiq at lps dot ens.fr
  2013-04-08 14:03 ` ysrumyan at gmail dot com
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: dominiq at lps dot ens.fr @ 2013-04-07 13:36 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #11 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2013-04-07 13:35:54 UTC ---
> The test is failing on ia64:
>
> $ grep basic slp-pr56812.cc.115t.slp 
> /usr/local/gcc/gcc-20130404/gcc/testsuite/g++.dg/vect/slp-pr56812.cc:16: note:
> not vectorized: unsupported alignment in basic block.

It is also failing on powerpc*-*-*. The test is skipped if I follows comment
#9.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (10 preceding siblings ...)
  2013-04-07 13:36 ` dominiq at lps dot ens.fr
@ 2013-04-08 14:03 ` ysrumyan at gmail dot com
  2013-04-08 14:05 ` ysrumyan at gmail dot com
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-08 14:03 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #12 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-08 14:03:45 UTC ---
Richard,

We found out another issue related to your fix (r196872), namely for the
attached test-case t1.c function vect_gen_niters_for_prolog_loop() uses
non-invariant pointer (v1) for calculation of #iterations for prolog but before
your fix it uses invariant pointer (x) for doing it and all these evaluations
can be hoised out of outermost loop:

before your fix
  <bb 6>:
  niters.3_17 = (unsigned int) len_7;
  vect_px.4_4 = x_24(D);
  _119 = (unsigned long) vect_px.4_4;
  _118 = _119 & 31;
  _117 = _118 >> 2;
  _116 = -_117;
  _115 = (unsigned int) _116;
  _114 = _115 & 7;
  prolog_loop_niters.5_52 = MIN_EXPR <niters.3_17, _114>;

after your fix

  <bb 6>:
  niters.3_17 = (unsigned int) len_7;
  vect_pv1.4_4 = v1_16;
  _119 = (unsigned long) vect_pv1.4_4;

It leads to 7% performance regression on 482.sphinx3 from spec2006 (since
#itertaions of outer loop is much more greater (4096) then #iteration of inner
loop (13)).

This can be reproduced with following options:

  -O3 -funroll-loops -ffast-math -march=corei7 -mavx


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (11 preceding siblings ...)
  2013-04-08 14:03 ` ysrumyan at gmail dot com
@ 2013-04-08 14:05 ` ysrumyan at gmail dot com
  2013-04-08 15:04 ` rguenth at gcc dot gnu.org
  2013-04-08 15:48 ` ysrumyan at gmail dot com
  14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-08 14:05 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #13 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-08 14:05:26 UTC ---
Created attachment 29824
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29824
testcase

The following optins were used to compile on x86:

 -O3 -funroll-loops -ffast-math -march=corei7 -mavx


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (12 preceding siblings ...)
  2013-04-08 14:05 ` ysrumyan at gmail dot com
@ 2013-04-08 15:04 ` rguenth at gcc dot gnu.org
  2013-04-08 15:48 ` ysrumyan at gmail dot com
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-04-08 15:04 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> 2013-04-08 15:04:10 UTC ---
(In reply to comment #13)
> Created attachment 29824 [details]
> testcase
> 
> The following optins were used to compile on x86:
> 
>  -O3 -funroll-loops -ffast-math -march=corei7 -mavx

Can you open a new bugreport for this please?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
  2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
                   ` (13 preceding siblings ...)
  2013-04-08 15:04 ` rguenth at gcc dot gnu.org
@ 2013-04-08 15:48 ` ysrumyan at gmail dot com
  14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-08 15:48 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812

--- Comment #15 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-08 15:48:45 UTC ---
New bug has been opened for it:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56878


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-04-08 15:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
2013-04-02 11:22 ` [Bug tree-optimization/56812] " ysrumyan at gmail dot com
2013-04-02 11:41 ` ysrumyan at gmail dot com
2013-04-02 13:20 ` rguenth at gcc dot gnu.org
2013-04-02 13:27 ` ysrumyan at gmail dot com
2013-04-02 13:36 ` rguenth at gcc dot gnu.org
2013-04-02 14:06 ` rguenth at gcc dot gnu.org
2013-04-03  8:04 ` rguenth at gcc dot gnu.org
2013-04-04  9:35 ` schwab@linux-m68k.org
2013-04-04  9:45 ` rguenther at suse dot de
2013-04-04 17:24 ` schwab@linux-m68k.org
2013-04-07 13:36 ` dominiq at lps dot ens.fr
2013-04-08 14:03 ` ysrumyan at gmail dot com
2013-04-08 14:05 ` ysrumyan at gmail dot com
2013-04-08 15:04 ` rguenth at gcc dot gnu.org
2013-04-08 15:48 ` ysrumyan at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).