* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
@ 2013-04-02 11:22 ` ysrumyan at gmail dot com
2013-04-02 11:41 ` ysrumyan at gmail dot com
` (13 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-02 11:22 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #1 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-02 11:22:45 UTC ---
Created attachment 29775
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29775
testcase
Need to compile with -O3 -funroll-loops options.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
2013-04-02 11:22 ` [Bug tree-optimization/56812] " ysrumyan at gmail dot com
@ 2013-04-02 11:41 ` ysrumyan at gmail dot com
2013-04-02 13:20 ` rguenth at gcc dot gnu.org
` (12 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-02 11:41 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #2 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-02 11:41:23 UTC ---
Sorry, i did a typo in -march option - it must be -march=corei7 -mavx.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
2013-04-02 11:22 ` [Bug tree-optimization/56812] " ysrumyan at gmail dot com
2013-04-02 11:41 ` ysrumyan at gmail dot com
@ 2013-04-02 13:20 ` rguenth at gcc dot gnu.org
2013-04-02 13:27 ` ysrumyan at gmail dot com
` (11 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-04-02 13:20 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |WAITING
Last reconfirmed| |2013-04-02
Ever Confirmed|0 |1
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> 2013-04-02 13:20:47 UTC ---
It should not make a difference ... I see that vectorization is determined
to be not profitable though:
t.C:11: note: === vect_update_slp_costs_according_to_vf ===cost model: prologue
peel iters set to vf/2.cost model: epilogue peel iters set to vf/2 because
peeling for alignment is unknown.
t.C:11: note: Cost model analysis:
Vector inside of loop cost: 1
Vector prologue cost: 11
Vector epilogue cost: 2
Scalar iteration cost: 1
Scalar outside cost: 0
Vector outside cost: 13
prologue iterations: 2
epilogue iterations: 2
Calculated minimum iters for profitability: 17
but it's not vectorized with 4.8 either. -fno-vect-cost-model fixes this.
Are you sure you attached the correct testcase?
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (2 preceding siblings ...)
2013-04-02 13:20 ` rguenth at gcc dot gnu.org
@ 2013-04-02 13:27 ` ysrumyan at gmail dot com
2013-04-02 13:36 ` rguenth at gcc dot gnu.org
` (10 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-02 13:27 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #4 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-02 13:27:15 UTC ---
Yes, the test-case is correct. If we delete your changes we got thee following
(with -ftree-vectorizer-verbose-3):
t.cc:12: note: vectorizing stmts using SLP.BASIC BLOCK VECTORIZED
t.cc:12: note: basic block vectorized using SLP
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (3 preceding siblings ...)
2013-04-02 13:27 ` ysrumyan at gmail dot com
@ 2013-04-02 13:36 ` rguenth at gcc dot gnu.org
2013-04-02 14:06 ` rguenth at gcc dot gnu.org
` (9 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-04-02 13:36 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |ASSIGNED
AssignedTo|unassigned at gcc dot |rguenth at gcc dot gnu.org
|gnu.org |
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> 2013-04-02 13:36:11 UTC ---
Ah, I see. Confirmed.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (4 preceding siblings ...)
2013-04-02 13:36 ` rguenth at gcc dot gnu.org
@ 2013-04-02 14:06 ` rguenth at gcc dot gnu.org
2013-04-03 8:04 ` rguenth at gcc dot gnu.org
` (8 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-04-02 14:06 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> 2013-04-02 14:06:12 UTC ---
The BB vectorization case ran into
/* When vectorizing a basic block unknown depnedence can still mean
grouped access. */
if (vect_check_interleaving (dra, drb))
return false;
that is, whenever dra and drb are part of the same interleaving chain
it considers the accesses to be independent. I'm not sure this is
a good idea in general, but it's easy to re-instantiate.
There is no reason why dependence analysis should fail here though
(it does because dr_may_alias_p does not use SCEV info as computed by
dr_analyze_innermost and used by interleaving chain analysis, but instead
it looks at the base DR_REF which is also used as DR_BASE_OBJECT).
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (5 preceding siblings ...)
2013-04-02 14:06 ` rguenth at gcc dot gnu.org
@ 2013-04-03 8:04 ` rguenth at gcc dot gnu.org
2013-04-04 9:35 ` schwab@linux-m68k.org
` (7 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-04-03 8:04 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> 2013-04-03 08:04:16 UTC ---
Author: rguenth
Date: Wed Apr 3 08:03:33 2013
New Revision: 197390
URL: http://gcc.gnu.org/viewcvs?rev=197390&root=gcc&view=rev
Log:
2013-04-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/56812
* tree-vect-data-refs.c (vect_slp_analyze_data_ref_dependence):
DRs of the same interleaving chain are independent.
* g++.dg/vect/slp-pr56812.cc: New testcase.
Added:
trunk/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-data-refs.c
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (6 preceding siblings ...)
2013-04-03 8:04 ` rguenth at gcc dot gnu.org
@ 2013-04-04 9:35 ` schwab@linux-m68k.org
2013-04-04 9:45 ` rguenther at suse dot de
` (6 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: schwab@linux-m68k.org @ 2013-04-04 9:35 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #8 from Andreas Schwab <schwab@linux-m68k.org> 2013-04-04 09:35:51 UTC ---
The test is failing on ia64:
$ grep basic slp-pr56812.cc.115t.slp
/usr/local/gcc/gcc-20130404/gcc/testsuite/g++.dg/vect/slp-pr56812.cc:16: note:
not vectorized: unsupported alignment in basic block.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (7 preceding siblings ...)
2013-04-04 9:35 ` schwab@linux-m68k.org
@ 2013-04-04 9:45 ` rguenther at suse dot de
2013-04-04 17:24 ` schwab@linux-m68k.org
` (5 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: rguenther at suse dot de @ 2013-04-04 9:45 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> 2013-04-04 09:45:27 UTC ---
On Thu, 4 Apr 2013, schwab@linux-m68k.org wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
>
> --- Comment #8 from Andreas Schwab <schwab@linux-m68k.org> 2013-04-04 09:35:51 UTC ---
> The test is failing on ia64:
>
> $ grep basic slp-pr56812.cc.115t.slp
> /usr/local/gcc/gcc-20130404/gcc/testsuite/g++.dg/vect/slp-pr56812.cc:16: note:
> not vectorized: unsupported alignment in basic block.
Does adding
/* { dg-require-effective-target vect_hw_misalign } */
work?
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (8 preceding siblings ...)
2013-04-04 9:45 ` rguenther at suse dot de
@ 2013-04-04 17:24 ` schwab@linux-m68k.org
2013-04-07 13:36 ` dominiq at lps dot ens.fr
` (4 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: schwab@linux-m68k.org @ 2013-04-04 17:24 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #10 from Andreas Schwab <schwab@linux-m68k.org> 2013-04-04 17:24:14 UTC ---
Yes, that will skip the test.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (9 preceding siblings ...)
2013-04-04 17:24 ` schwab@linux-m68k.org
@ 2013-04-07 13:36 ` dominiq at lps dot ens.fr
2013-04-08 14:03 ` ysrumyan at gmail dot com
` (3 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: dominiq at lps dot ens.fr @ 2013-04-07 13:36 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #11 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2013-04-07 13:35:54 UTC ---
> The test is failing on ia64:
>
> $ grep basic slp-pr56812.cc.115t.slp
> /usr/local/gcc/gcc-20130404/gcc/testsuite/g++.dg/vect/slp-pr56812.cc:16: note:
> not vectorized: unsupported alignment in basic block.
It is also failing on powerpc*-*-*. The test is skipped if I follows comment
#9.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (10 preceding siblings ...)
2013-04-07 13:36 ` dominiq at lps dot ens.fr
@ 2013-04-08 14:03 ` ysrumyan at gmail dot com
2013-04-08 14:05 ` ysrumyan at gmail dot com
` (2 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-08 14:03 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #12 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-08 14:03:45 UTC ---
Richard,
We found out another issue related to your fix (r196872), namely for the
attached test-case t1.c function vect_gen_niters_for_prolog_loop() uses
non-invariant pointer (v1) for calculation of #iterations for prolog but before
your fix it uses invariant pointer (x) for doing it and all these evaluations
can be hoised out of outermost loop:
before your fix
<bb 6>:
niters.3_17 = (unsigned int) len_7;
vect_px.4_4 = x_24(D);
_119 = (unsigned long) vect_px.4_4;
_118 = _119 & 31;
_117 = _118 >> 2;
_116 = -_117;
_115 = (unsigned int) _116;
_114 = _115 & 7;
prolog_loop_niters.5_52 = MIN_EXPR <niters.3_17, _114>;
after your fix
<bb 6>:
niters.3_17 = (unsigned int) len_7;
vect_pv1.4_4 = v1_16;
_119 = (unsigned long) vect_pv1.4_4;
It leads to 7% performance regression on 482.sphinx3 from spec2006 (since
#itertaions of outer loop is much more greater (4096) then #iteration of inner
loop (13)).
This can be reproduced with following options:
-O3 -funroll-loops -ffast-math -march=corei7 -mavx
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (11 preceding siblings ...)
2013-04-08 14:03 ` ysrumyan at gmail dot com
@ 2013-04-08 14:05 ` ysrumyan at gmail dot com
2013-04-08 15:04 ` rguenth at gcc dot gnu.org
2013-04-08 15:48 ` ysrumyan at gmail dot com
14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-08 14:05 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #13 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-08 14:05:26 UTC ---
Created attachment 29824
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29824
testcase
The following optins were used to compile on x86:
-O3 -funroll-loops -ffast-math -march=corei7 -mavx
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (12 preceding siblings ...)
2013-04-08 14:05 ` ysrumyan at gmail dot com
@ 2013-04-08 15:04 ` rguenth at gcc dot gnu.org
2013-04-08 15:48 ` ysrumyan at gmail dot com
14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-04-08 15:04 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> 2013-04-08 15:04:10 UTC ---
(In reply to comment #13)
> Created attachment 29824 [details]
> testcase
>
> The following optins were used to compile on x86:
>
> -O3 -funroll-loops -ffast-math -march=corei7 -mavx
Can you open a new bugreport for this please?
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug tree-optimization/56812] Simple loop is not SLP-vectorized after r196872
2013-04-02 11:21 [Bug tree-optimization/56812] New: Simple loop is not SLP-vectorized after r196872 ysrumyan at gmail dot com
` (13 preceding siblings ...)
2013-04-08 15:04 ` rguenth at gcc dot gnu.org
@ 2013-04-08 15:48 ` ysrumyan at gmail dot com
14 siblings, 0 replies; 16+ messages in thread
From: ysrumyan at gmail dot com @ 2013-04-08 15:48 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56812
--- Comment #15 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2013-04-08 15:48:45 UTC ---
New bug has been opened for it:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56878
^ permalink raw reply [flat|nested] 16+ messages in thread