public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/63677] Failure to constant fold with vectorization.
       [not found] <bug-63677-4@http.gcc.gnu.org/bugzilla/>
@ 2014-10-29 16:50 ` belagod at gcc dot gnu.org
  2014-10-29 17:24 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: belagod at gcc dot gnu.org @ 2014-10-29 16:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63677

--- Comment #1 from Tejas Belagod <belagod at gcc dot gnu.org> ---
There is similar behaviour on aarch64. So, it doesn't look like a backend
issue.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63677] Failure to constant fold with vectorization.
       [not found] <bug-63677-4@http.gcc.gnu.org/bugzilla/>
  2014-10-29 16:50 ` [Bug tree-optimization/63677] Failure to constant fold with vectorization belagod at gcc dot gnu.org
@ 2014-10-29 17:24 ` pinskia at gcc dot gnu.org
  2014-10-29 18:21 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2014-10-29 17:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63677

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I have seen this also.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63677] Failure to constant fold with vectorization.
       [not found] <bug-63677-4@http.gcc.gnu.org/bugzilla/>
  2014-10-29 16:50 ` [Bug tree-optimization/63677] Failure to constant fold with vectorization belagod at gcc dot gnu.org
  2014-10-29 17:24 ` pinskia at gcc dot gnu.org
@ 2014-10-29 18:21 ` jakub at gcc dot gnu.org
  2014-10-30  9:52 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-10-29 18:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63677

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2014-10-29
                 CC|                            |jakub at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The problem is that the loop is first vectorized, then several passes later slp
vectorizes the initialization, so after some cleanups we have e.g. in cddce2:
  MEM[(int *)&a] = { 0, 1, 2, 3 };
  MEM[(int *)&a + 16B] = { 4, 5, 6, 7 };
  vect__13.6_20 = MEM[(int *)&a];
  vect__13.6_17 = MEM[(int *)&a + 16B];
But there is no further FRE pass that would optimize the loads into
  vect__13.6_20 = { 0, 1, 2, 3 };
  vect__13.6_17 = { 4, 5, 6, 7 };
(supposedly that would need to be done before forwprop4 that could in theory
refold all the stmts into constant).

Richard, how expensive would be to schedule another FRE pass if anything has
been vectorized in the current function (either vect pass, or slp)?  Or are
there other passes that handle this?  Looking at e.g.
typedef int V __attribute__((vector_size (4 * sizeof (int))));
struct S { int a[4]; };
V __attribute__ ((noinline)) foo (struct S *p)
{
  *(V *) p = (V) { 1, 2, 3, 4 };
  return *(V *) p;
}
with -O2 -fno-tree-fre, it seems DOM is able to do that, but unfortunately at
dom2 time the values have not been sufficiently forward propagated for dom2 to
optimize this.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63677] Failure to constant fold with vectorization.
       [not found] <bug-63677-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2014-10-29 18:21 ` jakub at gcc dot gnu.org
@ 2014-10-30  9:52 ` rguenth at gcc dot gnu.org
  2014-11-19  9:35 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-10-30  9:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63677

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #3)
> The problem is that the loop is first vectorized, then several passes later
> slp vectorizes the initialization, so after some cleanups we have e.g. in
> cddce2:
>   MEM[(int *)&a] = { 0, 1, 2, 3 };
>   MEM[(int *)&a + 16B] = { 4, 5, 6, 7 };
>   vect__13.6_20 = MEM[(int *)&a];
>   vect__13.6_17 = MEM[(int *)&a + 16B];
> But there is no further FRE pass that would optimize the loads into
>   vect__13.6_20 = { 0, 1, 2, 3 };
>   vect__13.6_17 = { 4, 5, 6, 7 };
> (supposedly that would need to be done before forwprop4 that could in theory
> refold all the stmts into constant).
> 
> Richard, how expensive would be to schedule another FRE pass if anything has
> been vectorized in the current function (either vect pass, or slp)?  Or are
> there other passes that handle this?  Looking at e.g.
> typedef int V __attribute__((vector_size (4 * sizeof (int))));
> struct S { int a[4]; };
> V __attribute__ ((noinline)) foo (struct S *p)
> {
>   *(V *) p = (V) { 1, 2, 3, 4 };
>   return *(V *) p;
> }
> with -O2 -fno-tree-fre, it seems DOM is able to do that, but unfortunately
> at dom2 time the values have not been sufficiently forward propagated for
> dom2 to optimize this.

For the case in question there is only FRE that can handle CSEing of
the MEM[(int *)&a] load (DOM should habdle the laod of _17 fine).
I'm not very fond of adding more passes, but in theory a FRE right
after pass_tree_loop_done could do the trick.  Though ideally you'd
want it a bit later, after vector lowering - and after tracer
(so where the current DOM sits and remove DOM).  Of course FRE is
more expensive than DOM and DOM might catch some jump threading
opportunities (though VRP does that as well).


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63677] Failure to constant fold with vectorization.
       [not found] <bug-63677-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2014-10-30  9:52 ` rguenth at gcc dot gnu.org
@ 2014-11-19  9:35 ` rguenth at gcc dot gnu.org
  2014-11-20  8:41 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-11-19  9:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63677

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
With the patch from PR63864 we still don't optimize:

  <bb 2>:
  vect_cst_.12_23 = { 0, 1, 2, 3 };
  vect_cst_.11_32 = { 4, 5, 6, 7 };
  vectp.14_2 = &a[0];
  MEM[(int *)&a] = { 0, 1, 2, 3 };
  vectp.14_21 = &a[0] + 16;
  MEM[(int *)vectp.14_21] = { 4, 5, 6, 7 };
  vectp_a.5_22 = &a;
  vect__13.6_20 = MEM[(int *)&a];

this is because while seeing the candidate MEM[(int *)&a] = { 0, 1, 2, 3 };
for the load vect__13.6_20 = MEM[(int *)&a]; we fail to disambiguate
against the store to MEM[(int *)vectp.14_21] which is not simplified
to MEM[&a, 16] = { 4, 5, 6, 7 }; because DOM does not have the "trick"
of representing invariant-ptr + CST as &MEM[&..., CST'] for propagation.

If I fix that (huh, not sure why we don't simply fold the pointer-plus that
way,
now four places do that trick for propagation...) then it works:

LKUP STMT vect__13.6_20 = MEM[(int *)&a]
          vect__13.6_20 = MEM[(int *)&a];
FIND: { 0, 1, 2, 3 }
  Replaced redundant expr 'MEM[(int *)&a]' with '{ 0, 1, 2, 3 }'

t.c.183t.optimized:

foo ()
{
  <bb 2>:
  return 28;

}


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63677] Failure to constant fold with vectorization.
       [not found] <bug-63677-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2014-11-19  9:35 ` rguenth at gcc dot gnu.org
@ 2014-11-20  8:41 ` rguenth at gcc dot gnu.org
  2014-11-20  8:43 ` rguenth at gcc dot gnu.org
  2014-11-20 12:44 ` rguenth at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-11-20  8:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63677

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Author: rguenth
Date: Thu Nov 20 08:40:52 2014
New Revision: 217827

URL: https://gcc.gnu.org/viewcvs?rev=217827&root=gcc&view=rev
Log:
2014-11-20   Richard Biener  <rguenther@suse.de>

    PR tree-optimization/63677
    * tree-ssa-dom.c: Include gimplify.h for unshare_expr.
    (avail_exprs_stack): Make a vector of pairs.
    (struct hash_expr_elt): Replace stmt member with vop member.
    (expr_elt_hasher::equal): Simplify.
    (initialize_hash_element): Adjust.
    (initialize_hash_element_from_expr): Likewise.
    (dom_opt_dom_walker::thread_across_edge): Likewise.
    (record_cond): Likewise.
    (dom_opt_dom_walker::before_dom_children): Likewise.
    (print_expr_hash_elt): Likewise.
    (remove_local_expressions_from_table): Restore previous state
    if requested.
    (record_equivalences_from_stmt): Record &x + CST as constant
    &MEM[&x, CST] for further propagation.
    (vuse_eq): New function.
    (lookup_avail_expr): For loads use the alias oracle to see
    whether a candidate from the expr hash is usable.
    (avail_expr_hash): Do not hash VUSEs.

    * gcc.dg/tree-ssa/ssa-dom-cse-2.c: New testcase.
    * gcc.dg/tree-ssa/ssa-dom-cse-3.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-dom.c


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63677] Failure to constant fold with vectorization.
       [not found] <bug-63677-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2014-11-20  8:41 ` rguenth at gcc dot gnu.org
@ 2014-11-20  8:43 ` rguenth at gcc dot gnu.org
  2014-11-20 12:44 ` rguenth at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-11-20  8:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63677

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed for GCC 5.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/63677] Failure to constant fold with vectorization.
       [not found] <bug-63677-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2014-11-20  8:43 ` rguenth at gcc dot gnu.org
@ 2014-11-20 12:44 ` rguenth at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-11-20 12:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63677

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
*** Bug 63679 has been marked as a duplicate of this bug. ***


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-11-20 12:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-63677-4@http.gcc.gnu.org/bugzilla/>
2014-10-29 16:50 ` [Bug tree-optimization/63677] Failure to constant fold with vectorization belagod at gcc dot gnu.org
2014-10-29 17:24 ` pinskia at gcc dot gnu.org
2014-10-29 18:21 ` jakub at gcc dot gnu.org
2014-10-30  9:52 ` rguenth at gcc dot gnu.org
2014-11-19  9:35 ` rguenth at gcc dot gnu.org
2014-11-20  8:41 ` rguenth at gcc dot gnu.org
2014-11-20  8:43 ` rguenth at gcc dot gnu.org
2014-11-20 12:44 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).