From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-468230-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 5539 invoked by alias); 24 Nov 2014 11:16:37 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Received: (qmail 5477 invoked by uid 48); 24 Nov 2014 11:16:32 -0000
From: "belagod at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/63679] [5.0 Regression][AArch64] Failure to constant fold.
Date: Mon, 24 Nov 2014 11:16:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: belagod at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 5.0
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-63679-4-sYN6FpuAD6@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-63679-4@http.gcc.gnu.org/bugzilla/>
References: <bug-63679-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-11/txt/msg02702.txt.bz2

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D63679

--- Comment #17 from Tejas Belagod <belagod at gcc dot gnu.org> ---
> -
>  	    /* Do a block move either if the size is so small as to make
>  	       each individual move a sub-unit move on average, or if it
> -	       is so large as to make individual moves inefficient.  */
> +	       is so large as to make individual moves inefficient.  Reuse
> +	       the same costs logic as we use in the SRA passes.  */
> +            unsigned max_scalarization_size
> +	      =3D optimize_function_for_size_p (cfun)
> +	        ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE)
> +		: PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED);
> +
>  	    if (size > 0
>  		&& num_nonzero_elements > 1
>  		&& (size < num_nonzero_elements
> -		    || !can_move_by_pieces (size, align)))
> +		    || size > max_scalarization_size))
>  	      {
>  		if (notify_temp_creation)
>  		  return GS_ERROR;

I think both move_by_pieces and SRA can co-exist here:
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 8e3dd83..be51ce7 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -70,6 +70,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "omp-low.h"
 #include "gimple-low.h"
 #include "cilk.h"
+#include "params.h"

 #include "langhooks-def.h"    /* FIXME: for lhd_set_decl_assembler_name */
 #include "tree-pass.h"        /* FIXME: only for PROP_gimple_any */
@@ -3895,7 +3896,6 @@ gimplify_init_constructor (tree *expr_p, gimple_seq
*pre_p, gimple_seq *post_p,
                       DECL_ATTRIBUTES (current_function_decl))))
       {
         HOST_WIDE_INT size =3D int_size_in_bytes (type);
        unsigned int align;

         /* ??? We can still get unbounded array types, at least
            from the C++ front end.  This seems wrong, but attempt
@@ -3907,20 +3907,19 @@ gimplify_init_constructor (tree *expr_p, gimple_seq
*pre_p, gimple_seq *post_p,
           TREE_TYPE (ctor) =3D type =3D TREE_TYPE (object);
           }

        /* Find the maximum alignment we can assume for the object.  */
        /* ??? Make use of DECL_OFFSET_ALIGN.  */
        if (DECL_P (object))
          align =3D DECL_ALIGN (object);
        else
          align =3D TYPE_ALIGN (type);

         /* Do a block move either if the size is so small as to make
            each individual move a sub-unit move on average, or if it
-           is so large as to make individual moves inefficient.  */
+           is so large as to make individual moves inefficient.  Reuse
+           the same costs logic as we use in the SRA passes.  */
+            unsigned max_scalarization_size
+          =3D optimize_function_for_size_p (cfun)
+            ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE)
+        : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED);
+
         if (size > 0
         && num_nonzero_elements > 1
         && (size < num_nonzero_elements
+            || size > max_scalarization_size
            || !can_move_by_pieces (size, align))
           {
         if (notify_temp_creation)
           return GS_ERROR;

If it isn't profitable to do an SRA, we can fall-back to the backend hook to
move it by pieces. This way, I think we'll have move opportunity for
optimization.
>>From gcc-bugs-return-468231-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Mon Nov 24 11:20:46 2014
Return-Path: <gcc-bugs-return-468231-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 10349 invoked by alias); 24 Nov 2014 11:20:46 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 10310 invoked by uid 48); 24 Nov 2014 11:20:41 -0000
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/64031] (un-)conditional execution state is not preserved by PRE/sink
Date: Mon, 24 Nov 2014 11:20:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 4.9.3
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: keywords bug_status cf_reconfirmed_on blocked short_desc everconfirmed
Message-ID: <bug-64031-4-MYEACF7ugX@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-64031-4@http.gcc.gnu.org/bugzilla/>
References: <bug-64031-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-11/txt/msg02703.txt.bz2
Content-length: 1378

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64031

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2014-11-24
             Blocks|                            |53947
            Summary|Vectorization of max/min is |(un-)conditional execution
                   |not robust enough           |state is not preserved by
                   |                            |PRE/sink
     Ever confirmed|0                           |1

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is that PRE optimizes this to

  f2_11 = f2_10 * f2_10;
  if (f2_10 < f2_11)
    goto <bb 5>;
  else
    goto <bb 4>;

  <bb 4>:
  pretmp_25 = f2_11 * f2_11;

  <bb 5>:
  # prephitmp_26 = PHI <f2_11(3), pretmp_25(4)>
  *_9 = prephitmp_26;

and f2_11 * f2_11 may trap thus ifcvt refuses to execute it unconditionally
(but only PRE made it executed conditionally).

Thus "confirmed" that both PRE and code sinking can make stmts executed
conditionally while they were not so before which can pessimize transforms
done by later passes such as LIM and if-conversion.