From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5539 invoked by alias); 24 Nov 2014 11:16:37 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 5477 invoked by uid 48); 24 Nov 2014 11:16:32 -0000 From: "belagod at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/63679] [5.0 Regression][AArch64] Failure to constant fold. Date: Mon, 24 Nov 2014 11:16:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: belagod at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: 5.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-11/txt/msg02702.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D63679 --- Comment #17 from Tejas Belagod --- > - > /* Do a block move either if the size is so small as to make > each individual move a sub-unit move on average, or if it > - is so large as to make individual moves inefficient. */ > + is so large as to make individual moves inefficient. Reuse > + the same costs logic as we use in the SRA passes. */ > + unsigned max_scalarization_size > + =3D optimize_function_for_size_p (cfun) > + ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE) > + : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED); > + > if (size > 0 > && num_nonzero_elements > 1 > && (size < num_nonzero_elements > - || !can_move_by_pieces (size, align))) > + || size > max_scalarization_size)) > { > if (notify_temp_creation) > return GS_ERROR; I think both move_by_pieces and SRA can co-exist here: diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 8e3dd83..be51ce7 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -70,6 +70,7 @@ along with GCC; see the file COPYING3. If not see #include "omp-low.h" #include "gimple-low.h" #include "cilk.h" +#include "params.h" #include "langhooks-def.h" /* FIXME: for lhd_set_decl_assembler_name */ #include "tree-pass.h" /* FIXME: only for PROP_gimple_any */ @@ -3895,7 +3896,6 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, DECL_ATTRIBUTES (current_function_decl)))) { HOST_WIDE_INT size =3D int_size_in_bytes (type); unsigned int align; /* ??? We can still get unbounded array types, at least from the C++ front end. This seems wrong, but attempt @@ -3907,20 +3907,19 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, TREE_TYPE (ctor) =3D type =3D TREE_TYPE (object); } /* Find the maximum alignment we can assume for the object. */ /* ??? Make use of DECL_OFFSET_ALIGN. */ if (DECL_P (object)) align =3D DECL_ALIGN (object); else align =3D TYPE_ALIGN (type); /* Do a block move either if the size is so small as to make each individual move a sub-unit move on average, or if it - is so large as to make individual moves inefficient. */ + is so large as to make individual moves inefficient. Reuse + the same costs logic as we use in the SRA passes. */ + unsigned max_scalarization_size + =3D optimize_function_for_size_p (cfun) + ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE) + : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED); + if (size > 0 && num_nonzero_elements > 1 && (size < num_nonzero_elements + || size > max_scalarization_size || !can_move_by_pieces (size, align)) { if (notify_temp_creation) return GS_ERROR; If it isn't profitable to do an SRA, we can fall-back to the backend hook to move it by pieces. This way, I think we'll have move opportunity for optimization. >>From gcc-bugs-return-468231-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Mon Nov 24 11:20:46 2014 Return-Path: Delivered-To: listarch-gcc-bugs@gcc.gnu.org Received: (qmail 10349 invoked by alias); 24 Nov 2014 11:20:46 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Delivered-To: mailing list gcc-bugs@gcc.gnu.org Received: (qmail 10310 invoked by uid 48); 24 Nov 2014 11:20:41 -0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/64031] (un-)conditional execution state is not preserved by PRE/sink Date: Mon, 24 Nov 2014 11:20:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 4.9.3 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: keywords bug_status cf_reconfirmed_on blocked short_desc everconfirmed Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-11/txt/msg02703.txt.bz2 Content-length: 1378 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64031 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed| |2014-11-24 Blocks| |53947 Summary|Vectorization of max/min is |(un-)conditional execution |not robust enough |state is not preserved by | |PRE/sink Ever confirmed|0 |1 --- Comment #2 from Richard Biener --- The issue is that PRE optimizes this to f2_11 = f2_10 * f2_10; if (f2_10 < f2_11) goto ; else goto ; : pretmp_25 = f2_11 * f2_11; : # prephitmp_26 = PHI *_9 = prephitmp_26; and f2_11 * f2_11 may trap thus ifcvt refuses to execute it unconditionally (but only PRE made it executed conditionally). Thus "confirmed" that both PRE and code sinking can make stmts executed conditionally while they were not so before which can pessimize transforms done by later passes such as LIM and if-conversion.