From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 836113854171; Fri, 21 Oct 2022 12:21:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 836113854171 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1666354880; bh=Xc8VfozSpXvMxfg2fpYoGEkVDDgRMk+8Gy3uWEj1iUU=; h=From:To:Subject:Date:In-Reply-To:References:From; b=yfbjmwti/O/glytFV+xuYoepeCANYjz+jOwljLESyuznW1s61h3SJ/9XOYzdZjOVQ PAwAHMA3iaY+m+NI2E+AMhRzVoc7bOQbhzf4fLlm1R5HtUnqYwJKS9ikICCKwF8EgI 6xFrgzdTYtj2agqJCCmBa9jBv+ft4HlbObh/23Uo= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/100756] [12/13 Regression] vect: Superfluous epilog created on s390x Date: Fri, 21 Oct 2022 12:21:18 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.3 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cf_reconfirmed_on everconfirmed bug_status short_desc cc target_milestone keywords Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D100756 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2022-10-21 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Summary|vect: Superfluous epilog |[12/13 Regression] vect: |created on s390x |Superfluous epilog created | |on s390x CC| |amacleod at redhat dot com Target Milestone|--- |12.3 Keywords| |missed-optimization --- Comment #5 from Richard Biener --- I think the issue is that we now have [local count: 118111600]: _15 =3D n_8(D) * 4; if (n_8(D) > 0) goto ; [89.00%] else goto ; [11.00%] while we probably had [local count: 118111600]: _15 =3D n_8(D) * 4; if (_15 > 0) goto ; [89.00%] else goto ; [11.00%] before the change. Loop header copying applies VN to the copied blocks: Processing block 0: BB6 Value numbering stmt =3D i_9 =3D PHI <0(2)> Setting value number of i_9 to 0 (changed) Replaced redundant PHI node defining i_9 with 0 Value numbering stmt =3D result_14 =3D PHI <0(2)> Setting value number of result_14 to 0 (changed) Replaced redundant PHI node defining result_14 with 0 Value numbering stmt =3D _15 =3D n_8(D) * 4; Setting value number of _15 to _15 (changed) Making available beyond BB6 _15 for value _15 Value numbering stmt =3D if (_15 > i_9) Recording on edge 6->7 _15 gt_expr 0 =3D=3D true Recording on edge 6->7 _15 le_expr 0 =3D=3D false Recording on edge 6->7 _15 ne_expr 0 =3D=3D true Recording on edge 6->7 _15 ge_expr 0 =3D=3D true Recording on edge 6->7 _15 lt_expr 0 =3D=3D false Recording on edge 6->7 _15 eq_expr 0 =3D=3D false marking outgoing edge 6 -> 7 executable gimple_simplified to if (n_8(D) > 0) with [local count: 118111600]: _15 =3D n_8(D) * 4; if (n_8(D) > 0) goto ; [89.00%] else goto ; [11.00%] [local count: 955630225]: # i_16 =3D PHI # result_17 =3D PHI _1 =3D (long unsigned int) i_16; _2 =3D _1 * 4; _3 =3D a_11(D) + _2; _4 =3D *_3; result_12 =3D _4 + result_17; i_13 =3D i_16 + 1; _5 =3D n_8(D) * 4; if (_5 > i_13) goto ; [89.00%] else goto ; [11.00%] so it was a single use in the compare (because CSE only later introduces more uses through DOM). The niter code then ends up with maybe-zero as _15 <=3D 0 and a condition of n_8(D) > 0 it tries to simplify with tree_simplify_using_condition (called from simplify_using_initial_conditions). That old machinery would be a perfect candidate to be rewritten using path ranger, but in a somewhat extended mode that can "skip" diamonds, aka, the path just contains dominators of the loop entry edge on which we want to evaluate the _15 <=3D 0 condition. To make the old simplification code work we can do the following: diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc index 1e0f609d8b6..4ffcef4f4ff 100644 --- a/gcc/tree-ssa-loop-niter.cc +++ b/gcc/tree-ssa-loop-niter.cc @@ -2216,6 +2216,7 @@ expand_simple_operations (tree expr, tree stop, hash_map &cache) case PLUS_EXPR: case MINUS_EXPR: + case MULT_EXPR: if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (expr)) && TYPE_OVERFLOW_TRAPS (TREE_TYPE (expr))) return expr; but that can of course have unintended side-effects elsewhere (this function is also used by IVOPTs).=