From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 2A61E3858C41; Mon, 14 Aug 2023 07:28:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2A61E3858C41 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691998111; bh=cg8+Q27JlQujhhi+9gSek3LWqjNeKtlx4yyszUKp93o=; h=From:To:Subject:Date:In-Reply-To:References:From; b=F2CfIeCnwPy82zebV8RNVLc5dP6jGbo9sH+ED/6xeW4FUYRZg8XJbRVP2EZBHcT3z OsaWHGCauexb2GLo3ReW9oxE8UG566ytlq/YkFzsh+JBcOYhUUaaIeic8KgppZj+8S JB8+Vy14avvvg7K1FWkUSK7GSaMElNi0CfRGqzYU= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/110991] [14 Regression] Dead Code Elimination Regression at -O2 since r14-1135-gc53f51005de Date: Mon, 14 Aug 2023 07:28:29 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status cf_gcctarget target_milestone assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110991 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Target| |x86_64-*-* Target Milestone|--- |14.0 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot = gnu.org --- Comment #3 from Richard Biener --- So the difference is that GCC 14 vectorizes the loop and that vectorized lo= op is not completely unrolled because Loop 1 likely iterates at most 2 times. Estimating sizes for loop 1 BB: 3, after_exit: 0 size: 1 _34 =3D vect_vec_iv_.15_33 + { 252, 252, 252, 252 }; size: 0 vect_a.16_35 =3D VIEW_CONVERT_EXPR(vect_vec_iv_.15_33); size: 1 vect_iftmp.17_36 =3D vect_a.16_35 << 3; size: 1 mask__23.18_38 =3D vect_a.16_35 < { 0, 0, 0, 0 }; size: 1 vect_iftmp.19_40 =3D VEC_COND_EXPR ; size: 1 ivtmp_44 =3D ivtmp_43 + 1; Induction variable computation will be folded away. size: 2 if (ivtmp_44 < 3) Exit condition will be eliminated in peeled copies. Exit condition will be eliminated in last copy. Constant conditional. BB: 9, after_exit: 1 size: 7-3, last_iteration: 7-3 Loop size: 7 Estimated size after unrolling: 8 Not unrolling loop 1: size would grow. when we still have a loop there's nothing that can fully elide things. Without vectorization we have Loop 2 likely iterates at most 11 times. Estimating sizes for loop 2 BB: 10, after_exit: 0 size: 0 a.2_13 =3D (signed char) a.6_22; Induction variable computation will be folded away. size: 2 if (a.2_13 < 0) Constant conditional. BB: 13, after_exit: 1 BB: 12, after_exit: 0 size: 1 _26 =3D a.6_22 + 255; Induction variable computation will be folded away. size: 1 ivtmp_27 =3D ivtmp_4 - 1; Induction variable computation will be folded away. size: 2 if (ivtmp_27 !=3D 0) Exit condition will be eliminated in peeled copies. Exit condition will be eliminated in last copy. Constant conditional.=20 BB: 11, after_exit: 0 size: 1 iftmp.0_12 =3D a.2_13 << 3; Induction variable computation will be folded away. size: 7-7, last_iteration: 7-7 Loop size: 7 Estimated size after unrolling: 1 unrolling relies on constant_after_peeling which relies on SCEV which doesn't handle vector IVs. I have a patch improving it to size: 7-4, last_iteration: 7-4 Loop size: 7 Estimated size after unrolling: 6 IIRC I also had a patch more appropriately "propagating" constness at some point.=