From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 2BF293858D37; Wed, 2 Feb 2022 23:51:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2BF293858D37 From: "pinskia at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/104357] [Aarch64] Failure to use csinv instead of mvn+csel where possible Date: Wed, 02 Feb 2022 23:51:37 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: pinskia at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status cf_reconfirmed_on component everconfirmed Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Feb 2022 23:51:37 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104357 Andrew Pinski changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2022-02-02 Component|target |tree-optimization Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- This will get GCC closer to what clang/LLVM produces: unsigned char stbi__clamp(int x) { int t =3D x; if ((unsigned)x > 255) { if (x < 0) t =3D 0; else if (x > 255) t =3D -1; } return t; } ---- CUT ---- The zero-extends are due to the cast not being outside of the csel and the = RTL level is not really good at cross bb optimizations. The gimple level looks like: [local count: 1073741824]: x.0_1 =3D (unsigned int) x_3(D); if (x.0_1 > 255) goto ; [50.00%] else goto ; [50.00%] [local count: 536870913]: _7 =3D x_3(D) >=3D 0; _6 =3D (unsigned char) _7; _8 =3D -_6; goto ; [100.00%] [local count: 536870913]: _4 =3D (unsigned char) x_3(D); [local count: 1073741824]: # _2 =3D PHI <_8(3), _4(4)> return _2; Which in theory could be improved to the what I gave above. The gimple level has no knowledge of the rtl/target level that to do - in unsigned, you need to a zero extend still.=