From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 19179384003D; Mon, 26 Jul 2021 21:26:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 19179384003D From: "segher at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/78103] Failure to optimize with __builtin_clzl Date: Mon, 26 Jul 2021 21:26:26 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 6.2.1 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: segher at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jul 2021 21:26:26 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D78103 --- Comment #15 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #14) > (In reply to Segher Boessenkool from comment #13) > > (In reply to Jakub Jelinek from comment #10) > > > Unfortunately, it doesn't work for the #c0 testcase, after the combin= er > > > splitter kicks in, the combiner doesn't even try that 4 insn combinat= ion.=20 > >=20 > > It does for me? >=20 > But only in the unpatched gcc, no? Yes, of course. > For #c0 findLastSet I actually need to combine 5 original instructions, [...] That is not something we want to ever implement: 4 insns already is too expensive unless we try only the simplest, and/or only very specific combinations. > and > what I was hoping for is to first combine first 3 instructions into 2, > 9, 10 -> 12 to get rid of the useless sign-extension, You should be able to combine only 10 and 12 even, to a SImode xor followed by the sign extension (may not work out wrt costs, but it isn't even tried). Or, why is r86 DImode anyway? > the value is known to > be 0..63, so zero extension is fine, into 10 (bsr) and 12 (xor with zero > extend), which is what the #c9 patch does. > And then I was hoping 10, 12, 13 -> 14 would be attempted to be combined > because 13 is mov of a constant. But that doesn't happen because the 9, = 10 > -> 12 combination with the #c9 patch throws away the 12 -> 10 LOG_LINKS a= nd > doesn't add a new one, even when 10 is a setter of a fresh new pseudo and= 12 > is the only use of that pseudo. This is only safe if it *is* a new pseudo, and even then, you need to preve= nt getting stuck somehow. insn 10 is the most problematic things here btw, having the same pseudo as input and as output (it is not the unique setter either). This happens in expand already, probably a machine pattern that forgets to create new registers where it should?=