From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 0EC6F3851C16; Mon, 15 Jun 2020 09:49:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0EC6F3851C16 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1592214557; bh=9QNraq5UUvEHZz6hT2aeWy7Ik3uNjmznjo2bki1FTsk=; h=From:To:Subject:Date:In-Reply-To:References:From; b=LZwBPrCid5OGQyHZUUGGZkCLmjPQn8qJZ4Xe8ugTGX36EDeEWFqsGl8bpnD3PUUaw WjKVZNJuc1sfbsXtW1Cu01t7V9BffnvQZSZ7STWRI9skz62ltGJDCalaJbtrR0e1s5 uN/6bn2WN7dkrwZGu4Ys4HhuPIFiKYdnOWbRoW+k= From: "bina2374 at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/95632] Redundant zero extension Date: Mon, 15 Jun 2020 09:49:16 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 10.1.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: bina2374 at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Jun 2020 09:49:17 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D95632 --- Comment #2 from Mel Chen --- (In reply to Jim Wilson from comment #1) > We sign extend HImode constants as that is the natural thing to do to make > arithmetic work. This does mean that unsigned short logical operations n= eed > a zero extend after the operation which might otherwise be unnecessary.=20 > This can't be handled at rtl generation time as we don't know if the > constant will be used for arithmetic or logicals or signed or unsigned. = But > maybe an optimization pass could go over the code and convert HImode > constants to signed or unsigned as appropriate to reduce the number of > sign/zero extend operations. We have the ree pass that we might be able = to > extend to handle this. Extend ree pass is a good way, but now it seems only scanning XXX_extend. Because the zero_extend has been split to 2 shift instructions before ree p= ass, do we need to keep zero_extend until ree pass? Or is there any other way to know that the shift pair was a zero_extend? >=20 > Handling this in combine requires a 4->3 splitter which is something comb= ine > doesn't do. We could work around that by not splitting constants before > combine, but that would be a major change and probably not beneficial, as= we > wouldn't be able to easily optimize the high part of the constants anymor= e. I agree. This way is a bit risky. >=20 > Another approach here might be to split the xor along with the constant. = If > we generated something like > srli a0,a0,1 > xori a0,a0,1 > li a5,-24576 > xor a0,a0,a5 > then we can optimize away the following zero extend with a 3->2 splitter > which combine already supports via find_split_point. We can still optimi= ze > the high part of the constant. Since the immediates are sign extended, if > the low part of the immediate has the sign bit set, we would have to inve= rt > the high part of the immediate to get the right result. At least I think > that works, I haven't double checked it yet. This only works for or if t= he > low part doesn't have the sign bit set. And this only works for and if t= he > low part does have the sign bit set. I'm not sure how difficult it is to split 1 xor to 2 xor before combine pas= s, but I have another proposal: The following dump is combine dump: Trying 8, 9, 10 -> 11: 8: r79:SI=3D0xffffffffffffa000 9: r78:SI=3Dr79:SI+0x1 REG_DEAD r79:SI REG_EQUAL 0xffffffffffffa001 10: r77:SI=3Dr72:SI^r78:SI REG_DEAD r78:SI REG_DEAD r72:SI 11: r80:SI=3Dzero_extend(r77:SI#0) REG_DEAD r77:SI Failed to match this instruction: (set (reg:SI 80) (xor:SI (reg:SI 72 [ _4 ]) (const_int 40961 [0xa001]))) Is it possible to pretend that we have a pattern that can match xor (reg:SI 80), (reg: SI 72), 0xa001 in combine pass? And then, if the constant part is too large to put in to the immediate part= , it can be split to 2 xor in split pass.=