From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 0FF8B3858D28; Sun, 5 Mar 2023 11:48:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0FF8B3858D28 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1678016920; bh=s+ru0Pn7ZxkThQ9NXoxos+yavOqUD/C5mfJo+nCuasw=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Coz5LD4kxmPuK7e6Sot786IZfXl0CjjhfrZtsWJxDkovbT3jdF5gwvN0I7ftNLHgA Y17EzIXNGstvbkYpbrSfz8eMzqRPL/EWdipcgvB+cSWX+ijnPKJSW6Cl8MLXDJe+G7 gzvIyWj7fiFOh0FIh8MPYIHtGFLPz1oPJj63k6R0= From: "jskumari at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/103784] suboptimal code for returning bool value on target ppc Date: Sun, 05 Mar 2023 11:48:39 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: jskumari at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: jskumari at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103784 --- Comment #10 from Surya Kumari Jangala --- After the expand pass, we have a single return bb which first zero extends = r117 (this reg holds the return value which has been set by predecessor blocks). Zero extension is done because r117 is of mode QI and we want to change the mode to DI. After zero extension, r117 is copied to r3. The input RTL to the peephole2 pass is similar, ie, the writing of value to= r3 occurs in predecessor BBs while zero extension of r3 happens in the return = bb. So we cannot do any peephole optimization to get rid of the unnecessary zero extension. Note that when return value is written into r3, it has mode QI. Later in the return bb, r3 is zero extended to convert it's mode into DI. However, after the bbro (basic block reordering) pass, we have 2 return BBs. And in each BB, the return value is copied into r3 (in QI mode), and then r= 3 is zero extended. Note that bbro occurs after peephole2. We can do another peephole after bbro, and get rid of the unnecessary zero extension. However, we need not always get an opportunity to do a peephole. That is, t= he instructions that write into r3 and zero extend r3 can be in different BBs. A possible solution to this issue would be to have a separate pass that can remove the zero extends. In brief, the new pass can do the following: Fist create webs. Then find definitions (that is, writes into registers) that reach zero exte= nd insns. Mark such definitions (to indicate that the value is going to be zero exten= ded later on), and then at the time of assembly generation (final pass), definitions which have been marked should be converted to assembly instruct= ions which work on the extended mode (for example, with -m64, the generated asse= mbly should work on the entire 64bit register instead of just a part of it.). If we generate such assembly instructions, then the zero extend instruction= can be removed, ie, no assembly need be generated. Note that for definitions that reach zero extends as well as other uses, we cannot remove the zero extends.=