From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id CDA19385842A; Thu, 10 Feb 2022 08:02:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CDA19385842A From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/104345] [12 Regression] "nvptx: Transition nvptx backend to STORE_FLAG_VALUE = 1" patch made some code generation worse Date: Thu, 10 Feb 2022 08:02:44 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: minor X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: roger at nextmovesoftware dot com X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Feb 2022 08:02:44 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104345 --- Comment #10 from CVS Commits --- The master branch has been updated by Tom de Vries : https://gcc.gnu.org/g:9bacd7af2e3bba9ddad17e7de4e2d299419d819d commit r12-7167-g9bacd7af2e3bba9ddad17e7de4e2d299419d819d Author: Roger Sayle Date: Fri Feb 4 04:13:53 2022 +0100 PR target/104345: Use nvptx "set" instruction for cond ? -1 : 0 This patch addresses the "increased register pressure" regression on nvptx-none caused by my change to transition the backend to a STORE_FLAG_VALUE =3D 1 target. This improved code generation for the more common case of producing 0/1 Boolean values, but unfortunately made things marginally worse when a 0/-1 mask value is desired. Unfortunately, nvptx kernels are extremely sensitive to changes in register usage, which was observable in the reported PR. This patch provides optimizations for -(cond ? 1 : 0), effectively simplify this into cond ? -1 : 0, where these ternary operators are provided by nvptx's selp instruction, and for the specific case of SImode, using (restoring) nvptx's "set" instruction (which avoids the need for a predicate register). This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu with a "make" and "make -k check" with no new failures. Unfortunately, the exact register usage of a nvptx kernel depends upon the version of the Cuda drivers being used (and the hardware), but I believe this change should resolve the PR (for Thomas) by improving code generation for the cases that regressed. gcc/ChangeLog: PR target/104345 * config/nvptx/nvptx.md (sel_true): Fix indentation. (sel_false): Likewise. (define_code_iterator eqne): New code iterator for EQ and NE. (*selp_neg_): New define_insn_and_split to optimize the negation of a selp instruction. (*selp_not_): New define_insn_and_split to optimize the bitwise not of a selp instruction. (*setcc_int): Use set instruction for neg:SI of a selp. gcc/testsuite/ChangeLog: PR target/104345 * gcc.target/nvptx/neg-selp.c: New test case.=