From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id ABEFB3858D33 for ; Mon, 17 Jul 2023 22:12:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ABEFB3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7C549C15; Mon, 17 Jul 2023 15:13:09 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1A9293F738; Mon, 17 Jul 2023 15:12:24 -0700 (PDT) From: Richard Sandiford To: Manolis Tsamis Mail-Followup-To: Manolis Tsamis ,gcc-patches@gcc.gnu.org, Philipp Tomsich , Jakub Jelinek , Andrew Pinski , Robin Dapp , richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org, Philipp Tomsich , Jakub Jelinek , Andrew Pinski , Robin Dapp Subject: Re: [PATCH v2 0/2] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets References: <20230713140904.3274306-1-manolis.tsamis@vrull.eu> Date: Mon, 17 Jul 2023 23:12:23 +0100 In-Reply-To: <20230713140904.3274306-1-manolis.tsamis@vrull.eu> (Manolis Tsamis's message of "Thu, 13 Jul 2023 16:09:02 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-20.7 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Manolis Tsamis writes: > noce_convert_multiple_sets has been introduced and extended over time to handle > if conversion for blocks with multiple sets. Currently this is focused on > register moves and rejects any sort of arithmetic operations. > > This series is an extension to allow more sequences to take part in if > conversion. The first patch is a required change to emit correct code and the > second patch whitelists a larger number of operations through > bb_ok_for_noce_convert_multiple_sets. > > For targets that have a rich selection of conditional instructions, > like aarch64, I have seen an ~5x increase of profitable if conversions for > multiple set blocks in SPEC benchmarks. Also tested with a wide variety of > benchmarks and I have not seen performance regressions on either x64 / aarch64. Interesting results. Are you free to say which target you used for aarch64? If I've understood the cost heuristics correctly, we'll allow a "predictable" branch to be replaced by up to 5 simple conditional instructions and an "unpredictable" branch to be replaced by up to 10 simple conditional instructions. That seems pretty high. And I'm not sure how well we guess predictability in the absence of real profile information. So my gut instinct was that the limitations of the current code might be saving us from overly generous limits. It sounds from your results like that might not be the case though. Still, if it does turn out to be the case in future, I agree we should fix the costs rather than hamstring the code. > Some samples that previously resulted in a branch but now better use these > instructions can be seen in the provided test case. > > Tested on aarch64 and x64; On x64 some tests that use __builtin_rint are > failing with an ICE but I believe that it's not an issue of this change. > force_operand crashes when (and:DF (not:DF (reg:DF 88)) (reg/v:DF 83 [ x ])) > is provided through emit_conditional_move. I guess that needs to be fixed first though. (Thanks for checking both targets.) My main comments on the series are: (1) It isn't obvious which operations should be included in the list in patch 2 and which shouldn't. Also, the patch only checks the outermost operation, and so it allows the inner rtxes to be arbitrarily complex. Because of that, it might be better to remove the condition altogether and just rely on the other routines to do costing and correctness checks. (2) Don't you also need to update the "rewiring" mechanism, to cope with cases where the then block has something like: if (a == 0) { a = b op c; -> a' = a == 0 ? b op c : a; d = a op b; -> d = a == 0 ? a' op b : d; } a = a' At the moment the code only handles regs and subregs, whereas but IIUC it should now iterate over all the regs in the SET_SRC. And I suppose that creates the need for multiple possible rewirings in the same insn, so that it isn't a simple insn -> index mapping any more. Thanks, Richard > > > Changes in v2: > - Change "conditional moves" to "conditional instructions" > in bb_ok_for_noce_convert_multiple_sets's comment. > > Manolis Tsamis (2): > ifcvt: handle sequences that clobber flags in > noce_convert_multiple_sets > ifcvt: Allow more operations in multiple set if conversion > > gcc/ifcvt.cc | 109 ++++++++++-------- > .../aarch64/ifcvt_multiple_sets_arithm.c | 67 +++++++++++ > 2 files changed, 127 insertions(+), 49 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c