From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 238113858D33 for ; Thu, 19 Oct 2023 19:41:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 238113858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 238113858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697744493; cv=none; b=QntgymK29C9outfiNRfRa+kPQ1Te2ewkqWCOt7QN/jZM4IVowDLhI17m4nd5tLsKuxJugooxBzjwWOb7e5vKL0HJlvxk1HtBEEMzTp8V9ogrHh7tHxxrt8fCE+Rf/Epzm2E2OXhT5kL/h/t6xtDz1ipc7WHMxiDKyfCc263YBaw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697744493; c=relaxed/simple; bh=14En31sSSw6J5e1qaqCNRSZPlOrQRewCjeOK79PuZZg=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=hC7EqUecxsMdB9CF2JaDJ0NHPOMQPnU3jvNz2SKExStNwkHcWWId0uk1Dmio6VOdUEmZpCGU87CWjPl7QS/HEJJ7j/YS6pVecs18UjT36B4DFBswilFa/RV1+E4jYzMgV/GEvZe6ANdI5V/4ZI8A1/xQ3Qn7E9eUXLEcKzr5bv8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B8FD92F4; Thu, 19 Oct 2023 12:42:12 -0700 (PDT) Received: from localhost (unknown [10.32.110.65]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E92143F5A1; Thu, 19 Oct 2023 12:41:30 -0700 (PDT) From: Richard Sandiford To: Manolis Tsamis Mail-Followup-To: Manolis Tsamis ,gcc-patches@gcc.gnu.org, Philipp Tomsich , Robin Dapp , Jakub Jelinek , richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org, Philipp Tomsich , Robin Dapp , Jakub Jelinek Subject: Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets References: <20230830101400.1539313-1-manolis.tsamis@vrull.eu> <20230830101400.1539313-2-manolis.tsamis@vrull.eu> Date: Thu, 19 Oct 2023 20:41:29 +0100 In-Reply-To: <20230830101400.1539313-2-manolis.tsamis@vrull.eu> (Manolis Tsamis's message of "Wed, 30 Aug 2023 12:13:57 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-23.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Manolis Tsamis writes: > This is an extension of what was done in PR106590. > > Currently if a sequence generated in noce_convert_multiple_sets clobbers the > condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards > (sequences that emit the comparison itself). Since this applies only from the > next iteration it assumes that the sequences generated (in particular seq2) > doesn't clobber the condition rtx itself before using it in the if_then_else, > which is only true in specific cases (currently only register/subregister moves > are allowed). > > This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp in > the current iteration. This makes it possible to include arithmetic operations > in noce_convert_multiple_sets. > > gcc/ChangeLog: > > * ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead. > (noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp. > > Signed-off-by: Manolis Tsamis > --- > > (no changes since v1) > > gcc/ifcvt.cc | 49 +++++++++++++++++++------------------------------ > 1 file changed, 19 insertions(+), 30 deletions(-) Sorry for the slow review. TBH I was hoping someone else would pick it up, since (a) I'm not very familiar with this code, and (b) I don't really agree with the way that the current code works. I'm not sure the current dependency checking is safe, so I'm nervous about adding even more cases to it. And it feels like the different ifcvt techniques are not now well distinguished, so that they're beginning to overlap and compete with one another. None of that is your fault, of course. :) > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc > index a0af553b9ff..3273aeca125 100644 > --- a/gcc/ifcvt.cc > +++ b/gcc/ifcvt.cc > @@ -3375,20 +3375,6 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) > return true; > } > > -/* Helper function for noce_convert_multiple_sets_1. If store to > - DEST can affect P[0] or P[1], clear P[0]. Called via note_stores. */ > - > -static void > -check_for_cc_cmp_clobbers (rtx dest, const_rtx, void *p0) > -{ > - rtx *p = (rtx *) p0; > - if (p[0] == NULL_RTX) > - return; > - if (reg_overlap_mentioned_p (dest, p[0]) > - || (p[1] && reg_overlap_mentioned_p (dest, p[1]))) > - p[0] = NULL_RTX; > -} > - > /* This goes through all relevant insns of IF_INFO->then_bb and tries to > create conditional moves. In case a simple move sufficis the insn > should be listed in NEED_NO_CMOV. The rewired-src cases should be > @@ -3552,9 +3538,17 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, > creating an additional compare for each. If successful, costing > is easier and this sequence is usually preferred. */ > if (cc_cmp) > - seq2 = try_emit_cmove_seq (if_info, temp, cond, > - new_val, old_val, need_cmov, > - &cost2, &temp_dest2, cc_cmp, rev_cc_cmp); > + { > + seq2 = try_emit_cmove_seq (if_info, temp, cond, > + new_val, old_val, need_cmov, > + &cost2, &temp_dest2, cc_cmp, rev_cc_cmp); > + > + /* The if_then_else in SEQ2 may be affected when cc_cmp/rev_cc_cmp is > + clobbered. We can't safely use the sequence in this case. */ > + if (seq2 && (modified_in_p (cc_cmp, seq2) > + || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq2)))) > + seq2 = NULL; modified_in_p only checks the first instruction in seq2, not the whole sequence. I think the unpatched approach is OK in cases where seq2 clobbers cc_cmp/rev_cc_cmp in or after the last use, since instructions are defined to operate on a read-all/compute/write-all basis. Soon after the snippet above, the unpatched code has this loop: /* The backend might have created a sequence that uses the condition. Check this. */ rtx_insn *walk = seq2; while (walk) { rtx set = single_set (walk); if (!set || !SET_SRC (set)) This condition looks odd. A SET_SRC is never null. But more importantly, continuing means "assume the best", and I don't think we should assume the best for (say) an insn with two parallel sets. It doesn't look like the series addresses this, but !set seems more likely to occur if we extend the function to general operations. { walk = NEXT_INSN (walk); continue; } rtx src = SET_SRC (set); if (XEXP (set, 1) && GET_CODE (XEXP (set, 1)) == IF_THEN_ELSE) ; /* We assume that this is the cmove created by the backend that naturally uses the condition. Therefore we ignore it. */ XEXP (set, 1) is src. But here too, I'm a bit nervous about assuming that any if_then_else is safe to ignore, even currently. It seems more dangerous if we extend the function to handle more cases. else { if (reg_mentioned_p (XEXP (cond, 0), src) || reg_mentioned_p (XEXP (cond, 1), src)) { read_comparison = true; break; } } walk = NEXT_INSN (walk); } Maybe a safer way to check whether "insn" is sensitive to the values in "val" is to use: subrtx_iterator::array_type array; FOR_EACH_SUBRTX (iter, array, val, NONCONST) if (OBJECT_P (*iter) && reg_overlap_mentioned_p (*iter, insn)) return true; return false; which doesn't assume that insn is a single set. But I'm not sure which cases this code is trying to catch. Is it trying to catch cases where seq2 "spontaneously" uses registers that happen to overlap with cond? If so, then when does that happen? And if it does happen, wouldn't the sequence also have to set the registers first? If instead the code is trying to detect uses of cond that were fed to seq2 outside of cond itself, via try_emit_cmove_seq, then wouldn't it be enough to check old_val and new_val? Anyway, the point I was trying to get to was: we could use the FOR_EACH_SUBRTX above to check whether an instruction in seq2 is sensitive to the value of cc_cmp/rev_cc_cmp. And we can use modified_in_p, as in your patch, to check whether an instruction changes the value of cc_cmp/rev_cc_cmp. We could therefore record whether we've seen a modification of cc_cmp/rev_cc_cmp, then reject seq2 if we see a subsequent use. Thanks, Richard > + } > > /* The backend might have created a sequence that uses the > condition. Check this. */ > @@ -3609,21 +3603,16 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, > return false; > } > > - if (cc_cmp) > + if (cc_cmp && seq == seq1) > { > - /* Check if SEQ can clobber registers mentioned in > - cc_cmp and/or rev_cc_cmp. If yes, we need to use > - only seq1 from that point on. */ > - rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp }; > - for (walk = seq; walk; walk = NEXT_INSN (walk)) > + /* Check if SEQ can clobber registers mentioned in cc_cmp/rev_cc_cmp. > + If yes, we need to use only seq1 from that point on. > + Only check when we use seq1 since we have already tested seq2. */ > + if (modified_in_p (cc_cmp, seq) > + || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq))) > { > - note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair); > - if (cc_cmp_pair[0] == NULL_RTX) > - { > - cc_cmp = NULL_RTX; > - rev_cc_cmp = NULL_RTX; > - break; > - } > + cc_cmp = NULL_RTX; > + rev_cc_cmp = NULL_RTX; > } > }