From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id E73433857359 for ; Tue, 28 Nov 2023 10:12:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E73433857359 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E73433857359 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701166329; cv=none; b=MXfZ1G3xgSIAO0HCGlqY7NlFmcefHiZH5zpG6bFrwdf4UukvwXRypYYjJXeE+Je9LvyIw1vjbfhIt6HfAroEe3Bzs7EYH/vPdVOQhdT/DTvIyxsaYSQTav4pNGBAJZmisHH8qBRfZTHpmfv70C7k3VzLWPvi49995Jh14G73I4M= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701166329; c=relaxed/simple; bh=iWWRovBwnRBAiSLfIgW/HuI9ADEx7VsNyL7VxmzyAJs=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=rCNXX5MN2CQkqq+NRr/Zf4WLBp6rp8AaFtYzdrDTWKFO3MtcZD6+TOxSsVqHDyQHJYC0mo8/w8uyt+v03a3RRTU/wKF2a/zhwud6WHvCbjNWL+oXMbbGQ+FU2ZLZC+LLTad7lxbFrEUfsWNKtWqZ7UUz5IIvD8zHxwpblhNbT9w= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1E314C15; Tue, 28 Nov 2023 02:12:54 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A931B3F73F; Tue, 28 Nov 2023 02:12:05 -0800 (PST) From: Richard Sandiford To: Manolis Tsamis Mail-Followup-To: Manolis Tsamis ,gcc-patches@gcc.gnu.org, Robin Dapp , Jeff Law , Philipp Tomsich , richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org, Robin Dapp , Jeff Law , Philipp Tomsich Subject: Re: [PATCH v2] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets References: <20231121180054.3949602-1-manolis.tsamis@vrull.eu> Date: Tue, 28 Nov 2023 10:12:04 +0000 In-Reply-To: (Manolis Tsamis's message of "Tue, 28 Nov 2023 10:48:59 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-22.5 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Manolis Tsamis writes: > On Thu, Nov 23, 2023 at 11:01=E2=80=AFPM Richard Sandiford > wrote: >> >> Manolis Tsamis writes: >> > The existing implementation of need_cmov_or_rewire and >> > noce_convert_multiple_sets_1 assumes that sets are either REG or SUBRE= G. >> > This commit enchances them so they can handle/rewire arbitrary set sta= tements. >> > >> > To do that a new helper struct noce_multiple_sets_info is introduced w= hich is >> > used by noce_convert_multiple_sets and its helper functions. This resu= lts in >> > cleaner function signatures, improved efficientcy (a number of vecs an= d hash >> > set/map are replaced with a single vec of struct) and simplicity. >> > >> > gcc/ChangeLog: >> > >> > * ifcvt.cc (need_cmov_or_rewire): Renamed init_noce_multiple_set= s_info. >> > (init_noce_multiple_sets_info): Initialize noce_multiple_sets_in= fo. >> > (noce_convert_multiple_sets_1): Use noce_multiple_sets_info and = handle >> > rewiring of multiple registers. >> > (noce_convert_multiple_sets): Updated to use noce_multiple_sets_= info. >> > * ifcvt.h (struct noce_multiple_sets_info): Introduce new struct >> > noce_multiple_sets_info to store info for noce_convert_multiple_= sets. >> > >> > Signed-off-by: Manolis Tsamis >> > --- >> >> Thanks, this looks like a really nice clean-up. One comment below: >> > Thanks! > >> > >> > Changes in v2: >> > - Made standalone patch. >> > - Better comments, some more checks. >> > >> > gcc/ifcvt.cc | 252 +++++++++++++++++++++++---------------------------- >> > gcc/ifcvt.h | 16 ++++ >> > 2 files changed, 129 insertions(+), 139 deletions(-) >> > >> > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc >> > index a0af553b9ff..9486d54de34 100644 >> > --- a/gcc/ifcvt.cc >> > +++ b/gcc/ifcvt.cc >> > @@ -98,14 +98,10 @@ static bool dead_or_predicable (basic_block, basic= _block, basic_block, >> > edge, bool); >> > static void noce_emit_move_insn (rtx, rtx); >> > static rtx_insn *block_has_only_trap (basic_block); >> > -static void need_cmov_or_rewire (basic_block, hash_set *, >> > - hash_map *); >> > +static void init_noce_multiple_sets_info (basic_block, >> > + auto_delete_vec &); >> > static bool noce_convert_multiple_sets_1 (struct noce_if_info *, >> > - hash_set *, >> > - hash_map *, >> > - auto_vec *, >> > - auto_vec *, >> > - auto_vec *, int *); >> > + auto_delete_vec &, int *); >> > >> > /* Count the number of non-jump active insns in BB. */ >> > >> > @@ -3270,24 +3266,13 @@ noce_convert_multiple_sets (struct noce_if_inf= o *if_info) >> > rtx x =3D XEXP (cond, 0); >> > rtx y =3D XEXP (cond, 1); >> > >> > - /* The true targets for a conditional move. */ >> > - auto_vec targets; >> > - /* The temporaries introduced to allow us to not consider register >> > - overlap. */ >> > - auto_vec temporaries; >> > - /* The insns we've emitted. */ >> > - auto_vec unmodified_insns; >> > - >> > - hash_set need_no_cmov; >> > - hash_map rewired_src; >> > - >> > - need_cmov_or_rewire (then_bb, &need_no_cmov, &rewired_src); >> > + auto_delete_vec insn_info; >> > + init_noce_multiple_sets_info (then_bb, insn_info); >> > >> > int last_needs_comparison =3D -1; >> > >> > bool ok =3D noce_convert_multiple_sets_1 >> > - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, >> > - &unmodified_insns, &last_needs_comparison); >> > + (if_info, insn_info, &last_needs_comparison); >> > if (!ok) >> > return false; >> > >> > @@ -3302,8 +3287,7 @@ noce_convert_multiple_sets (struct noce_if_info = *if_info) >> > end_sequence (); >> > start_sequence (); >> > ok =3D noce_convert_multiple_sets_1 >> > - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, >> > - &unmodified_insns, &last_needs_comparison); >> > + (if_info, insn_info, &last_needs_comparison); >> > /* Actually we should not fail anymore if we reached here, >> > but better still check. */ >> > if (!ok) >> > @@ -3312,12 +3296,12 @@ noce_convert_multiple_sets (struct noce_if_inf= o *if_info) >> > >> > /* We must have seen some sort of insn to insert, otherwise we were >> > given an empty BB to convert, and we can't handle that. */ >> > - gcc_assert (!unmodified_insns.is_empty ()); >> > + gcc_assert (!insn_info.is_empty ()); >> > >> > /* Now fixup the assignments. */ >> > - for (unsigned i =3D 0; i < targets.length (); i++) >> > - if (targets[i] !=3D temporaries[i]) >> > - noce_emit_move_insn (targets[i], temporaries[i]); >> > + for (unsigned i =3D 0; i < insn_info.length (); i++) >> > + if (insn_info[i]->target !=3D insn_info[i]->temporary) >> > + noce_emit_move_insn (insn_info[i]->target, insn_info[i]->tempor= ary); >> > >> > /* Actually emit the sequence if it isn't too expensive. */ >> > rtx_insn *seq =3D get_insns (); >> > @@ -3332,10 +3316,10 @@ noce_convert_multiple_sets (struct noce_if_inf= o *if_info) >> > set_used_flags (insn); >> > >> > /* Mark all our temporaries and targets as used. */ >> > - for (unsigned i =3D 0; i < targets.length (); i++) >> > + for (unsigned i =3D 0; i < insn_info.length (); i++) >> > { >> > - set_used_flags (temporaries[i]); >> > - set_used_flags (targets[i]); >> > + set_used_flags (insn_info[i]->temporary); >> > + set_used_flags (insn_info[i]->target); >> > } >> > >> > set_used_flags (cond); >> > @@ -3354,7 +3338,7 @@ noce_convert_multiple_sets (struct noce_if_info = *if_info) >> > return false; >> > >> > emit_insn_before_setloc (seq, if_info->jump, >> > - INSN_LOCATION (unmodified_insns.last ())); >> > + INSN_LOCATION (insn_info.last ()->unmodified_= insn)); >> > >> > /* Clean up THEN_BB and the edges in and out of it. */ >> > remove_edge (find_edge (test_bb, join_bb)); >> > @@ -3389,21 +3373,13 @@ check_for_cc_cmp_clobbers (rtx dest, const_rtx= , void *p0) >> > p[0] =3D NULL_RTX; >> > } >> > >> > -/* This goes through all relevant insns of IF_INFO->then_bb and tries= to >> > - create conditional moves. In case a simple move sufficis the insn >> > - should be listed in NEED_NO_CMOV. The rewired-src cases should be >> > - specified via REWIRED_SRC. TARGETS, TEMPORARIES and UNMODIFIED_IN= SNS >> > - are specified and used in noce_convert_multiple_sets and should be= passed >> > - to this function.. */ >> > +/* This goes through all relevant insns of IF_INFO->then_bb and tries= to create >> > + conditional moves. Information for the insns is kept in INSN_INFO= . */ >> > >> > static bool >> > noce_convert_multiple_sets_1 (struct noce_if_info *if_info, >> > - hash_set *need_no_cmov, >> > - hash_map *rewired_src, >> > - auto_vec *targets, >> > - auto_vec *temporaries, >> > - auto_vec *unmodified_insns, >> > - int *last_needs_comparison) >> > + auto_delete_vec &insn_i= nfo, >> > + int *last_needs_comparison) >> > { >> > basic_block then_bb =3D if_info->then_bb; >> > rtx_insn *jump =3D if_info->jump; >> > @@ -3421,11 +3397,6 @@ noce_convert_multiple_sets_1 (struct noce_if_in= fo *if_info, >> > >> > rtx_insn *insn; >> > int count =3D 0; >> > - >> > - targets->truncate (0); >> > - temporaries->truncate (0); >> > - unmodified_insns->truncate (0); >> > - >> > bool second_try =3D *last_needs_comparison !=3D -1; >> > >> > FOR_BB_INSNS (then_bb, insn) >> > @@ -3434,6 +3405,8 @@ noce_convert_multiple_sets_1 (struct noce_if_inf= o *if_info, >> > if (!active_insn_p (insn)) >> > continue; >> > >> > + noce_multiple_sets_info *info =3D insn_info[count]; >> > + >> > rtx set =3D single_set (insn); >> > gcc_checking_assert (set); >> > >> > @@ -3441,9 +3414,12 @@ noce_convert_multiple_sets_1 (struct noce_if_in= fo *if_info, >> > rtx temp; >> > >> > rtx new_val =3D SET_SRC (set); >> > - if (int *ii =3D rewired_src->get (insn)) >> > - new_val =3D simplify_replace_rtx (new_val, (*targets)[*ii], >> > - (*temporaries)[*ii]); >> > + >> > + int i, ii; >> > + FOR_EACH_VEC_ELT (info->rewired_src, i, ii) >> > + new_val =3D simplify_replace_rtx (new_val, insn_info[ii]->target, >> > + insn_info[ii]->temporary); >> > + >> > rtx old_val =3D target; >> > >> > /* As we are transforming >> > @@ -3484,7 +3460,7 @@ noce_convert_multiple_sets_1 (struct noce_if_inf= o *if_info, >> > /* We have identified swap-style idioms before. A normal >> > set will need to be a cmov while the first instruction of a swa= p-style >> > idiom can be a regular move. This helps with costing. */ >> > - bool need_cmov =3D !need_no_cmov->contains (insn); >> > + bool need_cmov =3D info->need_cmov; >> > >> > /* If we had a non-canonical conditional jump (i.e. one where >> > the fallthrough is to the "else" case) we need to reverse >> > @@ -3632,21 +3608,102 @@ noce_convert_multiple_sets_1 (struct noce_if_= info *if_info, >> > >> > /* Bookkeeping. */ >> > count++; >> > - targets->safe_push (target); >> > - temporaries->safe_push (temp_dest); >> > - unmodified_insns->safe_push (insn); >> > + >> > + info->target =3D target; >> > + info->temporary =3D temp_dest; >> > + info->unmodified_insn =3D insn; >> > } >> > >> > + /* The number of processed insns must match init_noce_multiple_sets= _info. */ >> > + gcc_assert (count =3D=3D (int) insn_info.length ()); >> > + >> > /* Even if we did not actually need the comparison, we want to make= sure >> > to try a second time in order to get rid of the temporaries. */ >> > if (*last_needs_comparison =3D=3D -1) >> > *last_needs_comparison =3D 0; >> > >> > - >> > return true; >> > } >> > >> > +/* Find local swap-style idioms in BB and mark the first insn (1) >> > + that is only a temporary as not needing a conditional move as >> > + it is going to be dead afterwards anyway. >> > + >> > + (1) int tmp =3D a; >> > + a =3D b; >> > + b =3D tmp; >> > + >> > + ifcvt >> > + --> >> > + >> > + tmp =3D a; >> > + a =3D cond ? b : a_old; >> > + b =3D cond ? tmp : b_old; >> > + >> > + Additionally, store the index of insns like (2) when a subsequent >> > + SET reads from their destination. >> > >> > + (2) int c =3D a; >> > + int d =3D c; >> > + >> > + ifcvt >> > + --> >> > + >> > + c =3D cond ? a : c_old; >> > + d =3D cond ? d : c; // Need to use c rather than c_old here. >> > +*/ >> > + >> > +static void >> > +init_noce_multiple_sets_info (basic_block bb, >> > + auto_delete_vec &insn_info) >> > +{ >> > + rtx_insn *insn; >> > + int count =3D 0; >> > + auto_vec dests; >> > + >> > + /* Iterate over all SETs, storing the destinations >> > + in DEST. >> > + - If we hit a SET that reads from a destination >> > + that we have seen before and the corresponding register >> > + is dead afterwards, the register does not need to be >> > + moved conditionally. >> > + - If we encounter a previously changed register, >> > + rewire the read to the original source. */ >> > + FOR_BB_INSNS (bb, insn) >> > + { >> > + if (!active_insn_p (insn)) >> > + continue; >> > + >> > + noce_multiple_sets_info *info =3D new noce_multiple_sets_info; >> > + info->target =3D NULL_RTX; >> > + info->temporary =3D NULL_RTX; >> > + info->unmodified_insn =3D NULL; >> > + info->need_cmov =3D true; >> > + insn_info.safe_push (info); >> > + >> > + rtx set =3D single_set (insn); >> > + gcc_checking_assert (set); >> > + >> > + rtx src =3D SET_SRC (set); >> > + rtx dest =3D SET_DEST (set); >> > + >> > + /* Check if the current SET's source is the same >> > + as any previously seen destination. >> > + This is quadratic but the number of insns in BB >> > + is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ >> > + for (int i =3D count - 1; i >=3D 0; --i) >> > + if (reg_mentioned_p (dests[i], src)) >> > + { >> > + if (find_reg_note (insn, REG_DEAD, src) !=3D NULL_RTX) >> >> This is pre-existing, but is a REG_DEAD note enough? I think we also >> need to check that the register isn't live on exit from the condition >> block (or, alternatively, isn't live on entry to the join block). >> >> E.g. for: >> >> if (cond) >> { >> tmp =3D a; >> a =3D b; >> b =3D tmp; >> tmp =3D ...; >> } >> ...tmp...; >> >> we need to preserve the original value of tmp when !cond, even though >> tmp is temporarily dead in the "then" block. >> >> Thanks, >> Richard >> > Good question. I played around with some tests and looked at the > original code again and I believe it's fine. > In your example, when insn is at `b =3D tmp;` then `tmp =3D a;` will be > marked as need_no_cmov because tmp will have a REG_DEAD at insn. `tmp > =3D ...;` will be emitted as a conditional set and the use outside the > conditional will be just fine. I don't see any issue. But I think we end up with: tmp =3D a; a =3D cond ? b : a; b =3D cond ? tmp : b; tmp =3D cond ? ... : tmp; So the net effect for !cond will be to set tmp to a, rather than preserve the original value of tmp. Thanks, Richard > > But it made me realize that my updated code doesn't really make sense > on that part. Since I allow src to be an arbitrary expression and not > a single reg, the code > > if (find_reg_note (insn, REG_DEAD, src) !=3D NULL_RTX) > > is nonsensical. Apart from the fact that the check is loop invariant > it also doesn't properly handle src anyway. I'll have to think again > about how to refactor this check now that src is not just a reg. > > Thanks, > Manolis