From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by sourceware.org (Postfix) with ESMTPS id C4EA23858D37 for ; Tue, 28 Nov 2023 08:49:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C4EA23858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C4EA23858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1030 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701161379; cv=none; b=nosTUK2khRK+xAN0RbJ7UdQp2OmQC+fLjORdkBp5WeiTkTrU2zyvbQ48bue0LZfTPFw/E5jO5pkCNhWnn3sOZ/kwYi+2HGpqRK9gi0ICI0fsDEGKPFAuACWLuWX3smJYq9euk/PU5fRIw9+x7tO75hOrWLxxmP0bg75AtPBrbJU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701161379; c=relaxed/simple; bh=xkJHF+ADwFxPbtevvFYRspRLy5iKAeXPoQRVUaycgiI=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=G7xQhELIGxU2WegcUHeVlFmeAVe8einuH0l798TUasy3N6ae7pN6U+1yszyeeHv2ui2FWD+BQ+ewukkZbN1Kx+8/Uw3SsnZK6pwAD+njw5S7E7gqibcVi8JknclZ1Z5t+bfbTXQC8yhBAevCR9fiqgFaX6DNjEJwgaSCXRRnnYE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pj1-x1030.google.com with SMTP id 98e67ed59e1d1-2858f58ed3cso3472498a91.2 for ; Tue, 28 Nov 2023 00:49:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1701161375; x=1701766175; darn=gcc.gnu.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Jroesrxr9u6AwHJzZO9YDzIiDuxw7z16pPJOwkc7RFs=; b=azGGi6TwbdQJXqo5d2mHKSjfiS2kqgYZN8+q6TJ4c9+AkAtyIJ8lBFVHrv4px8r8zj lPWxhfQXaLm9cOWO9lw9gJIuBIghvJIRV6/096W0PWABz1nQfHJcmxNJidLAEkBLVHEs Zy0sktCmmxAvBpg9UDFqGE0i9llSPxyM6CUEWMQd2BhrAtT3ZP8BkNqISud7oGrdZWCZ i8/neLQV4xZWbqiB+MocCKUios0qSBYn4I5tdKf4ZCxykxzQlPbdbb4xp2B18dU5F8+/ gT72BtLaq1G1tqxHTSOdLQt+9Bz/gpHMEZ6W4E6LTG3jJv6zf0pDfLzUIC4sy1ibGK2I DrIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701161375; x=1701766175; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Jroesrxr9u6AwHJzZO9YDzIiDuxw7z16pPJOwkc7RFs=; b=EytLFLCYFfW15pLkdqEwiK83+So4lBnfdR0SN+xbsL9bExCFxI6DKYx4CDf1c3Bb20 9uqhwLoTCqAh7reTl7sv3TX8C/Yby0OPRKixB8UdVES2H0c5GZUJmsHXHVYgDQ0+MfTk IDxzQlKU1J6827npsMrxaSl9y50fwFhSfHVv9tvFEiqKK0ehXi0m4Mrplmp5wTCmZneu YdQ44W0NZFNNPpNdW92Q5OOlbKsHo9v+bxhOGhxNU7kV6L3yPkhLthhJ/m/Gz+1kpSmy /wX5DHySfpEUcO3j56wh2pDEhD/MNhojWHIUbYqtWk4oqIRjhqI36uhuGXM/eDdIRaom SFCQ== X-Gm-Message-State: AOJu0Yze9BxdIiFqLpY8HBQswxrl54Rk3DK9CWXvzPHTZoh5n+7ogqru 7QPh8hzjzJDC/XjqYmFdRMAc19kgDs+wFPu7xmyg6Q== X-Google-Smtp-Source: AGHT+IEq8LnnJSK9TAHx8KYP6FKhXe3INGygjBiYVHSbK6ion/1cIxLfGeRHWCa+nOECzVJV/nZg5nXWqg4WlEkTqSw= X-Received: by 2002:a17:90b:314c:b0:285:a18a:49c0 with SMTP id ip12-20020a17090b314c00b00285a18a49c0mr12742199pjb.28.1701161375397; Tue, 28 Nov 2023 00:49:35 -0800 (PST) MIME-Version: 1.0 References: <20231121180054.3949602-1-manolis.tsamis@vrull.eu> In-Reply-To: From: Manolis Tsamis Date: Tue, 28 Nov 2023 10:48:59 +0200 Message-ID: Subject: Re: [PATCH v2] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets To: Manolis Tsamis , gcc-patches@gcc.gnu.org, Robin Dapp , Jeff Law , Philipp Tomsich , richard.sandiford@arm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Nov 23, 2023 at 11:01=E2=80=AFPM Richard Sandiford wrote: > > Manolis Tsamis writes: > > The existing implementation of need_cmov_or_rewire and > > noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG= . > > This commit enchances them so they can handle/rewire arbitrary set stat= ements. > > > > To do that a new helper struct noce_multiple_sets_info is introduced wh= ich is > > used by noce_convert_multiple_sets and its helper functions. This resul= ts in > > cleaner function signatures, improved efficientcy (a number of vecs and= hash > > set/map are replaced with a single vec of struct) and simplicity. > > > > gcc/ChangeLog: > > > > * ifcvt.cc (need_cmov_or_rewire): Renamed init_noce_multiple_sets= _info. > > (init_noce_multiple_sets_info): Initialize noce_multiple_sets_inf= o. > > (noce_convert_multiple_sets_1): Use noce_multiple_sets_info and h= andle > > rewiring of multiple registers. > > (noce_convert_multiple_sets): Updated to use noce_multiple_sets_i= nfo. > > * ifcvt.h (struct noce_multiple_sets_info): Introduce new struct > > noce_multiple_sets_info to store info for noce_convert_multiple_s= ets. > > > > Signed-off-by: Manolis Tsamis > > --- > > Thanks, this looks like a really nice clean-up. One comment below: > Thanks! > > > > Changes in v2: > > - Made standalone patch. > > - Better comments, some more checks. > > > > gcc/ifcvt.cc | 252 +++++++++++++++++++++++---------------------------- > > gcc/ifcvt.h | 16 ++++ > > 2 files changed, 129 insertions(+), 139 deletions(-) > > > > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc > > index a0af553b9ff..9486d54de34 100644 > > --- a/gcc/ifcvt.cc > > +++ b/gcc/ifcvt.cc > > @@ -98,14 +98,10 @@ static bool dead_or_predicable (basic_block, basic_= block, basic_block, > > edge, bool); > > static void noce_emit_move_insn (rtx, rtx); > > static rtx_insn *block_has_only_trap (basic_block); > > -static void need_cmov_or_rewire (basic_block, hash_set *, > > - hash_map *); > > +static void init_noce_multiple_sets_info (basic_block, > > + auto_delete_vec &); > > static bool noce_convert_multiple_sets_1 (struct noce_if_info *, > > - hash_set *, > > - hash_map *, > > - auto_vec *, > > - auto_vec *, > > - auto_vec *, int *); > > + auto_delete_vec &, int *); > > > > /* Count the number of non-jump active insns in BB. */ > > > > @@ -3270,24 +3266,13 @@ noce_convert_multiple_sets (struct noce_if_info= *if_info) > > rtx x =3D XEXP (cond, 0); > > rtx y =3D XEXP (cond, 1); > > > > - /* The true targets for a conditional move. */ > > - auto_vec targets; > > - /* The temporaries introduced to allow us to not consider register > > - overlap. */ > > - auto_vec temporaries; > > - /* The insns we've emitted. */ > > - auto_vec unmodified_insns; > > - > > - hash_set need_no_cmov; > > - hash_map rewired_src; > > - > > - need_cmov_or_rewire (then_bb, &need_no_cmov, &rewired_src); > > + auto_delete_vec insn_info; > > + init_noce_multiple_sets_info (then_bb, insn_info); > > > > int last_needs_comparison =3D -1; > > > > bool ok =3D noce_convert_multiple_sets_1 > > - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, > > - &unmodified_insns, &last_needs_comparison); > > + (if_info, insn_info, &last_needs_comparison); > > if (!ok) > > return false; > > > > @@ -3302,8 +3287,7 @@ noce_convert_multiple_sets (struct noce_if_info *= if_info) > > end_sequence (); > > start_sequence (); > > ok =3D noce_convert_multiple_sets_1 > > - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, > > - &unmodified_insns, &last_needs_comparison); > > + (if_info, insn_info, &last_needs_comparison); > > /* Actually we should not fail anymore if we reached here, > > but better still check. */ > > if (!ok) > > @@ -3312,12 +3296,12 @@ noce_convert_multiple_sets (struct noce_if_info= *if_info) > > > > /* We must have seen some sort of insn to insert, otherwise we were > > given an empty BB to convert, and we can't handle that. */ > > - gcc_assert (!unmodified_insns.is_empty ()); > > + gcc_assert (!insn_info.is_empty ()); > > > > /* Now fixup the assignments. */ > > - for (unsigned i =3D 0; i < targets.length (); i++) > > - if (targets[i] !=3D temporaries[i]) > > - noce_emit_move_insn (targets[i], temporaries[i]); > > + for (unsigned i =3D 0; i < insn_info.length (); i++) > > + if (insn_info[i]->target !=3D insn_info[i]->temporary) > > + noce_emit_move_insn (insn_info[i]->target, insn_info[i]->tempora= ry); > > > > /* Actually emit the sequence if it isn't too expensive. */ > > rtx_insn *seq =3D get_insns (); > > @@ -3332,10 +3316,10 @@ noce_convert_multiple_sets (struct noce_if_info= *if_info) > > set_used_flags (insn); > > > > /* Mark all our temporaries and targets as used. */ > > - for (unsigned i =3D 0; i < targets.length (); i++) > > + for (unsigned i =3D 0; i < insn_info.length (); i++) > > { > > - set_used_flags (temporaries[i]); > > - set_used_flags (targets[i]); > > + set_used_flags (insn_info[i]->temporary); > > + set_used_flags (insn_info[i]->target); > > } > > > > set_used_flags (cond); > > @@ -3354,7 +3338,7 @@ noce_convert_multiple_sets (struct noce_if_info *= if_info) > > return false; > > > > emit_insn_before_setloc (seq, if_info->jump, > > - INSN_LOCATION (unmodified_insns.last ())); > > + INSN_LOCATION (insn_info.last ()->unmodified_i= nsn)); > > > > /* Clean up THEN_BB and the edges in and out of it. */ > > remove_edge (find_edge (test_bb, join_bb)); > > @@ -3389,21 +3373,13 @@ check_for_cc_cmp_clobbers (rtx dest, const_rtx,= void *p0) > > p[0] =3D NULL_RTX; > > } > > > > -/* This goes through all relevant insns of IF_INFO->then_bb and tries = to > > - create conditional moves. In case a simple move sufficis the insn > > - should be listed in NEED_NO_CMOV. The rewired-src cases should be > > - specified via REWIRED_SRC. TARGETS, TEMPORARIES and UNMODIFIED_INS= NS > > - are specified and used in noce_convert_multiple_sets and should be = passed > > - to this function.. */ > > +/* This goes through all relevant insns of IF_INFO->then_bb and tries = to create > > + conditional moves. Information for the insns is kept in INSN_INFO.= */ > > > > static bool > > noce_convert_multiple_sets_1 (struct noce_if_info *if_info, > > - hash_set *need_no_cmov, > > - hash_map *rewired_src, > > - auto_vec *targets, > > - auto_vec *temporaries, > > - auto_vec *unmodified_insns, > > - int *last_needs_comparison) > > + auto_delete_vec &insn_in= fo, > > + int *last_needs_comparison) > > { > > basic_block then_bb =3D if_info->then_bb; > > rtx_insn *jump =3D if_info->jump; > > @@ -3421,11 +3397,6 @@ noce_convert_multiple_sets_1 (struct noce_if_inf= o *if_info, > > > > rtx_insn *insn; > > int count =3D 0; > > - > > - targets->truncate (0); > > - temporaries->truncate (0); > > - unmodified_insns->truncate (0); > > - > > bool second_try =3D *last_needs_comparison !=3D -1; > > > > FOR_BB_INSNS (then_bb, insn) > > @@ -3434,6 +3405,8 @@ noce_convert_multiple_sets_1 (struct noce_if_info= *if_info, > > if (!active_insn_p (insn)) > > continue; > > > > + noce_multiple_sets_info *info =3D insn_info[count]; > > + > > rtx set =3D single_set (insn); > > gcc_checking_assert (set); > > > > @@ -3441,9 +3414,12 @@ noce_convert_multiple_sets_1 (struct noce_if_inf= o *if_info, > > rtx temp; > > > > rtx new_val =3D SET_SRC (set); > > - if (int *ii =3D rewired_src->get (insn)) > > - new_val =3D simplify_replace_rtx (new_val, (*targets)[*ii], > > - (*temporaries)[*ii]); > > + > > + int i, ii; > > + FOR_EACH_VEC_ELT (info->rewired_src, i, ii) > > + new_val =3D simplify_replace_rtx (new_val, insn_info[ii]->target, > > + insn_info[ii]->temporary); > > + > > rtx old_val =3D target; > > > > /* As we are transforming > > @@ -3484,7 +3460,7 @@ noce_convert_multiple_sets_1 (struct noce_if_info= *if_info, > > /* We have identified swap-style idioms before. A normal > > set will need to be a cmov while the first instruction of a swap= -style > > idiom can be a regular move. This helps with costing. */ > > - bool need_cmov =3D !need_no_cmov->contains (insn); > > + bool need_cmov =3D info->need_cmov; > > > > /* If we had a non-canonical conditional jump (i.e. one where > > the fallthrough is to the "else" case) we need to reverse > > @@ -3632,21 +3608,102 @@ noce_convert_multiple_sets_1 (struct noce_if_i= nfo *if_info, > > > > /* Bookkeeping. */ > > count++; > > - targets->safe_push (target); > > - temporaries->safe_push (temp_dest); > > - unmodified_insns->safe_push (insn); > > + > > + info->target =3D target; > > + info->temporary =3D temp_dest; > > + info->unmodified_insn =3D insn; > > } > > > > + /* The number of processed insns must match init_noce_multiple_sets_= info. */ > > + gcc_assert (count =3D=3D (int) insn_info.length ()); > > + > > /* Even if we did not actually need the comparison, we want to make = sure > > to try a second time in order to get rid of the temporaries. */ > > if (*last_needs_comparison =3D=3D -1) > > *last_needs_comparison =3D 0; > > > > - > > return true; > > } > > > > +/* Find local swap-style idioms in BB and mark the first insn (1) > > + that is only a temporary as not needing a conditional move as > > + it is going to be dead afterwards anyway. > > + > > + (1) int tmp =3D a; > > + a =3D b; > > + b =3D tmp; > > + > > + ifcvt > > + --> > > + > > + tmp =3D a; > > + a =3D cond ? b : a_old; > > + b =3D cond ? tmp : b_old; > > + > > + Additionally, store the index of insns like (2) when a subsequent > > + SET reads from their destination. > > > > + (2) int c =3D a; > > + int d =3D c; > > + > > + ifcvt > > + --> > > + > > + c =3D cond ? a : c_old; > > + d =3D cond ? d : c; // Need to use c rather than c_old here. > > +*/ > > + > > +static void > > +init_noce_multiple_sets_info (basic_block bb, > > + auto_delete_vec &insn_info) > > +{ > > + rtx_insn *insn; > > + int count =3D 0; > > + auto_vec dests; > > + > > + /* Iterate over all SETs, storing the destinations > > + in DEST. > > + - If we hit a SET that reads from a destination > > + that we have seen before and the corresponding register > > + is dead afterwards, the register does not need to be > > + moved conditionally. > > + - If we encounter a previously changed register, > > + rewire the read to the original source. */ > > + FOR_BB_INSNS (bb, insn) > > + { > > + if (!active_insn_p (insn)) > > + continue; > > + > > + noce_multiple_sets_info *info =3D new noce_multiple_sets_info; > > + info->target =3D NULL_RTX; > > + info->temporary =3D NULL_RTX; > > + info->unmodified_insn =3D NULL; > > + info->need_cmov =3D true; > > + insn_info.safe_push (info); > > + > > + rtx set =3D single_set (insn); > > + gcc_checking_assert (set); > > + > > + rtx src =3D SET_SRC (set); > > + rtx dest =3D SET_DEST (set); > > + > > + /* Check if the current SET's source is the same > > + as any previously seen destination. > > + This is quadratic but the number of insns in BB > > + is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ > > + for (int i =3D count - 1; i >=3D 0; --i) > > + if (reg_mentioned_p (dests[i], src)) > > + { > > + if (find_reg_note (insn, REG_DEAD, src) !=3D NULL_RTX) > > This is pre-existing, but is a REG_DEAD note enough? I think we also > need to check that the register isn't live on exit from the condition > block (or, alternatively, isn't live on entry to the join block). > > E.g. for: > > if (cond) > { > tmp =3D a; > a =3D b; > b =3D tmp; > tmp =3D ...; > } > ...tmp...; > > we need to preserve the original value of tmp when !cond, even though > tmp is temporarily dead in the "then" block. > > Thanks, > Richard > Good question. I played around with some tests and looked at the original code again and I believe it's fine. In your example, when insn is at `b =3D tmp;` then `tmp =3D a;` will be marked as need_no_cmov because tmp will have a REG_DEAD at insn. `tmp =3D ...;` will be emitted as a conditional set and the use outside the conditional will be just fine. I don't see any issue. But it made me realize that my updated code doesn't really make sense on that part. Since I allow src to be an arbitrary expression and not a single reg, the code if (find_reg_note (insn, REG_DEAD, src) !=3D NULL_RTX) is nonsensical. Apart from the fact that the check is loop invariant it also doesn't properly handle src anyway. I'll have to think again about how to refactor this check now that src is not just a reg. Thanks, Manolis > > > + insn_info[i]->need_cmov =3D false; > > + else > > + insn_info[count]->rewired_src.safe_push (i); > > + } > > + > > + dests.safe_push (dest); > > + count++; > > + } > > +} > > > > /* Return true iff basic block TEST_BB is comprised of only > > (SET (REG) (REG)) insns suitable for conversion to a series > > @@ -4121,89 +4178,6 @@ check_cond_move_block (basic_block bb, > > return true; > > } > > > > -/* Find local swap-style idioms in BB and mark the first insn (1) > > - that is only a temporary as not needing a conditional move as > > - it is going to be dead afterwards anyway. > > - > > - (1) int tmp =3D a; > > - a =3D b; > > - b =3D tmp; > > - > > - ifcvt > > - --> > > - > > - tmp =3D a; > > - a =3D cond ? b : a_old; > > - b =3D cond ? tmp : b_old; > > - > > - Additionally, store the index of insns like (2) when a subsequent > > - SET reads from their destination. > > - > > - (2) int c =3D a; > > - int d =3D c; > > - > > - ifcvt > > - --> > > - > > - c =3D cond ? a : c_old; > > - d =3D cond ? d : c; // Need to use c rather than c_old here. > > -*/ > > - > > -static void > > -need_cmov_or_rewire (basic_block bb, > > - hash_set *need_no_cmov, > > - hash_map *rewired_src) > > -{ > > - rtx_insn *insn; > > - int count =3D 0; > > - auto_vec insns; > > - auto_vec dests; > > - > > - /* Iterate over all SETs, storing the destinations > > - in DEST. > > - - If we hit a SET that reads from a destination > > - that we have seen before and the corresponding register > > - is dead afterwards, the register does not need to be > > - moved conditionally. > > - - If we encounter a previously changed register, > > - rewire the read to the original source. */ > > - FOR_BB_INSNS (bb, insn) > > - { > > - rtx set, src, dest; > > - > > - if (!active_insn_p (insn)) > > - continue; > > - > > - set =3D single_set (insn); > > - if (set =3D=3D NULL_RTX) > > - continue; > > - > > - src =3D SET_SRC (set); > > - if (SUBREG_P (src)) > > - src =3D SUBREG_REG (src); > > - dest =3D SET_DEST (set); > > - > > - /* Check if the current SET's source is the same > > - as any previously seen destination. > > - This is quadratic but the number of insns in BB > > - is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ > > - if (REG_P (src)) > > - for (int i =3D count - 1; i >=3D 0; --i) > > - if (reg_overlap_mentioned_p (src, dests[i])) > > - { > > - if (find_reg_note (insn, REG_DEAD, src) !=3D NULL_RTX) > > - need_no_cmov->add (insns[i]); > > - else > > - rewired_src->put (insn, i); > > - } > > - > > - insns.safe_push (insn); > > - dests.safe_push (dest); > > - > > - count++; > > - } > > -} > > - > > /* Given a basic block BB suitable for conditional move conversion, > > a condition COND, and pointer maps THEN_VALS and ELSE_VALS containi= ng > > the register values depending on COND, emit the insns in the block = as > > diff --git a/gcc/ifcvt.h b/gcc/ifcvt.h > > index be1385aabe4..83420fe1207 100644 > > --- a/gcc/ifcvt.h > > +++ b/gcc/ifcvt.h > > @@ -40,6 +40,22 @@ struct ce_if_block > > int pass; /* Pass number. */ > > }; > > > > +struct noce_multiple_sets_info > > +{ > > + /* A list of indices to instructions that we need to rewire into thi= s > > + instruction when we replace them with temporary conditional moves= . */ > > + auto_vec rewired_src; > > + /* The true targets for a conditional move. */ > > + rtx target; > > + /* The temporary introduced for this conditional move. */ > > + rtx temporary; > > + /* The original instruction that we're replacing. */ > > + rtx_insn *unmodified_insn; > > + /* True if a conditional move is needed. > > + False if a simple move can be used instead. */ > > + bool need_cmov; > > +}; > > + > > /* Used by noce_process_if_block to communicate with its subroutines. > > > > The subroutines know that A and B may be evaluated freely. They