From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by sourceware.org (Postfix) with ESMTPS id 1B89338582B6 for ; Tue, 28 Nov 2023 10:35:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1B89338582B6 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1B89338582B6 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::42c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701167726; cv=none; b=pZW6Cg1F5JQlbGC+9mlcKS63cU8aPXN/hjVllXqiQDDcuX27Zd2YRy3QgZE4h1x/MdvDFjMeYSp7aeqYsF0X0VKPHjlTXjWLslFVQmgPizihRy44xChx3ZsyWCd2Fkahb6hCWpnBP7QwX9BvNXZM3GdouXBVkI23f1XLfxgeZyI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701167726; c=relaxed/simple; bh=LwT2q0zP81XPYFAJF4iEstDb0wPOGxvSU5MSe2d4SLw=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=RljVC0GAAf/oPXq5g0xPgts9iVHNO6Kj65VVtHQwooA6pcwTUebn+EaYP9/I0Eh5AtPexMRuqLzMvSOgjNM60LKlfJr1QrAvPsOCR94mm0XOesgg3NOP9Fr1IfvPAK0UMVhL/xvO6AAXIiG7IMxFIQo1c5iWwEk8m5g/bZSj+rc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pf1-x42c.google.com with SMTP id d2e1a72fcca58-6cc02e77a9cso2866302b3a.0 for ; Tue, 28 Nov 2023 02:35:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1701167723; x=1701772523; darn=gcc.gnu.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=DhMkYnuoJsTl5j7OwPcLfYA3jU+Fzs5kIm2G2u8N0Ug=; b=pN0Ums6JrLKdk2myr9bSHDgccVNoz2EivGK6OLd2pCkUhW2x2x34EZsIHFXTos0zbq tL6hL9MzlwzhDLtJ6MXz4wiB7/sumSt7+zmBLf2PwG4777t2W5bxpAgp6AuPlRGqC9hc 1hxo0zkE3yZ2iXLxYGL8JEXv6GtcPkVMuZ4f0NS6KoWe6ib5sg4+TAhrITCjxi+imEIg 1BAunHiHKbMswW2ZgeyBqgzsnWH5a7sGBgaaGtR8BA2XaetVJPUqS/z3vERMNppbBkPF 2V4mjb8konmpOywbplDYa2zsBs4eZmoat5bz6maJ2P6U1nBCEYCU2uldaWys36G+vFOz tqCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701167723; x=1701772523; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DhMkYnuoJsTl5j7OwPcLfYA3jU+Fzs5kIm2G2u8N0Ug=; b=bxn0R2V9LrEF9FaJ0HCYnYn54v0u64ZShyqHmmUjoOFlPofxu7b2RXHKUyziL4v/dH /nLQBHpyshaEHv9AorlPtPgHDKHPVi2dTwxHzl9PI7InVYGt9L5kec4sBuB4UBDxoF6L m2sXuH72coVt5MeQ/oZQ1GzebaAWiqdmJ6sofijODjmHPjvJbDjqo4PKFW+OLLpakBN/ pzwHAK77zNApiQdj09HaPgCwh5C/xUofwL+G/qGexhYqL6he+jm8pwZ7M5koXNWlc9Tu U5dZ+7DvImKsTE0gyGiEsdKx70AW+LspojfnHzVBbXD9z/LDxjYJSQQgMlD10t65dvJx W6wQ== X-Gm-Message-State: AOJu0Yzs8EotfMLj6Hn8Uw8t905z/dlpI7lZ7tk5uKaaop41iIqlpu7M 5Ek1roI/X9qcbGbmRzFoFht/7hsIPHEAfE7nrDHsjw== X-Google-Smtp-Source: AGHT+IHYpnsmMRl2cqyEUfN3y5hlFOurf1PNbVqtqLO6DLHbvump4TpVR4Z0RTP0N2/FRB3NsTaQPr2Poe4+k4EpSfo= X-Received: by 2002:a05:6a20:a122:b0:18c:95f1:20bf with SMTP id q34-20020a056a20a12200b0018c95f120bfmr6996034pzk.47.1701167722867; Tue, 28 Nov 2023 02:35:22 -0800 (PST) MIME-Version: 1.0 References: <20231121180054.3949602-1-manolis.tsamis@vrull.eu> In-Reply-To: From: Manolis Tsamis Date: Tue, 28 Nov 2023 12:34:46 +0200 Message-ID: Subject: Re: [PATCH v2] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets To: Manolis Tsamis , gcc-patches@gcc.gnu.org, Robin Dapp , Jeff Law , Philipp Tomsich , richard.sandiford@arm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Nov 28, 2023 at 12:12=E2=80=AFPM Richard Sandiford wrote: > > Manolis Tsamis writes: > > On Thu, Nov 23, 2023 at 11:01=E2=80=AFPM Richard Sandiford > > wrote: > >> > >> Manolis Tsamis writes: > >> > The existing implementation of need_cmov_or_rewire and > >> > noce_convert_multiple_sets_1 assumes that sets are either REG or SUB= REG. > >> > This commit enchances them so they can handle/rewire arbitrary set s= tatements. > >> > > >> > To do that a new helper struct noce_multiple_sets_info is introduced= which is > >> > used by noce_convert_multiple_sets and its helper functions. This re= sults in > >> > cleaner function signatures, improved efficientcy (a number of vecs = and hash > >> > set/map are replaced with a single vec of struct) and simplicity. > >> > > >> > gcc/ChangeLog: > >> > > >> > * ifcvt.cc (need_cmov_or_rewire): Renamed init_noce_multiple_s= ets_info. > >> > (init_noce_multiple_sets_info): Initialize noce_multiple_sets_= info. > >> > (noce_convert_multiple_sets_1): Use noce_multiple_sets_info an= d handle > >> > rewiring of multiple registers. > >> > (noce_convert_multiple_sets): Updated to use noce_multiple_set= s_info. > >> > * ifcvt.h (struct noce_multiple_sets_info): Introduce new stru= ct > >> > noce_multiple_sets_info to store info for noce_convert_multipl= e_sets. > >> > > >> > Signed-off-by: Manolis Tsamis > >> > --- > >> > >> Thanks, this looks like a really nice clean-up. One comment below: > >> > > Thanks! > > > >> > > >> > Changes in v2: > >> > - Made standalone patch. > >> > - Better comments, some more checks. > >> > > >> > gcc/ifcvt.cc | 252 +++++++++++++++++++++++-------------------------= --- > >> > gcc/ifcvt.h | 16 ++++ > >> > 2 files changed, 129 insertions(+), 139 deletions(-) > >> > > >> > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc > >> > index a0af553b9ff..9486d54de34 100644 > >> > --- a/gcc/ifcvt.cc > >> > +++ b/gcc/ifcvt.cc > >> > @@ -98,14 +98,10 @@ static bool dead_or_predicable (basic_block, bas= ic_block, basic_block, > >> > edge, bool); > >> > static void noce_emit_move_insn (rtx, rtx); > >> > static rtx_insn *block_has_only_trap (basic_block); > >> > -static void need_cmov_or_rewire (basic_block, hash_set = *, > >> > - hash_map *); > >> > +static void init_noce_multiple_sets_info (basic_block, > >> > + auto_delete_vec &); > >> > static bool noce_convert_multiple_sets_1 (struct noce_if_info *, > >> > - hash_set *, > >> > - hash_map *, > >> > - auto_vec *, > >> > - auto_vec *, > >> > - auto_vec *, int = *); > >> > + auto_delete_vec &, int *); > >> > > >> > /* Count the number of non-jump active insns in BB. */ > >> > > >> > @@ -3270,24 +3266,13 @@ noce_convert_multiple_sets (struct noce_if_i= nfo *if_info) > >> > rtx x =3D XEXP (cond, 0); > >> > rtx y =3D XEXP (cond, 1); > >> > > >> > - /* The true targets for a conditional move. */ > >> > - auto_vec targets; > >> > - /* The temporaries introduced to allow us to not consider registe= r > >> > - overlap. */ > >> > - auto_vec temporaries; > >> > - /* The insns we've emitted. */ > >> > - auto_vec unmodified_insns; > >> > - > >> > - hash_set need_no_cmov; > >> > - hash_map rewired_src; > >> > - > >> > - need_cmov_or_rewire (then_bb, &need_no_cmov, &rewired_src); > >> > + auto_delete_vec insn_info; > >> > + init_noce_multiple_sets_info (then_bb, insn_info); > >> > > >> > int last_needs_comparison =3D -1; > >> > > >> > bool ok =3D noce_convert_multiple_sets_1 > >> > - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, > >> > - &unmodified_insns, &last_needs_comparison); > >> > + (if_info, insn_info, &last_needs_comparison); > >> > if (!ok) > >> > return false; > >> > > >> > @@ -3302,8 +3287,7 @@ noce_convert_multiple_sets (struct noce_if_inf= o *if_info) > >> > end_sequence (); > >> > start_sequence (); > >> > ok =3D noce_convert_multiple_sets_1 > >> > - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, > >> > - &unmodified_insns, &last_needs_comparison); > >> > + (if_info, insn_info, &last_needs_comparison); > >> > /* Actually we should not fail anymore if we reached here, > >> > but better still check. */ > >> > if (!ok) > >> > @@ -3312,12 +3296,12 @@ noce_convert_multiple_sets (struct noce_if_i= nfo *if_info) > >> > > >> > /* We must have seen some sort of insn to insert, otherwise we we= re > >> > given an empty BB to convert, and we can't handle that. */ > >> > - gcc_assert (!unmodified_insns.is_empty ()); > >> > + gcc_assert (!insn_info.is_empty ()); > >> > > >> > /* Now fixup the assignments. */ > >> > - for (unsigned i =3D 0; i < targets.length (); i++) > >> > - if (targets[i] !=3D temporaries[i]) > >> > - noce_emit_move_insn (targets[i], temporaries[i]); > >> > + for (unsigned i =3D 0; i < insn_info.length (); i++) > >> > + if (insn_info[i]->target !=3D insn_info[i]->temporary) > >> > + noce_emit_move_insn (insn_info[i]->target, insn_info[i]->temp= orary); > >> > > >> > /* Actually emit the sequence if it isn't too expensive. */ > >> > rtx_insn *seq =3D get_insns (); > >> > @@ -3332,10 +3316,10 @@ noce_convert_multiple_sets (struct noce_if_i= nfo *if_info) > >> > set_used_flags (insn); > >> > > >> > /* Mark all our temporaries and targets as used. */ > >> > - for (unsigned i =3D 0; i < targets.length (); i++) > >> > + for (unsigned i =3D 0; i < insn_info.length (); i++) > >> > { > >> > - set_used_flags (temporaries[i]); > >> > - set_used_flags (targets[i]); > >> > + set_used_flags (insn_info[i]->temporary); > >> > + set_used_flags (insn_info[i]->target); > >> > } > >> > > >> > set_used_flags (cond); > >> > @@ -3354,7 +3338,7 @@ noce_convert_multiple_sets (struct noce_if_inf= o *if_info) > >> > return false; > >> > > >> > emit_insn_before_setloc (seq, if_info->jump, > >> > - INSN_LOCATION (unmodified_insns.last ())); > >> > + INSN_LOCATION (insn_info.last ()->unmodifie= d_insn)); > >> > > >> > /* Clean up THEN_BB and the edges in and out of it. */ > >> > remove_edge (find_edge (test_bb, join_bb)); > >> > @@ -3389,21 +3373,13 @@ check_for_cc_cmp_clobbers (rtx dest, const_r= tx, void *p0) > >> > p[0] =3D NULL_RTX; > >> > } > >> > > >> > -/* This goes through all relevant insns of IF_INFO->then_bb and tri= es to > >> > - create conditional moves. In case a simple move sufficis the in= sn > >> > - should be listed in NEED_NO_CMOV. The rewired-src cases should = be > >> > - specified via REWIRED_SRC. TARGETS, TEMPORARIES and UNMODIFIED_= INSNS > >> > - are specified and used in noce_convert_multiple_sets and should = be passed > >> > - to this function.. */ > >> > +/* This goes through all relevant insns of IF_INFO->then_bb and tri= es to create > >> > + conditional moves. Information for the insns is kept in INSN_IN= FO. */ > >> > > >> > static bool > >> > noce_convert_multiple_sets_1 (struct noce_if_info *if_info, > >> > - hash_set *need_no_cmov, > >> > - hash_map *rewired_src, > >> > - auto_vec *targets, > >> > - auto_vec *temporaries, > >> > - auto_vec *unmodified_insns, > >> > - int *last_needs_comparison) > >> > + auto_delete_vec &insn= _info, > >> > + int *last_needs_comparison) > >> > { > >> > basic_block then_bb =3D if_info->then_bb; > >> > rtx_insn *jump =3D if_info->jump; > >> > @@ -3421,11 +3397,6 @@ noce_convert_multiple_sets_1 (struct noce_if_= info *if_info, > >> > > >> > rtx_insn *insn; > >> > int count =3D 0; > >> > - > >> > - targets->truncate (0); > >> > - temporaries->truncate (0); > >> > - unmodified_insns->truncate (0); > >> > - > >> > bool second_try =3D *last_needs_comparison !=3D -1; > >> > > >> > FOR_BB_INSNS (then_bb, insn) > >> > @@ -3434,6 +3405,8 @@ noce_convert_multiple_sets_1 (struct noce_if_i= nfo *if_info, > >> > if (!active_insn_p (insn)) > >> > continue; > >> > > >> > + noce_multiple_sets_info *info =3D insn_info[count]; > >> > + > >> > rtx set =3D single_set (insn); > >> > gcc_checking_assert (set); > >> > > >> > @@ -3441,9 +3414,12 @@ noce_convert_multiple_sets_1 (struct noce_if_= info *if_info, > >> > rtx temp; > >> > > >> > rtx new_val =3D SET_SRC (set); > >> > - if (int *ii =3D rewired_src->get (insn)) > >> > - new_val =3D simplify_replace_rtx (new_val, (*targets)[*ii], > >> > - (*temporaries)[*ii]); > >> > + > >> > + int i, ii; > >> > + FOR_EACH_VEC_ELT (info->rewired_src, i, ii) > >> > + new_val =3D simplify_replace_rtx (new_val, insn_info[ii]->targ= et, > >> > + insn_info[ii]->temporary); > >> > + > >> > rtx old_val =3D target; > >> > > >> > /* As we are transforming > >> > @@ -3484,7 +3460,7 @@ noce_convert_multiple_sets_1 (struct noce_if_i= nfo *if_info, > >> > /* We have identified swap-style idioms before. A normal > >> > set will need to be a cmov while the first instruction of a s= wap-style > >> > idiom can be a regular move. This helps with costing. */ > >> > - bool need_cmov =3D !need_no_cmov->contains (insn); > >> > + bool need_cmov =3D info->need_cmov; > >> > > >> > /* If we had a non-canonical conditional jump (i.e. one where > >> > the fallthrough is to the "else" case) we need to reverse > >> > @@ -3632,21 +3608,102 @@ noce_convert_multiple_sets_1 (struct noce_i= f_info *if_info, > >> > > >> > /* Bookkeeping. */ > >> > count++; > >> > - targets->safe_push (target); > >> > - temporaries->safe_push (temp_dest); > >> > - unmodified_insns->safe_push (insn); > >> > + > >> > + info->target =3D target; > >> > + info->temporary =3D temp_dest; > >> > + info->unmodified_insn =3D insn; > >> > } > >> > > >> > + /* The number of processed insns must match init_noce_multiple_se= ts_info. */ > >> > + gcc_assert (count =3D=3D (int) insn_info.length ()); > >> > + > >> > /* Even if we did not actually need the comparison, we want to ma= ke sure > >> > to try a second time in order to get rid of the temporaries. = */ > >> > if (*last_needs_comparison =3D=3D -1) > >> > *last_needs_comparison =3D 0; > >> > > >> > - > >> > return true; > >> > } > >> > > >> > +/* Find local swap-style idioms in BB and mark the first insn (1) > >> > + that is only a temporary as not needing a conditional move as > >> > + it is going to be dead afterwards anyway. > >> > + > >> > + (1) int tmp =3D a; > >> > + a =3D b; > >> > + b =3D tmp; > >> > + > >> > + ifcvt > >> > + --> > >> > + > >> > + tmp =3D a; > >> > + a =3D cond ? b : a_old; > >> > + b =3D cond ? tmp : b_old; > >> > + > >> > + Additionally, store the index of insns like (2) when a subseque= nt > >> > + SET reads from their destination. > >> > > >> > + (2) int c =3D a; > >> > + int d =3D c; > >> > + > >> > + ifcvt > >> > + --> > >> > + > >> > + c =3D cond ? a : c_old; > >> > + d =3D cond ? d : c; // Need to use c rather than c_old her= e. > >> > +*/ > >> > + > >> > +static void > >> > +init_noce_multiple_sets_info (basic_block bb, > >> > + auto_delete_vec &insn_in= fo) > >> > +{ > >> > + rtx_insn *insn; > >> > + int count =3D 0; > >> > + auto_vec dests; > >> > + > >> > + /* Iterate over all SETs, storing the destinations > >> > + in DEST. > >> > + - If we hit a SET that reads from a destination > >> > + that we have seen before and the corresponding register > >> > + is dead afterwards, the register does not need to be > >> > + moved conditionally. > >> > + - If we encounter a previously changed register, > >> > + rewire the read to the original source. */ > >> > + FOR_BB_INSNS (bb, insn) > >> > + { > >> > + if (!active_insn_p (insn)) > >> > + continue; > >> > + > >> > + noce_multiple_sets_info *info =3D new noce_multiple_sets_info= ; > >> > + info->target =3D NULL_RTX; > >> > + info->temporary =3D NULL_RTX; > >> > + info->unmodified_insn =3D NULL; > >> > + info->need_cmov =3D true; > >> > + insn_info.safe_push (info); > >> > + > >> > + rtx set =3D single_set (insn); > >> > + gcc_checking_assert (set); > >> > + > >> > + rtx src =3D SET_SRC (set); > >> > + rtx dest =3D SET_DEST (set); > >> > + > >> > + /* Check if the current SET's source is the same > >> > + as any previously seen destination. > >> > + This is quadratic but the number of insns in BB > >> > + is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ > >> > + for (int i =3D count - 1; i >=3D 0; --i) > >> > + if (reg_mentioned_p (dests[i], src)) > >> > + { > >> > + if (find_reg_note (insn, REG_DEAD, src) !=3D NULL_RTX) > >> > >> This is pre-existing, but is a REG_DEAD note enough? I think we also > >> need to check that the register isn't live on exit from the condition > >> block (or, alternatively, isn't live on entry to the join block). > >> > >> E.g. for: > >> > >> if (cond) > >> { > >> tmp =3D a; > >> a =3D b; > >> b =3D tmp; > >> tmp =3D ...; > >> } > >> ...tmp...; > >> > >> we need to preserve the original value of tmp when !cond, even though > >> tmp is temporarily dead in the "then" block. > >> > >> Thanks, > >> Richard > >> > > Good question. I played around with some tests and looked at the > > original code again and I believe it's fine. > > In your example, when insn is at `b =3D tmp;` then `tmp =3D a;` will be > > marked as need_no_cmov because tmp will have a REG_DEAD at insn. `tmp > > =3D ...;` will be emitted as a conditional set and the use outside the > > conditional will be just fine. I don't see any issue. > > But I think we end up with: > > tmp =3D a; > a =3D cond ? b : a; > b =3D cond ? tmp : b; > tmp =3D cond ? ... : tmp; > > So the net effect for !cond will be to set tmp to a, rather than > preserve the original value of tmp. > Ok, I now see what you've meant. Then, indeed, it looks like a bug to me (that fortunately didn't trigger?). Since I'll have to refactor my change around REG_DEAD anyway, I can make sure to fix this issue too. I'll also re-try to create some test case that triggers this bug, if possib= le. Thanks, Manolis > Thanks, > Richard > > > > > But it made me realize that my updated code doesn't really make sense > > on that part. Since I allow src to be an arbitrary expression and not > > a single reg, the code > > > > if (find_reg_note (insn, REG_DEAD, src) !=3D NULL_RTX) > > > > is nonsensical. Apart from the fact that the check is loop invariant > > it also doesn't properly handle src anyway. I'll have to think again > > about how to refactor this check now that src is not just a reg. > > > > Thanks, > > Manolis