From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by sourceware.org (Postfix) with ESMTPS id E19BB3858D39 for ; Thu, 7 Dec 2023 17:45:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E19BB3858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E19BB3858D39 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1132 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701971113; cv=none; b=bHVw51eCIaxYX0TXIqW+gq/EYo0QUo7PKYSHVOjmMN2hXmyTM+Nh9N4LnSs1rOGjjvnZ58WhB7nCf/Uq4mqTJszTIktwxsTwT6xntuqK7yLy9oxtKmgcXqLrBp0/eWZfxJsb2cWYvZKkDTmg4v2BvxsRzffiMjVbBSaYPF4YvC8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701971113; c=relaxed/simple; bh=y2wGcITReF5hzHlD8LsPdkrXCn84W90h2NG7/wqFSFY=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=qHB9R89mYRzPUDZ15CLyDPmNY68dPkq5xSx8iY7r5zPFACtDIkrLL9R0VRXhsIVWOZoISbgXP42aB8xri35tfM8no8tQ7Amt0oIvdld+ogS2Z39wTjQJUd3WGQM+TetGNnyIy4rhgXv5hCqbOlgYdbyWm28G8TkkyrXtzgkGThk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-5d3644ca426so10505867b3.1 for ; Thu, 07 Dec 2023 09:45:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1701971110; x=1702575910; darn=gcc.gnu.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=PKphFXqIl36HTH+Ck/jfaVapeIKtDcGn41Gc/EfEtrY=; b=QZhwW2h5DTzpHnkHGlLIRvG9j/sQgyKkeVY9R1OV4U10vw1vTfKYH2f5lqJmU0i+l2 b7HNeUjnzn9+FehLEB2NXFT4+WycBH8SrXL7a5GW6IBJx26CITaBXS/Nl93ihk9EDwkT siVR3MDR4GsPNlC2xpe7xsa9HQCy+q4SK/lnQVkvOGbbkziVzyryWmsRWNn95Gns9Nnr R3Jy5QFSSpoRMWF4RcB3HmdEGhw37sn/sBC8KaXDlLeoq/5mgVCfsmQ4sTBDpxNufn3G zgfDgyHEC2Gw5tWCKQbb6kzNVgzqdWTPp5h/w2jmQZvGVQVhx/EMp8f0tJpsouXRF2h8 4zWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701971110; x=1702575910; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PKphFXqIl36HTH+Ck/jfaVapeIKtDcGn41Gc/EfEtrY=; b=cPBadtcHhPNHjnLoIavlxb8/rbGGABJ/4617HsbGPCorSVOUSvkqcUJ/HXxdBKk9A0 v6t9kjb7ZZSLSC+sSJbPTw0DCRPrNDyDya6kJV52YVjb+cqtlqqp4nTHYB0lJY4nIpac Ly+KdhpIKtZGnZVMadrpVudhaB55Tz5d/PQFNz5dr1f/mBTpb4yzXbSPGgVkoya8Hj6J xzG8WJ7P69+mvDPOoNRlcK144h7XKZ5D+ElVYyqJZ5nG747Or8DHUDx2T4/oYlLWE2/z ToOjM91qUGQIEfUQHcFye6aa2CvA6T2FANvK+k0nXuAqZ0WElpXhPKgsVzvt1jAMP1Ye EyVQ== X-Gm-Message-State: AOJu0Yya3Dpawy2jYo1d9wNlhmAOWpnEW+rfOjWsCCoEFvGA9WzyjeBr mGBHOvyh0/SN9T/zhZAL/EukSlhtldQgCAzdE9vfuA== X-Google-Smtp-Source: AGHT+IFvzXR711uJNqugPwul6ansO2J9eK/4YtPgEJzjeOyhuACKqOvBuMDMpgwJWAKts80Uryv/HgLh6nIsKUYvCiw= X-Received: by 2002:a25:cbd5:0:b0:db7:dad0:76af with SMTP id b204-20020a25cbd5000000b00db7dad076afmr2368250ybg.75.1701971109884; Thu, 07 Dec 2023 09:45:09 -0800 (PST) MIME-Version: 1.0 References: <20231121180054.3949602-1-manolis.tsamis@vrull.eu> In-Reply-To: From: Manolis Tsamis Date: Thu, 7 Dec 2023 19:44:32 +0200 Message-ID: Subject: Re: [PATCH v2] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets To: Manolis Tsamis , gcc-patches@gcc.gnu.org, Robin Dapp , Jeff Law , Philipp Tomsich , richard.sandiford@arm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Nov 23, 2023 at 11:01=E2=80=AFPM Richard Sandiford wrote: > > Manolis Tsamis writes: > > The existing implementation of need_cmov_or_rewire and > > noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG= . > > This commit enchances them so they can handle/rewire arbitrary set stat= ements. > > > > To do that a new helper struct noce_multiple_sets_info is introduced wh= ich is > > used by noce_convert_multiple_sets and its helper functions. This resul= ts in > > cleaner function signatures, improved efficientcy (a number of vecs and= hash > > set/map are replaced with a single vec of struct) and simplicity. > > > > gcc/ChangeLog: > > > > * ifcvt.cc (need_cmov_or_rewire): Renamed init_noce_multiple_sets= _info. > > (init_noce_multiple_sets_info): Initialize noce_multiple_sets_inf= o. > > (noce_convert_multiple_sets_1): Use noce_multiple_sets_info and h= andle > > rewiring of multiple registers. > > (noce_convert_multiple_sets): Updated to use noce_multiple_sets_i= nfo. > > * ifcvt.h (struct noce_multiple_sets_info): Introduce new struct > > noce_multiple_sets_info to store info for noce_convert_multiple_s= ets. > > > > Signed-off-by: Manolis Tsamis > > --- > > Thanks, this looks like a really nice clean-up. One comment below: > > > > > Changes in v2: > > - Made standalone patch. > > - Better comments, some more checks. > > > > gcc/ifcvt.cc | 252 +++++++++++++++++++++++---------------------------- > > gcc/ifcvt.h | 16 ++++ > > 2 files changed, 129 insertions(+), 139 deletions(-) > > > > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc > > index a0af553b9ff..9486d54de34 100644 > > --- a/gcc/ifcvt.cc > > +++ b/gcc/ifcvt.cc > > @@ -98,14 +98,10 @@ static bool dead_or_predicable (basic_block, basic_= block, basic_block, > > edge, bool); > > static void noce_emit_move_insn (rtx, rtx); > > static rtx_insn *block_has_only_trap (basic_block); > > -static void need_cmov_or_rewire (basic_block, hash_set *, > > - hash_map *); > > +static void init_noce_multiple_sets_info (basic_block, > > + auto_delete_vec &); > > static bool noce_convert_multiple_sets_1 (struct noce_if_info *, > > - hash_set *, > > - hash_map *, > > - auto_vec *, > > - auto_vec *, > > - auto_vec *, int *); > > + auto_delete_vec &, int *); > > > > /* Count the number of non-jump active insns in BB. */ > > > > @@ -3270,24 +3266,13 @@ noce_convert_multiple_sets (struct noce_if_info= *if_info) > > rtx x =3D XEXP (cond, 0); > > rtx y =3D XEXP (cond, 1); > > > > - /* The true targets for a conditional move. */ > > - auto_vec targets; > > - /* The temporaries introduced to allow us to not consider register > > - overlap. */ > > - auto_vec temporaries; > > - /* The insns we've emitted. */ > > - auto_vec unmodified_insns; > > - > > - hash_set need_no_cmov; > > - hash_map rewired_src; > > - > > - need_cmov_or_rewire (then_bb, &need_no_cmov, &rewired_src); > > + auto_delete_vec insn_info; > > + init_noce_multiple_sets_info (then_bb, insn_info); > > > > int last_needs_comparison =3D -1; > > > > bool ok =3D noce_convert_multiple_sets_1 > > - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, > > - &unmodified_insns, &last_needs_comparison); > > + (if_info, insn_info, &last_needs_comparison); > > if (!ok) > > return false; > > > > @@ -3302,8 +3287,7 @@ noce_convert_multiple_sets (struct noce_if_info *= if_info) > > end_sequence (); > > start_sequence (); > > ok =3D noce_convert_multiple_sets_1 > > - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, > > - &unmodified_insns, &last_needs_comparison); > > + (if_info, insn_info, &last_needs_comparison); > > /* Actually we should not fail anymore if we reached here, > > but better still check. */ > > if (!ok) > > @@ -3312,12 +3296,12 @@ noce_convert_multiple_sets (struct noce_if_info= *if_info) > > > > /* We must have seen some sort of insn to insert, otherwise we were > > given an empty BB to convert, and we can't handle that. */ > > - gcc_assert (!unmodified_insns.is_empty ()); > > + gcc_assert (!insn_info.is_empty ()); > > > > /* Now fixup the assignments. */ > > - for (unsigned i =3D 0; i < targets.length (); i++) > > - if (targets[i] !=3D temporaries[i]) > > - noce_emit_move_insn (targets[i], temporaries[i]); > > + for (unsigned i =3D 0; i < insn_info.length (); i++) > > + if (insn_info[i]->target !=3D insn_info[i]->temporary) > > + noce_emit_move_insn (insn_info[i]->target, insn_info[i]->tempora= ry); > > > > /* Actually emit the sequence if it isn't too expensive. */ > > rtx_insn *seq =3D get_insns (); > > @@ -3332,10 +3316,10 @@ noce_convert_multiple_sets (struct noce_if_info= *if_info) > > set_used_flags (insn); > > > > /* Mark all our temporaries and targets as used. */ > > - for (unsigned i =3D 0; i < targets.length (); i++) > > + for (unsigned i =3D 0; i < insn_info.length (); i++) > > { > > - set_used_flags (temporaries[i]); > > - set_used_flags (targets[i]); > > + set_used_flags (insn_info[i]->temporary); > > + set_used_flags (insn_info[i]->target); > > } > > > > set_used_flags (cond); > > @@ -3354,7 +3338,7 @@ noce_convert_multiple_sets (struct noce_if_info *= if_info) > > return false; > > > > emit_insn_before_setloc (seq, if_info->jump, > > - INSN_LOCATION (unmodified_insns.last ())); > > + INSN_LOCATION (insn_info.last ()->unmodified_i= nsn)); > > > > /* Clean up THEN_BB and the edges in and out of it. */ > > remove_edge (find_edge (test_bb, join_bb)); > > @@ -3389,21 +3373,13 @@ check_for_cc_cmp_clobbers (rtx dest, const_rtx,= void *p0) > > p[0] =3D NULL_RTX; > > } > > > > -/* This goes through all relevant insns of IF_INFO->then_bb and tries = to > > - create conditional moves. In case a simple move sufficis the insn > > - should be listed in NEED_NO_CMOV. The rewired-src cases should be > > - specified via REWIRED_SRC. TARGETS, TEMPORARIES and UNMODIFIED_INS= NS > > - are specified and used in noce_convert_multiple_sets and should be = passed > > - to this function.. */ > > +/* This goes through all relevant insns of IF_INFO->then_bb and tries = to create > > + conditional moves. Information for the insns is kept in INSN_INFO.= */ > > > > static bool > > noce_convert_multiple_sets_1 (struct noce_if_info *if_info, > > - hash_set *need_no_cmov, > > - hash_map *rewired_src, > > - auto_vec *targets, > > - auto_vec *temporaries, > > - auto_vec *unmodified_insns, > > - int *last_needs_comparison) > > + auto_delete_vec &insn_in= fo, > > + int *last_needs_comparison) > > { > > basic_block then_bb =3D if_info->then_bb; > > rtx_insn *jump =3D if_info->jump; > > @@ -3421,11 +3397,6 @@ noce_convert_multiple_sets_1 (struct noce_if_inf= o *if_info, > > > > rtx_insn *insn; > > int count =3D 0; > > - > > - targets->truncate (0); > > - temporaries->truncate (0); > > - unmodified_insns->truncate (0); > > - > > bool second_try =3D *last_needs_comparison !=3D -1; > > > > FOR_BB_INSNS (then_bb, insn) > > @@ -3434,6 +3405,8 @@ noce_convert_multiple_sets_1 (struct noce_if_info= *if_info, > > if (!active_insn_p (insn)) > > continue; > > > > + noce_multiple_sets_info *info =3D insn_info[count]; > > + > > rtx set =3D single_set (insn); > > gcc_checking_assert (set); > > > > @@ -3441,9 +3414,12 @@ noce_convert_multiple_sets_1 (struct noce_if_inf= o *if_info, > > rtx temp; > > > > rtx new_val =3D SET_SRC (set); > > - if (int *ii =3D rewired_src->get (insn)) > > - new_val =3D simplify_replace_rtx (new_val, (*targets)[*ii], > > - (*temporaries)[*ii]); > > + > > + int i, ii; > > + FOR_EACH_VEC_ELT (info->rewired_src, i, ii) > > + new_val =3D simplify_replace_rtx (new_val, insn_info[ii]->target, > > + insn_info[ii]->temporary); > > + > > rtx old_val =3D target; > > > > /* As we are transforming > > @@ -3484,7 +3460,7 @@ noce_convert_multiple_sets_1 (struct noce_if_info= *if_info, > > /* We have identified swap-style idioms before. A normal > > set will need to be a cmov while the first instruction of a swap= -style > > idiom can be a regular move. This helps with costing. */ > > - bool need_cmov =3D !need_no_cmov->contains (insn); > > + bool need_cmov =3D info->need_cmov; > > > > /* If we had a non-canonical conditional jump (i.e. one where > > the fallthrough is to the "else" case) we need to reverse > > @@ -3632,21 +3608,102 @@ noce_convert_multiple_sets_1 (struct noce_if_i= nfo *if_info, > > > > /* Bookkeeping. */ > > count++; > > - targets->safe_push (target); > > - temporaries->safe_push (temp_dest); > > - unmodified_insns->safe_push (insn); > > + > > + info->target =3D target; > > + info->temporary =3D temp_dest; > > + info->unmodified_insn =3D insn; > > } > > > > + /* The number of processed insns must match init_noce_multiple_sets_= info. */ > > + gcc_assert (count =3D=3D (int) insn_info.length ()); > > + > > /* Even if we did not actually need the comparison, we want to make = sure > > to try a second time in order to get rid of the temporaries. */ > > if (*last_needs_comparison =3D=3D -1) > > *last_needs_comparison =3D 0; > > > > - > > return true; > > } > > > > +/* Find local swap-style idioms in BB and mark the first insn (1) > > + that is only a temporary as not needing a conditional move as > > + it is going to be dead afterwards anyway. > > + > > + (1) int tmp =3D a; > > + a =3D b; > > + b =3D tmp; > > + > > + ifcvt > > + --> > > + > > + tmp =3D a; > > + a =3D cond ? b : a_old; > > + b =3D cond ? tmp : b_old; > > + > > + Additionally, store the index of insns like (2) when a subsequent > > + SET reads from their destination. > > > > + (2) int c =3D a; > > + int d =3D c; > > + > > + ifcvt > > + --> > > + > > + c =3D cond ? a : c_old; > > + d =3D cond ? d : c; // Need to use c rather than c_old here. > > +*/ > > + > > +static void > > +init_noce_multiple_sets_info (basic_block bb, > > + auto_delete_vec &insn_info) > > +{ > > + rtx_insn *insn; > > + int count =3D 0; > > + auto_vec dests; > > + > > + /* Iterate over all SETs, storing the destinations > > + in DEST. > > + - If we hit a SET that reads from a destination > > + that we have seen before and the corresponding register > > + is dead afterwards, the register does not need to be > > + moved conditionally. > > + - If we encounter a previously changed register, > > + rewire the read to the original source. */ > > + FOR_BB_INSNS (bb, insn) > > + { > > + if (!active_insn_p (insn)) > > + continue; > > + > > + noce_multiple_sets_info *info =3D new noce_multiple_sets_info; > > + info->target =3D NULL_RTX; > > + info->temporary =3D NULL_RTX; > > + info->unmodified_insn =3D NULL; > > + info->need_cmov =3D true; > > + insn_info.safe_push (info); > > + > > + rtx set =3D single_set (insn); > > + gcc_checking_assert (set); > > + > > + rtx src =3D SET_SRC (set); > > + rtx dest =3D SET_DEST (set); > > + > > + /* Check if the current SET's source is the same > > + as any previously seen destination. > > + This is quadratic but the number of insns in BB > > + is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ > > + for (int i =3D count - 1; i >=3D 0; --i) > > + if (reg_mentioned_p (dests[i], src)) > > + { > > + if (find_reg_note (insn, REG_DEAD, src) !=3D NULL_RTX) > > This is pre-existing, but is a REG_DEAD note enough? I think we also > need to check that the register isn't live on exit from the condition > block (or, alternatively, isn't live on entry to the join block). > > E.g. for: > > if (cond) > { > tmp =3D a; > a =3D b; > b =3D tmp; > tmp =3D ...; > } > ...tmp...; > > we need to preserve the original value of tmp when !cond, even though > tmp is temporarily dead in the "then" block. > Something else that I noticed about the current code is that the condition = below if (find_reg_note (insn, REG_DEAD, src) !=3D NULL_RTX) need_no_cmov->add (insns[i]); else rewired_src->put (insn, i); considers rewiring and no_cmov exclusive, but they are not since the same register can appear in SRC and as a REG_DEAD note in a single instruction. For example if we had something like (insn 1 (set (reg:SI 107) (reg:SI 105))) (insn 2 (set (reg:SI 102) (reg:SI 107)) (expr_list:REG_DEAD (reg:SI 107)) Then the existing code would not rewrite insn 2 because reg 107 is also dead at insn 2. But although we have identified these issues I haven't been able to trigger them in either x86 or aarch64. One reason looks to be that although there is coded support for SUBREG in noce_convert_multiple_sets, which would make it possible to write some more complicated testcases, it does not trigger. On aarch64 (and (reg X) (const_int 0xfff...)) is used and on x86 the SUBREG is nested within zero_extend or sign_extend so SUBREG_P is false in both cases. So we're left with only simple register moves which look to work fine (there may be related issues in other targets though). Manolis > Thanks, > Richard > > > > + insn_info[i]->need_cmov =3D false; > > + else > > + insn_info[count]->rewired_src.safe_push (i); > > + } > > + > > + dests.safe_push (dest); > > + count++; > > + } > > +} > > > > /* Return true iff basic block TEST_BB is comprised of only > > (SET (REG) (REG)) insns suitable for conversion to a series > > @@ -4121,89 +4178,6 @@ check_cond_move_block (basic_block bb, > > return true; > > } > > > > -/* Find local swap-style idioms in BB and mark the first insn (1) > > - that is only a temporary as not needing a conditional move as > > - it is going to be dead afterwards anyway. > > - > > - (1) int tmp =3D a; > > - a =3D b; > > - b =3D tmp; > > - > > - ifcvt > > - --> > > - > > - tmp =3D a; > > - a =3D cond ? b : a_old; > > - b =3D cond ? tmp : b_old; > > - > > - Additionally, store the index of insns like (2) when a subsequent > > - SET reads from their destination. > > - > > - (2) int c =3D a; > > - int d =3D c; > > - > > - ifcvt > > - --> > > - > > - c =3D cond ? a : c_old; > > - d =3D cond ? d : c; // Need to use c rather than c_old here. > > -*/ > > - > > -static void > > -need_cmov_or_rewire (basic_block bb, > > - hash_set *need_no_cmov, > > - hash_map *rewired_src) > > -{ > > - rtx_insn *insn; > > - int count =3D 0; > > - auto_vec insns; > > - auto_vec dests; > > - > > - /* Iterate over all SETs, storing the destinations > > - in DEST. > > - - If we hit a SET that reads from a destination > > - that we have seen before and the corresponding register > > - is dead afterwards, the register does not need to be > > - moved conditionally. > > - - If we encounter a previously changed register, > > - rewire the read to the original source. */ > > - FOR_BB_INSNS (bb, insn) > > - { > > - rtx set, src, dest; > > - > > - if (!active_insn_p (insn)) > > - continue; > > - > > - set =3D single_set (insn); > > - if (set =3D=3D NULL_RTX) > > - continue; > > - > > - src =3D SET_SRC (set); > > - if (SUBREG_P (src)) > > - src =3D SUBREG_REG (src); > > - dest =3D SET_DEST (set); > > - > > - /* Check if the current SET's source is the same > > - as any previously seen destination. > > - This is quadratic but the number of insns in BB > > - is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ > > - if (REG_P (src)) > > - for (int i =3D count - 1; i >=3D 0; --i) > > - if (reg_overlap_mentioned_p (src, dests[i])) > > - { > > - if (find_reg_note (insn, REG_DEAD, src) !=3D NULL_RTX) > > - need_no_cmov->add (insns[i]); > > - else > > - rewired_src->put (insn, i); > > - } > > - > > - insns.safe_push (insn); > > - dests.safe_push (dest); > > - > > - count++; > > - } > > -} > > - > > /* Given a basic block BB suitable for conditional move conversion, > > a condition COND, and pointer maps THEN_VALS and ELSE_VALS containi= ng > > the register values depending on COND, emit the insns in the block = as > > diff --git a/gcc/ifcvt.h b/gcc/ifcvt.h > > index be1385aabe4..83420fe1207 100644 > > --- a/gcc/ifcvt.h > > +++ b/gcc/ifcvt.h > > @@ -40,6 +40,22 @@ struct ce_if_block > > int pass; /* Pass number. */ > > }; > > > > +struct noce_multiple_sets_info > > +{ > > + /* A list of indices to instructions that we need to rewire into thi= s > > + instruction when we replace them with temporary conditional moves= . */ > > + auto_vec rewired_src; > > + /* The true targets for a conditional move. */ > > + rtx target; > > + /* The temporary introduced for this conditional move. */ > > + rtx temporary; > > + /* The original instruction that we're replacing. */ > > + rtx_insn *unmodified_insn; > > + /* True if a conditional move is needed. > > + False if a simple move can be used instead. */ > > + bool need_cmov; > > +}; > > + > > /* Used by noce_process_if_block to communicate with its subroutines. > > > > The subroutines know that A and B may be evaluated freely. They