From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by sourceware.org (Postfix) with ESMTPS id 86ECB3858D38 for ; Tue, 23 Apr 2024 10:58:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 86ECB3858D38 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 86ECB3858D38 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::102b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713869934; cv=none; b=aeef05J+1xAqinbWzUGemS4Png6jVT92K/0aYUAeMBlKkXJzL422moxEQ7fQ0vyUQ7wZHJShqc5b9u5hLZVK9UjplZO+TiYGq/6hHQaZV+T7cIUaYfFhO8lKTJa3bVfTdcAr72CuoZLtS2QoG6TzUpOIDC4Q8MAuIX4bREqOfxU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713869934; c=relaxed/simple; bh=IcqXlFcurI3I3eSpmPDGdMgGPxiUOXGSVOptcJ3SwkE=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=fuMdiDmrM1J6zdZVTant786B+5hjU11l8i0uNebDEngHRFk1UTZbepybJTMDBRVAYdTjCqmT4AnjzdBE4xjhr4jOiCo6bzwtLYrl+QjWDAmpW8AQ4loufTopig7q/Ov8kOnmnX87RPijcU8l1FiV49H3y8PyLtVAeyYWuHwp7Cs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pj1-x102b.google.com with SMTP id 98e67ed59e1d1-2a559928f46so3555044a91.0 for ; Tue, 23 Apr 2024 03:58:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1713869930; x=1714474730; darn=gcc.gnu.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Vc2kntulC9vAOowN5ajk/9GV/rvPTWzPqejHqeOPYNQ=; b=TbY4UuxnYAB+qJIZqIg4VVt4JdKKaaM4fz6xEQEj7rQ+yLVpF2cB/ymuEgfkbDKYIF p36zfss/K04Z//KHYDm7roygzzEpB5vyEnunR7rPjNHryAwfRWkaju9ewlzHzHrOsRgw 05ZFyQ0lLkrQeuqM5+HjKOlqH18XoL/6sNgAvbi2prz3wTn7qxxkDF6v9d4inV+TSozy q8wzpL74sr/VSaIE40SKwiIcYTBG3iujPXA5JYMvtCRDHoftND1Yz/wjjXoeihpWo670 K9PY+GLDqvikSUPHVDaD2ndkZwK42aeARN56GuEdYIfuXzTptj+5EgNQqM7CTwWYEEiw d2tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713869930; x=1714474730; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vc2kntulC9vAOowN5ajk/9GV/rvPTWzPqejHqeOPYNQ=; b=qnijCKkXs0c0nS8FMT4IX/d79H7u0watigXSuhJxU8/Cxb2lqRaHzQdJDXWD3Jm97a OQcd0o7OscDcy/8SYsCDFNVuKNXjh382LSblIf2Ut3AjTPaqj3+RAF6CmBrL9R4tpNj3 qfKPMsNceGTAs5FaNv6sUeHU+HE7NWG1U0b5+b+UZf4ijwIifAwqO/suz45Z8T685LfA ZR+A9zUzFzRhtWdgqrDgQC/ph6012J9Jrj+h+4tyVRB+SAGESdv2sALAb9oO8RvzT8Ca T2PKmcJA2AlBKfNW+l0tzTrKQmUifaAhCqYI6LCFM0iGPa2/tmlsmNFLhgfI23X9UHPX o6ww== X-Forwarded-Encrypted: i=1; AJvYcCULIU58X5ivJixFJBdeCsL3jhhS+UOviVTUhRWCINAbFU0YES201CWij36b8y3DsX204Iham9HjF1NuiK5tFk1KVCNE7gdY8A== X-Gm-Message-State: AOJu0YxIZXeab/47rcJqovqTb+G5sE5eq4qO1eKvizKBVDm4SpFKA36/ cV97frg0wtWhoYVKxE8gmQYlnJqN1cd5iJIsPBeZoulU7V6oCbo7GvuzDqiDgZna3wfEdEq/MoX NQTmN91CZARJ+2mWOqpGfBfYZfk8ETiA5m7eF/A== X-Google-Smtp-Source: AGHT+IGg4rB+6FZ6MS+aDoox7a0h6cKdYnWmUF/+Isuuc70os8DlmgIFV9+yP/P/Ny6s7Lpy9H4oD5YPDDzW19JvrRg= X-Received: by 2002:a17:90a:9907:b0:2ac:5b11:9e46 with SMTP id b7-20020a17090a990700b002ac5b119e46mr13022475pjp.26.1713869930205; Tue, 23 Apr 2024 03:58:50 -0700 (PDT) MIME-Version: 1.0 References: <20230830101400.1539313-1-manolis.tsamis@vrull.eu> <20230830101400.1539313-2-manolis.tsamis@vrull.eu> In-Reply-To: From: Manolis Tsamis Date: Tue, 23 Apr 2024 13:58:14 +0300 Message-ID: Subject: Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets To: Manolis Tsamis , gcc-patches@gcc.gnu.org, Philipp Tomsich , Robin Dapp , Jakub Jelinek , richard.sandiford@arm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Oct 19, 2023 at 10:41=E2=80=AFPM Richard Sandiford wrote: > > Manolis Tsamis writes: > > This is an extension of what was done in PR106590. > > > > Currently if a sequence generated in noce_convert_multiple_sets clobber= s the > > condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards > > (sequences that emit the comparison itself). Since this applies only fr= om the > > next iteration it assumes that the sequences generated (in particular s= eq2) > > doesn't clobber the condition rtx itself before using it in the if_then= _else, > > which is only true in specific cases (currently only register/subregist= er moves > > are allowed). > > > > This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc= _cmp in > > the current iteration. This makes it possible to include arithmetic ope= rations > > in noce_convert_multiple_sets. > > > > gcc/ChangeLog: > > > > * ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead= . > > (noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_= cmp. > > > > Signed-off-by: Manolis Tsamis > > --- > > > > (no changes since v1) > > > > gcc/ifcvt.cc | 49 +++++++++++++++++++------------------------------ > > 1 file changed, 19 insertions(+), 30 deletions(-) > > Sorry for the slow review. TBH I was hoping someone else would pick > it up, since (a) I'm not very familiar with this code, and (b) I don't > really agree with the way that the current code works. I'm not sure the > current dependency checking is safe, so I'm nervous about adding even > more cases to it. And it feels like the different ifcvt techniques are > not now well distinguished, so that they're beginning to overlap and > compete with one another. None of that is your fault, of course. :) > > > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc > > index a0af553b9ff..3273aeca125 100644 > > --- a/gcc/ifcvt.cc > > +++ b/gcc/ifcvt.cc > > @@ -3375,20 +3375,6 @@ noce_convert_multiple_sets (struct noce_if_info = *if_info) > > return true; > > } > > > > -/* Helper function for noce_convert_multiple_sets_1. If store to > > - DEST can affect P[0] or P[1], clear P[0]. Called via note_stores. = */ > > - > > -static void > > -check_for_cc_cmp_clobbers (rtx dest, const_rtx, void *p0) > > -{ > > - rtx *p =3D (rtx *) p0; > > - if (p[0] =3D=3D NULL_RTX) > > - return; > > - if (reg_overlap_mentioned_p (dest, p[0]) > > - || (p[1] && reg_overlap_mentioned_p (dest, p[1]))) > > - p[0] =3D NULL_RTX; > > -} > > - > > /* This goes through all relevant insns of IF_INFO->then_bb and tries = to > > create conditional moves. In case a simple move sufficis the insn > > should be listed in NEED_NO_CMOV. The rewired-src cases should be > > @@ -3552,9 +3538,17 @@ noce_convert_multiple_sets_1 (struct noce_if_inf= o *if_info, > > creating an additional compare for each. If successful, costing > > is easier and this sequence is usually preferred. */ > > if (cc_cmp) > > - seq2 =3D try_emit_cmove_seq (if_info, temp, cond, > > - new_val, old_val, need_cmov, > > - &cost2, &temp_dest2, cc_cmp, rev_cc_cm= p); > > + { > > + seq2 =3D try_emit_cmove_seq (if_info, temp, cond, > > + new_val, old_val, need_cmov, > > + &cost2, &temp_dest2, cc_cmp, rev_cc_= cmp); > > + > > + /* The if_then_else in SEQ2 may be affected when cc_cmp/rev_cc_= cmp is > > + clobbered. We can't safely use the sequence in this case. = */ > > + if (seq2 && (modified_in_p (cc_cmp, seq2) > > + || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq2)))) > > + seq2 =3D NULL; > > modified_in_p only checks the first instruction in seq2, not the whole > sequence. > > I think the unpatched approach is OK in cases where seq2 clobbers > cc_cmp/rev_cc_cmp in or after the last use, since instructions are > defined to operate on a read-all/compute/write-all basis. > > Soon after the snippet above, the unpatched code has this loop: > > /* The backend might have created a sequence that uses the > condition. Check this. */ > rtx_insn *walk =3D seq2; > while (walk) > { > rtx set =3D single_set (walk); > > if (!set || !SET_SRC (set)) > > This condition looks odd. A SET_SRC is never null. But more importantly= , > continuing means "assume the best", and I don't think we should assume > the best for (say) an insn with two parallel sets. > > It doesn't look like the series addresses this, but !set seems more > likely to occur if we extend the function to general operations. > > { > walk =3D NEXT_INSN (walk); > continue; > } > > rtx src =3D SET_SRC (set); > > if (XEXP (set, 1) && GET_CODE (XEXP (set, 1)) =3D=3D IF_THEN_EL= SE) > ; /* We assume that this is the cmove created by the backend = that > naturally uses the condition. Therefore we ignore it. = */ > > XEXP (set, 1) is src. But here too, I'm a bit nervous about assuming > that any if_then_else is safe to ignore, even currently. It seems > more dangerous if we extend the function to handle more cases. > > else > { > if (reg_mentioned_p (XEXP (cond, 0), src) > || reg_mentioned_p (XEXP (cond, 1), src)) > { > read_comparison =3D true; > break; > } > } > > walk =3D NEXT_INSN (walk); > } > > Maybe a safer way to check whether "insn" is sensitive to the values > in "val" is to use: > > subrtx_iterator::array_type array; > FOR_EACH_SUBRTX (iter, array, val, NONCONST) > if (OBJECT_P (*iter) && reg_overlap_mentioned_p (*iter, insn)) > return true; > return false; > > which doesn't assume that insn is a single set. > > But I'm not sure which cases this code is trying to catch. Is it trying > to catch cases where seq2 "spontaneously" uses registers that happen to > overlap with cond? If so, then when does that happen? And if it does > happen, wouldn't the sequence also have to set the registers first? > > If instead the code is trying to detect uses of cond that were fed > to seq2 outside of cond itself, via try_emit_cmove_seq, then wouldn't > it be enough to check old_val and new_val? > > Anyway, the point I was trying to get to was: we could use the > FOR_EACH_SUBRTX above to check whether an instruction in seq2 > is sensitive to the value of cc_cmp/rev_cc_cmp. And we can use > modified_in_p, as in your patch, to check whether an instruction > changes the value of cc_cmp/rev_cc_cmp. We could therefore record > whether we've seen a modification of cc_cmp/rev_cc_cmp, then reject > seq2 if we see a subsequent use. > Hi Richard, I have re-implemented this code based on all your suggestions and resubmitted it as an RFC: https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649898.html The code is a bit more complicated but it handles a lot more cases. On a side note, I also observed that the original code was sometimes setting read_comparison =3D true more conservatively than needed due to a limitation of reg_mentioned_p. In the expression reg_mentioned_p (XEXP (cond, 1), src) both XEXP (cond, 1) and src can be (const_int 1) and reg_mentioned_p will return true, although it should return false. This is because it has a check if (reg =3D=3D in) return true; Which will return true. If we look at reg_overlap_mentioned_p instead, it has a check if (CONSTANT_P (in)) return false; and I believe we need something similar for reg_mentioned_p. Thanks, Manolis > Thanks, > Richard > > > + } > > > > /* The backend might have created a sequence that uses the > > condition. Check this. */ > > @@ -3609,21 +3603,16 @@ noce_convert_multiple_sets_1 (struct noce_if_in= fo *if_info, > > return false; > > } > > > > - if (cc_cmp) > > + if (cc_cmp && seq =3D=3D seq1) > > { > > - /* Check if SEQ can clobber registers mentioned in > > - cc_cmp and/or rev_cc_cmp. If yes, we need to use > > - only seq1 from that point on. */ > > - rtx cc_cmp_pair[2] =3D { cc_cmp, rev_cc_cmp }; > > - for (walk =3D seq; walk; walk =3D NEXT_INSN (walk)) > > + /* Check if SEQ can clobber registers mentioned in cc_cmp/rev_c= c_cmp. > > + If yes, we need to use only seq1 from that point on. > > + Only check when we use seq1 since we have already tested seq= 2. */ > > + if (modified_in_p (cc_cmp, seq) > > + || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq))) > > { > > - note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair); > > - if (cc_cmp_pair[0] =3D=3D NULL_RTX) > > - { > > - cc_cmp =3D NULL_RTX; > > - rev_cc_cmp =3D NULL_RTX; > > - break; > > - } > > + cc_cmp =3D NULL_RTX; > > + rev_cc_cmp =3D NULL_RTX; > > } > > }