From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 791D43858D1E for ; Sun, 31 Dec 2023 16:23:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 791D43858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 791D43858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=162.254.253.69 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704039803; cv=none; b=aHzqVw7RXdB1a+wtPnJuvs+wZzhaiE6K4tmqwmrWmP5AZNJcL9Vpj03kdyggsdhJeajaKdUt2fZW6SFbLRet+pZOxj+XndF3D1Ab4Q0kf96hZRa2ARebMK1+OvmcC6OifvhIRHc0cxFYqZ33UeE+EzCPRSqaXfBD8yGJ4mIMRoc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704039803; c=relaxed/simple; bh=dOwMcrqSHHAOH9TN0Uu2no8DxuyrvSaYgHqLmxKTT60=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=MGWKXJ/4NtHw5UWI2907EdFRo2GBPP4fZDebhVlz7z2k78bfnE/66aqoePxElcpVBz9YLR6s3k81V0US6/6i+hNoyrXRP2pBEtSC9mc1i6jgNG1V3q0TEAWT5vYOgcgOFOECG87UetGD06BNKczkuGewQR3SI/L/4Sc74znAd98= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:In-Reply-To:References:Cc:To:From:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=50WXCnmsNaenQfZjqS7oQQIUwCXG/511Bx4/kypK3t8=; b=p4URkIeM8xwRpt4xGlgqa79XVy aLjILMDOcsGQlXMMQSqUmtVWcBcVWZ37ylAgyz+AmMYebQrLWlMS1Ki9/vsLo+gxMU1NJjO6gYfts 9W/rR7RQLBduwJdSIC6z0H2FX0CmK06ORys+YkahL23sPhuUwavneu0JSSwA6lFcFDZ8YAGsm12SY jK/EORTV9h6coH3zvcj1vvS77cWNQ9NYhoAjRJN4pK5lZj0cBbI7ZMihPsEE/g+d9MlG9pz43FQnC DmjntErwXMe1mz0hVt5045UwofxBUZqnpT8Int+xfIjZg0wT2/HTYCkq0Z2gDOkf5JjBEXTIJtnbI ZOF5QN5A==; Received: from host86-131-181-50.range86-131.btcentralplus.com ([86.131.181.50]:61050 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96.2) (envelope-from ) id 1rJyau-0004BK-2t; Sun, 31 Dec 2023 11:23:21 -0500 From: "Roger Sayle" To: Cc: "'Jeff Law'" References: In-Reply-To: Subject: [middle-end PATCH take #2] Only call targetm.truly_noop_truncation for truncations. Date: Sun, 31 Dec 2023 16:23:18 -0000 Message-ID: <04d301da3c05$aadbb740$009325c0$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_04D4_01DA3C05.AADBB740" X-Mailer: Microsoft Outlook 16.0 Content-Language: en-gb Thread-Index: Ado8AsZWOqtIFHXNQKKXtZVwVgBttw== X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_BARRACUDACENTRAL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multipart message in MIME format. ------=_NextPart_000_04D4_01DA3C05.AADBB740 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Very many thanks (and a Happy New Year) to the pre-commit patch testing folks at linaro.org. Their testing has revealed that although my patch is clean on x86_64, it triggers some problems on aarch64 and arm. The issue (with the previous version of my patch) is that these platforms require a paradoxical subreg to be generated by the middle-end, where we were previously checking for truly_noop_truncation. This has been fixed (in revision 2) below. Where previously I had: @@ -66,7 +66,9 @@ gen_lowpart_general (machine_mode mode, rtx x) scalar_int_mode xmode; if (is_a (GET_MODE (x), &xmode) && GET_MODE_SIZE (xmode) <= UNITS_PER_WORD - && TRULY_NOOP_TRUNCATION_MODES_P (mode, xmode) + && (known_lt (GET_MODE_SIZE (mode), GET_MODE_SIZE (xmode)) + ? TRULY_NOOP_TRUNCATION_MODES_P (mode, xmode) + : known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (xmode))) && !reload_completed) return gen_lowpart_general (mode, force_reg (xmode, x)); the correct change is: scalar_int_mode xmode; if (is_a (GET_MODE (x), &xmode) && GET_MODE_SIZE (xmode) <= UNITS_PER_WORD - && TRULY_NOOP_TRUNCATION_MODES_P (mode, xmode) + && (known_ge (GET_MODE_SIZE (mode), GET_MODE_SIZE (xmode)) + || TRULY_NOOP_TRUNCATION_MODES_P (mode, xmode)) && !reload_completed) return gen_lowpart_general (mode, force_reg (xmode, x)); i.e. we only call TRULY_NOOP_TRUNCATION_MODES_P when we know we have a truncation, but the behaviour of non-truncations is preserved (no longer depends upon unspecified behaviour) and gen_lowpart_general is called to create the paradoxical SUBREG. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? Hopefully this revision tests cleanly on the linaro.org CI pipeline. 2023-12-31 Roger Sayle gcc/ChangeLog * combine.cc (make_extraction): Confirm that OUTPREC is less than INPREC before calling TRULY_NOOP_TRUNCATION_MODES_P. * expmed.cc (store_bit_field_using_insv): Likewise. (extract_bit_field_using_extv): Likewise. (extract_bit_field_as_subreg): Likewise. * optabs-query.cc (get_best_extraction_insn): Likewise. * optabs.cc (expand_parity): Likewise. * rtlhooks.cc (gen_lowpart_general): Likewise. * simplify-rtx.cc (simplify_truncation): Disallow truncations to the same precision. (simplify_unary_operation_1) : Move optimization of truncations to the same mode earlier. > -----Original Message----- > From: Roger Sayle > Sent: 28 December 2023 15:35 > To: 'gcc-patches@gcc.gnu.org' > Cc: 'Jeff Law' > Subject: [middle-end PATCH] Only call targetm.truly_noop_truncation for > truncations. > > > The truly_noop_truncation target hook is documented, in target.def, as "true if it > is safe to convert a value of inprec bits to one of outprec bits (where outprec is > smaller than inprec) by merely operating on it as if it had only outprec bits", i.e. > the middle-end can use a SUBREG instead of a TRUNCATE. > > What's perhaps potentially a little ambiguous in the above description is whether > it is the caller or the callee that's responsible for ensuring or checking whether > "outprec < inprec". The name TRULY_NOOP_TRUNCATION_P, like > SUBREG_PROMOTED_P, may be prone to being understood as a predicate that > confirms that something is a no-op truncation or a promoted subreg, when in fact > the caller must first confirm this is a truncation/subreg and only then call the > "classification" macro. > > Alas making the following minor tweak (for testing) to the i386 backend: > > static bool > ix86_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec) { > gcc_assert (outprec < inprec); > return true; > } > > #undef TARGET_TRULY_NOOP_TRUNCATION > #define TARGET_TRULY_NOOP_TRUNCATION ix86_truly_noop_truncation > > reveals that there are numerous callers in middle-end that rely on the default > behaviour of silently returning true for any (invalid) input. > These are fixed below. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and > make -k check, both with and without --target_board=unix{-m32} with no new > failures. Ok for mainline? > > > 2023-12-28 Roger Sayle > > gcc/ChangeLog > * combine.cc (make_extraction): Confirm that OUTPREC is less than > INPREC before calling TRULY_NOOP_TRUNCATION_MODES_P. > * expmed.cc (store_bit_field_using_insv): Likewise. > (extract_bit_field_using_extv): Likewise. > (extract_bit_field_as_subreg): Likewise. > * optabs-query.cc (get_best_extraction_insn): Likewise. > * optabs.cc (expand_parity): Likewise. > * rtlhooks.cc (gen_lowpart_general): Likewise. > * simplify-rtx.cc (simplify_truncation): Disallow truncations > to the same precision. > (simplify_unary_operation_1) : Move optimization > of truncations to the same mode earlier. > > > Thanks in advance, > Roger > -- ------=_NextPart_000_04D4_01DA3C05.AADBB740 Content-Type: text/plain; name="patchnt3.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="patchnt3.txt" diff --git a/gcc/combine.cc b/gcc/combine.cc=0A= index f2c64a9..5aa2f57 100644=0A= --- a/gcc/combine.cc=0A= +++ b/gcc/combine.cc=0A= @@ -7613,7 +7613,8 @@ make_extraction (machine_mode mode, rtx inner, = HOST_WIDE_INT pos,=0A= && (pos =3D=3D 0 || REG_P (inner))=0A= && (inner_mode =3D=3D tmode=0A= || !REG_P (inner)=0A= - || TRULY_NOOP_TRUNCATION_MODES_P (tmode, inner_mode)=0A= + || (known_lt (GET_MODE_SIZE (tmode), GET_MODE_SIZE (inner_mode))=0A= + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, inner_mode))=0A= || reg_truncated_to_mode (tmode, inner))=0A= && (! in_dest=0A= || (REG_P (inner)=0A= @@ -7856,6 +7857,8 @@ make_extraction (machine_mode mode, rtx inner, = HOST_WIDE_INT pos,=0A= /* On the LHS, don't create paradoxical subregs implicitely = truncating=0A= the register unless TARGET_TRULY_NOOP_TRUNCATION. */=0A= if (in_dest=0A= + && known_lt (GET_MODE_SIZE (GET_MODE (inner)),=0A= + GET_MODE_SIZE (wanted_inner_mode))=0A= && !TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (inner),=0A= wanted_inner_mode))=0A= return NULL_RTX;=0A= diff --git a/gcc/expmed.cc b/gcc/expmed.cc=0A= index 05331dd..6398bf9 100644=0A= --- a/gcc/expmed.cc=0A= +++ b/gcc/expmed.cc=0A= @@ -651,6 +651,7 @@ store_bit_field_using_insv (const extraction_insn = *insv, rtx op0,=0A= X) 0)) is (reg:N X). */=0A= if (GET_CODE (xop0) =3D=3D SUBREG=0A= && REG_P (SUBREG_REG (xop0))=0A= + && paradoxical_subreg_p (xop0)=0A= && !TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (SUBREG_REG (xop0)),=0A= op_mode))=0A= {=0A= @@ -1585,7 +1586,11 @@ extract_bit_field_using_extv (const = extraction_insn *extv, rtx op0,=0A= mode. Instead, create a temporary and use convert_move to set=0A= the target. */=0A= if (REG_P (target)=0A= - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)=0A= + && (known_lt (GET_MODE_SIZE (GET_MODE (target)),=0A= + GET_MODE_SIZE (ext_mode))=0A= + ? TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)=0A= + : known_eq (GET_MODE_SIZE (GET_MODE (target)),=0A= + GET_MODE_SIZE (ext_mode)))=0A= && (temp =3D gen_lowpart_if_possible (ext_mode, target)))=0A= {=0A= target =3D temp;=0A= @@ -1626,7 +1631,9 @@ extract_bit_field_as_subreg (machine_mode mode, = rtx op0,=0A= if (multiple_p (bitnum, BITS_PER_UNIT, &bytenum)=0A= && known_eq (bitsize, GET_MODE_BITSIZE (mode))=0A= && lowpart_bit_field_p (bitnum, bitsize, op0_mode)=0A= - && TRULY_NOOP_TRUNCATION_MODES_P (mode, op0_mode))=0A= + && (known_lt (GET_MODE_SIZE (mode), GET_MODE_SIZE (op0_mode))=0A= + ? TRULY_NOOP_TRUNCATION_MODES_P (mode, op0_mode)=0A= + : known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (op0_mode))))=0A= return simplify_gen_subreg (mode, op0, op0_mode, bytenum);=0A= return NULL_RTX;=0A= }=0A= diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc=0A= index 947ccef..f33253f 100644=0A= --- a/gcc/optabs-query.cc=0A= +++ b/gcc/optabs-query.cc=0A= @@ -213,7 +213,7 @@ get_best_extraction_insn (extraction_insn *insn,=0A= FOR_EACH_MODE_FROM (mode_iter, mode)=0A= {=0A= mode =3D mode_iter.require ();=0A= - if (maybe_gt (GET_MODE_SIZE (mode), GET_MODE_SIZE (field_mode))=0A= + if (maybe_ge (GET_MODE_SIZE (mode), GET_MODE_SIZE (field_mode))=0A= || TRULY_NOOP_TRUNCATION_MODES_P (insn->field_mode,=0A= field_mode))=0A= break;=0A= diff --git a/gcc/optabs.cc b/gcc/optabs.cc=0A= index 6a34276..fad0d59 100644=0A= --- a/gcc/optabs.cc=0A= +++ b/gcc/optabs.cc=0A= @@ -2954,7 +2954,11 @@ expand_parity (scalar_int_mode mode, rtx op0, rtx = target)=0A= if (temp)=0A= {=0A= if (mclass !=3D MODE_INT=0A= - || !TRULY_NOOP_TRUNCATION_MODES_P (mode, wider_mode))=0A= + || (known_lt (GET_MODE_SIZE (mode),=0A= + GET_MODE_SIZE (wider_mode))=0A= + ? !TRULY_NOOP_TRUNCATION_MODES_P (mode, wider_mode)=0A= + : maybe_ne (GET_MODE_SIZE (mode),=0A= + GET_MODE_SIZE (wider_mode))))=0A= return convert_to_mode (mode, temp, 0);=0A= else=0A= return gen_lowpart (mode, temp);=0A= diff --git a/gcc/rtlhooks.cc b/gcc/rtlhooks.cc=0A= index 989d3c9..c3313fd 100644=0A= --- a/gcc/rtlhooks.cc=0A= +++ b/gcc/rtlhooks.cc=0A= @@ -66,7 +66,8 @@ gen_lowpart_general (machine_mode mode, rtx x)=0A= scalar_int_mode xmode;=0A= if (is_a (GET_MODE (x), &xmode)=0A= && GET_MODE_SIZE (xmode) <=3D UNITS_PER_WORD=0A= - && TRULY_NOOP_TRUNCATION_MODES_P (mode, xmode)=0A= + && (known_ge (GET_MODE_SIZE (mode), GET_MODE_SIZE (xmode))=0A= + || TRULY_NOOP_TRUNCATION_MODES_P (mode, xmode))=0A= && !reload_completed)=0A= return gen_lowpart_general (mode, force_reg (xmode, x));=0A= =0A= diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc=0A= index f3745d8..27518f5 100644=0A= --- a/gcc/simplify-rtx.cc=0A= +++ b/gcc/simplify-rtx.cc=0A= @@ -617,7 +617,7 @@ simplify_context::simplify_truncation (machine_mode = mode, rtx op,=0A= unsigned int op_precision =3D GET_MODE_UNIT_PRECISION (op_mode);=0A= scalar_int_mode int_mode, int_op_mode, subreg_mode;=0A= =0A= - gcc_assert (precision <=3D op_precision);=0A= + gcc_assert (precision < op_precision);=0A= =0A= /* Optimize truncations of zero and sign extended values. */=0A= if (GET_CODE (op) =3D=3D ZERO_EXTEND=0A= @@ -1207,6 +1207,10 @@ simplify_context::simplify_unary_operation_1 = (rtx_code code, machine_mode mode,=0A= break;=0A= =0A= case TRUNCATE:=0A= + /* Check for useless truncation. */=0A= + if (GET_MODE (op) =3D=3D mode)=0A= + return op;=0A= +=0A= /* Don't optimize (lshiftrt (mult ...)) as it would interfere=0A= with the umulXi3_highpart patterns. */=0A= if (GET_CODE (op) =3D=3D LSHIFTRT=0A= @@ -1271,9 +1275,6 @@ simplify_context::simplify_unary_operation_1 = (rtx_code code, machine_mode mode,=0A= return temp;=0A= }=0A= =0A= - /* Check for useless truncation. */=0A= - if (GET_MODE (op) =3D=3D mode)=0A= - return op;=0A= break;=0A= =0A= case FLOAT_TRUNCATE:=0A= ------=_NextPart_000_04D4_01DA3C05.AADBB740--