From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id BB1063858C52; Tue, 13 Jun 2023 09:18:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BB1063858C52 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D8GnjX017402; Tue, 13 Jun 2023 09:18:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : references : date : in-reply-to : message-id : mime-version : content-type : content-transfer-encoding; s=pp1; bh=QXIpPT/Dl9sSNQyB/vmUCfb397+TF1Mauz8IAQRN94A=; b=Vuvz5NBbv5ojsywekNLOd+LAGKHbyhQ2NaeQCrm9E/Ro4X7Q2EK0WAgiqnXG3So+wIXo 0qrTBDdD+z8qROr4VSaIRwcLpLma0Neo/t3tzDorTaCtVRwvO6rhjJbLpRe8TrzId0OF hhk8ENRegObY1l+Q6MuqxXX3ICtj1EzI8urXnrESh+i+vr+x7lyCKXMEFkEvQh1hi/qa QajnH/4IRR4iQYhzhaH36MSS2FUxcRpzgMMWP2u7d6EAxQBa7f559zKZUPH4v08qNAGB KE4lzLW8tYFz9u+K3/oAbWTrlXHrFSMcEC9LswUDEy8KqguRzLNiLGVrCd7TWG4XWAq2 tA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6mt69g88-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 09:18:12 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D9DBNj011254; Tue, 13 Jun 2023 09:18:12 GMT Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6mt69g7y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 09:18:12 +0000 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35D5ZCUi014531; Tue, 13 Jun 2023 09:18:11 GMT Received: from smtprelay07.dal12v.mail.ibm.com ([9.208.130.99]) by ppma02wdc.us.ibm.com (PPS) with ESMTPS id 3r4gt5fp5q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 09:18:11 +0000 Received: from smtpav05.wdc07v.mail.ibm.com (smtpav05.wdc07v.mail.ibm.com [10.39.53.232]) by smtprelay07.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D9IAFe33358338 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 09:18:11 GMT Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9A73158053; Tue, 13 Jun 2023 09:18:10 +0000 (GMT) Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3E1365805F; Tue, 13 Jun 2023 09:18:10 +0000 (GMT) Received: from ltcden2-lp1.aus.stglabs.ibm.com (unknown [9.3.90.43]) by smtpav05.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Tue, 13 Jun 2023 09:18:10 +0000 (GMT) From: Jiufu Guo To: David Edelsohn Cc: gcc-patches@gcc.gnu.org, segher@kernel.crashing.org, linkw@gcc.gnu.org, bergner@linux.ibm.com Subject: Re: [PATCH 4/4] rs6000: build constant via li/lis;rldic References: <20230608015547.3432691-1-guojiufu@linux.ibm.com> <20230608015547.3432691-5-guojiufu@linux.ibm.com> Date: Tue, 13 Jun 2023 17:18:07 +0800 In-Reply-To: (David Edelsohn's message of "Sat, 10 Jun 2023 21:37:53 -0400") Message-ID: <7nsfavbu0g.fsf@ltcden2-lp1.aus.stglabs.ibm.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: E_182FAH1fctU6trPOU6yRHBBBBM7kC0 X-Proofpoint-GUID: 4EFLJFHsDQS5t8YAuznQJz82TVuMlFlb X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-13_04,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 malwarescore=0 phishscore=0 spamscore=0 suspectscore=0 lowpriorityscore=0 bulkscore=0 priorityscore=1501 clxscore=1015 adultscore=0 impostorscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130079 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi David, Thanks for your valuable comments! David Edelsohn writes: >=20=20 > On Wed, Jun 7, 2023 at 9:56=E2=80=AFPM Jiufu Guo = wrote: > > Hi, > > This patch checks if a constant is possible to be built by "li;rldic". > We only need to take care of "negative li", other forms do not need to c= heck. > For example, "negative lis" is just a "negative li" with an additional s= hift. > > Bootstrap and regtest pass on ppc64{,le}. > Is this ok for trunk? > > BR, > Jeff (Jiufu) > > gcc/ChangeLog: > > * config/rs6000/rs6000.cc (can_be_built_by_li_and_rldic): New fu= nction. > (rs6000_emit_set_long_const): Call can_be_built_by_li_and_rldic. > > This is okay. > > Do you have any measurement of how expensive it is to test all of these a= dditional methods to generate a constant? How much does this affect the > compile time? Yeap, Thanks for this very good question! This patch is mostly using bitwise operations and if-conditions, it would be expected not expensive. Testcases were checked. For example: A case with ~1000 constants: most of them hit this feature. With this feature, the compiling time is slightly faster. 0m1.985s(without patch) vs. 0m1.874s(with patch) (note:D rs6000_emit_set_long_const does not occur in hot perf functions. So, the tricky time saving would not directly cause by this feature.) A case with ~1000 constants:(most are not hit by this feature) 0m2.493s(without patch) vs. 0m2.558s(with patch). For runtime, actually, with the patch, it seems there is no visible improvement in SPEC2017. While I still feel this patch is doing the right thing: use fewer instructions to build the constant. BR, Jeff (Jiufu Guo) > > Thanks, David > >=20=20 >=20=20 > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/const-build.c: Add more tests. > --- > gcc/config/rs6000/rs6000.cc | 61 ++++++++++++++++++- > .../gcc.target/powerpc/const-build.c | 28 +++++++++ > 2 files changed, 88 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 2a3fa733b45..cd04b6b5c82 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -10387,6 +10387,64 @@ can_be_built_by_li_lis_and_rldicr (HOST_WIDE_IN= T c, int *shift, > return false; > } > > +/* Check if value C can be built by 2 instructions: one is 'li', anothe= r is > + rldic. > + > + If so, *SHIFT is set to the 'shift' operand of rldic; and *MASK is s= et > + to the mask value about the 'mb' operand of rldic; and return true. > + Return false otherwise. */ > + > +static bool > +can_be_built_by_li_and_rldic (HOST_WIDE_INT c, int *shift, HOST_WIDE_IN= T *mask) > +{ > + /* There are 49 successive ones in the negative value of 'li'. */ > + int ones =3D 49; > + > + /* 1..1xx1..1: negative value of li --> 0..01..1xx0..0: > + right bits are shifted as 0's, and left 1's(and x's) are cleaned. = */ > + int tz =3D ctz_hwi (c); > + int lz =3D clz_hwi (c); > + int middle_ones =3D clz_hwi (~(c << lz)); > + if (tz + lz + middle_ones >=3D ones) > + { > + *mask =3D ((1LL << (HOST_BITS_PER_WIDE_INT - tz - lz)) - 1LL) << = tz; > + *shift =3D tz; > + return true; > + } > + > + /* 1..1xx1..1 --> 1..1xx0..01..1: some 1's(following x's) are cleaned= . */ > + int leading_ones =3D clz_hwi (~c); > + int tailing_ones =3D ctz_hwi (~c); > + int middle_zeros =3D ctz_hwi (c >> tailing_ones); > + if (leading_ones + tailing_ones + middle_zeros >=3D ones) > + { > + *mask =3D ~(((1ULL << middle_zeros) - 1ULL) << tailing_ones); > + *shift =3D tailing_ones + middle_zeros; > + return true; > + } > + > + /* xx1..1xx: --> xx0..01..1xx: some 1's(following x's) are cleaned. */ > + /* Get the position for the first bit of successive 1. > + The 24th bit would be in successive 0 or 1. */ > + HOST_WIDE_INT low_mask =3D (1LL << 24) - 1LL; > + int pos_first_1 =3D ((c & (low_mask + 1)) =3D=3D 0) > + ? clz_hwi (c & low_mask) > + : HOST_BITS_PER_WIDE_INT - ctz_hwi (~(c | low_mask= )); > + middle_ones =3D clz_hwi (~c << pos_first_1); > + middle_zeros =3D ctz_hwi (c >> (HOST_BITS_PER_WIDE_INT - pos_first_1)= ); > + if (pos_first_1 < HOST_BITS_PER_WIDE_INT > + && middle_ones + middle_zeros < HOST_BITS_PER_WIDE_INT > + && middle_ones + middle_zeros >=3D ones) > + { > + *mask =3D ~(((1ULL << middle_zeros) - 1LL) > + << (HOST_BITS_PER_WIDE_INT - pos_first_1)); > + *shift =3D HOST_BITS_PER_WIDE_INT - pos_first_1 + middle_zeros; > + return true; > + } > + > + return false; > +} > + > /* Subroutine of rs6000_emit_set_const, handling PowerPC64 DImode. > Output insns to set DEST equal to the constant C as a series of > lis, ori and shl instructions. */ > @@ -10435,7 +10493,8 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_= INT c) > } > else if (can_be_built_by_li_lis_and_rotldi (c, &shift, &mask) > || can_be_built_by_li_lis_and_rldicl (c, &shift, &mask) > - || can_be_built_by_li_lis_and_rldicr (c, &shift, &mask)) > + || can_be_built_by_li_lis_and_rldicr (c, &shift, &mask) > + || can_be_built_by_li_and_rldic (c, &shift, &mask)) > { > temp =3D !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); > unsigned HOST_WIDE_INT imm =3D (c | ~mask); > diff --git a/gcc/testsuite/gcc.target/powerpc/const-build.c b/gcc/testsu= ite/gcc.target/powerpc/const-build.c > index 8c209921d41..b503ee31c7c 100644 > --- a/gcc/testsuite/gcc.target/powerpc/const-build.c > +++ b/gcc/testsuite/gcc.target/powerpc/const-build.c > @@ -82,6 +82,29 @@ lis_rldicr_12 (void) > return 0x5310000ffffffff0LL; > } > > +long long NOIPA > +li_rldic_13 (void) > +{ > + return 0x000f853100000000LL; > +} > +long long NOIPA > +li_rldic_14 (void) > +{ > + return 0xffff853100ffffffLL; > +} > + > +long long NOIPA > +li_rldic_15 (void) > +{ > + return 0x800000ffffffff31LL; > +} > + > +long long NOIPA > +li_rldic_16 (void) > +{ > + return 0x800000000fffff31LL; > +} > + > struct fun arr[] =3D { > {li_rotldi_1, 0x7531000000000LL}, > {li_rotldi_2, 0x2100000000000064LL}, > @@ -95,11 +118,16 @@ struct fun arr[] =3D { > {li_rldicr_10, 0xffff8531fff00000LL}, > {li_rldicr_11, 0x21fffffffff00000LL}, > {lis_rldicr_12, 0x5310000ffffffff0LL}, > + {li_rldic_13, 0x000f853100000000LL}, > + {li_rldic_14, 0xffff853100ffffffLL}, > + {li_rldic_15, 0x800000ffffffff31LL}, > + {li_rldic_16, 0x800000000fffff31LL} > }; > > /* { dg-final { scan-assembler-times {\mrotldi\M} 6 } } */ > /* { dg-final { scan-assembler-times {\mrldicl\M} 3 } } */ > /* { dg-final { scan-assembler-times {\mrldicr\M} 3 } } */ > +/* { dg-final { scan-assembler-times {\mrldic\M} 4 } } */ > > int > main () > --=20 > 2.39.1