From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id F1504385841F for ; Thu, 24 Feb 2022 07:48:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F1504385841F Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 21O768ZL021285; Thu, 24 Feb 2022 07:48:59 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3edxfdre9p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 24 Feb 2022 07:48:59 +0000 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 21O7B9uH011490; Thu, 24 Feb 2022 07:48:58 GMT Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com with ESMTP id 3edxfdre9b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 24 Feb 2022 07:48:58 +0000 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 21O7gCeX000756; Thu, 24 Feb 2022 07:48:57 GMT Received: from b03cxnp07027.gho.boulder.ibm.com (b03cxnp07027.gho.boulder.ibm.com [9.17.130.14]) by ppma02wdc.us.ibm.com with ESMTP id 3ear6b02g8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 24 Feb 2022 07:48:57 +0000 Received: from b03ledav004.gho.boulder.ibm.com (b03ledav004.gho.boulder.ibm.com [9.17.130.235]) by b03cxnp07027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 21O7mv2S22086132 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 24 Feb 2022 07:48:57 GMT Received: from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 184737805C; Thu, 24 Feb 2022 07:48:57 +0000 (GMT) Received: from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B027E78068; Thu, 24 Feb 2022 07:48:56 +0000 (GMT) Received: from genoa (unknown [9.40.192.157]) by b03ledav004.gho.boulder.ibm.com (Postfix) with ESMTPS; Thu, 24 Feb 2022 07:48:56 +0000 (GMT) From: Jiufu Guo To: Segher Boessenkool Cc: Richard Biener , gcc-patches@gcc.gnu.org, dje.gcc@gmail.com, jlaw@tachyum.com, wschmidt@linux.ibm.com Subject: Re: [PATCH] Check if loading const from mem is faster References: <20220222065313.2040127-1-guojiufu@linux.ibm.com> <70r5oq10-988r-3rns-356-o3s79o292nn0@fhfr.qr> <1d471fba-a966-3e90-92ce-ae4707fe53b6@linux.ibm.com> <20220223212749.GI614@gate.crashing.org> Date: Thu, 24 Feb 2022 15:48:54 +0800 In-Reply-To: <20220223212749.GI614@gate.crashing.org> (Segher Boessenkool's message of "Wed, 23 Feb 2022 15:27:49 -0600") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: WlseoNZ6RPzlmk95N9vIpByh3lVRx8rv X-Proofpoint-GUID: 7sLiyB0AXyrGSAYvSNK4MW0NEYUYM3bb X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.64.514 definitions=2022-02-23_09,2022-02-23_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 malwarescore=0 mlxlogscore=999 spamscore=0 priorityscore=1501 phishscore=0 bulkscore=0 mlxscore=0 adultscore=0 impostorscore=0 lowpriorityscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202240044 X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Feb 2022 07:49:01 -0000 Segher Boessenkool writes: > On Wed, Feb 23, 2022 at 07:32:55PM +0800, guojiufu wrote: >> >We already have TARGET_INSN_COST which you could ask for a cost. >> >Like if we'd have a single_set then just temporarily substitute >> >the RHS with the candidate and cost the insns and compare against >> >the original insn cost. So why exactly do you need a new hook >> >for this particular situation? >>=20 >> Thanks for pointing out this! Segher also mentioned this before. >> Currently,=C2=A0CSE is using rtx_cost.=C2=A0Using insn_cost to replace >> rtx_cost would be a good idea for all necessary places including CSE. > > I have updated many places that used rtx_cost to use insn_cost instead, > over the years (as a fallback the generic insn_cost will use rtx_cost). > CSE is the biggest remaining thing. There is a long tail left as well > of course. > >> For this particular case: check the cost for constants. >> I did not use insn_cost. Because to use insn_cost, we may need >> to create a recognizable insn temporarily, and for some kind of >> constants we may need to create a sequence instructions on some >> platform,=C2=A0e.g. "li xx; ori ; sldi .." on ppc64, and check the >> sum cost of those instructions. If only create one fake >> instruction, the insn_cost may not return the accurate cost either. > > That is the problem yes. You need insns to call insn_cost on. You can > look in combine.c:combine_validate_cost to see how this can be done; but > you need to have some code to generate in the first place, and for CSE > it isn't always clear what code to generate, it really is based on RTL > expressions having a cost. Hi Segher, Thanks! combine_validate_cost is useful to help me on evaluating the costs of several instructions or replacements. As you pointed out, at CSE, it may not be clear to know what extact insn sequences will be generated. Actually, the same issue also exists on RTL expression. At CSE, it may not clear the exact cost, since the real instructions maybe emitted in very late passes. To get the accurate cost, we may analyze the constant in the hook(insn_cost or rtx_cost) and estimate the possible final instructions and then calculate the costs. We discussed one idea: let the hook insn_cost accept any interim instruction, and estimate the real instruction base on the interim insn, and then return the estimated costs. For example: input insn "r119:DI=3D0x100803004101001" to insn_cost; and in rs6000_insn_cost (for ppc), analyze constant "0x100803004101001" which would need 5 insns; then rs6000_insn_cost sumarize the cost of 5 insns. A minor concern: because we know that reading this constant from the pool is faster than building it by insns, we will generate instructions to load constant from the pool finally, do not emit 5 real instructions to build the value. So, we are more interested in if it is faster to load from pool or not. BR, Jiufu=20 > > > Segher