From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id B4E7E3858C60 for ; Wed, 9 Mar 2022 04:37:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B4E7E3858C60 Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 2290uQ4U030558; Wed, 9 Mar 2022 04:37:44 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3enu2sq1f3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Mar 2022 04:37:43 +0000 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 2294V4db004762; Wed, 9 Mar 2022 04:37:42 GMT Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com with ESMTP id 3enu2sq1eu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Mar 2022 04:37:42 +0000 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2294WhjT006264; Wed, 9 Mar 2022 04:37:42 GMT Received: from b01cxnp22036.gho.pok.ibm.com (b01cxnp22036.gho.pok.ibm.com [9.57.198.26]) by ppma03dal.us.ibm.com with ESMTP id 3emy8h5v6m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Mar 2022 04:37:42 +0000 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2294bfm23998218 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 9 Mar 2022 04:37:41 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1963DAC064; Wed, 9 Mar 2022 04:37:41 +0000 (GMT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B2E64AC059; Wed, 9 Mar 2022 04:37:40 +0000 (GMT) Received: from perkins (unknown [9.3.90.38]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTPS; Wed, 9 Mar 2022 04:37:40 +0000 (GMT) From: Jiufu Guo To: Richard Biener , Segher Boessenkool Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com, jlaw@tachyum.com Subject: Re: [PATCH] Check if loading const from mem is faster References: <20220222065313.2040127-1-guojiufu@linux.ibm.com> <70r5oq10-988r-3rns-356-o3s79o292nn0@fhfr.qr> <1d471fba-a966-3e90-92ce-ae4707fe53b6@linux.ibm.com> <20220223212749.GI614@gate.crashing.org> <20220228164511.GB614@gate.crashing.org> <7p35k1u4pi.fsf@linux.ibm.com> <20220302202416.GG614@gate.crashing.org> <7pilsvqref.fsf@linux.ibm.com> <7pcziwy9cp.fsf@linux.ibm.com> <9s5op1r4-9nps-r156-95r-os98o6474153@fhfr.qr> Date: Wed, 09 Mar 2022 12:37:38 +0800 In-Reply-To: <9s5op1r4-9nps-r156-95r-os98o6474153@fhfr.qr> (Richard Biener's message of "Tue, 8 Mar 2022 12:50:07 +0100 (CET)") Message-ID: <7pwnh3wxkd.fsf@linux.ibm.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 5w6l2UX6Q2MSmjn62w1KzFLhzGUYXJyi X-Proofpoint-GUID: bfl94eabAFEqhQ_1WnlunxBfFS5IB0Ve X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.64.514 definitions=2022-03-09_01,2022-03-04_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1015 mlxscore=0 spamscore=0 lowpriorityscore=0 bulkscore=0 impostorscore=0 phishscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2203090025 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Mar 2022 04:37:47 -0000 Hi! Richard Biener writes: > On Tue, 8 Mar 2022, Jiufu Guo wrote: > >> Jiufu Guo writes: >> >> Hi! >> >> > Hi Sehger, >> > >> > Segher Boessenkool writes: >> > >> >> On Tue, Mar 01, 2022 at 10:28:57PM +0800, Jiufu Guo wrote: >> >>> Segher Boessenkool writes: >> >>> > No. insn_cost is only for correct, existing instructions, not for >> >>> > made-up nonsense. I created insn_cost precisely to get away from that >> >>> > aspect of rtx_cost (and some other issues, like, it is incredibly hard >> >>> > and cumbersome to write a correct rtx_cost). >> >>> >> >>> Thanks! The implementations of hook insn_cost are align with this >> >>> design, they are checking insn's attributes and COSTS_N_INSNS. >> >>> >> >>> One question on the speciall case: >> >>> For instruction: "r119:DI=0x100803004101001" >> >>> Would we treat it as valid instruction? >> >> >> >> Currently we do, alternative 6 in *movdi_internal64: we allow any r<-n. >> >> This is costed as 5 insns (cost=20). >> >> >> >> It generally is better to split things into patterns close to the >> >> eventual machine isntructions as early as possible: all the more generic >> >> optimisations can take advantage of that then. >> > Get it! >> >> >> >>> A patch, which is attached the end of this mail, accepts >> >>> "r119:DI=0x100803004101001" as input of insn_cost. >> >>> In this patch, >> >>> - A tmp instruction is generated via make_insn_raw. >> >>> - A few calls to rtx_cost (in cse_insn) is replaced by insn_cost. >> >>> - In hook of insn_cost, checking the special 'constant' instruction. >> >>> Are these make sense? >> >> >> >> I'll review that patch inline. >> >> I drafted a new patch that replace rtx_cost with insn_cost for cse.cc. >> Different from the previous partial patch, this patch replaces all usage >> of rtx_cost. It may be better/aggressive than previous one. > > I think there's no advantage for using insn_cost over rtx_cost for > the simple SET case. Thanks for your comments and raise this concern. For those targets which do not implement insn_cost, insn_cost calls rtx_cost through pattern_cost, then insn_cost is equal to rtx_cost. While, for those targets which have insn_cost, it seems insn_cost would be better(or say more accurate/consistent?) than rtx_cost. Since: - insn_cost recog the insn first, and compute cost through something (like length/cost attributes from .md file) for the 'machine insn'. - rtx_cost estimates the cost through analyzing the 'rtx content'. The accurate estimation relates to the context. For a special example: "%r100 = C", as a previous patch, by tunning target's rtx_cost hook, cost could be computed according to the value of C. insn_cost may just model the cost in the define of the machine instruction. These reasons are my initial thoughts. Segher may have better explain. :-) To replace rtx_cost with insn_cost, this patch build a SET instruction: "%r = rtx_expr", then using "%r = rtx_expr" from insn_cost to simulate the cost of "rtx_expr" from rtx_cost. BR, Jiufu > > Richard. > >> With this patch, bootstrap pass. >> From regtest, only output of fusion-p10-ldcmpi.c is changed, and the >> change seems as expected. >> >> >> BR, >> Jiufu >> >> diff --git a/gcc/cse.cc b/gcc/cse.cc >> index a18b599d324..e623ad298db 100644 >> --- a/gcc/cse.cc >> +++ b/gcc/cse.cc >> @@ -262,6 +262,9 @@ static struct qty_table_elem *qty_table; >> static rtx_insn *this_insn; >> static bool optimize_this_for_speed_p; >> >> +/* Used for insn_cost. */ >> +static rtx_insn *estimate_insn; >> + >> /* Index by register number, gives the number of the next (or >> previous) register in the chain of registers sharing the same >> value. >> @@ -445,7 +448,7 @@ struct table_elt >> /* Compute cost of X, as stored in the `cost' field of a table_elt. Fixed >> hard registers and pointers into the frame are the cheapest with a cost >> of 0. Next come pseudos with a cost of one and other hard registers with >> - a cost of 2. Aside from these special cases, call `rtx_cost'. */ >> + a cost of 2. Aside from these special cases, call `insn_cost'. */ >> >> #define CHEAP_REGNO(N) \ >> (REGNO_PTR_FRAME_P (N) \ >> @@ -698,18 +701,33 @@ preferable (int cost_a, int regcost_a, int cost_b, int regcost_b) >> from COST macro to keep it simple. */ >> >> static int >> -notreg_cost (rtx x, machine_mode mode, enum rtx_code outer, int opno) >> +notreg_cost (rtx x, machine_mode mode, enum rtx_code /*outer*/, int /*opno*/) >> { >> scalar_int_mode int_mode, inner_mode; >> - return ((GET_CODE (x) == SUBREG >> - && REG_P (SUBREG_REG (x)) >> - && is_int_mode (mode, &int_mode) >> - && is_int_mode (GET_MODE (SUBREG_REG (x)), &inner_mode) >> - && GET_MODE_SIZE (int_mode) < GET_MODE_SIZE (inner_mode) >> - && subreg_lowpart_p (x) >> - && TRULY_NOOP_TRUNCATION_MODES_P (int_mode, inner_mode)) >> - ? 0 >> - : rtx_cost (x, mode, outer, opno, optimize_this_for_speed_p) * 2); >> + if (GET_CODE (x) == SUBREG && REG_P (SUBREG_REG (x)) >> + && is_int_mode (mode, &int_mode) >> + && is_int_mode (GET_MODE (SUBREG_REG (x)), &inner_mode) >> + && GET_MODE_SIZE (int_mode) < GET_MODE_SIZE (inner_mode) >> + && subreg_lowpart_p (x) >> + && TRULY_NOOP_TRUNCATION_MODES_P (int_mode, inner_mode)) >> + return 0; >> + >> + if (estimate_insn == NULL) >> + { >> + estimate_insn = make_insn_raw ( >> + gen_rtx_SET (gen_rtx_REG (mode, LAST_VIRTUAL_REGISTER + 1), x)); >> + SET_PREV_INSN (estimate_insn) = NULL_RTX; >> + SET_NEXT_INSN (estimate_insn) = NULL_RTX; >> + INSN_LOCATION (estimate_insn) = 0; >> + } >> + else >> + { >> + /* Update for new context. */ >> + INSN_CODE (estimate_insn) = -1; >> + PUT_MODE (SET_DEST (PATTERN (estimate_insn)), mode); >> + SET_SRC (PATTERN (estimate_insn)) = x; >> + } >> + return insn_cost (estimate_insn, optimize_this_for_speed_p); >> } >> >> >> @@ -6667,6 +6685,7 @@ cse_main (rtx_insn *f ATTRIBUTE_UNUSED, int nregs) >> >> init_recog (); >> init_alias_analysis (); >> + estimate_insn = NULL; >> >> reg_eqv_table = XNEWVEC (struct reg_eqv_elem, nregs); >> >>