From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id A023E3851343; Wed, 14 Dec 2022 09:46:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A023E3851343 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BE9dcga015780; Wed, 14 Dec 2022 09:46:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=Sy1XdnkMogNKY/tGNf8lb1C/EumRhlr1gyrAhJOgm30=; b=QX9T6YaCjAh2PmMPC4nnTNCdufe+hB9de1F1I3gz5+Og4/Ql7YnZJGsjkphnUeoVr2dJ 551sc+GhcjFeNJI97vHYdoR5QxWt8RI/mxG10KE6jm54LbGV6KcvLCiUTnF3uGR/WrKx buzfo13D0cEZ8m7zjd2BVMrJ1tkGq1e6yDZMdOFB2GBmX2RjeHfYzjD6qbYLTEwTPFuo vRK7j6pR6DES0/BTqZgnsASzylzTYJTTVt7TEaybi7HfFmPXVzzgrjVWlVTILfAQfNt2 JZ78nzMEIi875Zm1Kvhd1Y4iVbzEvZrxp2wgiku/xxhzw1RV7AKaCQmVsI1xa+xva2e/ 7g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3mfaupa3u1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 14 Dec 2022 09:46:13 +0000 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2BE9gn30004610; Wed, 14 Dec 2022 09:46:13 GMT Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3mfaupa3st-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 14 Dec 2022 09:46:12 +0000 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 2BDK3EfU026122; Wed, 14 Dec 2022 09:46:10 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma03fra.de.ibm.com (PPS) with ESMTPS id 3mf0380kp1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 14 Dec 2022 09:46:10 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2BE9k8Ps47448448 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Dec 2022 09:46:08 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2B9D520040; Wed, 14 Dec 2022 09:46:08 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1F99E20043; Wed, 14 Dec 2022 09:46:06 +0000 (GMT) Received: from [9.197.253.236] (unknown [9.197.253.236]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 14 Dec 2022 09:46:05 +0000 (GMT) Message-ID: <76725c70-bdbb-e982-1ea8-4d11a1506711@linux.ibm.com> Date: Wed, 14 Dec 2022 17:46:04 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH V2] rs6000: Load high and low part of 64bit constant independently Content-Language: en-US To: Jiufu Guo Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, gcc-patches@gcc.gnu.org References: <20221212014401.112147-1-guojiufu@linux.ibm.com> From: "Kewen.Lin" In-Reply-To: <20221212014401.112147-1-guojiufu@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Ey5sLgOCwqSnWetV8DQ71QwcAHhH9Q6O X-Proofpoint-GUID: YAzrS4Yz9t80li_z3U5aIROPtKKZoCj8 Content-Transfer-Encoding: 7bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-14_04,2022-12-13_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 priorityscore=1501 lowpriorityscore=0 mlxlogscore=999 mlxscore=0 phishscore=0 adultscore=0 impostorscore=0 suspectscore=0 spamscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212140075 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Jeff, on 2022/12/12 09:44, Jiufu Guo via Gcc-patches wrote: > Hi, > > Compare with previous patch, this patch updates accoding to comments; fixes > conflicts with trunk, and recheck bootstrap®test. > https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607333.html > > For a complicate 64bit constant, blow is one instruction-sequence to ~~~~ below > build: > lis 9,0x800a > ori 9,9,0xabcd > sldi 9,9,32 > oris 9,9,0xc167 > ori 9,9,0xfa16 > > while we can also use below sequence to build: > lis 9,0xc167 > lis 10,0x800a > ori 9,9,0xfa16 > ori 10,10,0xabcd > rldimi 9,10,32,0 > This sequence is using 2 registers to build high and low part firstly, > and then merge them. > > In parallel aspect, this sequence would be faster. (Ofcause, using 1 more > register with potential register pressure). > > The instruction sequence with two registers for parallel version can be > generated only if can_create_pseudo_p. Otherwise, the one register > sequence is generated. > > Bootstrap and regtest pass on ppc64{,le}. > Is this ok for trunk? > > > BR, > Jeff(Jiufu) > > > gcc/ChangeLog: > > * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Generate > more parallel code if can_create_pseudo_p. > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/parall_5insn_const.c: New test. > > --- > gcc/config/rs6000/rs6000.cc | 37 +++++++++++++------ > .../gcc.target/powerpc/parall_5insn_const.c | 27 ++++++++++++++ > 2 files changed, 52 insertions(+), 12 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index b3a609f3aa3..3020d9780bc 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -10322,19 +10322,32 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c) > } > else > { > - temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); > - > - emit_move_insn (temp, GEN_INT (sext_hwi (ud4 << 16, 32))); > - if (ud3 != 0) > - emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud3))); > + if (can_create_pseudo_p ()) > + { > + /* lis H,U4; ori H,U3; lis L,U2; ori L,U1; rldimi L,H,32,0. */ Nit: It's probably better to update the capitals with the actual variable names in upper case, since they are also short, ... > + rtx high = gen_reg_rtx (DImode); > + rtx low = gen_reg_rtx (DImode); > + HOST_WIDE_INT num = (ud2 << 16) | ud1; > + rs6000_emit_set_long_const (low, sext_hwi (num, 32)); > + num = (ud4 << 16) | ud3; > + rs6000_emit_set_long_const (high, sext_hwi (num, 32)); > + emit_insn (gen_rotldi3_insert_3 (dest, high, GEN_INT (32), low, > + GEN_INT (0xffffffff))); > + } > + else > + { > + /* lis A,U4; ori A,U3; rotl A,32; oris A,U2; ori A,U1. */ ... and here, the others look good to me. Thanks! BR, Kewen