From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 3F2C63858C74 for ; Tue, 1 Mar 2022 19:30:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3F2C63858C74 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 221J8jOl009525; Tue, 1 Mar 2022 19:30:33 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3ehs6u8m9c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 01 Mar 2022 19:30:33 +0000 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 221JBJln018538; Tue, 1 Mar 2022 19:30:32 GMT Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0b-001b2d01.pphosted.com with ESMTP id 3ehs6u8m92-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 01 Mar 2022 19:30:32 +0000 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 221JK7RG016661; Tue, 1 Mar 2022 19:30:32 GMT Received: from b01cxnp22033.gho.pok.ibm.com (b01cxnp22033.gho.pok.ibm.com [9.57.198.23]) by ppma02dal.us.ibm.com with ESMTP id 3efbuaejvf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 01 Mar 2022 19:30:32 +0000 Received: from b01ledav001.gho.pok.ibm.com (b01ledav001.gho.pok.ibm.com [9.57.199.106]) by b01cxnp22033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 221JUTii32113030 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 1 Mar 2022 19:30:29 GMT Received: from b01ledav001.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 59F9A2805E; Tue, 1 Mar 2022 19:30:29 +0000 (GMT) Received: from b01ledav001.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 96C4028059; Tue, 1 Mar 2022 19:30:28 +0000 (GMT) Received: from sig-9-65-236-218.ibm.com (unknown [9.65.236.218]) by b01ledav001.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 1 Mar 2022 19:30:28 +0000 (GMT) Message-ID: <70f7af05040a044c15ec082261387785ce4cabab.camel@vnet.ibm.com> Subject: Re: [PATCH] Optimize signed DImode -> TImode on power10, PR target/104698 From: will schmidt To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt , Peter Bergner Date: Tue, 01 Mar 2022 13:30:28 -0600 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-18.el8) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: _Qx-YtlvkHsArrcEASrmTfCAVkzfZqv_ X-Proofpoint-GUID: WC-FUaNM0K3WZkfewH94qG_8uuCDWEzY X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.64.514 definitions=2022-03-01_07,2022-02-26_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 priorityscore=1501 spamscore=0 phishscore=0 bulkscore=0 adultscore=0 impostorscore=0 mlxlogscore=999 mlxscore=0 suspectscore=0 clxscore=1015 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2203010095 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2022 19:30:36 -0000 On Mon, 2022-02-28 at 22:21 -0500, Michael Meissner wrote: > Optimize signed DImode -> TImode on power10, PR target/104698. > Hi, Logic seems OK to me, a few suggestions on the comments intermixed below. As always, i defer if there are counter arguments. :-) > On power10, GCC tries to optimize the signed conversion from DImode to > TImode by using the vextsd2q instruction. However to generate this > instruction, it would have to generate 3 direct moves (1 from the GPR > registers to the altivec registers, and 2 from the altivec registers to > the GPR register). > > This patch adds code back in to use the shift right immediate instruction > to do the conversion if the target/source is GPR registers. Perhaps drop "back in". If it's necessary to call out a previous commit that removed the code for whatever reason, certainly do so. It's not clear from context if that was the case. > > 2022-02-28 Michael Meissner > > gcc/ > PR target/104698 > * config/rs6000/vsx.md (mtvsrdd_diti_w1): Delete. > (extendditi2): Replace with code to deal with both GPR registers > and with altivec registers. Perhaps enhance with (extendditi2): Convert from define_expand to define_insn_and_split. Replace with code ... > --- > gcc/config/rs6000/vsx.md | 73 ++++++++++++++++++++++++++++------------ > 1 file changed, 52 insertions(+), 21 deletions(-) > > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > index b53de103872..62464f67f4d 100644 > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -5023,15 +5023,58 @@ (define_expand "vsignextend_si_v2di" > DONE; > }) > > -;; ISA 3.1 vector sign extend > -;; Move DI value from GPR to TI mode in VSX register, word 1. > -(define_insn "mtvsrdd_diti_w1" > - [(set (match_operand:TI 0 "register_operand" "=wa") > - (unspec:TI [(match_operand:DI 1 "register_operand" "r")] > - UNSPEC_MTVSRD_DITI_W1))] > - "TARGET_POWERPC64 && TARGET_DIRECT_MOVE" > - "mtvsrdd %x0,0,%1" > - [(set_attr "type" "vecmove")]) > +;; Sign extend DI to TI. We provide both GPR targets and Altivec targets. If > +;; the register allocator prefers the GPRs, we won't have to move the value to > +;; the altivec registers, do the vextsd2q instruction and move it back. If we > +;; aren't compiling for 64-bit power10, don't provide the service and let the > +;; machine independent code handle the extension. So, the ".. we won't have to ..." applies to the altivec target path here? Describing in a way that indicates what code doesn't do doesn't seem right. If so, and perhaps even if not, i suggest rearranging the comment slightly so it can be read as an either or. If the register allocator prefers the GPRS, ... Otherwise, for altivec registers we dothe vextsd2q ... > +(define_insn_and_split "extendditi2" > + [(set (match_operand:TI 0 "register_operand" "=r,r,v,v,v") > + (sign_extend:TI (match_operand:DI 1 "input_operand" "r,m,r,wa,Z"))) > + (clobber (reg:DI CA_REGNO))] > + "TARGET_POWERPC64 && TARGET_POWER10" > + "#" > + "&& reload_completed" > + [(pc)] > +{ > + rtx dest = operands[0]; > + rtx src = operands[1]; > + int dest_regno = reg_or_subregno (dest); > + > + /* Handle conversion to GPR registers. Load up the low part and then do > + a sign extension to the upper part. */ > + if (INT_REGNO_P (dest_regno)) > + { > + rtx dest_hi = gen_highpart (DImode, dest); > + rtx dest_lo = gen_lowpart (DImode, dest); > + > + emit_move_insn (dest_lo, src); > + emit_insn (gen_ashrdi3 (dest_hi, dest_lo, GEN_INT (63))); > + DONE; > + } ok > + > + /* For conversion to Altivec register, generate either a splat operation or > + a load rightmost double word instruction. Both instructions gets the > + DImode value into the lower 64 bits, and then do the vextsd2q > + instruction. */ consider s/instruction. Both instructions gets/to get/ > + else if (ALTIVEC_REGNO_P (dest_regno)) > + { > + if (MEM_P (src)) > + emit_insn (gen_vsx_lxvrdx (dest, src)); > + else > + { > + rtx dest_v2di = gen_rtx_REG (V2DImode, dest_regno); > + emit_insn (gen_vsx_splat_v2di (dest_v2di, src)); > + } > + > + emit_insn (gen_extendditi2_vector (dest, dest)); > + DONE; > + } ok lgtm, thanks -Will > + > + else > + gcc_unreachable (); > +} > + [(set_attr "length" "8")]) > > ;; Sign extend 64-bit value in TI reg, word 1, to 128-bit value in TI reg > (define_insn "extendditi2_vector" > @@ -5042,18 +5085,6 @@ (define_insn "extendditi2_vector" > "vextsd2q %0,%1" > [(set_attr "type" "vecexts")]) > > -(define_expand "extendditi2" > - [(set (match_operand:TI 0 "gpc_reg_operand") > - (sign_extend:DI (match_operand:DI 1 "gpc_reg_operand")))] > - "TARGET_POWER10" > - { > - /* Move 64-bit src from GPR to vector reg and sign extend to 128-bits. */ > - rtx temp = gen_reg_rtx (TImode); > - emit_insn (gen_mtvsrdd_diti_w1 (temp, operands[1])); > - emit_insn (gen_extendditi2_vector (operands[0], temp)); > - DONE; > - }) > - > > ;; ISA 3.0 Binary Floating-Point Support > > -- > 2.35.1 > >