From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 3FA973858C5E for ; Mon, 10 Jul 2023 19:52:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3FA973858C5E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36AJlNOI007521; Mon, 10 Jul 2023 19:52:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : subject : message-id : mime-version : content-type; s=pp1; bh=UI2FO4SNcbY0zgyl1cmcKkgN6cEd7AnbwpGYrFdS5/s=; b=CuSiFhV/DQz6kmZXDn4/aV4iJ2avwpLgrRqvk2/qYwt6OAXIza+Jg/ZdhAjtvV/oUSpG tv+YbMAAoPIZAA/THW8Q/mRwRCK5mRu+68fqh+ekrsC+m6UEo0IAWtR7Jqvyw9NGq0Uh VHk5SEHuWkHbgnxL/swaAxb7q1B1/e1Vn4eCDDbhy5dCDOZZLz41MsMl+w/uImi0T+0L jUXize/OV8HHd4M90K3Kzc/ChL/ETDNgz6uZooQgirw65bzjcT217jCGWF2zbZ/uHOLj RpnuyuPYkcxxNlSYwfjGFp5cfR3NS0wPL0HHZnyQG6wlmBFv7GcXujQIzofUkNFH6Srz Gw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rrreug2pm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 10 Jul 2023 19:52:01 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 36AJlOgg007553; Mon, 10 Jul 2023 19:52:01 GMT Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rrreug2nf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 10 Jul 2023 19:52:00 +0000 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 36AFdhN0027385; Mon, 10 Jul 2023 19:51:59 GMT Received: from smtprelay07.dal12v.mail.ibm.com ([9.208.130.99]) by ppma02wdc.us.ibm.com (PPS) with ESMTPS id 3rpye63e4a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 10 Jul 2023 19:51:59 +0000 Received: from smtpav04.dal12v.mail.ibm.com (smtpav04.dal12v.mail.ibm.com [10.241.53.103]) by smtprelay07.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 36AJpwjp37945856 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Jul 2023 19:51:58 GMT Received: from smtpav04.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8AFA258062; Mon, 10 Jul 2023 19:51:58 +0000 (GMT) Received: from smtpav04.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 205EF5805A; Mon, 10 Jul 2023 19:51:58 +0000 (GMT) Received: from cowardly-lion.the-meissners.org (unknown [9.61.34.183]) by smtpav04.dal12v.mail.ibm.com (Postfix) with ESMTPS; Mon, 10 Jul 2023 19:51:58 +0000 (GMT) Date: Mon, 10 Jul 2023 15:51:56 -0400 From: Michael Meissner To: gcc-patches@gcc.gnu.org, Michael Meissner , Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner Subject: [PATCH] Improve 64->128 bit zero extension on PowerPC (PR target/108958) Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: OSXk4Izt0eH2skh67dbG4b9BFLFtwTzc X-Proofpoint-GUID: JVPRRxgbew_PRPhlj5mhZu9rzpULNlpU X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-10_14,2023-07-06_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 priorityscore=1501 suspectscore=0 spamscore=0 mlxlogscore=999 bulkscore=0 lowpriorityscore=0 mlxscore=0 malwarescore=0 impostorscore=0 phishscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2307100177 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: If we are converting an unsigned DImode to a TImode value, and the TImode value will go in a vector register, GCC currently does the DImode to TImode conversion in GPR registers, and then moves the value to the vector register via a mtvsrdd instruction. This patch adds a new zero_extendditi2 insn which optimizes moving a GPR to a vector register using the mtvsrdd instruction with RA=0, and using lxvrdx to load a 64-bit value into the bottom 64-bits of the vector register. 2023-07-10 Michael Meissner gcc/ PR target/108958 * gcc/config/rs6000.md (zero_extendditi2): New insn. gcc/testsuite/ PR target/108958 * gcc.target/powerpc/pr108958.c: New test. --- gcc/config/rs6000/rs6000.md | 52 +++++++++++++++++++ gcc/testsuite/gcc.target/powerpc/pr108958.c | 57 +++++++++++++++++++++ 2 files changed, 109 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108958.c diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index cdab49fbb91..1a3d6316eab 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -987,6 +987,58 @@ (define_insn_and_split "*zero_extendsi2_dot2" (set_attr "dot" "yes") (set_attr "length" "4,8")]) +(define_insn_and_split "zero_extendditi2" + [(set (match_operand:TI 0 "gpc_reg_operand" "=r,r,wa,wa,wa") + (zero_extend:TI + (match_operand:DI 1 "reg_or_mem_operand" "r,m,b,Z,wa"))) + (clobber (match_scratch:DI 2 "=X,X,X,X,&wa"))] + "TARGET_POWERPC64 && TARGET_P9_VECTOR" + "@ + # + # + mtvsrdd %x0,0,%1 + lxvrdx %x0,%y1 + #" + "&& reload_completed + && (int_reg_operand (operands[0], TImode) + || (vsx_register_operand (operands[0], TImode) + && vsx_register_operand (operands[1], DImode)))" + [(set (match_dup 2) (match_dup 1)) + (set (match_dup 3) (const_int 0))] +{ + rtx dest = operands[0]; + rtx src = operands[1]; + + /* If we are converting a VSX DImode to VSX TImode, we need to move the upper + 64-bits (DImode) to the lower 64-bits. We can't just do a xxpermdi + instruction to swap the two 64-bit words, because can't rely on the bottom + 64-bits of the VSX register being 0. Instead we create a 0 and do the + xxpermdi operation to combine the two registers. */ + if (vsx_register_operand (dest, TImode) + && vsx_register_operand (src, DImode)) + { + rtx tmp = operands[2]; + emit_move_insn (tmp, const0_rtx); + + rtx hi = tmp; + rtx lo = src; + if (!BYTES_BIG_ENDIAN) + std::swap (hi, lo); + + rtx dest_v2di = gen_rtx_REG (V2DImode, reg_or_subregno (dest)); + emit_insn (gen_vsx_concat_v2di (dest_v2di, hi, lo)); + DONE; + } + + /* If we are zero extending to a GPR register either from a GPR register, + a VSX register or from memory, do the zero extend operation to the + lower DI register, and set the upper DI register to 0. */ + operands[2] = gen_lowpart (DImode, dest); + operands[3] = gen_highpart (DImode, dest); +} + [(set_attr "type" "*,load,vecexts,vecload,vecperm") + (set_attr "isa" "*,*,p9v,p10,*") + (set_attr "length" "8,8,*,*,8")]) (define_insn "extendqi2" [(set (match_operand:EXTQI 0 "gpc_reg_operand" "=r,?*v") diff --git a/gcc/testsuite/gcc.target/powerpc/pr108958.c b/gcc/testsuite/gcc.target/powerpc/pr108958.c new file mode 100644 index 00000000000..85ea0976f91 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr108958.c @@ -0,0 +1,57 @@ +/* { dg-require-effective-target int128 } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +/* This patch makes sure the various optimization and code paths are done for + zero extending DImode to TImode on power10 (PR target/pr108958). */ + +__uint128_t +gpr_to_gpr (unsigned long long a) +{ + return a; /* li 4,0. */ +} + +__uint128_t +mem_to_gpr (unsigned long long *p) +{ + return *p; /* ld 3,0(3); li 4,0. */ +} + +__uint128_t +vsx_to_gpr (double d) +{ + return (unsigned long long)d; /* fctiduz 0,1; li 4,0; mfvsrd 3,0. */ +} + +void +gpr_to_vsx (__uint128_t *p, unsigned long long a) +{ + __uint128_t b = a; /* mtvsrdd 0,0,4; stxv 0,0(3). */ + __asm__ (" # %x0" : "+wa" (b)); + *p = b; +} + +void +mem_to_vsx (__uint128_t *p, unsigned long long *q) +{ + __uint128_t a = *q; /* lxvrdx 0,0,4; stxv 0,0(3). */ + __asm__ (" # %x0" : "+wa" (a)); + *p = a; +} + +void +vsx_to_vsx (__uint128_t *p, double d) +{ + /* fctiduz 1,1; xxspltib 0,0; xxpermdi 0,0,1,0; stxv 0,0(3). */ + __uint128_t a = (unsigned long long)d; + __asm__ (" # %x0" : "+wa" (a)); + *p = a; +} + +/* { dg-final { scan-assembler-times {\mld\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mli\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mlxvrdx\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mmfvsrd\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mstxv\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mxxpermdi\M} 1 } } */ -- 2.41.0 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meissner@linux.ibm.com