From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id AC5643858412 for ; Mon, 1 Aug 2022 06:19:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AC5643858412 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2716GjJT006354; Mon, 1 Aug 2022 06:19:42 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3hp9e2r2t1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 01 Aug 2022 06:19:42 +0000 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2716HWB7008600; Mon, 1 Aug 2022 06:19:42 GMT Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3hp9e2r2s5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 01 Aug 2022 06:19:41 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 27165YSc002234; Mon, 1 Aug 2022 06:19:39 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma04fra.de.ibm.com with ESMTP id 3hmv98hb0c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 01 Aug 2022 06:19:39 +0000 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2716JanO26542362 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 1 Aug 2022 06:19:36 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 503F7A405B; Mon, 1 Aug 2022 06:19:36 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 65F1AA4054; Mon, 1 Aug 2022 06:19:34 +0000 (GMT) Received: from [9.200.34.83] (unknown [9.200.34.83]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 1 Aug 2022 06:19:34 +0000 (GMT) Message-ID: <5a55cacd-539d-e3d6-3b3d-c7c30a76a564@linux.ibm.com> Date: Mon, 1 Aug 2022 14:19:32 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH, rs6000] Add multiply-add expand pattern [PR103109] Content-Language: en-US To: HAO CHEN GUI Cc: Segher Boessenkool , David , Peter Bergner , gcc-patches References: From: "Kewen.Lin" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Qc_yOv39T-kA9RPzmIE3Wi05cNgvy5BQ X-Proofpoint-GUID: cG3gEBwr_3B4tXGqPYB05hJDw7IWHQsL X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-01_01,2022-07-28_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 impostorscore=0 spamscore=0 mlxlogscore=999 phishscore=0 lowpriorityscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 clxscore=1015 suspectscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2206140000 definitions=main-2208010031 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Aug 2022 06:19:45 -0000 Hi Haochen, Thanks for the patch, some comments are inlined. on 2022/7/25 13:11, HAO CHEN GUI wrote: > Hi, > This patch adds an expand and several insns for multiply-add with > three 64bit operands. > > Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. > Is this okay for trunk? Any recommendations? Thanks a lot. > > ChangeLog > 2022-07-22 Haochen Gui > > gcc/ > PR target/103109 > * config/rs6000/rs6000.md (maddditi4): New pattern for > multiply-add. > (madddi4_lowpart): New. > (madddi4_lowpart_le): New. > (madddi4_highpart): New. > (madddi4_highpart_le): New. > > gcc/testsuite/ > PR target/103109 > * gcc.target/powerpc/pr103109.c: New. > > patch.diff > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index c55ee7e171a..4f3b56e103e 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -3226,6 +3226,97 @@ (define_insn "*maddld4" > "maddld %0,%1,%2,%3" > [(set_attr "type" "mul")]) > > +(define_expand "maddditi4" > + [(set (match_operand:TI 0 "gpc_reg_operand") > + (plus:TI > + (mult:TI (any_extend:TI > + (match_operand:DI 1 "gpc_reg_operand")) > + (any_extend:TI > + (match_operand:DI 2 "gpc_reg_operand"))) > + (any_extend:TI > + (match_operand:DI 3 "gpc_reg_operand"))))] > + "TARGET_POWERPC64 && TARGET_MADDLD" > +{ > + rtx op0_lo = gen_rtx_SUBREG (DImode, operands[0], BYTES_BIG_ENDIAN ? 8 : 0); > + rtx op0_hi = gen_rtx_SUBREG (DImode, operands[0], BYTES_BIG_ENDIAN ? 0 : 8); > + > + if (BYTES_BIG_ENDIAN) > + { > + emit_insn (gen_madddi4_lowpart (op0_lo, operands[1], operands[2], > + operands[3])); > + emit_insn (gen_madddi4_highpart (op0_hi, operands[1], operands[2], > + operands[3])); > + } > + else > + { > + emit_insn (gen_madddi4_lowpart_le (op0_lo, operands[1], operands[2], > + operands[3])); > + emit_insn (gen_madddi4_highpart_le (op0_hi, operands[1], operands[2], > + operands[3])); > + } > + DONE; > +}) > + > +(define_insn "madddi4_lowpart" > + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") > + (subreg:DI > + (plus:TI > + (mult:TI (any_extend:TI > + (match_operand:DI 1 "gpc_reg_operand" "r")) > + (any_extend:TI > + (match_operand:DI 2 "gpc_reg_operand" "r"))) > + (any_extend:TI > + (match_operand:DI 3 "gpc_reg_operand" "r"))) > + 8))] > + "TARGET_POWERPC64 && TARGET_MADDLD && BYTES_BIG_ENDIAN" > + "maddld %0,%1,%2,%3" > + [(set_attr "type" "mul")]) > + > +(define_insn "madddi4_lowpart_le" > + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") > + (subreg:DI > + (plus:TI > + (mult:TI (any_extend:TI > + (match_operand:DI 1 "gpc_reg_operand" "r")) > + (any_extend:TI > + (match_operand:DI 2 "gpc_reg_operand" "r"))) > + (any_extend:TI > + (match_operand:DI 3 "gpc_reg_operand" "r"))) > + 0))] > + "TARGET_POWERPC64 && TARGET_MADDLD && !BYTES_BIG_ENDIAN" > + "maddld %0,%1,%2,%3" > + [(set_attr "type" "mul")] > + > +(define_insn "madddi4_highpart" > + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") > + (subreg:DI > + (plus:TI > + (mult:TI (any_extend:TI > + (match_operand:DI 1 "gpc_reg_operand" "r")) > + (any_extend:TI > + (match_operand:DI 2 "gpc_reg_operand" "r"))) > + (any_extend:TI > + (match_operand:DI 3 "gpc_reg_operand" "r"))) > + 0))] > + "TARGET_POWERPC64 && TARGET_MADDLD && BYTES_BIG_ENDIAN" > + "maddhd %0,%1,%2,%3" > + [(set_attr "type" "mul")]) > + > +(define_insn "madddi4_highpart_le" > + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") > + (subreg:DI > + (plus:TI > + (mult:TI (any_extend:TI > + (match_operand:DI 1 "gpc_reg_operand" "r")) > + (any_extend:TI > + (match_operand:DI 2 "gpc_reg_operand" "r"))) > + (any_extend:TI > + (match_operand:DI 3 "gpc_reg_operand" "r"))) > + 8))] > + "TARGET_POWERPC64 && TARGET_MADDLD && !BYTES_BIG_ENDIAN" > + "maddhd %0,%1,%2,%3" > + [(set_attr "type" "mul")]) > + > (define_insn "udiv3" > [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > (udiv:GPR (match_operand:GPR 1 "gpc_reg_operand" "r") > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103109.c b/gcc/testsuite/gcc.target/powerpc/pr103109.c > new file mode 100644 > index 00000000000..256e05d5677 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103109.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile { target { lp64 } } } */ Since the guard is TARGET_POWERPC64, should use has_arch_ppc64? > +/* { dg-require-effective-target powerpc_p9modulo_ok } */ Need effective target int128 as well? > +/* { dg-options "-mdejagnu-cpu=power9 -O2" } */ > +/* { dg-final { scan-assembler-times {\mmaddld\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mmaddhd\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mmaddhdu\M} 1 } } */ > + > +__int128 test (long a, long b, long c) > +{ > + return (__int128) a * (__int128) b + (__int128) c; > +} > + > +unsigned __int128 testu (unsigned long a, unsigned long b, unsigned long c) > +{ > + return (unsigned __int128) a * (unsigned __int128) b + (unsigned __int128) c; > +} Not sure there is some coverage for this kind of multiply-add (promoted first then mul and add), if no, it seems better to add one runnable test case. BR, Kewen