From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 67B6F385840F for ; Fri, 24 Feb 2023 08:11:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 67B6F385840F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31O61caW023103; Fri, 24 Feb 2023 08:11:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=M112FSWSoK0gs2097k4+/qEcXfKVO7CQF4SNKGm+J2E=; b=L3hIPDFKXuf2gT5Vexf34rGA8vZxSAPhr2MP7HGt4iQIysZ2Lo9wQwKeHby89gU4V81X 1OzNe7AgAdpqqQ/t9ETpAPIb/nyJuoVc73RPHTOPYmimdbJL2NWpFBdzhYAQNlygYefu 8Mfzek3A14tJaBSgx1EvQD5Mf/5QS6UXLAHtpvg4MCnboJ+9LWRR4pPODk9oZGQ4K1Qz eAkm9NboZDMV548W0i5/FemqtkMC3ZAbxoIYrN6ooO0pZfXUzlGvB/5Ngp79Pvr8LOto Upowhb9pSjF+kOOhVMAg0VcMxWvQRsJJTZ2ew/oFbfyO3I4ldSOmDS/7elEa9KObxCrQ gA== Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3nxme46mjw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 24 Feb 2023 08:11:55 +0000 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 31O4qdwf016286; Fri, 24 Feb 2023 08:11:55 GMT Received: from smtprelay04.wdc07v.mail.ibm.com ([9.208.129.114]) by ppma02wdc.us.ibm.com (PPS) with ESMTPS id 3ntpa7tq92-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 24 Feb 2023 08:11:55 +0000 Received: from smtpav05.wdc07v.mail.ibm.com (smtpav05.wdc07v.mail.ibm.com [10.39.53.232]) by smtprelay04.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 31O8BrDQ50725148 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 24 Feb 2023 08:11:53 GMT Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1194A5805F; Fri, 24 Feb 2023 08:11:53 +0000 (GMT) Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8C69358043; Fri, 24 Feb 2023 08:11:51 +0000 (GMT) Received: from [9.43.47.151] (unknown [9.43.47.151]) by smtpav05.wdc07v.mail.ibm.com (Postfix) with ESMTP; Fri, 24 Feb 2023 08:11:51 +0000 (GMT) Message-ID: <84050fc8-c833-0b46-ce6c-6f0dd9869ba6@linux.ibm.com> Date: Fri, 24 Feb 2023 13:41:49 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571] Content-Language: en-US To: Segher Boessenkool Cc: gcc-patches , bergner@linux.ibm.com References: <16fa34b8-ad8a-20f2-b285-3b3f5bf5d5b2@linux.ibm.com> <20230217172319.GL25951@gate.crashing.org> <5f800afe-f162-0d56-3c78-6ee93610e201@linux.ibm.com> <20230221110448.GP25951@gate.crashing.org> <46ec2582-b3a8-52c9-584a-9d282d26fb79@linux.ibm.com> <20230221140927.GQ25951@gate.crashing.org> <9099872d-8cd3-c610-5ae0-2484df6572e7@linux.ibm.com> From: Ajit Agarwal In-Reply-To: <9099872d-8cd3-c610-5ae0-2484df6572e7@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: xTB2tF4Wts1yQsrxx0GcrVeK0yru34No X-Proofpoint-GUID: xTB2tF4Wts1yQsrxx0GcrVeK0yru34No X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-24_04,2023-02-23_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 impostorscore=0 priorityscore=1501 lowpriorityscore=0 malwarescore=0 clxscore=1015 adultscore=0 bulkscore=0 suspectscore=0 mlxscore=0 mlxlogscore=999 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2302240066 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello All: Here is the patch that uses xxlor instead of fmr where possible. Performance results shows that fmr is better in power9 and power10 architectures whereas xxlor is better in power7 and power 8 architectures. Bootstrapped and regtested powepc64-linux-gnu. Thanks & Regards Ajit rs6000: Use xxlor instead of fmr where possible This patch replaces fmr with xxlor instruction for power7 and power8 architectures whereas for power9 and power10 replaces xxlor with fmr instruction. Perf measurement results: Power9 fmr: 201,847,661 cycles. Power9 xxlor: 201,877,78 cycles. Power8 fmr: 201,057,795 cycles. Power8 xxlor: 201,004,671 cycles. 2023-02-24 Ajit Kumar Agarwal gcc/ChangeLog: * config/rs6000/rs6000.md (*movdf_hardfloat64): Use xxlor for power7 and power8 and fmr for power9 and power10. --- gcc/config/rs6000/rs6000.md | 46 +++++++++++++++++++++++-------------- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 81bffb04ceb..1253b8622a7 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -354,7 +354,7 @@ (define_attr "cpu" (const (symbol_ref "(enum attr_cpu) rs6000_tune"))) ;; The ISA we implement. -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10" +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p7p8,p10" (const_string "any")) ;; Is this alternative enabled for the current CPU/ISA/etc.? @@ -402,6 +402,11 @@ (define_attr "enabled" "" (and (eq_attr "isa" "p10") (match_test "TARGET_POWER10")) (const_int 1) + + (and (eq_attr "isa" "p7p8") + (match_test "TARGET_VSX && !TARGET_P9_VECTOR")) + (const_int 1) + ] (const_int 0))) ;; If this instruction is microcoded on the CELL processor @@ -8436,27 +8441,29 @@ (define_insn "*mov_softfloat32" (define_insn "*mov_hardfloat64" [(set (match_operand:FMOVE64 0 "nonimmediate_operand" - "=m, d, d, , wY, - , Z, , , !r, - YZ, r, !r, *c*l, !r, - *h, r, , wa") + "=m, d, , , wY, + , Z, wa, , !r, + YZ, r, !r, *c*l, !r, + *h, r, , d, wn, + wa") (match_operand:FMOVE64 1 "input_operand" - "d, m, d, wY, , - Z, , , , , + "d, m, , wY, , + Z, , wa, , , r, YZ, r, r, *h, - 0, , r, eP"))] + 0, , r, d, wn, + eP"))] "TARGET_POWERPC64 && TARGET_HARD_FLOAT && (gpc_reg_operand (operands[0], mode) || gpc_reg_operand (operands[1], mode))" "@ stfd%U0%X0 %1,%0 lfd%U1%X1 %0,%1 - fmr %0,%1 + xxlor %x0,%x1,%x1 lxsd %0,%1 stxsd %1,%0 lxsdx %x0,%y1 stxsdx %x1,%y0 - xxlor %x0,%x1,%x1 + fmr %0,%1 xxlxor %x0,%x0,%x0 li %0,0 std%U0%X0 %1,%0 @@ -8467,23 +8474,28 @@ (define_insn "*mov_hardfloat64" nop mfvsrd %0,%x1 mtvsrd %x0,%1 + fmr %0,%1 + fmr %0,%1 #" [(set_attr "type" - "fpstore, fpload, fpsimple, fpload, fpstore, + "fpstore, fpload, veclogical, fpload, fpstore, fpload, fpstore, veclogical, veclogical, integer, store, load, *, mtjmpr, mfjmpr, - *, mfvsr, mtvsr, vecperm") + *, mfvsr, mtvsr, fpsimple, fpsimple, + vecperm") (set_attr "size" "64") (set_attr "isa" - "*, *, *, p9v, p9v, - p7v, p7v, *, *, *, - *, *, *, *, *, - *, p8v, p8v, p10") + "*, *, p7p8, p9v, p9v, + p7v, p7v, *, *, *, + *, *, *, *, *, + *, p8v, p8v, *, *, + p10") (set_attr "prefixed" "*, *, *, *, *, *, *, *, *, *, *, *, *, *, *, - *, *, *, *")]) + *, *, *, *, *, + *")]) ;; STD LD MR MT MF G-const ;; H-const F-const Special -- 2.31.1 On 22/02/23 3:58 pm, Ajit Agarwal via Gcc-patches wrote: > > > On 21/02/23 7:39 pm, Segher Boessenkool wrote: >> On Tue, Feb 21, 2023 at 06:00:52PM +0530, Ajit Agarwal wrote: >>> On 21/02/23 4:34 pm, Segher Boessenkool wrote: >>>> Please domn't use a switch, it isn't needed. Instead use the "isa" >>>> attribute (with p7v here), and put the preferred alternative first. >>> >>> I am not sure how this is possible without switch and using only "isa". >> >> You have the "p7v" "xxlor" alternative earlier than the "*" "fmr" >> alternative. You can have an "xxlor" for contraints "d", but probably >> the best (and certainly the easiest) is to just move the existing >> xxlor to before fmr. >> >> Oh, the existing xxlor alternative is implicitly isa p7v, the "wa" >> constraint causes that. It may be nicer to mark it explicitly p7v as >> well, nicer for the reader. >> > > If I do the above, for power9 it selects xxlor instead of fmr. > >> Btw, please update the other similar patterns at the same time? There >> are eight patterns with fmr in rs6000.md (the four in dfp.md should >> probably not be touched); not all are similar so should be in separate >> patches, if changed at all, but a bunch are completely analogous so >> should not diverge. >> >> (It is fine to first do this one pattern only, until we have worked out >> all kinks, but all should be committed at the same time). >> >> Thanks, >> >> >> Segher