From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 670253857C45 for ; Sat, 25 Feb 2023 07:42:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 670253857C45 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31P4mYCI005816; Sat, 25 Feb 2023 07:42:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=9MsTN4UQHhYBDvYpuGc8bLeOj/4EiPxp2CFI4rjentU=; b=bHw8Rvs5h3JyjSGVzHjqUi1BLDR/k2adVdDM+sJuob5dTX3wxQLKUUJss/8T9K9P9Fkf opn/c6TlIOYtKf/VshvvnLhRE7wJRJ92756pm+l9uj1ECg89txAFjbAqu/8Z9j1SxsyO jaO620DgpPyCfsjWC5FCW/I42gAkpAyUZYfJ4GEHdDLpx0B2zE8f8rFTcljDDQVmTsWL YogUFj1K9iQUCUDujtUDojgSQ0jZjXXjfpVoKcNBzhcITxncWwhTA3O87IjLrzXmCmq3 /gPm1Oj3u1m8kdgLQk9FKOb4URv8QvaGGiFsz43ThHdmRyXHFYGyexgv+BMoxzFl/s/7 OQ== Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nybmqj9bm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 25 Feb 2023 07:42:32 +0000 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 31P4V0wU005879; Sat, 25 Feb 2023 07:42:31 GMT Received: from smtprelay04.dal12v.mail.ibm.com ([9.208.130.102]) by ppma01wdc.us.ibm.com (PPS) with ESMTPS id 3nybcfgn1b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 25 Feb 2023 07:42:30 +0000 Received: from smtpav04.dal12v.mail.ibm.com (smtpav04.dal12v.mail.ibm.com [10.241.53.103]) by smtprelay04.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 31P7gTZN64029042 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 25 Feb 2023 07:42:29 GMT Received: from smtpav04.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C1A1C58056; Sat, 25 Feb 2023 07:42:29 +0000 (GMT) Received: from smtpav04.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 65F6558052; Sat, 25 Feb 2023 07:42:28 +0000 (GMT) Received: from [9.43.6.59] (unknown [9.43.6.59]) by smtpav04.dal12v.mail.ibm.com (Postfix) with ESMTP; Sat, 25 Feb 2023 07:42:27 +0000 (GMT) Message-ID: <629b6d20-5db0-894f-6e28-1fca119c0b1a@linux.ibm.com> Date: Sat, 25 Feb 2023 13:12:26 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571] Content-Language: en-US To: Segher Boessenkool Cc: gcc-patches , bergner@linux.ibm.com References: <16fa34b8-ad8a-20f2-b285-3b3f5bf5d5b2@linux.ibm.com> <20230217172319.GL25951@gate.crashing.org> <5f800afe-f162-0d56-3c78-6ee93610e201@linux.ibm.com> <20230221110448.GP25951@gate.crashing.org> <46ec2582-b3a8-52c9-584a-9d282d26fb79@linux.ibm.com> <20230221140927.GQ25951@gate.crashing.org> <9099872d-8cd3-c610-5ae0-2484df6572e7@linux.ibm.com> <84050fc8-c833-0b46-ce6c-6f0dd9869ba6@linux.ibm.com> <20230224151155.GX25951@gate.crashing.org> From: Ajit Agarwal In-Reply-To: <20230224151155.GX25951@gate.crashing.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: h5geGFTgIxsVTkqrQmLlD_kS76j07H09 X-Proofpoint-ORIG-GUID: h5geGFTgIxsVTkqrQmLlD_kS76j07H09 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-25_03,2023-02-24_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 spamscore=0 impostorscore=0 mlxscore=0 suspectscore=0 phishscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 lowpriorityscore=0 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2302250054 X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Segher: On 24/02/23 8:41 pm, Segher Boessenkool wrote: > Hi! > > For future patches: please don't send patches as replies to existing > threads. Just start a new thread for a new patch (series). You can > mark it as [PATCH v2] in the subject, if you want. > > On Fri, Feb 24, 2023 at 01:41:49PM +0530, Ajit Agarwal wrote: >> Here is the patch that uses xxlor instead of fmr where possible. >> Performance results shows that fmr is better in power9 and >> power10 architectures whereas xxlor is better in power7 and >> power 8 architectures. > > And fmr is the only option before p7. > >> rs6000: Use xxlor instead of fmr where possible >> >> This patch replaces fmr with xxlor instruction for power7 >> and power8 architectures whereas for power9 and power10 >> replaces xxlor with fmr instruction. > > Saying "this patch" in a commit message reads strangely. Just "Replace > fmr with" etc.? > I will correct this. > The second part is just wrong, you cannot replace xxlor by fmr in > general. > >> Perf measurement results: >> >> Power9 fmr: 201,847,661 cycles. >> Power9 xxlor: 201,877,78 cycles. >> Power8 fmr: 201,057,795 cycles. >> Power8 xxlor: 201,004,671 cycles. > > What is this measuring? 100M insns back-to-back, each dependent on the > previous one? > Yes. > What are the results on p7 and p10? > > These numbers show there is no difference on p8 either. Did you paste > the wrong numbers maybe? > I will measure it again and update with a new patch. >> * config/rs6000/rs6000.md (*movdf_hardfloat64): Use xxlor >> for power7 and power8 and fmr for power9 and power10. > > Please don't break lines early. Changelogs lines can be 80 columns > wide, just like source code lines. > >> --- a/gcc/config/rs6000/rs6000.md >> +++ b/gcc/config/rs6000/rs6000.md >> @@ -354,7 +354,7 @@ (define_attr "cpu" >> (const (symbol_ref "(enum attr_cpu) rs6000_tune"))) >> >> ;; The ISA we implement. >> -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10" >> +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p7p8,p10" > > p78v, and sort it after p8v please. > >> + (and (eq_attr "isa" "p7p8") >> + (match_test "TARGET_VSX && !TARGET_P9_VECTOR")) >> + (const_int 1) > > Okay. > >> (define_insn "*mov_hardfloat64" >> [(set (match_operand:FMOVE64 0 "nonimmediate_operand" >> - "=m, d, d, , wY, >> - , Z, , , !r, >> - YZ, r, !r, *c*l, !r, >> - *h, r, , wa") >> + "=m, d, , , wY, >> + , Z, wa, , !r, >> + YZ, r, !r, *c*l, !r, >> + *h, r, , d, wn, >> + wa") >> (match_operand:FMOVE64 1 "input_operand" > > (You posted this mail as wrapping. That means the patch cannot be > applied non-manually, and that replies to your mail will be mangled. > Just get a Real mail client, and configure it correctly :-) ) > I am using Thunderbird as mail client and the settings are all correct. I have set the mailnews.wrapLength 0. >> - "d, m, d, wY, , >> - Z, , , , , >> + "d, m, , wY, , >> + Z, , wa, , , >> r, YZ, r, r, *h, >> - 0, , r, eP"))] >> + 0, , r, d, wn, >> + eP"))] > > No. It is impossible to figure out what you changed here by just > reading it. > > There is no requirement there should be exactly five alternatives per > line, and/or that there should be the same number everywhere. > > If the indentation was incorrect, and you want to fix that, do that in a > separate *earlier* patch in the series, please. > I will Keep indentation as same. >> "TARGET_POWERPC64 && TARGET_HARD_FLOAT >> && (gpc_reg_operand (operands[0], mode) >> || gpc_reg_operand (operands[1], mode))" >> "@ >> stfd%U0%X0 %1,%0 >> lfd%U1%X1 %0,%1 >> - fmr %0,%1 >> + xxlor %x0,%x1,%x1 >> lxsd %0,%1 >> stxsd %1,%0 >> lxsdx %x0,%y1 >> stxsdx %x1,%y0 >> - xxlor %x0,%x1,%x1 >> + fmr %0,%1 >> xxlxor %x0,%x0,%x0 >> li %0,0 >> std%U0%X0 %1,%0 >> @@ -8467,23 +8474,28 @@ (define_insn "*mov_hardfloat64" >> nop >> mfvsrd %0,%x1 >> mtvsrd %x0,%1 >> + fmr %0,%1 >> + fmr %0,%1 >> #" >> [(set_attr "type" >> - "fpstore, fpload, fpsimple, fpload, fpstore, >> + "fpstore, fpload, veclogical, fpload, fpstore, >> fpload, fpstore, veclogical, veclogical, integer, >> store, load, *, mtjmpr, mfjmpr, >> - *, mfvsr, mtvsr, vecperm") >> + *, mfvsr, mtvsr, fpsimple, fpsimple, >> + vecperm") >> (set_attr "size" "64") >> (set_attr "isa" >> - "*, *, *, p9v, p9v, >> - p7v, p7v, *, *, *, >> - *, *, *, *, *, >> - *, p8v, p8v, p10") >> + "*, *, p7p8, p9v, p9v, >> + p7v, p7v, *, *, *, >> + *, *, *, *, *, >> + *, p8v, p8v, *, *, >> + p10") > > So, you swapped the xxlor and fmr entries, and added two nextra fmr > entries at the end?! > I have moved xxlor "p7p8" before any other constraints with fmr "*". I have added first constraints as xxlor with "p7p8" then wa fmr "*" and wn,d "*" as fmr at end. > > Segher Thanks & Regards Ajit