From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id E662B385840F for ; Fri, 24 Feb 2023 15:13:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E662B385840F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 31OFBuPJ008888; Fri, 24 Feb 2023 09:11:56 -0600 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 31OFBtCV008881; Fri, 24 Feb 2023 09:11:56 -0600 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Fri, 24 Feb 2023 09:11:55 -0600 From: Segher Boessenkool To: Ajit Agarwal Cc: gcc-patches , bergner@linux.ibm.com Subject: Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571] Message-ID: <20230224151155.GX25951@gate.crashing.org> References: <16fa34b8-ad8a-20f2-b285-3b3f5bf5d5b2@linux.ibm.com> <20230217172319.GL25951@gate.crashing.org> <5f800afe-f162-0d56-3c78-6ee93610e201@linux.ibm.com> <20230221110448.GP25951@gate.crashing.org> <46ec2582-b3a8-52c9-584a-9d282d26fb79@linux.ibm.com> <20230221140927.GQ25951@gate.crashing.org> <9099872d-8cd3-c610-5ae0-2484df6572e7@linux.ibm.com> <84050fc8-c833-0b46-ce6c-6f0dd9869ba6@linux.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <84050fc8-c833-0b46-ce6c-6f0dd9869ba6@linux.ibm.com> User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi! For future patches: please don't send patches as replies to existing threads. Just start a new thread for a new patch (series). You can mark it as [PATCH v2] in the subject, if you want. On Fri, Feb 24, 2023 at 01:41:49PM +0530, Ajit Agarwal wrote: > Here is the patch that uses xxlor instead of fmr where possible. > Performance results shows that fmr is better in power9 and > power10 architectures whereas xxlor is better in power7 and > power 8 architectures. And fmr is the only option before p7. > rs6000: Use xxlor instead of fmr where possible > > This patch replaces fmr with xxlor instruction for power7 > and power8 architectures whereas for power9 and power10 > replaces xxlor with fmr instruction. Saying "this patch" in a commit message reads strangely. Just "Replace fmr with" etc.? The second part is just wrong, you cannot replace xxlor by fmr in general. > Perf measurement results: > > Power9 fmr: 201,847,661 cycles. > Power9 xxlor: 201,877,78 cycles. > Power8 fmr: 201,057,795 cycles. > Power8 xxlor: 201,004,671 cycles. What is this measuring? 100M insns back-to-back, each dependent on the previous one? What are the results on p7 and p10? These numbers show there is no difference on p8 either. Did you paste the wrong numbers maybe? > * config/rs6000/rs6000.md (*movdf_hardfloat64): Use xxlor > for power7 and power8 and fmr for power9 and power10. Please don't break lines early. Changelogs lines can be 80 columns wide, just like source code lines. > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -354,7 +354,7 @@ (define_attr "cpu" > (const (symbol_ref "(enum attr_cpu) rs6000_tune"))) > > ;; The ISA we implement. > -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10" > +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p7p8,p10" p78v, and sort it after p8v please. > + (and (eq_attr "isa" "p7p8") > + (match_test "TARGET_VSX && !TARGET_P9_VECTOR")) > + (const_int 1) Okay. > (define_insn "*mov_hardfloat64" > [(set (match_operand:FMOVE64 0 "nonimmediate_operand" > - "=m, d, d, , wY, > - , Z, , , !r, > - YZ, r, !r, *c*l, !r, > - *h, r, , wa") > + "=m, d, , , wY, > + , Z, wa, , !r, > + YZ, r, !r, *c*l, !r, > + *h, r, , d, wn, > + wa") > (match_operand:FMOVE64 1 "input_operand" (You posted this mail as wrapping. That means the patch cannot be applied non-manually, and that replies to your mail will be mangled. Just get a Real mail client, and configure it correctly :-) ) > - "d, m, d, wY, , > - Z, , , , , > + "d, m, , wY, , > + Z, , wa, , , > r, YZ, r, r, *h, > - 0, , r, eP"))] > + 0, , r, d, wn, > + eP"))] No. It is impossible to figure out what you changed here by just reading it. There is no requirement there should be exactly five alternatives per line, and/or that there should be the same number everywhere. If the indentation was incorrect, and you want to fix that, do that in a separate *earlier* patch in the series, please. > "TARGET_POWERPC64 && TARGET_HARD_FLOAT > && (gpc_reg_operand (operands[0], mode) > || gpc_reg_operand (operands[1], mode))" > "@ > stfd%U0%X0 %1,%0 > lfd%U1%X1 %0,%1 > - fmr %0,%1 > + xxlor %x0,%x1,%x1 > lxsd %0,%1 > stxsd %1,%0 > lxsdx %x0,%y1 > stxsdx %x1,%y0 > - xxlor %x0,%x1,%x1 > + fmr %0,%1 > xxlxor %x0,%x0,%x0 > li %0,0 > std%U0%X0 %1,%0 > @@ -8467,23 +8474,28 @@ (define_insn "*mov_hardfloat64" > nop > mfvsrd %0,%x1 > mtvsrd %x0,%1 > + fmr %0,%1 > + fmr %0,%1 > #" > [(set_attr "type" > - "fpstore, fpload, fpsimple, fpload, fpstore, > + "fpstore, fpload, veclogical, fpload, fpstore, > fpload, fpstore, veclogical, veclogical, integer, > store, load, *, mtjmpr, mfjmpr, > - *, mfvsr, mtvsr, vecperm") > + *, mfvsr, mtvsr, fpsimple, fpsimple, > + vecperm") > (set_attr "size" "64") > (set_attr "isa" > - "*, *, *, p9v, p9v, > - p7v, p7v, *, *, *, > - *, *, *, *, *, > - *, p8v, p8v, p10") > + "*, *, p7p8, p9v, p9v, > + p7v, p7v, *, *, *, > + *, *, *, *, *, > + *, p8v, p8v, *, *, > + p10") So, you swapped the xxlor and fmr entries, and added two nextra fmr entries at the end?! Segher