From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rep.dot.nop@gmail.com>
Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com
 [IPv6:2a00:1450:4864:20::62d])
 by sourceware.org (Postfix) with ESMTPS id 1265239DC4E0
 for <gcc-patches@gcc.gnu.org>; Fri,  9 Apr 2021 19:20:47 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1265239DC4E0
Received: by mail-ej1-x62d.google.com with SMTP id n2so10323753ejy.7
 for <gcc-patches@gcc.gnu.org>; Fri, 09 Apr 2021 12:20:47 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to
 :references:mime-version:content-transfer-encoding;
 bh=I5thpx8PdkDByuOPKkWsYx85FONaNv4gkSeirPLYWZc=;
 b=IRO13Au0iWWADaUhSUk4kysF3+9y3HOX3g3jtVUaqRhiDWjtwbuYqBFR8W6N0nrOgE
 KqqyEa/hAKuAWRgvIsU3ON16Q1dFkZwBFzLLeyyaHcfa8Pt5x0TrkJk+03PTheSF6xsc
 ecSeTed4aah27HtYqbFj0flEaNHTUdr1W/NC0Qyz3ZvYTeiAygLYACIef8acCHOj8GO/
 g3EMEKJeDm88mBG/GnmtE0e0DF4QuCfaHFnNr5zZtqQjQIE7VQfOjJ787FxW5M1X+1B/
 NF4WHLtCsr6zmNJQ6+BWCsv6iHYwj3rcdo8jofDahADjFH58Mu/D4HYo9A4Lg2ooG0sa
 2pWw==
X-Gm-Message-State: AOAM531ehneTZdyRhcbqCBXK6nC8ey8TQqb1jBgfEFZkxsqmdmcpIbX5
 u9O7P9eS05w1E8QmlXs3dUJK4qNabvY=
X-Google-Smtp-Source: ABdhPJyHn73Jj6iek9CWRg1cn/5ExMYvkdXB4QA4561l4sIDhRWxE5ssjGE/0zPdVnuWlvgEZuX6aA==
X-Received: by 2002:a17:906:dfd6:: with SMTP id
 jt22mr1275408ejc.161.1617996046025; 
 Fri, 09 Apr 2021 12:20:46 -0700 (PDT)
Received: from nbbrfq (193-154-143-202.adsl.highway.telekom.at.
 [193.154.143.202])
 by smtp.gmail.com with ESMTPSA id t15sm2034922edw.84.2021.04.09.12.20.45
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 09 Apr 2021 12:20:45 -0700 (PDT)
Date: Fri, 9 Apr 2021 21:20:32 +0200
From: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
To: Michael Meissner <meissner@linux.ibm.com>
Cc: rep.dot.nop@gmail.com, will schmidt via Gcc-patches
 <gcc-patches@gcc.gnu.org>, will schmidt <will_schmidt@vnet.ibm.com>, Segher
 Boessenkool <segher@kernel.crashing.org>, David Edelsohn
 <dje.gcc@gmail.com>, Bill Schmidt <wschmidt@linux.ibm.com>, Peter Bergner
 <bergner@linux.ibm.com>
Subject: Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC
Message-ID: <20210409212032.1caa0552@nbbrfq>
In-Reply-To: <bd9a3856a3346da63e99c25efd3bb7cbc52f298b.camel@vnet.ibm.com>
References: <20210409143800.GA12782@ibm-toto.the-meissners.org>
 <20210409144358.GB14459@ibm-toto.the-meissners.org>
 <bd9a3856a3346da63e99c25efd3bb7cbc52f298b.camel@vnet.ibm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0,
 RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Apr 2021 19:20:50 -0000

On Fri, 09 Apr 2021 11:54:59 -0500
will schmidt via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

> On Fri, 2021-04-09 at 10:43 -0400, Michael Meissner wrote:

> > gcc/
> > 2021-04-09 Michael Meissner  <meissner@linux.ibm.com>

> >         (mov<FPMASK:mode><FPMASK2:mode>cc_fpmask): Replace
> >         mov<SFDF:mode><SFDF2:mode>cc_p9.  Add IEEE 128-bit fp support.
> >         (mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask): Replace
> >         mov<SFDF:mode><SFDF2:mode>cc_invert_p9.  Add IEEE 128-bit fp
> >         support.
> >         (fpmask<mode>): Add IEEE 128-bit fp support.  Enable generator to
> >         build te RTL.
> >         (xxsel<mode>): Add IEEE 128-bit fp support.  Enable generator to
> >         build te RTL.  
> ok

te? the?

> > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> > index 17b2fdc1cdd..ca4a4d01f05 100644
> > --- a/gcc/config/rs6000/rs6000.md
> > +++ b/gcc/config/rs6000/rs6000.md


> >  ;; Handle inverting the fpmask comparisons.
> > -(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_invert_p9"
> > -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
> > -	(if_then_else:SFDF
> > +(define_insn_and_split "*mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask"
> > +  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=<FPMASK:Fv>")
> > +	(if_then_else:FPMASK
> >  	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
> > -		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
> > -		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
> > -	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
> > -	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
> > -   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
> > +	    [(match_operand:FPMASK2 2 "vsx_register_operand" "<FPMASK2:Fv>")
> > +	     (match_operand:FPMASK2 3 "vsx_register_operand" "<FPMASK2:Fv>")])
> > +	 (match_operand:FPMASK 4 "vsx_register_operand" "<FPMASK:Fv>")
> > +	 (match_operand:FPMASK 5 "vsx_register_operand" "<FPMASK:Fv>")))
> > +   (clobber (match_scratch:V2DI 6 "=&<FPMASK2:Fv>"))]
> >    "TARGET_P9_MINMAX"
> >    "#"
> >    "&& 1"
> > -  [(set (match_dup 6)
> > -	(if_then_else:V2DI (match_dup 9)
> > -			   (match_dup 7)
> > -			   (match_dup 8)))
> > -   (set (match_dup 0)
> > -	(if_then_else:SFDF (ne (match_dup 6)
> > -			       (match_dup 8))
> > -			   (match_dup 5)
> > -			   (match_dup 4)))]
> > +  [(pc)]
> >  {
> > -  rtx op1 = operands[1];
> > -  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
> > +  rtx dest = operands[0];
> > +  rtx old_cmp = operands[1];
> > +  rtx cmp_op1 = operands[2];
> > +  rtx cmp_op2 = operands[3];
> > +  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (old_cmp));

I think you can drop the enum keyword.
Didn't read the pattern in detail.

> > +  rtx cmp_rev = gen_rtx_fmt_ee (cond, CCFPmode, cmp_op1, cmp_op2);
> > +  rtx move_f = operands[4];
> > +  rtx move_t = operands[5];
> > +  rtx mask_reg = operands[6];
> > +  rtx mask_m1 = CONSTM1_RTX (V2DImode);
> > +  rtx mask_0 = CONST0_RTX (V2DImode);
> > +  machine_mode move_mode = <FPMASK:MODE>mode;
> > +  machine_mode compare_mode = <FPMASK2:MODE>mode;
> > 
> > -  if (GET_CODE (operands[6]) == SCRATCH)
> > -    operands[6] = gen_reg_rtx (V2DImode);
> > +  if (GET_CODE (mask_reg) == SCRATCH)
> > +    mask_reg = gen_reg_rtx (V2DImode);
> > 
> > -  operands[7] = CONSTM1_RTX (V2DImode);
> > -  operands[8] = CONST0_RTX (V2DImode);
> > +  /* Emit the compare and set mask instruction.  */
> > +  emit_insn (gen_fpmask<FPMASK2:mode> (mask_reg, cmp_rev, cmp_op1, cmp_op2,
> > +				       mask_m1, mask_0));
> > 
> > -  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
> > +  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
> > +     mask.  Because we are using the splat builtin to extend the V2DImode, we
> > +     need to use element 1 on little endian systems.  */
> > +  if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode))
> > +    {
> > +      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
> > +      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
> > +    }
> > +
> > +  /* Now emit the XXSEL insn.  */
> > +  emit_insn (gen_xxsel<FPMASK:mode> (dest, mask_reg, mask_0, move_t, move_f));
> > +  DONE;
> >  }
> > - [(set_attr "length" "8")
> > + ;; length is 12 in case we need to add XXPERMDI
> > + [(set_attr "length" "12")
> >    (set_attr "type" "vecperm")])
> > 
> > -(define_insn "*fpmask<mode>"
> > -  [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa")
> > +(define_insn "fpmask<mode>"
> > +  [(set (match_operand:V2DI 0 "vsx_register_operand" "=<Fv>")
> >  	(if_then_else:V2DI
> >  	 (match_operator:CCFP 1 "fpmask_comparison_operator"
> > -		[(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")
> > -		 (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")])
> > +		[(match_operand:FPMASK 2 "vsx_register_operand" "<Fv>")
> > +		 (match_operand:FPMASK 3 "vsx_register_operand" "<Fv>")])
> >  	 (match_operand:V2DI 4 "all_ones_constant" "")
> >  	 (match_operand:V2DI 5 "zero_constant" "")))]
> >    "TARGET_P9_MINMAX"
> > -  "xscmp%V1dp %x0,%x2,%x3"
> > +{
> > +  return (FLOAT128_IEEE_P (<MODE>mode)
> > +	  ? "xscmp%V1qp %0,%2,%3"
> > +	  : "xscmp%V1dp %x0,%x2,%x3");
> > +}
> >    [(set_attr "type" "fpcompare")])
> > 
> > -(define_insn "*xxsel<mode>"
> > -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
> > -	(if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
> > -			       (match_operand:V2DI 2 "zero_constant" ""))
> > -			   (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")
> > -			   (match_operand:SFDF 4 "vsx_register_operand" "<Fv>")))]
> > +(define_insn "xxsel<mode>"
> > +  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=wa")
> > +	(if_then_else:FPMASK
> > +	 (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
> > +	     (match_operand:V2DI 2 "zero_constant" ""))
> > +	 (match_operand:FPMASK 3 "vsx_register_operand" "wa")
> > +	 (match_operand:FPMASK 4 "vsx_register_operand" "wa")))]
> >    "TARGET_P9_MINMAX"
> >    "xxsel %x0,%x4,%x3,%x1"
> >    [(set_attr "type" "vecmove")])
> > diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> > new file mode 100644
> > index 00000000000..639d5a77087
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> > @@ -0,0 +1,93 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target ppc_float128_hw } */
> > +/* { dg-require-effective-target power10_ok } */
> > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> > +/* { dg-final { scan-assembler     {\mxscmpeq[dq]p\M} } } */
> > +/* { dg-final { scan-assembler     {\mxxpermdi\M}     } } */
> > +/* { dg-final { scan-assembler     {\mxxsel\M}        } } */
> > +/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
> > +/* { dg-final { scan-assembler-not {\mfcmp[uo]\M}     } } */
> > +/* { dg-final { scan-assembler-not {\mfsel\M}         } } */

I'd have expected scan-assembler-times fwiw.

> > +
> > +/* This series of tests tests whether you can do a conditional move where the
> > +   test is one floating point type, and the result is another floating point
> > +   type.
> > +
> > +   If the comparison type is SF/DFmode, and the move type is IEEE 128-bit
> > +   floating point, we have to duplicate the mask in the lower 64-bits with
> > +   XXPERMDI because XSCMPEQDP clears the bottom 64-bits of the mask register.
> > +
> > +   Going the other way (IEEE 128-bit comparsion, 64-bit move) is fine as the
> > +   mask word will be 128-bits.  */
> > +
> > +float
> > +eq_f_d (float a, float b, double x, double y)
> > +{
> > +  return (x == y) ? a : b;
> > +}
> > +
> > +double
> > +eq_d_f (double a, double b, float x, float y)
> > +{
> > +  return (x == y) ? a : b;
> > +}
> > +
> > +float
> > +eq_f_f128 (float a, float b, __float128 x, __float128 y)
> > +{
> > +  return (x == y) ? a : b;
> > +}
> > +
> > +double
> > +eq_d_f128 (double a, double b, __float128 x, __float128 y)
> > +{
> > +  return (x == y) ? a : b;
> > +}
> > +
> > +__float128
> > +eq_f128_f (__float128 a, __float128 b, float x, float y)
> > +{
> > +  return (x == y) ? a : b;
> > +}
> > +
> > +__float128
> > +eq_f128_d (__float128 a, __float128 b, double x, double y)
> > +{
> > +  return (x != y) ? a : b;
> > +}

I would think the above should be == since it's named eq_ and
the body would be redundant to ne_f128_d below as is.

thanks,

> > +
> > +float
> > +ne_f_d (float a, float b, double x, double y)
> > +{
> > +  return (x != y) ? a : b;
> > +}
> > +
> > +double
> > +ne_d_f (double a, double b, float x, float y)
> > +{
> > +  return (x != y) ? a : b;
> > +}
> > +
> > +float
> > +ne_f_f128 (float a, float b, __float128 x, __float128 y)
> > +{
> > +  return (x != y) ? a : b;
> > +}
> > +
> > +double
> > +ne_d_f128 (double a, double b, __float128 x, __float128 y)
> > +{
> > +  return (x != y) ? a : b;
> > +}
> > +
> > +__float128
> > +ne_f128_f (__float128 a, __float128 b, float x, float y)
> > +{
> > +  return (x != y) ? a : b;
> > +}
> > +
> > +__float128
> > +ne_f128_d (__float128 a, __float128 b, double x, double y)
> > +{
> > +  return (x != y) ? a : b;
> > +}