From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by sourceware.org (Postfix) with ESMTPS id 1265239DC4E0 for ; Fri, 9 Apr 2021 19:20:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1265239DC4E0 Received: by mail-ej1-x62d.google.com with SMTP id n2so10323753ejy.7 for ; Fri, 09 Apr 2021 12:20:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I5thpx8PdkDByuOPKkWsYx85FONaNv4gkSeirPLYWZc=; b=IRO13Au0iWWADaUhSUk4kysF3+9y3HOX3g3jtVUaqRhiDWjtwbuYqBFR8W6N0nrOgE KqqyEa/hAKuAWRgvIsU3ON16Q1dFkZwBFzLLeyyaHcfa8Pt5x0TrkJk+03PTheSF6xsc ecSeTed4aah27HtYqbFj0flEaNHTUdr1W/NC0Qyz3ZvYTeiAygLYACIef8acCHOj8GO/ g3EMEKJeDm88mBG/GnmtE0e0DF4QuCfaHFnNr5zZtqQjQIE7VQfOjJ787FxW5M1X+1B/ NF4WHLtCsr6zmNJQ6+BWCsv6iHYwj3rcdo8jofDahADjFH58Mu/D4HYo9A4Lg2ooG0sa 2pWw== X-Gm-Message-State: AOAM531ehneTZdyRhcbqCBXK6nC8ey8TQqb1jBgfEFZkxsqmdmcpIbX5 u9O7P9eS05w1E8QmlXs3dUJK4qNabvY= X-Google-Smtp-Source: ABdhPJyHn73Jj6iek9CWRg1cn/5ExMYvkdXB4QA4561l4sIDhRWxE5ssjGE/0zPdVnuWlvgEZuX6aA== X-Received: by 2002:a17:906:dfd6:: with SMTP id jt22mr1275408ejc.161.1617996046025; Fri, 09 Apr 2021 12:20:46 -0700 (PDT) Received: from nbbrfq (193-154-143-202.adsl.highway.telekom.at. [193.154.143.202]) by smtp.gmail.com with ESMTPSA id t15sm2034922edw.84.2021.04.09.12.20.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Apr 2021 12:20:45 -0700 (PDT) Date: Fri, 9 Apr 2021 21:20:32 +0200 From: Bernhard Reutner-Fischer To: Michael Meissner Cc: rep.dot.nop@gmail.com, will schmidt via Gcc-patches , will schmidt , Segher Boessenkool , David Edelsohn , Bill Schmidt , Peter Bergner Subject: Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC Message-ID: <20210409212032.1caa0552@nbbrfq> In-Reply-To: References: <20210409143800.GA12782@ibm-toto.the-meissners.org> <20210409144358.GB14459@ibm-toto.the-meissners.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Apr 2021 19:20:50 -0000 On Fri, 09 Apr 2021 11:54:59 -0500 will schmidt via Gcc-patches wrote: > On Fri, 2021-04-09 at 10:43 -0400, Michael Meissner wrote: > > gcc/ > > 2021-04-09 Michael Meissner > > (movcc_fpmask): Replace > > movcc_p9. Add IEEE 128-bit fp support. > > (movcc_invert_fpmask): Replace > > movcc_invert_p9. Add IEEE 128-bit fp > > support. > > (fpmask): Add IEEE 128-bit fp support. Enable generator to > > build te RTL. > > (xxsel): Add IEEE 128-bit fp support. Enable generator to > > build te RTL. > ok te? the? > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > > index 17b2fdc1cdd..ca4a4d01f05 100644 > > --- a/gcc/config/rs6000/rs6000.md > > +++ b/gcc/config/rs6000/rs6000.md > > ;; Handle inverting the fpmask comparisons. > > -(define_insn_and_split "*movcc_invert_p9" > > - [(set (match_operand:SFDF 0 "vsx_register_operand" "=&,") > > - (if_then_else:SFDF > > +(define_insn_and_split "*movcc_invert_fpmask" > > + [(set (match_operand:FPMASK 0 "vsx_register_operand" "=") > > + (if_then_else:FPMASK > > (match_operator:CCFP 1 "invert_fpmask_comparison_operator" > > - [(match_operand:SFDF2 2 "vsx_register_operand" ",") > > - (match_operand:SFDF2 3 "vsx_register_operand" ",")]) > > - (match_operand:SFDF 4 "vsx_register_operand" ",") > > - (match_operand:SFDF 5 "vsx_register_operand" ","))) > > - (clobber (match_scratch:V2DI 6 "=0,&wa"))] > > + [(match_operand:FPMASK2 2 "vsx_register_operand" "") > > + (match_operand:FPMASK2 3 "vsx_register_operand" "")]) > > + (match_operand:FPMASK 4 "vsx_register_operand" "") > > + (match_operand:FPMASK 5 "vsx_register_operand" ""))) > > + (clobber (match_scratch:V2DI 6 "=&"))] > > "TARGET_P9_MINMAX" > > "#" > > "&& 1" > > - [(set (match_dup 6) > > - (if_then_else:V2DI (match_dup 9) > > - (match_dup 7) > > - (match_dup 8))) > > - (set (match_dup 0) > > - (if_then_else:SFDF (ne (match_dup 6) > > - (match_dup 8)) > > - (match_dup 5) > > - (match_dup 4)))] > > + [(pc)] > > { > > - rtx op1 = operands[1]; > > - enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1)); > > + rtx dest = operands[0]; > > + rtx old_cmp = operands[1]; > > + rtx cmp_op1 = operands[2]; > > + rtx cmp_op2 = operands[3]; > > + enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (old_cmp)); I think you can drop the enum keyword. Didn't read the pattern in detail. > > + rtx cmp_rev = gen_rtx_fmt_ee (cond, CCFPmode, cmp_op1, cmp_op2); > > + rtx move_f = operands[4]; > > + rtx move_t = operands[5]; > > + rtx mask_reg = operands[6]; > > + rtx mask_m1 = CONSTM1_RTX (V2DImode); > > + rtx mask_0 = CONST0_RTX (V2DImode); > > + machine_mode move_mode = mode; > > + machine_mode compare_mode = mode; > > > > - if (GET_CODE (operands[6]) == SCRATCH) > > - operands[6] = gen_reg_rtx (V2DImode); > > + if (GET_CODE (mask_reg) == SCRATCH) > > + mask_reg = gen_reg_rtx (V2DImode); > > > > - operands[7] = CONSTM1_RTX (V2DImode); > > - operands[8] = CONST0_RTX (V2DImode); > > + /* Emit the compare and set mask instruction. */ > > + emit_insn (gen_fpmask (mask_reg, cmp_rev, cmp_op1, cmp_op2, > > + mask_m1, mask_0)); > > > > - operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]); > > + /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the > > + mask. Because we are using the splat builtin to extend the V2DImode, we > > + need to use element 1 on little endian systems. */ > > + if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode)) > > + { > > + rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx; > > + emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element)); > > + } > > + > > + /* Now emit the XXSEL insn. */ > > + emit_insn (gen_xxsel (dest, mask_reg, mask_0, move_t, move_f)); > > + DONE; > > } > > - [(set_attr "length" "8") > > + ;; length is 12 in case we need to add XXPERMDI > > + [(set_attr "length" "12") > > (set_attr "type" "vecperm")]) > > > > -(define_insn "*fpmask" > > - [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa") > > +(define_insn "fpmask" > > + [(set (match_operand:V2DI 0 "vsx_register_operand" "=") > > (if_then_else:V2DI > > (match_operator:CCFP 1 "fpmask_comparison_operator" > > - [(match_operand:SFDF 2 "vsx_register_operand" "") > > - (match_operand:SFDF 3 "vsx_register_operand" "")]) > > + [(match_operand:FPMASK 2 "vsx_register_operand" "") > > + (match_operand:FPMASK 3 "vsx_register_operand" "")]) > > (match_operand:V2DI 4 "all_ones_constant" "") > > (match_operand:V2DI 5 "zero_constant" "")))] > > "TARGET_P9_MINMAX" > > - "xscmp%V1dp %x0,%x2,%x3" > > +{ > > + return (FLOAT128_IEEE_P (mode) > > + ? "xscmp%V1qp %0,%2,%3" > > + : "xscmp%V1dp %x0,%x2,%x3"); > > +} > > [(set_attr "type" "fpcompare")]) > > > > -(define_insn "*xxsel" > > - [(set (match_operand:SFDF 0 "vsx_register_operand" "=") > > - (if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa") > > - (match_operand:V2DI 2 "zero_constant" "")) > > - (match_operand:SFDF 3 "vsx_register_operand" "") > > - (match_operand:SFDF 4 "vsx_register_operand" "")))] > > +(define_insn "xxsel" > > + [(set (match_operand:FPMASK 0 "vsx_register_operand" "=wa") > > + (if_then_else:FPMASK > > + (ne (match_operand:V2DI 1 "vsx_register_operand" "wa") > > + (match_operand:V2DI 2 "zero_constant" "")) > > + (match_operand:FPMASK 3 "vsx_register_operand" "wa") > > + (match_operand:FPMASK 4 "vsx_register_operand" "wa")))] > > "TARGET_P9_MINMAX" > > "xxsel %x0,%x4,%x3,%x1" > > [(set_attr "type" "vecmove")]) > > diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c > > new file mode 100644 > > index 00000000000..639d5a77087 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c > > @@ -0,0 +1,93 @@ > > +/* { dg-do compile } */ > > +/* { dg-require-effective-target ppc_float128_hw } */ > > +/* { dg-require-effective-target power10_ok } */ > > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ > > +/* { dg-final { scan-assembler {\mxscmpeq[dq]p\M} } } */ > > +/* { dg-final { scan-assembler {\mxxpermdi\M} } } */ > > +/* { dg-final { scan-assembler {\mxxsel\M} } } */ > > +/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M} } } */ > > +/* { dg-final { scan-assembler-not {\mfcmp[uo]\M} } } */ > > +/* { dg-final { scan-assembler-not {\mfsel\M} } } */ I'd have expected scan-assembler-times fwiw. > > + > > +/* This series of tests tests whether you can do a conditional move where the > > + test is one floating point type, and the result is another floating point > > + type. > > + > > + If the comparison type is SF/DFmode, and the move type is IEEE 128-bit > > + floating point, we have to duplicate the mask in the lower 64-bits with > > + XXPERMDI because XSCMPEQDP clears the bottom 64-bits of the mask register. > > + > > + Going the other way (IEEE 128-bit comparsion, 64-bit move) is fine as the > > + mask word will be 128-bits. */ > > + > > +float > > +eq_f_d (float a, float b, double x, double y) > > +{ > > + return (x == y) ? a : b; > > +} > > + > > +double > > +eq_d_f (double a, double b, float x, float y) > > +{ > > + return (x == y) ? a : b; > > +} > > + > > +float > > +eq_f_f128 (float a, float b, __float128 x, __float128 y) > > +{ > > + return (x == y) ? a : b; > > +} > > + > > +double > > +eq_d_f128 (double a, double b, __float128 x, __float128 y) > > +{ > > + return (x == y) ? a : b; > > +} > > + > > +__float128 > > +eq_f128_f (__float128 a, __float128 b, float x, float y) > > +{ > > + return (x == y) ? a : b; > > +} > > + > > +__float128 > > +eq_f128_d (__float128 a, __float128 b, double x, double y) > > +{ > > + return (x != y) ? a : b; > > +} I would think the above should be == since it's named eq_ and the body would be redundant to ne_f128_d below as is. thanks, > > + > > +float > > +ne_f_d (float a, float b, double x, double y) > > +{ > > + return (x != y) ? a : b; > > +} > > + > > +double > > +ne_d_f (double a, double b, float x, float y) > > +{ > > + return (x != y) ? a : b; > > +} > > + > > +float > > +ne_f_f128 (float a, float b, __float128 x, __float128 y) > > +{ > > + return (x != y) ? a : b; > > +} > > + > > +double > > +ne_d_f128 (double a, double b, __float128 x, __float128 y) > > +{ > > + return (x != y) ? a : b; > > +} > > + > > +__float128 > > +ne_f128_f (__float128 a, __float128 b, float x, float y) > > +{ > > + return (x != y) ? a : b; > > +} > > + > > +__float128 > > +ne_f128_d (__float128 a, __float128 b, double x, double y) > > +{ > > + return (x != y) ? a : b; > > +}