public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md.
@ 2024-07-27 18:18 Roger Sayle
  2024-09-27  8:51 ` Thomas Schwinge
  2024-09-27 13:45 ` Thomas Schwinge
  0 siblings, 2 replies; 3+ messages in thread
From: Roger Sayle @ 2024-07-27 18:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: 'Thomas Schwinge', 'Tom de Vries'

[-- Attachment #1: Type: text/plain, Size: 3341 bytes --]


Firstly, thanks to Haochen Gui for recently adding optab support for
isfinite and isnormal to the middle-end.  This patch adds define_expand
for both these functions to the nvptx backend, which conveniently has
special instructions to simplify their implementation.  As this patch
adds UNSPEC_ISFINITE and UNSPEC_ISNORMAL, I've also taken the opportunity
to include/repost my tweak to clean-up/eliminate UNSPEC_COPYSIGN.

Previously, for isfinite, GCC on nvptx-none with -O2 would generate:

                mov.f64 %r26, %ar0;
                abs.f64 %r28, %r26;
                setp.gtu.f64    %r31, %r28, 0d7fefffffffffffff;
                selp.u32        %value, 0, 1, %r31;

and with this patch, we now generate:

                mov.f64 %r23, %ar0;
                testp.finite.f64        %r24, %r23;
                selp.u32        %value, 1, 0, %r24;

Previously, for isnormal, GCC -O2 would generate:

                mov.f64 %r28, %ar0;
                abs.f64 %r22, %r28;
                setp.gtu.f64    %r32, %r22, 0d7fefffffffffffff;
                setp.ltu.f64    %r35, %r22, 0d0010000000000000;
                or.pred %r43, %r35, %r32;
                selp.u32        %value, 0, 1, %r43;

and with this patch becomes:

                mov.f64 %r23, %ar0;
                setp.neu.f64    %r24, %r23, 0d0000000000000000;
                testp.normal.f64        %r25, %r23;
                and.pred        %r26, %r24, %r25;
                selp.u32        %value, 1, 0, %r26;

Notice that although nvptx provides a testp.normal.f{32,64} instruction,
the semantics don't quite match those required of libm [+0.0 and -0.0
are considered normal by this instruction, but need to return false
for __builtin_isnormal, hence the additional logic, which is still
better than the original].

This patch has been tested on nvptx-none hosted by x86_64-pc-linux-gnu
using make and make -k check, with only one new failure in the testsuite.
The test case g++.dg/opt/pr107569.C exposes a latent bug in the middle-end
(actually a missed optimization) as evrp fails to bound the results of
isfinite.  This issue is independent of the back-end, as the tree-ssa
evrp pass is run long before __builtin_finite is expanded by the backend,
and the existence of an (any) isfinite optab is sufficient to expose it.
Fortunately, Haochem Gui has already posted/proposed a fix at
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657881.html
[which I'm sad to see is taking a while to review/get approved].

Ok for mainline?


2024-07-27  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        * config/nvptx/nptx.md (UNSPEC_COPYSIGN): No longer required.
        (UNSPEC_ISFINITE): New UNSPEC.
        (UNSPEC_ISNORMAL): Likewise.
        (*cmp<mode>): Rename to...
        (cmp<mode>): Remove '*' prefix to generate gen_cmp{s,d}f.
        (copysign<mode>3): Replace UNSPEC_COPYSIGN with copysign RTX.
        (*setcc_isfinite<mode>): New define_insn using UNSPEC_ISFINITE.
        (isfinite<mode>2): Expand isfinite.
        (*setcc_isnormal<mode>): New define_insn using UNSPEC_ISNORMAL.
        (isnormal<mode>2): Expand isnormal.

gcc/testsuite/ChangeLog
        * gcc.target/nvptx/isfinite.c: New test case.
        * gcc.target/nvptx/isnormal.c: Likewise.


Thanks in advance (p.s. don't forget the nvptx_rtx_costs patch),
Roger
--


[-- Attachment #2: patchcs3.txt --]
[-- Type: text/plain, Size: 3993 bytes --]

diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 7878a3b..ae711bb 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -21,13 +21,14 @@
 (define_c_enum "unspec" [
    UNSPEC_ARG_REG
 
-   UNSPEC_COPYSIGN
    UNSPEC_LOG2
    UNSPEC_EXP2
    UNSPEC_SIN
    UNSPEC_COS
    UNSPEC_TANH
    UNSPEC_ISINF
+   UNSPEC_ISFINITE
+   UNSPEC_ISNORMAL
 
    UNSPEC_FPINT_FLOOR
    UNSPEC_FPINT_BTRUNC
@@ -888,7 +889,7 @@
   ""
   "%.\\tsetp%c1\\t%0, %2, %3;")
 
-(define_insn "*cmp<mode>"
+(define_insn "cmp<mode>"
   [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
 	(match_operator:BI 1 "nvptx_float_comparison_operator"
 	   [(match_operand:SDFM 2 "nvptx_register_operand" "R")
@@ -1253,9 +1254,8 @@
 
 (define_insn "copysign<mode>3"
   [(set (match_operand:SDFM 0 "nvptx_register_operand" "=R")
-	(unspec:SDFM [(match_operand:SDFM 1 "nvptx_nonmemory_operand" "RF")
-		      (match_operand:SDFM 2 "nvptx_nonmemory_operand" "RF")]
-		      UNSPEC_COPYSIGN))]
+	(copysign:SDFM (match_operand:SDFM 1 "nvptx_nonmemory_operand" "RF")
+		       (match_operand:SDFM 2 "nvptx_nonmemory_operand" "RF")))]
   ""
   "%.\\tcopysign%t0\\t%0, %2, %1;")
 
@@ -1330,6 +1330,8 @@
   "flag_unsafe_math_optimizations"
   "%.\\tex2.approx%t0\\t%0, %1;")
 
+;; FP classify predicates
+
 (define_insn "setcc_isinf<mode>"
   [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
 	(unspec:BI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
@@ -1349,6 +1351,50 @@
   DONE;
 })
 
+(define_insn "setcc_isfinite<mode>"
+  [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
+	(unspec:BI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
+		   UNSPEC_ISFINITE))]
+  ""
+  "%.\\ttestp.finite%t1\\t%0, %1;")
+
+(define_expand "isfinite<mode>2"
+  [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
+	(unspec:SI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
+		   UNSPEC_ISFINITE))]
+  ""
+{
+  rtx pred = gen_reg_rtx (BImode);
+  emit_insn (gen_setcc_isfinite<mode> (pred, operands[1]));
+  emit_insn (gen_setccsi_from_bi (operands[0], pred));
+  DONE;
+})
+
+(define_insn "setcc_isnormal<mode>"
+  [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
+	(unspec:BI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
+		   UNSPEC_ISNORMAL))]
+  ""
+  "%.\\ttestp.normal%t1\\t%0, %1;")
+
+(define_expand "isnormal<mode>2"
+  [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
+	(unspec:SI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
+		   UNSPEC_ISNORMAL))]
+  ""
+{
+  rtx pred1 = gen_reg_rtx (BImode);
+  rtx pred2 = gen_reg_rtx (BImode);
+  rtx pred3 = gen_reg_rtx (BImode);
+  rtx zero = CONST0_RTX (<MODE>mode);
+  rtx cmp = gen_rtx_fmt_ee (NE, BImode, operands[1], zero);
+  emit_insn (gen_cmp<mode> (pred1, cmp, operands[1], zero));
+  emit_insn (gen_setcc_isnormal<mode> (pred2, operands[1]));
+  emit_insn (gen_andbi3 (pred3, pred1, pred2));
+  emit_insn (gen_setccsi_from_bi (operands[0], pred3));
+  DONE;
+})
+
 ;; HFmode floating point arithmetic.
 
 (define_insn "addhf3"
diff --git a/gcc/testsuite/gcc.target/nvptx/isfinite.c b/gcc/testsuite/gcc.target/nvptx/isfinite.c
new file mode 100644
index 0000000..83099fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/isfinite.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int foo(double x)
+{
+  return __builtin_isfinite(x);
+}
+
+/* { dg-final { scan-assembler-times "testp.finite.f64" 1 } } */
diff --git a/gcc/testsuite/gcc.target/nvptx/isnormal.c b/gcc/testsuite/gcc.target/nvptx/isnormal.c
new file mode 100644
index 0000000..83c4fb6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/isnormal.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int isnormal(double x)
+{
+  return __builtin_isnormal(x);
+}
+
+/* { dg-final { scan-assembler-times "testp.normal.f64" 1 } } */

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md.
  2024-07-27 18:18 [nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md Roger Sayle
@ 2024-09-27  8:51 ` Thomas Schwinge
  2024-09-27 13:45 ` Thomas Schwinge
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Schwinge @ 2024-09-27  8:51 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc-patches, Tom de Vries

Hi Roger!

On 2024-07-27T19:18:35+0100, "Roger Sayle" <roger@nextmovesoftware.com> wrote:
> Firstly, thanks to Haochen Gui for recently adding optab support for
> isfinite and isnormal to the middle-end.

Do we, by the way, have documentation (I suppose that should be in
"GNU Compiler Collection (GCC) Internals"?) about the rationale and
subsequent optimization opportunities for having vs. not having
representations of "codes" (like, 'isfinite') in the various GCC IRs
etc., like builtins, internal functions, GIMPLE, optabs, RTL (..., and
I've probably missed some more)?

Of course, a lot of it can be inferred from the context or otherwise,
like having builtins corresponding to C library functions and then be
able to optimize according to their defined semantics, but others are not
always clear to me: like, why do we have 'copysign' RTL but not
'ifnormal'?

> This patch adds define_expand
> for both these functions to the nvptx backend, which conveniently has
> special instructions to simplify their implementation.

ACK.

> As this patch
> adds UNSPEC_ISFINITE and UNSPEC_ISNORMAL, I've also taken the opportunity
> to include/repost my tweak to clean-up/eliminate UNSPEC_COPYSIGN.

I'd seen your 2023 "Add RTX codes for [...] COPYSIGN", but not yet seen a
patch to use it for nvptx -- but indeed have stumbled over nvptx
'UNSPEC_COPYSIGN' a while ago; ACK.

> Previously, for isfinite, GCC on nvptx-none with -O2 would generate:
>
>                 mov.f64 %r26, %ar0;
>                 abs.f64 %r28, %r26;
>                 setp.gtu.f64    %r31, %r28, 0d7fefffffffffffff;
>                 selp.u32        %value, 0, 1, %r31;
>
> and with this patch, we now generate:
>
>                 mov.f64 %r23, %ar0;
>                 testp.finite.f64        %r24, %r23;
>                 selp.u32        %value, 1, 0, %r24;

Nice!

> Previously, for isnormal, GCC -O2 would generate:
>
>                 mov.f64 %r28, %ar0;
>                 abs.f64 %r22, %r28;
>                 setp.gtu.f64    %r32, %r22, 0d7fefffffffffffff;
>                 setp.ltu.f64    %r35, %r22, 0d0010000000000000;
>                 or.pred %r43, %r35, %r32;
>                 selp.u32        %value, 0, 1, %r43;
>
> and with this patch becomes:
>
>                 mov.f64 %r23, %ar0;
>                 setp.neu.f64    %r24, %r23, 0d0000000000000000;
>                 testp.normal.f64        %r25, %r23;
>                 and.pred        %r26, %r24, %r25;
>                 selp.u32        %value, 1, 0, %r26;
>
> Notice that although nvptx provides a testp.normal.f{32,64} instruction,
> the semantics don't quite match those required of libm [+0.0 and -0.0
> are considered normal by this instruction, but need to return false
> for __builtin_isnormal, hence the additional logic

Ugh.  ;-)

> which is still
> better than the original].

ACK.

> This patch has been tested on nvptx-none hosted by x86_64-pc-linux-gnu
> using make and make -k check, with only one new failure in the testsuite.
> The test case g++.dg/opt/pr107569.C exposes a latent bug in the middle-end
> (actually a missed optimization) as evrp fails to bound the results of
> isfinite.  This issue is independent of the back-end, as the tree-ssa
> evrp pass is run long before __builtin_finite is expanded by the backend,
> and the existence of an (any) isfinite optab is sufficient to expose it.
> Fortunately, Haochem Gui has already posted/proposed a fix at
> https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657881.html
> [which I'm sad to see is taking a while to review/get approved].

Well, now this nvptx one here took me even longer to look into, so the
'g++.dg/opt/pr107569.C' regression is resolved by now.  ;-\

> Ok for mainline?

Just minor items: generally, I do like seeing logically separate changes
as separate commits (like, the 'copysign' cleanup is not conceptually
related to the 'isfinite', 'isnormal' enhancements).  However, that's my
own ambition; I do acknowledge that others do things differently, like
mixing in small cleanups with other changes.  Also, I personally strive
to go one step further with enhancing test suite coverage (for example,
move towards using 'check-function-bodies' instead of 'scan-assembler',
and first push the current/"bad" test case as its own commit, possibly
partly XFAILed, and as part of the code-changes commit then "fix up" the
test case, so that the latter changes are visible in the commit history).
But again, that's my own ambition; I do acknowledge that others do things
differently.

All that said, the patch is OK as is, with just one small enhancement,
see below.  Thank you!

> Thanks in advance (p.s. don't forget the nvptx_rtx_costs patch),

Aye-aye!

> --- a/gcc/config/nvptx/nvptx.md
> +++ b/gcc/config/nvptx/nvptx.md

> +(define_insn "setcc_isnormal<mode>"
> +  [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
> +	(unspec:BI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
> +		   UNSPEC_ISNORMAL))]
> +  ""
> +  "%.\\ttestp.normal%t1\\t%0, %1;")
> +
> +(define_expand "isnormal<mode>2"
> +  [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
> +	(unspec:SI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
> +		   UNSPEC_ISNORMAL))]
> +  ""
> +{
> +  rtx pred1 = gen_reg_rtx (BImode);
> +  rtx pred2 = gen_reg_rtx (BImode);
> +  rtx pred3 = gen_reg_rtx (BImode);
> +  rtx zero = CONST0_RTX (<MODE>mode);
> +  rtx cmp = gen_rtx_fmt_ee (NE, BImode, operands[1], zero);
> +  emit_insn (gen_cmp<mode> (pred1, cmp, operands[1], zero));
> +  emit_insn (gen_setcc_isnormal<mode> (pred2, operands[1]));
> +  emit_insn (gen_andbi3 (pred3, pred1, pred2));
> +  emit_insn (gen_setccsi_from_bi (operands[0], pred3));
> +  DONE;
> +})

Isn't it that this special "+/-0.0" handling conceptually belongs into
'setcc_isnormal<mode>' instead of 'isnormal<mode>2' -- or, simpler,
'setcc_isnormal<mode>' be renamed to 'setcc_nvptx_isnormal<mode>' (or
similar), to make it clear that this is to fix up that for PTX 'testp',
"As a special case, positive and negative zero are considered normal
numbers"?  Please add a comment to that effect to the code block in
'isnormal<mode>2', to help the next unaware reader (like, me, in a few
months).

Also, do you know whether we have execution test coverage for
'isnormal([+/-0.0])' (in the generic parts of the GCC test suite), or
should we add something?  (OK to do incrementally.)  I suppose we've got
something as part of 'gcc/testsuite/gcc.dg/c99-math.h',
'gcc/testsuite/gcc.dg/tg-tests.h',
'gcc/testsuite/gcc.dg/torture/floatn-tg.h' -- but these, curiously, all
seem to skip the '-0.0' case?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/nvptx/isnormal.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +int isnormal(double x)
> +{
> +  return __builtin_isnormal(x);
> +}
> +
> +/* { dg-final { scan-assembler-times "testp.normal.f64" 1 } } */

For example, here, we could use 'check-function-bodies' to test the whole
PTX instruction sequence instead of just 'testp.normal.f64'.  (But again,
not a requirement; more like something to keep in mind going forward.)


Grüße
 Thomas

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md.
  2024-07-27 18:18 [nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md Roger Sayle
  2024-09-27  8:51 ` Thomas Schwinge
@ 2024-09-27 13:45 ` Thomas Schwinge
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Schwinge @ 2024-09-27 13:45 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc-patches, Tom de Vries

[-- Attachment #1: Type: text/plain, Size: 3211 bytes --]

Hi Roger!

If you don't mind, I could use your help here (but: low priority!):

On 2024-07-27T19:18:35+0100, "Roger Sayle" <roger@nextmovesoftware.com> wrote:
> Previously, for isnormal, GCC -O2 would generate: [...]
> and with this patch becomes:
>
>                 mov.f64 %r23, %ar0;
>                 setp.neu.f64    %r24, %r23, 0d0000000000000000;
>                 testp.normal.f64        %r25, %r23;
>                 and.pred        %r26, %r24, %r25;
>                 selp.u32        %value, 1, 0, %r26;

Looking at this, shouldn't we be able to optimize ("combine") this into
somethink like (untested):

    mov.f64 %r23, %ar0;
    testp.normal.f64        %r25, %r23;
    setp.neu.and.f64    %r26, %r23, 0d0000000000000000, %r25;
    selp.u32        %value, 1, 0, %r26;

(I hope I correctly understood PTX 'setp', 'combine [...] with a
predicate value by applying a Boolean operator'!)

That is, "combine":

    CmpOp = { eq, ne, lt, le, gt, ge, lo, ls, hi, hs, equ, neu, ltu, leu, gtu, geu, num, nan };

    BoolOp = { and, or, xor };

    setp.CmpOp.TYPE %3, %2, %1;
    BoolOp.pred %5, %3, %4

... into:

    setp.CmpOp.BoolOp.TYPE %5, %2, %1, %4;

I tried adding a corresponding 'define_insn' for just the 'and' case at
hand (eventually to be generalized to 'BoolOp'), see the attached
"WIP nvptx: 'setp', 'combine [...] with a predicate value by applying a Boolean operator'".
This does do the expected transformation for quite a number of instances
in the GCC/nvptx target libraries (again: completely untested!) -- but it
doesn't for the new 'gcc.target/nvptx/isnormal.c', and I don't know how
to read '-fdump-rtl-combine-all', to understand, why.  Any "RTFM" or
other pointers gladly accepted, guidance about how to approach such an
issue.  (Or tell me it's just 'TARGET_RTX_COSTS'...)


Grüße
 Thomas


> --- a/gcc/config/nvptx/nvptx.md
> +++ b/gcc/config/nvptx/nvptx.md

> +(define_insn "setcc_isnormal<mode>"
> +  [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
> +	(unspec:BI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
> +		   UNSPEC_ISNORMAL))]
> +  ""
> +  "%.\\ttestp.normal%t1\\t%0, %1;")
> +
> +(define_expand "isnormal<mode>2"
> +  [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
> +	(unspec:SI [(match_operand:SDFM 1 "nvptx_register_operand" "R")]
> +		   UNSPEC_ISNORMAL))]
> +  ""
> +{
> +  rtx pred1 = gen_reg_rtx (BImode);
> +  rtx pred2 = gen_reg_rtx (BImode);
> +  rtx pred3 = gen_reg_rtx (BImode);
> +  rtx zero = CONST0_RTX (<MODE>mode);
> +  rtx cmp = gen_rtx_fmt_ee (NE, BImode, operands[1], zero);
> +  emit_insn (gen_cmp<mode> (pred1, cmp, operands[1], zero));
> +  emit_insn (gen_setcc_isnormal<mode> (pred2, operands[1]));
> +  emit_insn (gen_andbi3 (pred3, pred1, pred2));
> +  emit_insn (gen_setccsi_from_bi (operands[0], pred3));
> +  DONE;
> +})

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/nvptx/isnormal.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +int isnormal(double x)
> +{
> +  return __builtin_isnormal(x);
> +}
> +
> +/* { dg-final { scan-assembler-times "testp.normal.f64" 1 } } */



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-WIP-nvptx-setp-combine-.-with-a-predicate-value-by-a.patch --]
[-- Type: text/x-diff, Size: 4480 bytes --]

From c4c389a6bd262356023202adab08a48f044e59b2 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <tschwinge@baylibre.com>
Date: Fri, 27 Sep 2024 15:14:19 +0200
Subject: [PATCH] WIP nvptx: 'setp', 'combine [...] with a predicate value by
 applying a Boolean operator'

Re "Implement isfinite and isnormal optabs in nvptx.md"

    mov.f64 %r23, %ar0;
    setp.neu.f64    %r24, %r23, 0d0000000000000000;
    testp.normal.f64        %r25, %r23;
    and.pred        %r26, %r24, %r25;
    selp.u32        %value, 1, 0, %r26;

Can we optimize this into somethink like (untested):

    mov.f64 %r23, %ar0;
    testp.normal.f64        %r25, %r23;
    setp.neu.and.f64    %r26, %r23, 0d0000000000000000, %r25;
    selp.u32        %value, 1, 0, %r26;

That is, "combine":

    CmpOp = { eq, ne, lt, le, gt, ge, lo, ls, hi, hs, equ, neu, ltu, leu, gtu, geu, num, nan };

    BoolOp = { and, or, xor };

    setp.CmpOp.TYPE %3, %2, %1;
    BoolOp.pred %5, %3, %4

..., into:

    setp.CmpOp.BoolOp.TYPE %5, %2, %1, %4;
---
 gcc/config/nvptx/nvptx.cc |  3 +++
 gcc/config/nvptx/nvptx.md | 23 ++++++++++++++++-------
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 96a1134220e..b4c4f9ff021 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -3080,6 +3080,9 @@ nvptx_print_operand (FILE *file, rtx x, int code)
 	default:
 	  gcc_unreachable ();
 	}
+      break;
+    case /*TODO*/ 'C':
+      mode = GET_MODE (XEXP (x, 0));
       if (FLOAT_MODE_P (mode)
 	  || x_code == EQ || x_code == NE
 	  || x_code == GEU || x_code == GTU
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index ae711bbd250..ce2603eeccb 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -881,13 +881,22 @@
 
 ;; Comparisons and branches
 
+(define_insn ""
+  [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
+	(and:BI (match_operator:BI 1 "nvptx_comparison_operator"
+		   [(match_operand:HSDIM 2 "nvptx_register_operand" "R")
+		    (match_operand:HSDIM 3 "nvptx_nonmemory_operand" "Ri")])
+		(match_operand:BI 4 "nvptx_register_operand" "R")))]
+  ""
+  "%.\\tsetp%c1.and%C1\\t%0, %2, %3, %4;")
+
 (define_insn "cmp<mode>"
   [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
 	(match_operator:BI 1 "nvptx_comparison_operator"
 	   [(match_operand:HSDIM 2 "nvptx_register_operand" "R")
 	    (match_operand:HSDIM 3 "nvptx_nonmemory_operand" "Ri")]))]
   ""
-  "%.\\tsetp%c1\\t%0, %2, %3;")
+  "%.\\tsetp%c1%C1\\t%0, %2, %3;")
 
 (define_insn "cmp<mode>"
   [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
@@ -895,7 +904,7 @@
 	   [(match_operand:SDFM 2 "nvptx_register_operand" "R")
 	    (match_operand:SDFM 3 "nvptx_nonmemory_operand" "RF")]))]
   ""
-  "%.\\tsetp%c1\\t%0, %2, %3;")
+  "%.\\tsetp%c1%C1\\t%0, %2, %3;")
 
 (define_insn "*cmphf"
   [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
@@ -903,7 +912,7 @@
 	   [(match_operand:HF 2 "nvptx_register_operand" "R")
 	    (match_operand:HF 3 "nvptx_nonmemory_operand" "RF")]))]
   "TARGET_SM53"
-  "%.\\tsetp%c1\\t%0, %2, %3;")
+  "%.\\tsetp%c1%C1\\t%0, %2, %3;")
 
 (define_insn "jump"
   [(set (pc)
@@ -1095,7 +1104,7 @@
 	    [(match_operand:HSDIM 2 "nvptx_register_operand" "R")
 	     (match_operand:HSDIM 3 "nvptx_nonmemory_operand" "Ri")])))]
   ""
-  "%.\\tset%t0%c1\\t%0, %2, %3;")
+  "%.\\tset%t0%c1%C1\\t%0, %2, %3;")
 
 (define_insn "*setcc_int<mode>"
   [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
@@ -1104,7 +1113,7 @@
 	    [(match_operand:SDFM 2 "nvptx_register_operand" "R")
 	     (match_operand:SDFM 3 "nvptx_nonmemory_operand" "RF")])))]
   ""
-  "%.\\tset%t0%c1\\t%0, %2, %3;")
+  "%.\\tset%t0%c1%C1\\t%0, %2, %3;")
 
 (define_insn "setcc_float<mode>"
   [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
@@ -1112,7 +1121,7 @@
 	   [(match_operand:HSDIM 2 "nvptx_register_operand" "R")
 	    (match_operand:HSDIM 3 "nvptx_nonmemory_operand" "Ri")]))]
   ""
-  "%.\\tset%t0%c1\\t%0, %2, %3;")
+  "%.\\tset%t0%c1%C1\\t%0, %2, %3;")
 
 (define_insn "setcc_float<mode>"
   [(set (match_operand:SF 0 "nvptx_register_operand" "=R")
@@ -1120,7 +1129,7 @@
 	   [(match_operand:SDFM 2 "nvptx_register_operand" "R")
 	    (match_operand:SDFM 3 "nvptx_nonmemory_operand" "RF")]))]
   ""
-  "%.\\tset%t0%c1\\t%0, %2, %3;")
+  "%.\\tset%t0%c1%C1\\t%0, %2, %3;")
 
 (define_expand "cstore<mode>4"
   [(set (match_operand:SI 0 "nvptx_register_operand")
-- 
2.34.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-09-27 13:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-27 18:18 [nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md Roger Sayle
2024-09-27  8:51 ` Thomas Schwinge
2024-09-27 13:45 ` Thomas Schwinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).