public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
@ 2022-02-08  8:06 vries at gcc dot gnu.org
  2022-02-08  8:07 ` [Bug target/104440] " vries at gcc dot gnu.org
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-08  8:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

            Bug ID: 104440
           Summary: nvptx: FAIL: gcc.c-torture/execute/pr53465.c
                    execution test
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

With nvptx target, driver version 510.47.03 and board GT 1030 I run into:
...
FAIL: gcc.c-torture/execute/pr53465.c   -O1  execution test
FAIL: gcc.c-torture/execute/pr53465.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr53465.c   -O3 -g  execution test
...

Passes with nvptx-none-run -O0:
...
$ ( export NVPTX_NONE_RUN="$(pwd -P)/install/bin/nvptx-none-run -O0" ;
./test.sh )
                === gcc Summary ===

# of expected passes            12
$
...

I can minimize it at -O1 to:
...
void __attribute__((noinline, noclone))
foo (int y)
{
  int i;
  int c;
  for (i = 0; i < y; i++)
    {
      int d = i + 1;
      if (i && d <= c)
        __builtin_abort ();
      c = d;
    }
}

int
main ()
{
  foo (2);
  return 0;
}
...

I can make the test pass by initializing c with any value (or by doing the
equivalent at ptx level).

Note that the read of c in the loop body only happens in the second iteration,
by which time it's initialized, so the example is valid.

Gcc however translates this at gimple level to:
...
    _1 = i != 0;
    _2 = d <= c;
    _3 = _1 & _2;
...
which does imply a read of c while it's undefined.

We can prevent this by using --param=logical-op-non-short-circuit=0, and that
makes the minimized example pass.  But not the original example.

If we translate the example into cuda, we see that the loop's first iteration
is peeled off, even at -O0.  This has the effect that there are two "d <= c"
tests.  The first one has an undefined input, but is dead code.  The second one
has its inputs defined on both loop entry and backedge.

We could try to report this to nvidia, but I'm not sure they want to fix this.
They've pushed back on examples involving reads from uninitialized regs before,
and looking at what cuda does, it seems they try to ensure this invariant.

Unfortunately, pass_initialize_regs does not insert the required init.

So, it looks like we'll have to fix this in the compiler.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
@ 2022-02-08  8:07 ` vries at gcc dot gnu.org
  2022-02-08  8:09 ` pinskia at gcc dot gnu.org
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-08  8:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |nvptx
           Keywords|                            |testsuite-fail

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
Tentative patch that fixes example:
...
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 5b26c0f4c7dd..4dc154434853 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -1565,6 +1565,23 @@ nvptx_declare_function_name (FILE *file, const char
*name, cons
t_tree decl)
          fprintf (file, "\t.reg%s ", nvptx_ptx_type_from_mode (mode, true));
          output_reg (file, i, split, -2);
          fprintf (file, ";\n");
+         switch (mode)
+           {
+           case HImode:
+             fprintf (file, "\tmov.u16 %%r%d, 0;\n", i);
+             break;
+           case SImode:
+             fprintf (file, "\tmov.u32 %%r%d, 0;\n", i);
+             break;
+           case DImode:
+             fprintf (file, "\tmov.u64 %%r%d, 0;\n", i);
+             break;
+           case BImode:
+             fprintf (file, "\tsetp.ne.u32 %%r%d,0,0;\n", i);
+             break;
+           default:
+             gcc_unreachable ();
+           }
        }
     }

...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
  2022-02-08  8:07 ` [Bug target/104440] " vries at gcc dot gnu.org
@ 2022-02-08  8:09 ` pinskia at gcc dot gnu.org
  2022-02-08  8:11 ` pinskia at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-08  8:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I thought there was another bug that reported a similar issue.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
  2022-02-08  8:07 ` [Bug target/104440] " vries at gcc dot gnu.org
  2022-02-08  8:09 ` pinskia at gcc dot gnu.org
@ 2022-02-08  8:11 ` pinskia at gcc dot gnu.org
  2022-02-08  8:16 ` vries at gcc dot gnu.org
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-08  8:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>Unfortunately, pass_initialize_regs does not insert the required init.

There is some ideas of getting rid of pass_initialize_regs for GCC 13 too.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-02-08  8:11 ` pinskia at gcc dot gnu.org
@ 2022-02-08  8:16 ` vries at gcc dot gnu.org
  2022-02-08  8:22 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-08  8:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

--- Comment #4 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> I thought there was another bug that reported a similar issue.

You mean related to nvptx, or in general?

FWIW, I do remember looking at this issue before in the nvptx context, but I
couldn't find a related PR.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-02-08  8:16 ` vries at gcc dot gnu.org
@ 2022-02-08  8:22 ` pinskia at gcc dot gnu.org
  2022-02-08  9:25 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-08  8:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #4)
> (In reply to Andrew Pinski from comment #2)
> > I thought there was another bug that reported a similar issue.
> 
> You mean related to nvptx, or in general?

It was in general. PR 21111 is related but not the same issue.


PR 61810 is the one pointing out the problems with init-regs and talking about
removing it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-02-08  8:22 ` pinskia at gcc dot gnu.org
@ 2022-02-08  9:25 ` rguenth at gcc dot gnu.org
  2022-02-08  9:26 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-08  9:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-02-08
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #5)
> (In reply to Tom de Vries from comment #4)
> > (In reply to Andrew Pinski from comment #2)
> > > I thought there was another bug that reported a similar issue.
> > 
> > You mean related to nvptx, or in general?
> 
> It was in general. PR 21111 is related but not the same issue.
> 
> 
> PR 61810 is the one pointing out the problems with init-regs and talking
> about removing it.

I think there's a bug that ifcombine produces the situation and that
valgrind complains about uninitialized uses.

Note that indeed the init-regs pass should go away.

It's somewhat unfeasible to compute a must-initialized regs so the issue
is really hard to avoid.  But nobody tried yet (it would also come at a cost
of course).  It would definitely inhibit all early short-circuiting on
GENERIC (a good thing, but with a lot of fallout I think).

That said, --param logical-op-non-short-circuit=0 is only a workaround until
you hit ifcombine doing similar transforms.

LOGICAL_OP_NON_SHORT_CIRCUIT is a target macro btw, so you could arrange it
to be zero for nvptx (but that's of course too late since the hosts
LOGICAL_OP_NON_SHORT_CIRCUIT value will be used).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-02-08  9:25 ` rguenth at gcc dot gnu.org
@ 2022-02-08  9:26 ` rguenth at gcc dot gnu.org
  2022-02-17  7:20 ` vries at gcc dot gnu.org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-08  9:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #6)
> (In reply to Andrew Pinski from comment #5)
> > (In reply to Tom de Vries from comment #4)
> > > (In reply to Andrew Pinski from comment #2)
> > > > I thought there was another bug that reported a similar issue.
> > > 
> > > You mean related to nvptx, or in general?
> > 
> > It was in general. PR 21111 is related but not the same issue.
> > 
> > 
> > PR 61810 is the one pointing out the problems with init-regs and talking
> > about removing it.
> 
> I think there's a bug that ifcombine produces the situation and that
> valgrind complains about uninitialized uses.
> 
> Note that indeed the init-regs pass should go away.
> 
> It's somewhat unfeasible to compute a must-initialized regs so the issue
> is really hard to avoid.  But nobody tried yet (it would also come at a cost
> of course).  It would definitely inhibit all early short-circuiting on
> GENERIC (a good thing, but with a lot of fallout I think).
> 
> That said, --param logical-op-non-short-circuit=0 is only a workaround until
> you hit ifcombine doing similar transforms.
> 
> LOGICAL_OP_NON_SHORT_CIRCUIT is a target macro btw, so you could arrange it
> to be zero for nvptx (but that's of course too late since the hosts
> LOGICAL_OP_NON_SHORT_CIRCUIT value will be used).

But it would at least prevent ifcombine from doing the transform (ifcombine
only runs on the offload side).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-02-08  9:26 ` rguenth at gcc dot gnu.org
@ 2022-02-17  7:20 ` vries at gcc dot gnu.org
  2022-02-17  7:37 ` vries at gcc dot gnu.org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-17  7:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

--- Comment #8 from Tom de Vries <vries at gcc dot gnu.org> ---
Created attachment 52456
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52456&action=edit
Tentative patch, introducing -minit-regs=<0|1|2>

This patch fixes the problem, and survived a standalone build and gcc testsuite
run.

Currently testing in offloading setting.

Uses DF_MIR_IN, otherwise only used by pass_ree.

Effect of -minit-regs=1 (insert init at function entry):
...
        .reg.pred %r50;
        .reg.pred %r51;
        .reg.u32 %r52;
+               mov.u32 %r26, 0;
                mov.u64 %r33, %ar0;
                mov.u32 %r34, %ar1;
                setp.le.s32     %r35, %r34, 0;
...

Effect of -minit-regs=2 (insert init close to use):
...
                add.u64 %r32, %r33, %r37;
                mov.u32 %r31, 0;
                mov.u32 %r52, 1;
+               mov.u32 %r26, 0;
 $L4:
                mov.u32 %r30, %r26;
                ld.u32  %r26, [%r25];
...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-02-17  7:20 ` vries at gcc dot gnu.org
@ 2022-02-17  7:37 ` vries at gcc dot gnu.org
  2022-02-17  7:58 ` vries at gcc dot gnu.org
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-17  7:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

--- Comment #9 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #1)
> Tentative patch that fixes example:
> ...
> diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
> index 5b26c0f4c7dd..4dc154434853 100644
> --- a/gcc/config/nvptx/nvptx.cc
> +++ b/gcc/config/nvptx/nvptx.cc
> @@ -1565,6 +1565,23 @@ nvptx_declare_function_name (FILE *file, const char
> *name, cons
> t_tree decl)
>           fprintf (file, "\t.reg%s ", nvptx_ptx_type_from_mode (mode, true));
>           output_reg (file, i, split, -2);
>           fprintf (file, ";\n");
> +         switch (mode)
> +           {
> +           case HImode:
> +             fprintf (file, "\tmov.u16 %%r%d, 0;\n", i);
> +             break;
> +           case SImode:
> +             fprintf (file, "\tmov.u32 %%r%d, 0;\n", i);
> +             break;
> +           case DImode:
> +             fprintf (file, "\tmov.u64 %%r%d, 0;\n", i);
> +             break;
> +           case BImode:
> +             fprintf (file, "\tsetp.ne.u32 %%r%d,0,0;\n", i);
> +             break;
> +           default:
> +             gcc_unreachable ();
> +           }
>         }
>      }
>  
> ...

FWIW, I've tested this patch (extended a bit to handle all cases) but ran into
trouble in the libgomp testsuite, with running out of resources.  So this
approach is too resource-hungry.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-02-17  7:37 ` vries at gcc dot gnu.org
@ 2022-02-17  7:58 ` vries at gcc dot gnu.org
  2022-02-20 22:48 ` vries at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-17  7:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

--- Comment #10 from Tom de Vries <vries at gcc dot gnu.org> ---
A good thing to note at this point: why doesn't init-regs work here?

The pass works per insn, and when hitting the insn with the problematic use:
...
(gdb) call debug_rtx (insn)
(insn 18 17 19 4 (set (reg/v:SI 30 [ c ])
        (reg/v:SI 26 [ d ])) 6 {*movsi_insn}
     (expr_list:REG_DEAD (reg/v:SI 26 [ d ])
        (nil)))
...
and dealing with the use of reg 26 we test:
...
              /* A use is MUST uninitialized if it reaches the top of           
                 the block from the inside of the block (the lr test)           
                 and no def for it reaches the top of the block from            
                 outside of the block (the ur test).  */
              if (bitmap_bit_p (lr, regno)
                  && (!bitmap_bit_p (ur, regno)))
...
where:
...
      bitmap lr = DF_LR_IN (bb);
      bitmap ur = DF_LIVE_IN (bb);
...

But we have:
...
(gdb) p bitmap_bit_p (lr, regno)
$1 = true
(gdb) p bitmap_bit_p (ur, regno)
$2 = true
...
so the test fails.

In terms of the rtl-dump, the insn is here:
...
(code_label 43 42 17 4 4 (nil) [1 uses])
(note 17 43 18 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 18 17 19 4 (set (reg/v:SI 30 [ c ])
        (reg/v:SI 26 [ d ])) 6 {*movsi_insn}
     (expr_list:REG_DEAD (reg/v:SI 26 [ d ])
        (nil)))
...
and the corresponding df info is:
...
( 7 3 )->[4]->( 8 5 )
;; bb 4 artificial_defs: { }
;; bb 4 artificial_uses: { u-1(1){ }u-1(2){ }u-1(3){ }}
;; lr  in        1 [%stack] 2 [%frame] 3 [%args] 25 26 31 32 52
;; lr  use       1 [%stack] 2 [%frame] 3 [%args] 25 26
;; lr  def       26 30 38
;; live  in      1 [%stack] 2 [%frame] 3 [%args] 25 26 31 32 52
;; live  gen     26 30 38
;; live  kill
;; lr  out       1 [%stack] 2 [%frame] 3 [%args] 25 26 30 31 32 52
;; live  out     1 [%stack] 2 [%frame] 3 [%args] 25 26 30 31 32 52
...

In short, init-regs doesn't work for this example, because reg 26 is defined on
some incoming path, and ptx (or the JIT implementation) needs it to be defined
on all incoming paths.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2022-02-17  7:58 ` vries at gcc dot gnu.org
@ 2022-02-20 22:48 ` vries at gcc dot gnu.org
  2022-02-21 15:51 ` cvs-commit at gcc dot gnu.org
  2022-02-21 15:52 ` vries at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-20 22:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

--- Comment #11 from Tom de Vries <vries at gcc dot gnu.org> ---
Posted patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590627.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2022-02-20 22:48 ` vries at gcc dot gnu.org
@ 2022-02-21 15:51 ` cvs-commit at gcc dot gnu.org
  2022-02-21 15:52 ` vries at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-21 15:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tom de Vries <vries@gcc.gnu.org>:

https://gcc.gnu.org/g:02aedc6f269b5e3c1f354edcf5b84d27b0a15946

commit r12-7312-g02aedc6f269b5e3c1f354edcf5b84d27b0a15946
Author: Tom de Vries <tdevries@suse.de>
Date:   Wed Feb 16 17:09:11 2022 +0100

    [nvptx] Initialize ptx regs

    With nvptx target, driver version 510.47.03 and board GT 1030 I, we run
into:
    ...
    FAIL: gcc.c-torture/execute/pr53465.c -O1 execution test
    FAIL: gcc.c-torture/execute/pr53465.c -O2 execution test
    FAIL: gcc.c-torture/execute/pr53465.c -O3 -g execution test
    ...
    while the test-cases pass with nvptx-none-run -O0.

    The problem is that the generated ptx contains a read from an uninitialized
    ptx register, and the driver JIT doesn't handle this well.

    For -O2 and -O3, we can get rid of the FAIL using --param
    logical-op-non-short-circuit=0.  But not for -O1.

    At -O1, the test-case minimizes to:
    ...
    void __attribute__((noinline, noclone))
    foo (int y) {
      int c;
      for (int i = 0; i < y; i++)
        {
          int d = i + 1;
          if (i && d <= c)
            __builtin_abort ();
          c = d;
        }
    }

    int main () {
      foo (2); return 0;
    }
    ...

    Note that the test-case does not contain an uninitialized use.  In the
first
    iteration, i is 0 and consequently c is not read.  In the second iteration,
c
    is read, but by that time it's already initialized by 'c = d' from the
first
    iteration.

    AFAICT the problem is introduced as follows: the conditional use of c in
the
    loop body is translated into an unconditional use of c in the loop header:
    ...
      # c_1 = PHI <c_4(D)(2), c_9(6)>
    ...
    which forwprop1 propagates the 'c_9 = d_7' assignment into:
    ...
      # c_1 = PHI <c_4(D)(2), d_7(6)>
    ...
    which ends up being translated by expand into an unconditional:
    ...
    (insn 13 12 0 (set (reg/v:SI 22 [ c ])
            (reg/v:SI 23 [ d ])) -1
         (nil))
    ...
    at the start of the loop body, creating an uninitialized read of d on the
    path from loop entry.

    By disabling coalesce_ssa_name, we get the more usual copies on the
incoming
    edges.  The copy on the loop entry path still does an uninitialized read,
but
    that one's now initialized by init-regs.  The test-case passes, also when
    disabling init-regs, so it's possible that the JIT driver doesn't object to
    this type of uninitialized read.

    Now that we characterized the problem to some degree, we need to fix this,
    because either:
    - we're violating an undocumented ptx invariant, and this is a compiler
bug,
      or
    - this is is a driver JIT bug and we need to work around it.

    There are essentially two strategies to address this:
    - stop the compiler from creating uninitialized reads
    - patch up uninitialized reads using additional initialization

    The former will probably involve:
    - making some optimizations more conservative in the presence of
      uninitialized reads, and
    - disabling some other optimizations (where making them more conservative
is
      not possible, or cannot easily be achieved).
    This will probably will have a cost penalty for code that does not suffer
from
    the original problem.

    The latter has the problem that it may paper over uninitialized reads
    in the source code, or indeed over ones that were incorrectly introduced
    by the compiler.  But it has the advantage that it allows for the problem
to
    be addressed at a single location.

    There's an existing pass, init-regs, which implements a form of the latter,
    but it doesn't work for this example because it only inserts additional
    initialization for uses that have not a single reaching definition.

    Fix this by adding initialization of uninitialized ptx regs in reorg.

    Control the new functionality using -minit-regs=<0|1|2|3>, meaning:
    - 0: disabled.
    - 1: add initialization of all regs at the entry bb
    - 2: add initialization of uninitialized regs at the entry bb
    - 3: add initialization of uninitialized regs close to the use
    and defaulting to 3.

    Tested on nvptx.

    gcc/ChangeLog:

    2022-02-17  Tom de Vries  <tdevries@suse.de>

            PR target/104440
            * config/nvptx/nvptx.cc (workaround_uninit_method_1)
            (workaround_uninit_method_2, workaround_uninit_method_3)
            (workaround_uninit): New function.
            (nvptx_reorg): Use workaround_uninit.
            * config/nvptx/nvptx.opt (minit-regs): New option.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
  2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2022-02-21 15:51 ` cvs-commit at gcc dot gnu.org
@ 2022-02-21 15:52 ` vries at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-21 15:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |12.0

--- Comment #13 from Tom de Vries <vries at gcc dot gnu.org> ---
Fixed by "[nvptx] Initialize ptx regs".

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-02-21 15:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
2022-02-08  8:07 ` [Bug target/104440] " vries at gcc dot gnu.org
2022-02-08  8:09 ` pinskia at gcc dot gnu.org
2022-02-08  8:11 ` pinskia at gcc dot gnu.org
2022-02-08  8:16 ` vries at gcc dot gnu.org
2022-02-08  8:22 ` pinskia at gcc dot gnu.org
2022-02-08  9:25 ` rguenth at gcc dot gnu.org
2022-02-08  9:26 ` rguenth at gcc dot gnu.org
2022-02-17  7:20 ` vries at gcc dot gnu.org
2022-02-17  7:37 ` vries at gcc dot gnu.org
2022-02-17  7:58 ` vries at gcc dot gnu.org
2022-02-20 22:48 ` vries at gcc dot gnu.org
2022-02-21 15:51 ` cvs-commit at gcc dot gnu.org
2022-02-21 15:52 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).