public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c  execution test
@ 2022-02-08  8:06 vries at gcc dot gnu.org
  2022-02-08  8:07 ` [Bug target/104440] " vries at gcc dot gnu.org
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-08  8:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440

            Bug ID: 104440
           Summary: nvptx: FAIL: gcc.c-torture/execute/pr53465.c
                    execution test
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

With nvptx target, driver version 510.47.03 and board GT 1030 I run into:
...
FAIL: gcc.c-torture/execute/pr53465.c   -O1  execution test
FAIL: gcc.c-torture/execute/pr53465.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr53465.c   -O3 -g  execution test
...

Passes with nvptx-none-run -O0:
...
$ ( export NVPTX_NONE_RUN="$(pwd -P)/install/bin/nvptx-none-run -O0" ;
./test.sh )
                === gcc Summary ===

# of expected passes            12
$
...

I can minimize it at -O1 to:
...
void __attribute__((noinline, noclone))
foo (int y)
{
  int i;
  int c;
  for (i = 0; i < y; i++)
    {
      int d = i + 1;
      if (i && d <= c)
        __builtin_abort ();
      c = d;
    }
}

int
main ()
{
  foo (2);
  return 0;
}
...

I can make the test pass by initializing c with any value (or by doing the
equivalent at ptx level).

Note that the read of c in the loop body only happens in the second iteration,
by which time it's initialized, so the example is valid.

Gcc however translates this at gimple level to:
...
    _1 = i != 0;
    _2 = d <= c;
    _3 = _1 & _2;
...
which does imply a read of c while it's undefined.

We can prevent this by using --param=logical-op-non-short-circuit=0, and that
makes the minimized example pass.  But not the original example.

If we translate the example into cuda, we see that the loop's first iteration
is peeled off, even at -O0.  This has the effect that there are two "d <= c"
tests.  The first one has an undefined input, but is dead code.  The second one
has its inputs defined on both loop entry and backedge.

We could try to report this to nvidia, but I'm not sure they want to fix this.
They've pushed back on examples involving reads from uninitialized regs before,
and looking at what cuda does, it seems they try to ensure this invariant.

Unfortunately, pass_initialize_regs does not insert the required init.

So, it looks like we'll have to fix this in the compiler.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-02-21 15:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-08  8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
2022-02-08  8:07 ` [Bug target/104440] " vries at gcc dot gnu.org
2022-02-08  8:09 ` pinskia at gcc dot gnu.org
2022-02-08  8:11 ` pinskia at gcc dot gnu.org
2022-02-08  8:16 ` vries at gcc dot gnu.org
2022-02-08  8:22 ` pinskia at gcc dot gnu.org
2022-02-08  9:25 ` rguenth at gcc dot gnu.org
2022-02-08  9:26 ` rguenth at gcc dot gnu.org
2022-02-17  7:20 ` vries at gcc dot gnu.org
2022-02-17  7:37 ` vries at gcc dot gnu.org
2022-02-17  7:58 ` vries at gcc dot gnu.org
2022-02-20 22:48 ` vries at gcc dot gnu.org
2022-02-21 15:51 ` cvs-commit at gcc dot gnu.org
2022-02-21 15:52 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).