public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
@ 2022-02-08 8:06 vries at gcc dot gnu.org
2022-02-08 8:07 ` [Bug target/104440] " vries at gcc dot gnu.org
` (12 more replies)
0 siblings, 13 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-08 8:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
Bug ID: 104440
Summary: nvptx: FAIL: gcc.c-torture/execute/pr53465.c
execution test
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
With nvptx target, driver version 510.47.03 and board GT 1030 I run into:
...
FAIL: gcc.c-torture/execute/pr53465.c -O1 execution test
FAIL: gcc.c-torture/execute/pr53465.c -O2 execution test
FAIL: gcc.c-torture/execute/pr53465.c -O3 -g execution test
...
Passes with nvptx-none-run -O0:
...
$ ( export NVPTX_NONE_RUN="$(pwd -P)/install/bin/nvptx-none-run -O0" ;
./test.sh )
=== gcc Summary ===
# of expected passes 12
$
...
I can minimize it at -O1 to:
...
void __attribute__((noinline, noclone))
foo (int y)
{
int i;
int c;
for (i = 0; i < y; i++)
{
int d = i + 1;
if (i && d <= c)
__builtin_abort ();
c = d;
}
}
int
main ()
{
foo (2);
return 0;
}
...
I can make the test pass by initializing c with any value (or by doing the
equivalent at ptx level).
Note that the read of c in the loop body only happens in the second iteration,
by which time it's initialized, so the example is valid.
Gcc however translates this at gimple level to:
...
_1 = i != 0;
_2 = d <= c;
_3 = _1 & _2;
...
which does imply a read of c while it's undefined.
We can prevent this by using --param=logical-op-non-short-circuit=0, and that
makes the minimized example pass. But not the original example.
If we translate the example into cuda, we see that the loop's first iteration
is peeled off, even at -O0. This has the effect that there are two "d <= c"
tests. The first one has an undefined input, but is dead code. The second one
has its inputs defined on both loop entry and backedge.
We could try to report this to nvidia, but I'm not sure they want to fix this.
They've pushed back on examples involving reads from uninitialized regs before,
and looking at what cuda does, it seems they try to ensure this invariant.
Unfortunately, pass_initialize_regs does not insert the required init.
So, it looks like we'll have to fix this in the compiler.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
@ 2022-02-08 8:07 ` vries at gcc dot gnu.org
2022-02-08 8:09 ` pinskia at gcc dot gnu.org
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-08 8:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
Tom de Vries <vries at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |nvptx
Keywords| |testsuite-fail
--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
Tentative patch that fixes example:
...
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 5b26c0f4c7dd..4dc154434853 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -1565,6 +1565,23 @@ nvptx_declare_function_name (FILE *file, const char
*name, cons
t_tree decl)
fprintf (file, "\t.reg%s ", nvptx_ptx_type_from_mode (mode, true));
output_reg (file, i, split, -2);
fprintf (file, ";\n");
+ switch (mode)
+ {
+ case HImode:
+ fprintf (file, "\tmov.u16 %%r%d, 0;\n", i);
+ break;
+ case SImode:
+ fprintf (file, "\tmov.u32 %%r%d, 0;\n", i);
+ break;
+ case DImode:
+ fprintf (file, "\tmov.u64 %%r%d, 0;\n", i);
+ break;
+ case BImode:
+ fprintf (file, "\tsetp.ne.u32 %%r%d,0,0;\n", i);
+ break;
+ default:
+ gcc_unreachable ();
+ }
}
}
...
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
2022-02-08 8:07 ` [Bug target/104440] " vries at gcc dot gnu.org
@ 2022-02-08 8:09 ` pinskia at gcc dot gnu.org
2022-02-08 8:11 ` pinskia at gcc dot gnu.org
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-08 8:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I thought there was another bug that reported a similar issue.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
2022-02-08 8:07 ` [Bug target/104440] " vries at gcc dot gnu.org
2022-02-08 8:09 ` pinskia at gcc dot gnu.org
@ 2022-02-08 8:11 ` pinskia at gcc dot gnu.org
2022-02-08 8:16 ` vries at gcc dot gnu.org
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-08 8:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>Unfortunately, pass_initialize_regs does not insert the required init.
There is some ideas of getting rid of pass_initialize_regs for GCC 13 too.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
` (2 preceding siblings ...)
2022-02-08 8:11 ` pinskia at gcc dot gnu.org
@ 2022-02-08 8:16 ` vries at gcc dot gnu.org
2022-02-08 8:22 ` pinskia at gcc dot gnu.org
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-08 8:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #4 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> I thought there was another bug that reported a similar issue.
You mean related to nvptx, or in general?
FWIW, I do remember looking at this issue before in the nvptx context, but I
couldn't find a related PR.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
` (3 preceding siblings ...)
2022-02-08 8:16 ` vries at gcc dot gnu.org
@ 2022-02-08 8:22 ` pinskia at gcc dot gnu.org
2022-02-08 9:25 ` rguenth at gcc dot gnu.org
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-08 8:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #4)
> (In reply to Andrew Pinski from comment #2)
> > I thought there was another bug that reported a similar issue.
>
> You mean related to nvptx, or in general?
It was in general. PR 21111 is related but not the same issue.
PR 61810 is the one pointing out the problems with init-regs and talking about
removing it.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
` (4 preceding siblings ...)
2022-02-08 8:22 ` pinskia at gcc dot gnu.org
@ 2022-02-08 9:25 ` rguenth at gcc dot gnu.org
2022-02-08 9:26 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-08 9:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2022-02-08
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #5)
> (In reply to Tom de Vries from comment #4)
> > (In reply to Andrew Pinski from comment #2)
> > > I thought there was another bug that reported a similar issue.
> >
> > You mean related to nvptx, or in general?
>
> It was in general. PR 21111 is related but not the same issue.
>
>
> PR 61810 is the one pointing out the problems with init-regs and talking
> about removing it.
I think there's a bug that ifcombine produces the situation and that
valgrind complains about uninitialized uses.
Note that indeed the init-regs pass should go away.
It's somewhat unfeasible to compute a must-initialized regs so the issue
is really hard to avoid. But nobody tried yet (it would also come at a cost
of course). It would definitely inhibit all early short-circuiting on
GENERIC (a good thing, but with a lot of fallout I think).
That said, --param logical-op-non-short-circuit=0 is only a workaround until
you hit ifcombine doing similar transforms.
LOGICAL_OP_NON_SHORT_CIRCUIT is a target macro btw, so you could arrange it
to be zero for nvptx (but that's of course too late since the hosts
LOGICAL_OP_NON_SHORT_CIRCUIT value will be used).
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
` (5 preceding siblings ...)
2022-02-08 9:25 ` rguenth at gcc dot gnu.org
@ 2022-02-08 9:26 ` rguenth at gcc dot gnu.org
2022-02-17 7:20 ` vries at gcc dot gnu.org
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-08 9:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #6)
> (In reply to Andrew Pinski from comment #5)
> > (In reply to Tom de Vries from comment #4)
> > > (In reply to Andrew Pinski from comment #2)
> > > > I thought there was another bug that reported a similar issue.
> > >
> > > You mean related to nvptx, or in general?
> >
> > It was in general. PR 21111 is related but not the same issue.
> >
> >
> > PR 61810 is the one pointing out the problems with init-regs and talking
> > about removing it.
>
> I think there's a bug that ifcombine produces the situation and that
> valgrind complains about uninitialized uses.
>
> Note that indeed the init-regs pass should go away.
>
> It's somewhat unfeasible to compute a must-initialized regs so the issue
> is really hard to avoid. But nobody tried yet (it would also come at a cost
> of course). It would definitely inhibit all early short-circuiting on
> GENERIC (a good thing, but with a lot of fallout I think).
>
> That said, --param logical-op-non-short-circuit=0 is only a workaround until
> you hit ifcombine doing similar transforms.
>
> LOGICAL_OP_NON_SHORT_CIRCUIT is a target macro btw, so you could arrange it
> to be zero for nvptx (but that's of course too late since the hosts
> LOGICAL_OP_NON_SHORT_CIRCUIT value will be used).
But it would at least prevent ifcombine from doing the transform (ifcombine
only runs on the offload side).
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
` (6 preceding siblings ...)
2022-02-08 9:26 ` rguenth at gcc dot gnu.org
@ 2022-02-17 7:20 ` vries at gcc dot gnu.org
2022-02-17 7:37 ` vries at gcc dot gnu.org
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-17 7:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #8 from Tom de Vries <vries at gcc dot gnu.org> ---
Created attachment 52456
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52456&action=edit
Tentative patch, introducing -minit-regs=<0|1|2>
This patch fixes the problem, and survived a standalone build and gcc testsuite
run.
Currently testing in offloading setting.
Uses DF_MIR_IN, otherwise only used by pass_ree.
Effect of -minit-regs=1 (insert init at function entry):
...
.reg.pred %r50;
.reg.pred %r51;
.reg.u32 %r52;
+ mov.u32 %r26, 0;
mov.u64 %r33, %ar0;
mov.u32 %r34, %ar1;
setp.le.s32 %r35, %r34, 0;
...
Effect of -minit-regs=2 (insert init close to use):
...
add.u64 %r32, %r33, %r37;
mov.u32 %r31, 0;
mov.u32 %r52, 1;
+ mov.u32 %r26, 0;
$L4:
mov.u32 %r30, %r26;
ld.u32 %r26, [%r25];
...
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
` (7 preceding siblings ...)
2022-02-17 7:20 ` vries at gcc dot gnu.org
@ 2022-02-17 7:37 ` vries at gcc dot gnu.org
2022-02-17 7:58 ` vries at gcc dot gnu.org
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-17 7:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #9 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #1)
> Tentative patch that fixes example:
> ...
> diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
> index 5b26c0f4c7dd..4dc154434853 100644
> --- a/gcc/config/nvptx/nvptx.cc
> +++ b/gcc/config/nvptx/nvptx.cc
> @@ -1565,6 +1565,23 @@ nvptx_declare_function_name (FILE *file, const char
> *name, cons
> t_tree decl)
> fprintf (file, "\t.reg%s ", nvptx_ptx_type_from_mode (mode, true));
> output_reg (file, i, split, -2);
> fprintf (file, ";\n");
> + switch (mode)
> + {
> + case HImode:
> + fprintf (file, "\tmov.u16 %%r%d, 0;\n", i);
> + break;
> + case SImode:
> + fprintf (file, "\tmov.u32 %%r%d, 0;\n", i);
> + break;
> + case DImode:
> + fprintf (file, "\tmov.u64 %%r%d, 0;\n", i);
> + break;
> + case BImode:
> + fprintf (file, "\tsetp.ne.u32 %%r%d,0,0;\n", i);
> + break;
> + default:
> + gcc_unreachable ();
> + }
> }
> }
>
> ...
FWIW, I've tested this patch (extended a bit to handle all cases) but ran into
trouble in the libgomp testsuite, with running out of resources. So this
approach is too resource-hungry.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
` (8 preceding siblings ...)
2022-02-17 7:37 ` vries at gcc dot gnu.org
@ 2022-02-17 7:58 ` vries at gcc dot gnu.org
2022-02-20 22:48 ` vries at gcc dot gnu.org
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-17 7:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #10 from Tom de Vries <vries at gcc dot gnu.org> ---
A good thing to note at this point: why doesn't init-regs work here?
The pass works per insn, and when hitting the insn with the problematic use:
...
(gdb) call debug_rtx (insn)
(insn 18 17 19 4 (set (reg/v:SI 30 [ c ])
(reg/v:SI 26 [ d ])) 6 {*movsi_insn}
(expr_list:REG_DEAD (reg/v:SI 26 [ d ])
(nil)))
...
and dealing with the use of reg 26 we test:
...
/* A use is MUST uninitialized if it reaches the top of
the block from the inside of the block (the lr test)
and no def for it reaches the top of the block from
outside of the block (the ur test). */
if (bitmap_bit_p (lr, regno)
&& (!bitmap_bit_p (ur, regno)))
...
where:
...
bitmap lr = DF_LR_IN (bb);
bitmap ur = DF_LIVE_IN (bb);
...
But we have:
...
(gdb) p bitmap_bit_p (lr, regno)
$1 = true
(gdb) p bitmap_bit_p (ur, regno)
$2 = true
...
so the test fails.
In terms of the rtl-dump, the insn is here:
...
(code_label 43 42 17 4 4 (nil) [1 uses])
(note 17 43 18 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 18 17 19 4 (set (reg/v:SI 30 [ c ])
(reg/v:SI 26 [ d ])) 6 {*movsi_insn}
(expr_list:REG_DEAD (reg/v:SI 26 [ d ])
(nil)))
...
and the corresponding df info is:
...
( 7 3 )->[4]->( 8 5 )
;; bb 4 artificial_defs: { }
;; bb 4 artificial_uses: { u-1(1){ }u-1(2){ }u-1(3){ }}
;; lr in 1 [%stack] 2 [%frame] 3 [%args] 25 26 31 32 52
;; lr use 1 [%stack] 2 [%frame] 3 [%args] 25 26
;; lr def 26 30 38
;; live in 1 [%stack] 2 [%frame] 3 [%args] 25 26 31 32 52
;; live gen 26 30 38
;; live kill
;; lr out 1 [%stack] 2 [%frame] 3 [%args] 25 26 30 31 32 52
;; live out 1 [%stack] 2 [%frame] 3 [%args] 25 26 30 31 32 52
...
In short, init-regs doesn't work for this example, because reg 26 is defined on
some incoming path, and ptx (or the JIT implementation) needs it to be defined
on all incoming paths.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
` (9 preceding siblings ...)
2022-02-17 7:58 ` vries at gcc dot gnu.org
@ 2022-02-20 22:48 ` vries at gcc dot gnu.org
2022-02-21 15:51 ` cvs-commit at gcc dot gnu.org
2022-02-21 15:52 ` vries at gcc dot gnu.org
12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-20 22:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #11 from Tom de Vries <vries at gcc dot gnu.org> ---
Posted patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590627.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
` (10 preceding siblings ...)
2022-02-20 22:48 ` vries at gcc dot gnu.org
@ 2022-02-21 15:51 ` cvs-commit at gcc dot gnu.org
2022-02-21 15:52 ` vries at gcc dot gnu.org
12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-21 15:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tom de Vries <vries@gcc.gnu.org>:
https://gcc.gnu.org/g:02aedc6f269b5e3c1f354edcf5b84d27b0a15946
commit r12-7312-g02aedc6f269b5e3c1f354edcf5b84d27b0a15946
Author: Tom de Vries <tdevries@suse.de>
Date: Wed Feb 16 17:09:11 2022 +0100
[nvptx] Initialize ptx regs
With nvptx target, driver version 510.47.03 and board GT 1030 I, we run
into:
...
FAIL: gcc.c-torture/execute/pr53465.c -O1 execution test
FAIL: gcc.c-torture/execute/pr53465.c -O2 execution test
FAIL: gcc.c-torture/execute/pr53465.c -O3 -g execution test
...
while the test-cases pass with nvptx-none-run -O0.
The problem is that the generated ptx contains a read from an uninitialized
ptx register, and the driver JIT doesn't handle this well.
For -O2 and -O3, we can get rid of the FAIL using --param
logical-op-non-short-circuit=0. But not for -O1.
At -O1, the test-case minimizes to:
...
void __attribute__((noinline, noclone))
foo (int y) {
int c;
for (int i = 0; i < y; i++)
{
int d = i + 1;
if (i && d <= c)
__builtin_abort ();
c = d;
}
}
int main () {
foo (2); return 0;
}
...
Note that the test-case does not contain an uninitialized use. In the
first
iteration, i is 0 and consequently c is not read. In the second iteration,
c
is read, but by that time it's already initialized by 'c = d' from the
first
iteration.
AFAICT the problem is introduced as follows: the conditional use of c in
the
loop body is translated into an unconditional use of c in the loop header:
...
# c_1 = PHI <c_4(D)(2), c_9(6)>
...
which forwprop1 propagates the 'c_9 = d_7' assignment into:
...
# c_1 = PHI <c_4(D)(2), d_7(6)>
...
which ends up being translated by expand into an unconditional:
...
(insn 13 12 0 (set (reg/v:SI 22 [ c ])
(reg/v:SI 23 [ d ])) -1
(nil))
...
at the start of the loop body, creating an uninitialized read of d on the
path from loop entry.
By disabling coalesce_ssa_name, we get the more usual copies on the
incoming
edges. The copy on the loop entry path still does an uninitialized read,
but
that one's now initialized by init-regs. The test-case passes, also when
disabling init-regs, so it's possible that the JIT driver doesn't object to
this type of uninitialized read.
Now that we characterized the problem to some degree, we need to fix this,
because either:
- we're violating an undocumented ptx invariant, and this is a compiler
bug,
or
- this is is a driver JIT bug and we need to work around it.
There are essentially two strategies to address this:
- stop the compiler from creating uninitialized reads
- patch up uninitialized reads using additional initialization
The former will probably involve:
- making some optimizations more conservative in the presence of
uninitialized reads, and
- disabling some other optimizations (where making them more conservative
is
not possible, or cannot easily be achieved).
This will probably will have a cost penalty for code that does not suffer
from
the original problem.
The latter has the problem that it may paper over uninitialized reads
in the source code, or indeed over ones that were incorrectly introduced
by the compiler. But it has the advantage that it allows for the problem
to
be addressed at a single location.
There's an existing pass, init-regs, which implements a form of the latter,
but it doesn't work for this example because it only inserts additional
initialization for uses that have not a single reaching definition.
Fix this by adding initialization of uninitialized ptx regs in reorg.
Control the new functionality using -minit-regs=<0|1|2|3>, meaning:
- 0: disabled.
- 1: add initialization of all regs at the entry bb
- 2: add initialization of uninitialized regs at the entry bb
- 3: add initialization of uninitialized regs close to the use
and defaulting to 3.
Tested on nvptx.
gcc/ChangeLog:
2022-02-17 Tom de Vries <tdevries@suse.de>
PR target/104440
* config/nvptx/nvptx.cc (workaround_uninit_method_1)
(workaround_uninit_method_2, workaround_uninit_method_3)
(workaround_uninit): New function.
(nvptx_reorg): Use workaround_uninit.
* config/nvptx/nvptx.opt (minit-regs): New option.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug target/104440] nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
` (11 preceding siblings ...)
2022-02-21 15:51 ` cvs-commit at gcc dot gnu.org
@ 2022-02-21 15:52 ` vries at gcc dot gnu.org
12 siblings, 0 replies; 14+ messages in thread
From: vries at gcc dot gnu.org @ 2022-02-21 15:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104440
Tom de Vries <vries at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
Target Milestone|--- |12.0
--- Comment #13 from Tom de Vries <vries at gcc dot gnu.org> ---
Fixed by "[nvptx] Initialize ptx regs".
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-02-21 15:52 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-08 8:06 [Bug target/104440] New: nvptx: FAIL: gcc.c-torture/execute/pr53465.c execution test vries at gcc dot gnu.org
2022-02-08 8:07 ` [Bug target/104440] " vries at gcc dot gnu.org
2022-02-08 8:09 ` pinskia at gcc dot gnu.org
2022-02-08 8:11 ` pinskia at gcc dot gnu.org
2022-02-08 8:16 ` vries at gcc dot gnu.org
2022-02-08 8:22 ` pinskia at gcc dot gnu.org
2022-02-08 9:25 ` rguenth at gcc dot gnu.org
2022-02-08 9:26 ` rguenth at gcc dot gnu.org
2022-02-17 7:20 ` vries at gcc dot gnu.org
2022-02-17 7:37 ` vries at gcc dot gnu.org
2022-02-17 7:58 ` vries at gcc dot gnu.org
2022-02-20 22:48 ` vries at gcc dot gnu.org
2022-02-21 15:51 ` cvs-commit at gcc dot gnu.org
2022-02-21 15:52 ` vries at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).