public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
@ 2021-04-23 13:20 burnus at gcc dot gnu.org
2021-04-23 14:19 ` [Bug target/100232] " vries at gcc dot gnu.org
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: burnus at gcc dot gnu.org @ 2021-04-23 13:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232
Bug ID: 100232
Summary: [OpenMP][nvptx] Reduction fails with optimization and
'loop'/'for simd' but not with 'for'
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Keywords: openmp, wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: burnus at gcc dot gnu.org
CC: vries at gcc dot gnu.org
Target Milestone: ---
Target: nvptx-none
Created attachment 50661
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50661&action=edit
Testcase: gcc -fopenmp -O1 (fails, -O0 works) - to be run with nvptx
offloading
(Based on https://github.com/SOLLVE/sollve_vv/ 's
tests/5.0/loop/test_loop_reduction_{and,or}_device.c )
The code works with nvptx offloading with -O0 but fails with -O1 and higher.
(It also works on AMD GCN or with host fallback.)
A reduction of result &&= 1 will yield 0 instead of the expected 1.
I note that it works with 'for' but fails with 'loop' and 'for simd', hence, I
think it might related to SIMT (→ some other PRs about SIMT).
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/100232] [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-23 13:20 [Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
@ 2021-04-23 14:19 ` vries at gcc dot gnu.org
2021-04-23 15:27 ` burnus at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-23 14:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232
--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
Can you try the patch for PR81778 ?
It's possible you're looking at a duplicate.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/100232] [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-23 13:20 [Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
2021-04-23 14:19 ` [Bug target/100232] " vries at gcc dot gnu.org
@ 2021-04-23 15:27 ` burnus at gcc dot gnu.org
2021-04-28 12:51 ` vries at gcc dot gnu.org
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: burnus at gcc dot gnu.org @ 2021-04-23 15:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232
--- Comment #2 from Tobias Burnus <burnus at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #1)
> Can you try the patch for PR81778 ?
> It's possible you're looking at a duplicate.
Unfortunately, it does not seem to make a difference - it still fails
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/100232] [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-23 13:20 [Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
2021-04-23 14:19 ` [Bug target/100232] " vries at gcc dot gnu.org
2021-04-23 15:27 ` burnus at gcc dot gnu.org
@ 2021-04-28 12:51 ` vries at gcc dot gnu.org
2021-04-28 13:03 ` vries at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-28 12:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232
Tom de Vries <vries at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |amonakov at gcc dot gnu.org
--- Comment #3 from Tom de Vries <vries at gcc dot gnu.org> ---
In expand_GOMP_SIMT_XCHG_BFLY, we have a subreg target:
...
(gdb) call debug_rtx ( target )
(subreg/s/u:QI (reg:SI 40 [ _61 ]) 0)
...
During expand_insn, the operands are legitimized, and this changes the state of
the output operand to:
...
(gdb) call debug_rtx ( ops[0].value )
(reg:QI 57)
...
So the value is written to reg 57, but never actually copied back to reg 40.
Tentative fix:
...
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index dd7173126fb..28ae3ed167a 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -361,6 +361,8 @@ expand_GOMP_SIMT_XCHG_BFLY (internal_fn, gcall *stmt)
create_input_operand (&ops[2], idx, SImode);
gcc_assert (targetm.have_omp_simt_xchg_bfly ());
expand_insn (targetm.code_for_omp_simt_xchg_bfly, 3, ops);
+ if (ops[0].value != target)
+ emit_move_insn (target, ops[0].value);
}
/* Exchange between SIMT lanes according to given source lane index. */
...
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/100232] [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-23 13:20 [Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
` (2 preceding siblings ...)
2021-04-28 12:51 ` vries at gcc dot gnu.org
@ 2021-04-28 13:03 ` vries at gcc dot gnu.org
2021-04-28 14:31 ` vries at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-28 13:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232
--- Comment #4 from Tom de Vries <vries at gcc dot gnu.org> ---
This commit:
...
commit 3af3bec2e4d344bd54a134d8b2263f44d788c3d8
Author: Richard Sandiford <richard.sandiford@arm.com>
Date: Mon May 4 21:21:16 2020 +0100
internal-fn: Avoid dropping the lhs of some calls [PR94941]
...
adds:
...
expand_insn (get_multi_vector_move (type, optab), 2, ops);
+ if (!rtx_equal_p (target, ops[0].value))
+ emit_move_insn (target, ops[0].value);
...
in expand_load_lanes_optab_fn and mentions:
...
create_output_operand coerces an output operand to the insn's
predicates, using a suggested rtx location if convenient.
But if that rtx location is actually required rather than
optional, the builder of the insn has to emit a move afterwards.
(We could instead add a new interface that does this automatically,
but that's future work.)
This PR shows that we were failing to emit the move for some of the
vector load internal functions. I think there are other routines in
internal-fn.c that potentially have the same problem, but this patch is
supposed to be a conservative subset suitable for backporting to GCC 10.
...
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/100232] [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-23 13:20 [Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
` (3 preceding siblings ...)
2021-04-28 13:03 ` vries at gcc dot gnu.org
@ 2021-04-28 14:31 ` vries at gcc dot gnu.org
2021-04-29 7:55 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-28 14:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232
--- Comment #5 from Tom de Vries <vries at gcc dot gnu.org> ---
https://gcc.gnu.org/pipermail/gcc-patches/2021-April/569038.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/100232] [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-23 13:20 [Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
` (4 preceding siblings ...)
2021-04-28 14:31 ` vries at gcc dot gnu.org
@ 2021-04-29 7:55 ` cvs-commit at gcc dot gnu.org
2021-04-29 8:40 ` cvs-commit at gcc dot gnu.org
2021-04-29 9:08 ` vries at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-29 7:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tom de Vries <vries@gcc.gnu.org>:
https://gcc.gnu.org/g:4d7c874e2c64ebf7631049ace642d246843febae
commit r12-249-g4d7c874e2c64ebf7631049ace642d246843febae
Author: Tom de Vries <tdevries@suse.de>
Date: Wed Apr 28 16:00:01 2021 +0200
[omp, simt] Fix expand_GOMP_SIMT_*
When running the test-case included in this patch using an
nvptx accelerator, it fails in execution.
The problem is that the expansion of GOMP_SIMT_XCHG_BFLY is optimized away
during pass_jump as "trivially dead insns".
This is caused by this code in expand_GOMP_SIMT_XCHG_BFLY:
...
class expand_operand ops[3];
create_output_operand (&ops[0], target, mode);
...
expand_insn (targetm.code_for_omp_simt_xchg_bfly, 3, ops);
...
which doesn't guarantee that target is assigned to by the expanded insn.
F.i., if target is:
...
(gdb) call debug_rtx ( target )
(subreg/s/u:QI (reg:SI 40 [ _61 ]) 0)
...
then after expand_insn, we have:
...
(gdb) call debug_rtx ( ops[0].value )
(reg:QI 57)
...
See commit 3af3bec2e4d "internal-fn: Avoid dropping the lhs of some
calls [PR94941]" for a similar problem.
Fix this in the same way, by adding:
...
if (!rtx_equal_p (target, ops[0].value))
emit_move_insn (target, ops[0].value);
...
where applicable in the expand_GOMP_SIMT_* functions.
Tested libgomp on x86_64 with nvptx accelerator.
gcc/ChangeLog:
2021-04-28 Tom de Vries <tdevries@suse.de>
PR target/100232
* internal-fn.c (expand_GOMP_SIMT_ENTER_ALLOC)
(expand_GOMP_SIMT_LAST_LANE, expand_GOMP_SIMT_ORDERED_PRED)
(expand_GOMP_SIMT_VOTE_ANY, expand_GOMP_SIMT_XCHG_BFLY)
(expand_GOMP_SIMT_XCHG_IDX): Ensure target is assigned to.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/100232] [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-23 13:20 [Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
` (5 preceding siblings ...)
2021-04-29 7:55 ` cvs-commit at gcc dot gnu.org
@ 2021-04-29 8:40 ` cvs-commit at gcc dot gnu.org
2021-04-29 9:08 ` vries at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-29 8:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232
--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Tom de Vries
<vries@gcc.gnu.org>:
https://gcc.gnu.org/g:f94c6caac7f03815c26c03a532f834c37517519c
commit r11-8324-gf94c6caac7f03815c26c03a532f834c37517519c
Author: Tom de Vries <tdevries@suse.de>
Date: Wed Apr 28 16:00:01 2021 +0200
[omp, simt] Fix expand_GOMP_SIMT_*
When running the test-case included in this patch using an
nvptx accelerator, it fails in execution.
The problem is that the expansion of GOMP_SIMT_XCHG_BFLY is optimized away
during pass_jump as "trivially dead insns".
This is caused by this code in expand_GOMP_SIMT_XCHG_BFLY:
...
class expand_operand ops[3];
create_output_operand (&ops[0], target, mode);
...
expand_insn (targetm.code_for_omp_simt_xchg_bfly, 3, ops);
...
which doesn't guarantee that target is assigned to by the expanded insn.
F.i., if target is:
...
(gdb) call debug_rtx ( target )
(subreg/s/u:QI (reg:SI 40 [ _61 ]) 0)
...
then after expand_insn, we have:
...
(gdb) call debug_rtx ( ops[0].value )
(reg:QI 57)
...
See commit 3af3bec2e4d "internal-fn: Avoid dropping the lhs of some
calls [PR94941]" for a similar problem.
Fix this in the same way, by adding:
...
if (!rtx_equal_p (target, ops[0].value))
emit_move_insn (target, ops[0].value);
...
where applicable in the expand_GOMP_SIMT_* functions.
Tested libgomp on x86_64 with nvptx accelerator.
gcc/ChangeLog:
2021-04-28 Tom de Vries <tdevries@suse.de>
PR target/100232
* internal-fn.c (expand_GOMP_SIMT_ENTER_ALLOC)
(expand_GOMP_SIMT_LAST_LANE, expand_GOMP_SIMT_ORDERED_PRED)
(expand_GOMP_SIMT_VOTE_ANY, expand_GOMP_SIMT_XCHG_BFLY)
(expand_GOMP_SIMT_XCHG_IDX): Ensure target is assigned to.
(cherry picked from commit 4d7c874e2c64ebf7631049ace642d246843febae)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/100232] [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-23 13:20 [Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
` (6 preceding siblings ...)
2021-04-29 8:40 ` cvs-commit at gcc dot gnu.org
@ 2021-04-29 9:08 ` vries at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-29 9:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232
Tom de Vries <vries at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Target Milestone|--- |11.2
Resolution|--- |FIXED
--- Comment #8 from Tom de Vries <vries at gcc dot gnu.org> ---
I tried backporting to releases/gcc-10, but ran into:
...
FAIL: libgomp.c/target-43.c (test for excess errors)
Excess errors:
unresolved symbol __sync_val_compare_and_swap_1
mkoffload: fatal error:
/home/vries/oacc/trunk/install/offload-nvptx-none/bin//x86_64-pc-linux-gnu-accel-nvptx-none-gcc
returned 1 exit status
compilation terminated.
...
So I guess backporting stops at gcc-11.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-04-29 9:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-23 13:20 [Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
2021-04-23 14:19 ` [Bug target/100232] " vries at gcc dot gnu.org
2021-04-23 15:27 ` burnus at gcc dot gnu.org
2021-04-28 12:51 ` vries at gcc dot gnu.org
2021-04-28 13:03 ` vries at gcc dot gnu.org
2021-04-28 14:31 ` vries at gcc dot gnu.org
2021-04-29 7:55 ` cvs-commit at gcc dot gnu.org
2021-04-29 8:40 ` cvs-commit at gcc dot gnu.org
2021-04-29 9:08 ` vries at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).