* [Bug target/100321] [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
@ 2021-04-28 21:15 ` vries at gcc dot gnu.org
2021-04-29 6:24 ` [Bug target/100321] [OpenMP][nvptx, SIMT] " vries at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-28 21:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321
Tom de Vries <vries at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |amonakov at gcc dot gnu.org
--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
There's no intra-warp exchange code generated: No GOMP_SIMT_XCHG_* :
...
$ grep GOMP_SIMT test.xnvptx-none.mkoffload.244t.optimized
_43 = .GOMP_SIMT_ENTER_ALLOC (16, 4);
_45 = .GOMP_SIMT_LANE ();
.GOMP_SIMT_EXIT (_43);
$
...
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
2021-04-28 21:15 ` [Bug target/100321] " vries at gcc dot gnu.org
@ 2021-04-29 6:24 ` vries at gcc dot gnu.org
2021-04-29 15:09 ` vries at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-29 6:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321
Tom de Vries <vries at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[OpenMP][nvptx] (Con't) |[OpenMP][nvptx, SIMT]
|Reduction fails with |(Con't) Reduction fails
|optimization and |with optimization and
|'loop'/'for simd' but not |'loop'/'for simd' but not
|with 'for' |with 'for'
--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
FTR, example minimized to:
...
// { dg-additional-options "-foffload=-latomic" }
#include <iostream>
#include <cstdlib>
#include <cmath>
#include <complex>
using std::complex;
#pragma omp declare reduction(+: complex<int>: omp_out += omp_in)
int
main (void)
{
const int N0 { 32768 };
const complex<int> expected_value { N0, 0 };
complex<int> counter_N0 { 0, 0 };
#pragma omp target
#pragma omp for simd reduction(+: counter_N0)
for (int i0 = 0 ; i0 < N0 ; i0++ )
counter_N0 += complex<int> { 1, 0 };
std::cerr << "Expected: " << expected_value << std::endl;
std::cerr << "Got : " << counter_N0 << std::endl;
return 0;
}
...
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
2021-04-28 21:15 ` [Bug target/100321] " vries at gcc dot gnu.org
2021-04-29 6:24 ` [Bug target/100321] [OpenMP][nvptx, SIMT] " vries at gcc dot gnu.org
@ 2021-04-29 15:09 ` vries at gcc dot gnu.org
2021-04-29 15:18 ` vries at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-29 15:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321
--- Comment #3 from Tom de Vries <vries at gcc dot gnu.org> ---
C example:
...
/* { dg-additional-options "-foffload=-latomic" } */
#include <stdio.h>
struct s
{
int i;
};
#pragma omp declare reduction(+: struct s: omp_out.i += omp_in.i)
int
main (void)
{
const int N0 = 32768;
printf ("Expected: %d\n", N0);
struct s counter_N0 = { 0 };
#pragma omp target
#pragma omp for simd reduction(+: counter_N0)
for (int i0 = 0 ; i0 < N0 ; i0++ )
counter_N0.i += 1;
printf ("Got : %d\n", counter_N0.i);
return 0;
}
...
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
` (2 preceding siblings ...)
2021-04-29 15:09 ` vries at gcc dot gnu.org
@ 2021-04-29 15:18 ` vries at gcc dot gnu.org
2021-05-03 10:26 ` vries at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-29 15:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321
--- Comment #4 from Tom de Vries <vries at gcc dot gnu.org> ---
During lower_rec_input_clauses in omp-low.c, the reduction clause is handled:
...
case OMP_CLAUSE_REDUCTION:
case OMP_CLAUSE_IN_REDUCTION:
/* OpenACC reductions are initialized using the
GOACC_REDUCTION internal function. */
if (is_gimple_omp_oacc (ctx->stmt))
break;
if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c))
...
AFAICT, the problem is that the the SIMT handling code is added only in the
!OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) case.
For this test-case, the OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) path is taken
instead.
So, something like this reflects the current state:
...
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 7b122059c6e..a0561800977 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -6005,6 +6005,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq
*ilist, gimple_seq *dlist,
tree placeholder = OMP_CLAUSE_REDUCTION_PLACEHOLDER (c);
gimple *tseq;
tree ptype = TREE_TYPE (placeholder);
+ if (sctx.is_simt)
+ {
+ sorry ("SIMT not fully implemented");
+ abort ();
+ }
if (cond)
{
x = error_mark_node;
...
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
` (3 preceding siblings ...)
2021-04-29 15:18 ` vries at gcc dot gnu.org
@ 2021-05-03 10:26 ` vries at gcc dot gnu.org
2021-05-03 21:14 ` cvs-commit at gcc dot gnu.org
2021-05-03 21:15 ` vries at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-05-03 10:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321
--- Comment #5 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #4)
> So, something like this reflects the current state:
> ...
> diff --git a/gcc/omp-low.c b/gcc/omp-low.c
> index 7b122059c6e..a0561800977 100644
> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -6005,6 +6005,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq
> *ilist, gimple_seq *dlist,
> tree placeholder = OMP_CLAUSE_REDUCTION_PLACEHOLDER (c);
> gimple *tseq;
> tree ptype = TREE_TYPE (placeholder);
> + if (sctx.is_simt)
> + {
> + sorry ("SIMT not fully implemented");
> + abort ();
> + }
> if (cond)
> {
> x = error_mark_node;
> ...
Submitted patch that does something similar (but using error rather than
sorry/abort) @ https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569421.html .
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
` (4 preceding siblings ...)
2021-05-03 10:26 ` vries at gcc dot gnu.org
@ 2021-05-03 21:14 ` cvs-commit at gcc dot gnu.org
2021-05-03 21:15 ` vries at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-05-03 21:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tom de Vries <vries@gcc.gnu.org>:
https://gcc.gnu.org/g:f87990a2a8fc9e20d30462a0a4c9047582af0cd9
commit r12-395-gf87990a2a8fc9e20d30462a0a4c9047582af0cd9
Author: Tom de Vries <tdevries@suse.de>
Date: Mon May 3 11:36:14 2021 +0200
[openmp, simt] Disable SIMT for user-defined reduction
The test-case included in this patch contains this target region:
...
for (int i0 = 0 ; i0 < N0 ; i0++ )
counter_N0.i += 1;
...
When running with nvptx accelerator, the counter variable is expected to
be N0 after the region, but instead is N0 / 32. The problem is that rather
than getting the result for all warp lanes, we get it for just one lane.
This is caused by the implementation of SIMT being incomplete. It handles
regular reductions, but appearantly not user-defined reductions.
For now, handle this by disabling SIMT in this case, specifically by
setting
sctx->max_vf to 1.
Tested libgomp on x86_64-linux with nvptx accelerator.
gcc/ChangeLog:
2021-05-03 Tom de Vries <tdevries@suse.de>
PR target/100321
* omp-low.c (lower_rec_input_clauses): Disable SIMT for
user-defined
reduction.
libgomp/ChangeLog:
2021-05-03 Tom de Vries <tdevries@suse.de>
PR target/100321
* testsuite/libgomp.c/target-44.c: New test.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
` (5 preceding siblings ...)
2021-05-03 21:14 ` cvs-commit at gcc dot gnu.org
@ 2021-05-03 21:15 ` vries at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-05-03 21:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321
Tom de Vries <vries at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Target Milestone|--- |12.0
Resolution|--- |FIXED
--- Comment #7 from Tom de Vries <vries at gcc dot gnu.org> ---
Patch committed, marking resolved-fixed.
^ permalink raw reply [flat|nested] 8+ messages in thread