public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
@ 2021-04-28 15:36 burnus at gcc dot gnu.org
  2021-04-28 21:15 ` [Bug target/100321] " vries at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: burnus at gcc dot gnu.org @ 2021-04-28 15:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321

            Bug ID: 100321
           Summary: [OpenMP][nvptx] (Con't) Reduction fails with
                    optimization and 'loop'/'for simd' but not with 'for'
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Keywords: openmp, wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: burnus at gcc dot gnu.org
                CC: vries at gcc dot gnu.org
  Target Milestone: ---
            Target: nvptx-none

Created attachment 50703
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50703&action=edit
target_parallel_for_simd.cpp - compile with g++ -fopenmp -O1 (and nvptx
offloading)

Similar to PR target/100232 
I had hoped that the posted patch does solves this issue as well, but it does
not :-/
[ https://gcc.gnu.org/pipermail/gcc-patches/2021-April/569038.html ]
(However, it does solve the two sollve_vv issue, I mentioned in PR100232 :-)
Thanks!)

Namely, https://github.com/TApplencourt/OvO 's
test_src/cpp/hierarchical_parallelism/reduction_add-complex_double/target_parallel_for_simd.cpp

(also attached) works on the host and AMD GCN, but with nvptx:

  g++ -fopenmp -O1 target_parallel_for_simd.cpp -foffload=-latomic

it fails as

  Expected: (32768,0) Got: (1024,0)

(with exist status code 112)

The -O1 is needed due to the missing .alias.

When removing the 'simd' from
    #pragma omp target parallel for simd map(tofrom: counter_N0) reduction(+:
counter_N0)
it does work.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100321] [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
  2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
@ 2021-04-28 21:15 ` vries at gcc dot gnu.org
  2021-04-29  6:24 ` [Bug target/100321] [OpenMP][nvptx, SIMT] " vries at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-28 21:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
There's no intra-warp exchange code generated: No GOMP_SIMT_XCHG_* :
...
$ grep GOMP_SIMT test.xnvptx-none.mkoffload.244t.optimized
  _43 = .GOMP_SIMT_ENTER_ALLOC (16, 4);
  _45 = .GOMP_SIMT_LANE ();
  .GOMP_SIMT_EXIT (_43);
$
...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
  2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
  2021-04-28 21:15 ` [Bug target/100321] " vries at gcc dot gnu.org
@ 2021-04-29  6:24 ` vries at gcc dot gnu.org
  2021-04-29 15:09 ` vries at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-29  6:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[OpenMP][nvptx] (Con't)     |[OpenMP][nvptx, SIMT]
                   |Reduction fails with        |(Con't) Reduction fails
                   |optimization and            |with optimization and
                   |'loop'/'for simd' but not   |'loop'/'for simd' but not
                   |with 'for'                  |with 'for'

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
FTR, example minimized to:
...
// { dg-additional-options "-foffload=-latomic" }

#include <iostream>
#include <cstdlib>
#include <cmath>
#include <complex>

using std::complex;

#pragma omp declare reduction(+: complex<int>: omp_out += omp_in)

int
main (void)
{
  const int N0 { 32768 };
  const complex<int> expected_value { N0, 0 };
  complex<int> counter_N0 { 0, 0 };

#pragma omp target
#pragma omp for simd reduction(+: counter_N0)
  for (int i0 = 0 ; i0 < N0 ; i0++ )
    counter_N0 += complex<int> { 1, 0 };

  std::cerr << "Expected: " << expected_value  << std::endl;
  std::cerr << "Got     : " << counter_N0 << std::endl;

  return 0;
}
...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
  2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
  2021-04-28 21:15 ` [Bug target/100321] " vries at gcc dot gnu.org
  2021-04-29  6:24 ` [Bug target/100321] [OpenMP][nvptx, SIMT] " vries at gcc dot gnu.org
@ 2021-04-29 15:09 ` vries at gcc dot gnu.org
  2021-04-29 15:18 ` vries at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-29 15:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321

--- Comment #3 from Tom de Vries <vries at gcc dot gnu.org> ---
C example:
...
/* { dg-additional-options "-foffload=-latomic" } */

#include <stdio.h>

struct s
{
  int i;
};

#pragma omp declare reduction(+: struct s: omp_out.i += omp_in.i)

int
main (void)
{
  const int N0 = 32768;

  printf ("Expected: %d\n", N0);

  struct s counter_N0 = { 0 };
#pragma omp target
#pragma omp for simd reduction(+: counter_N0)
  for (int i0 = 0 ; i0 < N0 ; i0++ )
    counter_N0.i += 1;
  printf ("Got     : %d\n", counter_N0.i);

  return 0;
}
...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
  2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-04-29 15:09 ` vries at gcc dot gnu.org
@ 2021-04-29 15:18 ` vries at gcc dot gnu.org
  2021-05-03 10:26 ` vries at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-04-29 15:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321

--- Comment #4 from Tom de Vries <vries at gcc dot gnu.org> ---
During lower_rec_input_clauses in omp-low.c, the reduction clause is handled:
...
            case OMP_CLAUSE_REDUCTION:
            case OMP_CLAUSE_IN_REDUCTION:
              /* OpenACC reductions are initialized using the                   
                 GOACC_REDUCTION internal function.  */
              if (is_gimple_omp_oacc (ctx->stmt))
                break;
              if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c))
...

AFAICT, the problem is that the the SIMT handling code is added only in the
!OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) case.

For this test-case, the OMP_CLAUSE_REDUCTION_PLACEHOLDER (c) path is taken
instead.

So, something like this reflects the current state:
...
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 7b122059c6e..a0561800977 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -6005,6 +6005,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq
*ilist, gimple_seq *dlist,
                  tree placeholder = OMP_CLAUSE_REDUCTION_PLACEHOLDER (c);
                  gimple *tseq;
                  tree ptype = TREE_TYPE (placeholder);
+                 if (sctx.is_simt)
+                   {
+                     sorry ("SIMT not fully implemented");
+                     abort ();
+                   }
                  if (cond)
                    {
                      x = error_mark_node;
...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
  2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-04-29 15:18 ` vries at gcc dot gnu.org
@ 2021-05-03 10:26 ` vries at gcc dot gnu.org
  2021-05-03 21:14 ` cvs-commit at gcc dot gnu.org
  2021-05-03 21:15 ` vries at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-05-03 10:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321

--- Comment #5 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #4)
> So, something like this reflects the current state:
> ...
> diff --git a/gcc/omp-low.c b/gcc/omp-low.c
> index 7b122059c6e..a0561800977 100644
> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -6005,6 +6005,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq
> *ilist, gimple_seq *dlist,
>                   tree placeholder = OMP_CLAUSE_REDUCTION_PLACEHOLDER (c);
>                   gimple *tseq;
>                   tree ptype = TREE_TYPE (placeholder);
> +                 if (sctx.is_simt)
> +                   {
> +                     sorry ("SIMT not fully implemented");
> +                     abort ();
> +                   }
>                   if (cond)
>                     {
>                       x = error_mark_node;
> ...

Submitted patch that does something similar (but using error rather than
sorry/abort) @ https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569421.html .

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
  2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-05-03 10:26 ` vries at gcc dot gnu.org
@ 2021-05-03 21:14 ` cvs-commit at gcc dot gnu.org
  2021-05-03 21:15 ` vries at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-05-03 21:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tom de Vries <vries@gcc.gnu.org>:

https://gcc.gnu.org/g:f87990a2a8fc9e20d30462a0a4c9047582af0cd9

commit r12-395-gf87990a2a8fc9e20d30462a0a4c9047582af0cd9
Author: Tom de Vries <tdevries@suse.de>
Date:   Mon May 3 11:36:14 2021 +0200

    [openmp, simt] Disable SIMT for user-defined reduction

    The test-case included in this patch contains this target region:
    ...
      for (int i0 = 0 ; i0 < N0 ; i0++ )
        counter_N0.i += 1;
    ...

    When running with nvptx accelerator, the counter variable is expected to
    be N0 after the region, but instead is N0 / 32.  The problem is that rather
    than getting the result for all warp lanes, we get it for just one lane.

    This is caused by the implementation of SIMT being incomplete.  It handles
    regular reductions, but appearantly not user-defined reductions.

    For now, handle this by disabling SIMT in this case, specifically by
setting
    sctx->max_vf to 1.

    Tested libgomp on x86_64-linux with nvptx accelerator.

    gcc/ChangeLog:

    2021-05-03  Tom de Vries  <tdevries@suse.de>

            PR target/100321
            * omp-low.c (lower_rec_input_clauses): Disable SIMT for
user-defined
            reduction.

    libgomp/ChangeLog:

    2021-05-03  Tom de Vries  <tdevries@suse.de>

            PR target/100321
            * testsuite/libgomp.c/target-44.c: New test.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100321] [OpenMP][nvptx, SIMT] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
  2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-05-03 21:14 ` cvs-commit at gcc dot gnu.org
@ 2021-05-03 21:15 ` vries at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: vries at gcc dot gnu.org @ 2021-05-03 21:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
   Target Milestone|---                         |12.0
         Resolution|---                         |FIXED

--- Comment #7 from Tom de Vries <vries at gcc dot gnu.org> ---
Patch committed, marking resolved-fixed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-05-03 21:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-28 15:36 [Bug target/100321] New: [OpenMP][nvptx] (Con't) Reduction fails with optimization and 'loop'/'for simd' but not with 'for' burnus at gcc dot gnu.org
2021-04-28 21:15 ` [Bug target/100321] " vries at gcc dot gnu.org
2021-04-29  6:24 ` [Bug target/100321] [OpenMP][nvptx, SIMT] " vries at gcc dot gnu.org
2021-04-29 15:09 ` vries at gcc dot gnu.org
2021-04-29 15:18 ` vries at gcc dot gnu.org
2021-05-03 10:26 ` vries at gcc dot gnu.org
2021-05-03 21:14 ` cvs-commit at gcc dot gnu.org
2021-05-03 21:15 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).