public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <thomas@codesourcery.com>
To: Tobias Burnus <tobias@codesourcery.com>, <gcc-patches@gcc.gnu.org>
Cc: Andrew Stubbs <ams@codesourcery.com>
Subject: Fix OpenACC gang-redundant execution in 'libgomp.oacc-fortran/privatized-ref-2.f90' (was: Add 'libgomp.oacc-fortran/privatized-ref-2.f90')
Date: Tue, 22 Feb 2022 17:39:42 +0100	[thread overview]
Message-ID: <87zgmionxt.fsf@euler.schwinge.homeip.net> (raw)
In-Reply-To: <87mtsow47q.fsf@euler.schwinge.homeip.net>

[-- Attachment #1: Type: text/plain, Size: 4267 bytes --]

Hi!

On 2021-05-21T16:28:57+0200, I wrote:
> This came into existance internally, when the og10 branch was set up.
>
> On 2020-06-03T17:23:51+0200, Tobias Burnus <Tobias_Burnus@mentor.com> wrote:
>> This fixes [...] on OG10 (og10_prerelease); it will be
>> later applied to gcn/… to fix the issue. (Upstream is unaffected.)
>> [...]
>
> However, that means that your testcase does work on master branch (and
> would regress if certain commits got pushed there).  As the testcase has
> got a property useful for a thing I'm currently working on, I've pushed
> to master branch "Add 'libgomp.oacc-fortran/privatized-ref-2.f90'" in
> commit 61796dc03befa9b7426d5bc7c336cca585944143

After commit a78b1ab1df9ca44acc5638e8f9d0ae2e62bd65ed
"amdgcn: Tune default OpenMP/OpenACC GPU utilization", we'd seen this
test case regress (only) on our AMD GPU amd-instinct1/'-march=gfx908'
system:

    {+WARNING: program timed out.+}
    [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O0  execution test

Same for other optimization levels.  Nothing more in 'libgomp.log'.

I have determined this is a latent problem in the original test case,
which contains a few instances of code as follows:

    !$acc parallel copyout(array)
    array = [(-i, i = 1, nn)]
    !$acc loop gang private(array)
    do i = 1, 10
      array(i) = i
    end do
    if (any (array /= [(-i, i = 1, nn)])) error stop 1
    !$acc end parallel

Given the '!$acc loop gang', the whole containing '!$acc parallel' region
is launched with gang parallelism.  The '!$acc loop gang' executes in
gang-partitioned mode, but the 'array' assignment before and checks after
don't execute in a (hypothetical) gang-single mode, but instead in
gang-redundant mode, meaning that each gang executes these concurrently,
giving rise to data races and other mischief.  Thus, we have to make sure
that we're not executing non-parallelized code in gang-redundant mode, by
putting these parts into their own 'parallel' constructs, which then
default to 'num_gangs(1)'.  Pushed to master branch
commit f8187b5c0d22723c8e0a3d13d0ea5dd7ecfeff75 "Fix OpenACC
gang-redundant execution in 'libgomp.oacc-fortran/privatized-ref-2.f90'",
see attached.


Grüße
 Thomas


> I confirm that "FIXME: Fails due to PR middle-end/95499" is still a
> problem.
>
> And, GCC '-O' reports:
>
>     [...]/libgomp.oacc-fortran/privatized-ref-2.f90:147:21:
>
>       147 |   subroutine foobar15 (scalar)
>           |                     ^
>     Warning: ‘foobar15’ defined but not used [-Wunused-function]
>     [...]/libgomp.oacc-fortran/privatized-ref-2.f90: In function ‘MAIN__’:
>     [...]/libgomp.oacc-fortran/privatized-ref-2.f90:31:22: warning: ‘a.offset’ is used uninitialized [-Wuninitialized]
>        31 |   A = [(3*j, j=1, 10)]
>           |                      ^
>     [...]/libgomp.oacc-fortran/privatized-ref-2.f90:27:30: note: ‘a’ declared here
>        27 |   integer, allocatable :: A(:)
>           |                              ^
>     [...]/libgomp.oacc-fortran/privatized-ref-2.f90:31:22: warning: ‘a.dim[0].lbound’ is used uninitialized [-Wuninitialized]
>        31 |   A = [(3*j, j=1, 10)]
>           |                      ^
>     [...]/libgomp.oacc-fortran/privatized-ref-2.f90:27:30: note: ‘a’ declared here
>        27 |   integer, allocatable :: A(:)
>           |                              ^
>     [...]/libgomp.oacc-fortran/privatized-ref-2.f90:31:22: warning: ‘a.dim[0].ubound’ is used uninitialized [-Wuninitialized]
>        31 |   A = [(3*j, j=1, 10)]
>           |                      ^
>     [...]/libgomp.oacc-fortran/privatized-ref-2.f90:27:30: note: ‘a’ declared here
>        27 |   integer, allocatable :: A(:)
>           |                              ^
>
> I haven't looked into these.
>
>
> Grüße
>  Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-OpenACC-gang-redundant-execution-in-libgomp.oacc.patch --]
[-- Type: text/x-diff, Size: 9289 bytes --]

From f8187b5c0d22723c8e0a3d13d0ea5dd7ecfeff75 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 21 Jan 2022 14:58:23 +0100
Subject: [PATCH] Fix OpenACC gang-redundant execution in
 'libgomp.oacc-fortran/privatized-ref-2.f90'

This was a latent problem, and this commit here now resolves a regression that
after recent commit a78b1ab1df9ca44acc5638e8f9d0ae2e62bd65ed
"amdgcn: Tune default OpenMP/OpenACC GPU utilization" we had (only) seen on a
GCN offloading '-march=gfx908' system:

    {+WARNING: program timed out.+}
    [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O0  execution test

Same for other optimization levels.

Make sure that we're not executing non-parallelized code in gang-redundant
mode, by putting these parts into their own 'parallel' constructs, which then
default to 'num_gangs(1)'.

	libgomp/
	* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Fix OpenACC
	gang-redundant execution.
---
 .../libgomp.oacc-fortran/privatized-ref-2.f90 | 42 ++++++++++++++-----
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
index f4a6af986e8..6bd17148911 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
@@ -53,12 +53,10 @@ contains
     integer :: array(nn)
 
     !$acc parallel copyout(array) ! { dg-line l_compute[incr c_compute] }
-    ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
-    ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
-    ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
-    ! { dg-note {variable 'test\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     array = [(-i, i = 1, nn)]
+    !$acc end parallel
+    !$acc parallel copy(array)
     !$acc loop gang private(array) ! { dg-line l_loop[incr c_loop] }
     ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'array' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "" { target *-*-* } l_loop$c_loop }
@@ -66,6 +64,13 @@ contains
     do i = 1, 10
       array(i) = i
     end do
+    !$acc end parallel
+    !$acc parallel copyin(array) ! { dg-line l_compute[incr c_compute] }
+    ! { dg-note {variable 'test\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     if (any (array /= [(-i, i = 1, nn)])) error stop 1
     !$acc end parallel
   end subroutine foo
@@ -74,14 +79,10 @@ contains
     integer :: array(:)
 
     !$acc parallel copyout(array) ! { dg-line l_compute[incr c_compute] }
-    ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
-    ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
-    ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
-    ! { dg-note {variable 'test\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
-    ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
-    ! { dg-note {variable 'A\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: static} "" { target *-*-* } l_compute$c_compute }
     array = [(-2*i, i = 1, size(array))]
+    !$acc end parallel
+    !$acc parallel copy(array)
     !$acc loop gang private(array) ! { dg-line l_loop[incr c_loop] }
     ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'array\.[0-9]+' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
@@ -91,6 +92,11 @@ contains
     do i = 1, 10
       array(i) = 9*i
     end do
+    !$acc end parallel
+    !$acc parallel copyin(array) ! { dg-line l_compute[incr c_compute] }
+    ! { dg-note {variable 'test\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'A\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: static} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
     if (any (array /= [(-2*i, i = 1, 10)])) error stop 2
     !$acc end parallel
   end subroutine bar
@@ -100,6 +106,8 @@ contains
 
     !$acc parallel copyout(str)
     str = "abcdefghij"
+    !$acc end parallel
+    !$acc parallel copy(str)
     !$acc loop gang private(str) ! { dg-line l_loop[incr c_loop] }
     ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'str' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "" { target *-*-* } l_loop$c_loop }
@@ -110,6 +118,8 @@ contains
     do i = 1, 10
       str(i:i) = achar(ichar('A') + i)
     end do
+    !$acc end parallel
+    !$acc parallel copyin(str)
     if (str /= "abcdefghij") error stop 3
     !$acc end parallel
   end
@@ -122,10 +132,14 @@ contains
 ! ***************************************
     !!$acc parallel copyout(str)
     str = "abcdefghij"
+    !!$acc end parallel
+    !!$acc parallel copy(str)
     !!$acc loop gang private(str)
     !do i = 1, 10
     !  str(i:i) = achar(ichar('A') + i)
     !end do
+    !!$acc end parallel
+    !!$acc parallel copyin(str)
     if (str /= "abcdefghij") error stop 5
     !!$acc end parallel
   end
@@ -135,6 +149,8 @@ contains
 
     !$acc parallel copyout(scalar)
     scalar = "abcdefghi-12345"
+    !$acc end parallel
+    !$acc parallel copy(scalar)
     !$acc loop gang private(scalar) ! { dg-line l_loop[incr c_loop] }
     ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'scalar' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "" { target *-*-* } l_loop$c_loop }
@@ -145,7 +161,9 @@ contains
       scalar(i:i) = achar(ichar('A') + i)
     end do
     !$acc end parallel
+    !$acc parallel copyin(scalar)
     if (scalar /= "abcdefghi-12345") error stop 6
+    !$acc end parallel
   end subroutine foobar
   subroutine foobar15 (scalar)
     integer :: i
@@ -153,11 +171,15 @@ contains
 
     !$acc parallel copyout(scalar)
     scalar = "abcdefghi-12345"
+    !$acc end parallel
+    !$acc parallel copy(scalar)
     !$acc loop gang private(scalar)
     do i = 1, 15
       scalar(i:i) = achar(ichar('A') + i)
     end do
     !$acc end parallel
+    !$acc parallel copyin(scalar)
     if (scalar /= "abcdefghi-12345") error stop 1
+    !$acc end parallel
   end subroutine foobar15
 end
-- 
2.34.1


      reply	other threads:[~2022-02-22 16:39 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <fc0dfbcc-0f5a-e63d-3943-80fb7b7cf03e@mentor.com>
2020-09-16 17:34 ` [PATCH] [OG10] Xfail libgomp.oacc-fortran/privatized-ref-2.f90 when offloading to nvptx Kwok Cheung Yeung
     [not found]   ` <0968ff25a4dc4d9895682ce0669345c5@svr-orw-mbx-02.mgc.mentorg.com>
2021-05-21 14:28     ` Add 'libgomp.oacc-fortran/privatized-ref-2.f90' Thomas Schwinge
2022-02-22 16:39       ` Thomas Schwinge [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zgmionxt.fsf@euler.schwinge.homeip.net \
    --to=thomas@codesourcery.com \
    --cc=ams@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=tobias@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).