* Re: [PATCH] fortran: Fix up gfc_trans_oacc_construct [PR104717]
[not found] <YmA9iaVvWhwVGFmT@tucnak>
@ 2022-04-25 21:19 ` Thomas Schwinge
2022-04-26 17:25 ` Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation [PR104717] (was: [PATCH] fortran: Fix up gfc_trans_oacc_construct [PR104717]) Thomas Schwinge
0 siblings, 1 reply; 3+ messages in thread
From: Thomas Schwinge @ 2022-04-25 21:19 UTC (permalink / raw)
To: Jakub Jelinek, gcc-patches; +Cc: Tobias Burnus, fortran
[-- Attachment #1: Type: text/plain, Size: 4575 bytes --]
Hi!
On 2022-04-20T19:06:17+0200, Jakub Jelinek <jakub@redhat.com> wrote:
> So that move_sese_region_to_fn works properly, OpenMP/OpenACC constructs
> for which that function is invoked need an extra artificial BIND_EXPR
> around their body so that we move all variables of the bodies.
>
> The C/C++ FEs do that both for OpenMP constructs like OMP_PARALLEL, OMP_TASK
> or OMP_TARGET and for OpenACC constructs that behave similarly to
> OMP_TARGET, but the Fortran FE only does that for OpenMP constructs.
>
> The following patch does that for OpenACC constructs too.
> This fixes ICE on the attached testcase.
ACK, thanks.
I do see that over the past years, there have been a number of similar
fixes been applied for other OMP constructs -- maybe that interface
should generally be made more obvious and fool-proof?
> Unfortunately, it also regresses
> FAIL: gfortran.dg/goacc/privatization-1-compute-loop.f90 -O (test for excess errors)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O0 (test for excess errors)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O1 (test for excess errors)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 (test for excess errors)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O3 -g (test for excess errors)
> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for excess errors)
> Those emits emit tons of various messages and now there are some extra ones,
I've fixed these up.
> the previous as well as new ones are mostly on artificial variables created
> by the compiler, so I wonder if we should emit those at all.
Yes, I did wonder about that, too. (This affects a lot of test cases.)
> Anyway, here it is the patch, appart from those regressions passed
> bootstrap/regtested on powerpc64le-linux.
>
> 2022-04-20 Jakub Jelinek <jakub@redhat.com>
>
> PR fortran/104717
> * trans-openmp.cc (gfc_trans_oacc_construct): Wrap construct body
> in an extra BIND_EXPR.
>
> * gfortran.dg/goacc/pr104717.f90: New test.
>
> --- gcc/fortran/trans-openmp.cc.jj 2022-04-06 09:59:32.729654664 +0200
> +++ gcc/fortran/trans-openmp.cc 2022-04-20 12:48:19.773402677 +0200
> @@ -4444,7 +4444,9 @@ gfc_trans_oacc_construct (gfc_code *code
> gfc_start_block (&block);
> oacc_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses,
> code->loc, false, true);
> + pushlevel ();
> stmt = gfc_trans_omp_code (code->block->next, true);
> + stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
> stmt = build2_loc (gfc_get_location (&code->loc), construct_code,
> void_type_node, stmt, oacc_clauses);
> gfc_add_expr_to_block (&block, stmt);
> --- gcc/testsuite/gfortran.dg/goacc/pr104717.f90.jj 2022-04-20 12:53:54.002748265 +0200
> +++ gcc/testsuite/gfortran.dg/goacc/pr104717.f90 2022-04-20 12:53:12.811321862 +0200
> @@ -0,0 +1,22 @@
> +! PR fortran/104717
> +! { dg-do compile }
> +! { dg-options "-O1 -fopenacc -fstack-arrays" }
As the ICE was "during IPA pass: pta", I've added an explicit '-fipa-pta'
here.
> +
> +program main
> + implicit none (type, external)
> + integer :: j
> + integer, allocatable :: A(:)
> +
> + A = [(3*j, j=1, 10)]
> + call foo (A, size(A))
> + deallocate (A)
> +contains
> + subroutine foo (array, nn)
> + integer :: i, nn
> + integer :: array(nn)
> +
> + !$acc parallel copyout(array)
> + array = [(-i, i = 1, nn)]
> + !$acc end parallel
> + end subroutine foo
> +end
Pushed to master branch commit b2202431910e30d8505c94d1cb9341cac7080d10
"fortran: Fix up gfc_trans_oacc_construct [PR104717]", see attached.
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-fortran-Fix-up-gfc_trans_oacc_construct-PR104717.patch --]
[-- Type: text/x-diff, Size: 7504 bytes --]
From b2202431910e30d8505c94d1cb9341cac7080d10 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek <jakub@redhat.com>
Date: Wed, 20 Apr 2022 19:06:17 +0200
Subject: [PATCH] fortran: Fix up gfc_trans_oacc_construct [PR104717]
So that move_sese_region_to_fn works properly, OpenMP/OpenACC constructs
for which that function is invoked need an extra artificial BIND_EXPR
around their body so that we move all variables of the bodies.
The C/C++ FEs do that both for OpenMP constructs like OMP_PARALLEL, OMP_TASK
or OMP_TARGET and for OpenACC constructs that behave similarly to
OMP_TARGET, but the Fortran FE only does that for OpenMP constructs.
The following patch does that for OpenACC constructs too.
PR fortran/104717
gcc/fortran/
* trans-openmp.cc (gfc_trans_oacc_construct): Wrap construct body
in an extra BIND_EXPR.
gcc/testsuite/
* gfortran.dg/goacc/pr104717.f90: New test.
* gfortran.dg/goacc/privatization-1-compute-loop.f90: Adjust.
libgomp/
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Adjust.
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
---
gcc/fortran/trans-openmp.cc | 2 ++
gcc/testsuite/gfortran.dg/goacc/pr104717.f90 | 22 +++++++++++++++++++
.../goacc/privatization-1-compute-loop.f90 | 7 +++---
.../libgomp.oacc-fortran/privatized-ref-2.f90 | 7 ++++++
4 files changed, 35 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/gfortran.dg/goacc/pr104717.f90
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 25dde826146..43d59abe9e0 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -4444,7 +4444,9 @@ gfc_trans_oacc_construct (gfc_code *code)
gfc_start_block (&block);
oacc_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses,
code->loc, false, true);
+ pushlevel ();
stmt = gfc_trans_omp_code (code->block->next, true);
+ stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
stmt = build2_loc (gfc_get_location (&code->loc), construct_code,
void_type_node, stmt, oacc_clauses);
gfc_add_expr_to_block (&block, stmt);
diff --git a/gcc/testsuite/gfortran.dg/goacc/pr104717.f90 b/gcc/testsuite/gfortran.dg/goacc/pr104717.f90
new file mode 100644
index 00000000000..4ef16187c84
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/pr104717.f90
@@ -0,0 +1,22 @@
+! Extracted from 'libgomp.oacc-fortran/privatized-ref-2.f90'.
+
+! { dg-additional-options "-O1 -fstack-arrays -fipa-pta" }
+
+program main
+ implicit none (type, external)
+ integer :: j
+ integer, allocatable :: A(:)
+
+ A = [(3*j, j=1, 10)]
+ call foo (A, size(A))
+ deallocate (A)
+contains
+ subroutine foo (array, nn)
+ integer :: i, nn
+ integer :: array(nn)
+
+ !$acc parallel copyout(array)
+ array = [(-i, i = 1, nn)]
+ !$acc end parallel
+ end subroutine foo
+end
diff --git a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90 b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
index 4dfeb7e07a2..13772c185ce 100644
--- a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
@@ -13,7 +13,7 @@
! passed to 'incr' may be unset, and in that case, it will be set to [...]",
! so to maintain compatibility with earlier Tcl releases, we manually
! initialize counter variables:
-! { dg-line l_dummy[variable c_loop 0] }
+! { dg-line l_dummy[variable c_compute 0 c_loop 0] }
! { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
! "WARNING: dg-line var l_dummy defined, but not used".
@@ -26,7 +26,7 @@ contains
integer, parameter :: c = 3
integer, external :: g
- !$acc parallel
+ !$acc parallel ! { dg-line l_compute[incr c_compute] }
!$acc loop collapse(2) private(a) private(x, y) ! { dg-line l_loop[incr c_loop] }
do i = 1, 20
do j = 1, 25
@@ -46,6 +46,8 @@ contains
y = a
end do
end do
+ !$acc end parallel
+ ! { dg-note {variable 'count\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
! { dg-note {variable 'count\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
@@ -54,6 +56,5 @@ contains
! { dg-note {variable 'y' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop }
! { dg-note {variable 'C\.[0-9]+' declared in block potentially has improper OpenACC privatization level: 'const_decl'} "TODO" { target *-*-* } l_loop$c_loop }
! { dg-note {variable 'y' ought to be adjusted for OpenACC privatization level: 'vector'} "" { target *-*-* } l_loop$c_loop }
- !$acc end parallel
end subroutine f
end module m
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
index 1d91e115d9f..b31f4066152 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90
@@ -53,6 +53,9 @@ contains
integer :: array(nn)
!$acc parallel copyout(array) ! { dg-line l_compute[incr c_compute] }
+ ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+ ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+ ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
array = [(-i, i = 1, nn)]
!$acc end parallel
@@ -81,6 +84,10 @@ contains
integer :: array(:)
!$acc parallel copyout(array) ! { dg-line l_compute[incr c_compute] }
+ ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+ ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+ ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+ ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
array = [(-2*i, i = 1, size(array))]
!$acc end parallel
--
2.25.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation [PR104717] (was: [PATCH] fortran: Fix up gfc_trans_oacc_construct [PR104717])
2022-04-25 21:19 ` [PATCH] fortran: Fix up gfc_trans_oacc_construct [PR104717] Thomas Schwinge
@ 2022-04-26 17:25 ` Thomas Schwinge
2022-04-26 17:51 ` Thomas Schwinge
0 siblings, 1 reply; 3+ messages in thread
From: Thomas Schwinge @ 2022-04-26 17:25 UTC (permalink / raw)
To: gcc-patches, Julian Brown, Andrew Stubbs
Cc: Jakub Jelinek, Tobias Burnus, fortran
[-- Attachment #1: Type: text/plain, Size: 5305 bytes --]
Hi!
On 2022-04-25T23:19:26+0200, I wrote:
> On 2022-04-20T19:06:17+0200, Jakub Jelinek <jakub@redhat.com> wrote:
>> So that move_sese_region_to_fn works properly, OpenMP/OpenACC constructs
>> for which that function is invoked need an extra artificial BIND_EXPR
>> around their body so that we move all variables of the bodies.
>>
>> The C/C++ FEs do that both for OpenMP constructs like OMP_PARALLEL, OMP_TASK
>> or OMP_TARGET and for OpenACC constructs that behave similarly to
>> OMP_TARGET, but the Fortran FE only does that for OpenMP constructs.
>>
>> The following patch does that for OpenACC constructs too.
>> This fixes ICE on the attached testcase.
>
> ACK, thanks.
>> Unfortunately, it also regresses
>> FAIL: gfortran.dg/goacc/privatization-1-compute-loop.f90 -O (test for excess errors)
>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O0 (test for excess errors)
>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O1 (test for excess errors)
>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 (test for excess errors)
>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors)
>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O3 -g (test for excess errors)
>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for excess errors)
>> Those emits emit tons of various messages and now there are some extra ones,
>
> I've fixed these up.
One more issue became apparent, where the code changes pushed actually do
lead to a GCN offloading compilation failure:
[...]/libgomp.oacc-fortran/print-1.f90: In function ‘MAIN__._omp_fn.0’:
[...]/libgomp.oacc-fortran/print-1.f90:13:14: error: 512 bytes of gang-private data-share memory exhausted (increase with ‘-mgang-private-size=560’, for example)
13 | !$acc parallel
| ^
In my configuration, I may indeed fix GCN offloading compilation with
'-foffload-options=amdgcn-amdhsa=-mgang-private-size=560', but I don't
think that's generally correct/sufficient, so in the the attached
"Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation
[PR104717]", I instead "raise '-mgang-private-size' to an arbitrary high
value". This avoids having to route the actual 'sizeof' from GCC build
down to the test suite harness (which ought to be doable, but
non-trivial). OK to push that:
+! For GCN offloading compilation, when gang-privatizing 'dt_parm.N'
+! (see below), we run into an 'gang-private data-share memory exhausted'
+! error: the default '-mgang-private-size' is too small. Per
+! 'gcc/fortran/trans-io.cc'/'libgfortran/io/io.h', that one is
+! 'struct st_parameter_dt', which indeed is rather big. Instead of
+! working out its exact size (which may vary per GCC configuration),
+! raise '-mgang-private-size' to an arbitrary high value.
+! { dg-additional-options "-foffload-options=amdgcn-amdhsa=-mgang-private-size=13579" { target openacc_radeon_accel_selected } }
... to master branch? (This doubles the use/testing of the
'-mgang-private-size' option!) ;-)
We've currently not been doing OpenACC privatization scanning in
'libgomp.oacc-fortran/print-1.f90', which I've now added, to help
document the issue; no need to review that.
Of course, the issue could alternatively be fixed by adding more logic to
the GCN back end to auto-scale the allocation, or be fixed by adding more
logic to the compiler to avoid gang-privatizing varibales such as
'dt_parm.N' in such cases, but that's not something I'm going to look
into at this point.
Or, of course, be avoided by re-writing the test case to not require
gang-privatizing 'dt_parm.N', but the test case is correct as it is.
Grüße
Thomas
> PR fortran/104717
> gcc/fortran/
> * trans-openmp.cc (gfc_trans_oacc_construct): Wrap construct body
> in an extra BIND_EXPR.
> --- a/gcc/fortran/trans-openmp.cc
> +++ b/gcc/fortran/trans-openmp.cc
> @@ -4444,7 +4444,9 @@ gfc_trans_oacc_construct (gfc_code *code)
> gfc_start_block (&block);
> oacc_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses,
> code->loc, false, true);
> + pushlevel ();
> stmt = gfc_trans_omp_code (code->block->next, true);
> + stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
> stmt = build2_loc (gfc_get_location (&code->loc), construct_code,
> void_type_node, stmt, oacc_clauses);
> gfc_add_expr_to_block (&block, stmt);
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-up-libgomp.oacc-fortran-print-1.f90-GCN-offloadi.patch --]
[-- Type: text/x-diff, Size: 3300 bytes --]
From 3dfea06371aa9bcc84ad75a2bc821a45e131dca6 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Tue, 26 Apr 2022 18:41:23 +0200
Subject: [PATCH] Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading
compilation [PR104717]
That got broken by recent commit b2202431910e30d8505c94d1cb9341cac7080d10
"fortran: Fix up gfc_trans_oacc_construct [PR104717]".
PR fortran/104717
libgomp/
* testsuite/libgomp.oacc-fortran/print-1.f90: Add OpenACC
privatization scanning. For GCN offloading compilation, raise
'-mgang-private-size'.
---
.../libgomp.oacc-fortran/print-1.f90 | 30 ++++++++++++++++++-
1 file changed, 29 insertions(+), 1 deletion(-)
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90
index 7b7f73741fe..42a8538e1fb 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90
@@ -6,11 +6,39 @@
! Separate file 'print-1-nvptx.f90' for nvptx offloading.
! { dg-skip-if "separate file" { offload_target_nvptx } }
+! For GCN offloading compilation, when gang-privatizing 'dt_parm.N'
+! (see below), we run into an 'gang-private data-share memory exhausted'
+! error: the default '-mgang-private-size' is too small. Per
+! 'gcc/fortran/trans-io.cc'/'libgfortran/io/io.h', that one is
+! 'struct st_parameter_dt', which indeed is rather big. Instead of
+! working out its exact size (which may vary per GCC configuration),
+! raise '-mgang-private-size' to an arbitrary high value.
+! { dg-additional-options "-foffload-options=amdgcn-amdhsa=-mgang-private-size=13579" { target openacc_radeon_accel_selected } }
+
+! { dg-additional-options "-fopt-info-note-omp" }
+! { dg-additional-options "-foffload=-fopt-info-note-omp" }
+
+! { dg-additional-options "--param=openacc-privatization=noisy" }
+! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
+! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */
+
+! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+! passed to 'incr' may be unset, and in that case, it will be set to [...]",
+! so to maintain compatibility with earlier Tcl releases, we manually
+! initialize counter variables:
+! { dg-line l_dummy[variable c_compute 0] }
+! { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+! "WARNING: dg-line var l_dummy defined, but not used".
+
program main
implicit none
integer :: var = 42
-!$acc parallel
+!$acc parallel ! { dg-line l_compute[incr c_compute] }
+ ! { dg-note {variable 'dt_parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_compute$c_compute }
+ ! { dg-note {variable 'dt_parm\.[0-9]+' ought to be adjusted for OpenACC privatization level: 'gang'} {} { target *-*-* } l_compute$c_compute }
+ ! { dg-note {variable 'dt_parm\.[0-9]+' adjusted for OpenACC privatization level: 'gang'} {} { target { ! openacc_host_selected } } l_compute$c_compute }
write (0, '("The answer is ", I2)') var
!$acc end parallel
--
2.25.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation [PR104717] (was: [PATCH] fortran: Fix up gfc_trans_oacc_construct [PR104717])
2022-04-26 17:25 ` Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation [PR104717] (was: [PATCH] fortran: Fix up gfc_trans_oacc_construct [PR104717]) Thomas Schwinge
@ 2022-04-26 17:51 ` Thomas Schwinge
0 siblings, 0 replies; 3+ messages in thread
From: Thomas Schwinge @ 2022-04-26 17:51 UTC (permalink / raw)
To: gcc-patches, Julian Brown, Andrew Stubbs
Cc: Jakub Jelinek, Tobias Burnus, fortran
Hi!
On 2022-04-26T19:25:31+0200, I wrote:
> On 2022-04-25T23:19:26+0200, I wrote:
>> On 2022-04-20T19:06:17+0200, Jakub Jelinek <jakub@redhat.com> wrote:
>>> So that move_sese_region_to_fn works properly, OpenMP/OpenACC constructs
>>> for which that function is invoked need an extra artificial BIND_EXPR
>>> around their body so that we move all variables of the bodies.
>>>
>>> The C/C++ FEs do that both for OpenMP constructs like OMP_PARALLEL, OMP_TASK
>>> or OMP_TARGET and for OpenACC constructs that behave similarly to
>>> OMP_TARGET, but the Fortran FE only does that for OpenMP constructs.
>>>
>>> The following patch does that for OpenACC constructs too.
>>> This fixes ICE on the attached testcase.
>>
>> ACK, thanks.
>
>>> Unfortunately, it also regresses
>>> FAIL: gfortran.dg/goacc/privatization-1-compute-loop.f90 -O (test for excess errors)
>>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O0 (test for excess errors)
>>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O1 (test for excess errors)
>>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 (test for excess errors)
>>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors)
>>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O3 -g (test for excess errors)
>>> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for excess errors)
>>> Those emits emit tons of various messages and now there are some extra ones,
>>
>> I've fixed these up.
>
> One more issue became apparent, where the code changes pushed actually do
> lead to a GCN offloading compilation failure:
>
> [...]/libgomp.oacc-fortran/print-1.f90: In function ‘MAIN__._omp_fn.0’:
> [...]/libgomp.oacc-fortran/print-1.f90:13:14: error: 512 bytes of gang-private data-share memory exhausted (increase with ‘-mgang-private-size=560’, for example)
> 13 | !$acc parallel
> | ^
>
> In my configuration, I may indeed fix GCN offloading compilation with
> '-foffload-options=amdgcn-amdhsa=-mgang-private-size=560', but I don't
> think that's generally correct/sufficient, so in the the attached
> "Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation
> [PR104717]", I instead "raise '-mgang-private-size' to an arbitrary high
> value". This avoids having to route the actual 'sizeof' from GCC build
> down to the test suite harness (which ought to be doable, but
> non-trivial). OK to push that:
>
> +! For GCN offloading compilation, when gang-privatizing 'dt_parm.N'
> +! (see below), we run into an 'gang-private data-share memory exhausted'
> +! error: the default '-mgang-private-size' is too small. Per
> +! 'gcc/fortran/trans-io.cc'/'libgfortran/io/io.h', that one is
> +! 'struct st_parameter_dt', which indeed is rather big. Instead of
> +! working out its exact size (which may vary per GCC configuration),
> +! raise '-mgang-private-size' to an arbitrary high value.
> +! { dg-additional-options "-foffload-options=amdgcn-amdhsa=-mgang-private-size=13579" { target openacc_radeon_accel_selected } }
>
> ... to master branch? (This doubles the use/testing of the
> '-mgang-private-size' option!) ;-)
Eh. That only works with the default GCN multilib '-march=fiji', testing
on gfx803 amdfury2 system. For all of '-march=gfx900' (amdnano2),
'-march=gfx906' (amd_ryzen3), '-march=gfx908' (amd-instinct1), I get:
libgomp: GCN fatal error: Asynchronous queue error
Runtime message: HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION: The agent attempted to access memory beyond the largest legal address.
..., and I still get that if lowering the allocation to the minimum,
'-foffload-options=amdgcn-amdhsa=-mgang-private-size=560'.
This is a really simple OpenACC 'parallel' construct:
!$acc parallel
write (0, '("The answer is ", I2)') var
!$acc end parallel
..., which ought to launch a 1-gang x 1-worker x 1-vector GPU kernel, so
I'd assume '-mgang-private-size=560' (or '-mgang-private-size=13579' in
fact) is not a problem?
Help?
Grüße
Thomas
> We've currently not been doing OpenACC privatization scanning in
> 'libgomp.oacc-fortran/print-1.f90', which I've now added, to help
> document the issue; no need to review that.
>
> Of course, the issue could alternatively be fixed by adding more logic to
> the GCN back end to auto-scale the allocation, or be fixed by adding more
> logic to the compiler to avoid gang-privatizing varibales such as
> 'dt_parm.N' in such cases, but that's not something I'm going to look
> into at this point.
>
> Or, of course, be avoided by re-writing the test case to not require
> gang-privatizing 'dt_parm.N', but the test case is correct as it is.
>
>
> Grüße
> Thomas
>
>
>> PR fortran/104717
>> gcc/fortran/
>> * trans-openmp.cc (gfc_trans_oacc_construct): Wrap construct body
>> in an extra BIND_EXPR.
>
>> --- a/gcc/fortran/trans-openmp.cc
>> +++ b/gcc/fortran/trans-openmp.cc
>> @@ -4444,7 +4444,9 @@ gfc_trans_oacc_construct (gfc_code *code)
>> gfc_start_block (&block);
>> oacc_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses,
>> code->loc, false, true);
>> + pushlevel ();
>> stmt = gfc_trans_omp_code (code->block->next, true);
>> + stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
>> stmt = build2_loc (gfc_get_location (&code->loc), construct_code,
>> void_type_node, stmt, oacc_clauses);
>> gfc_add_expr_to_block (&block, stmt);
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-04-26 17:51 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <YmA9iaVvWhwVGFmT@tucnak>
2022-04-25 21:19 ` [PATCH] fortran: Fix up gfc_trans_oacc_construct [PR104717] Thomas Schwinge
2022-04-26 17:25 ` Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation [PR104717] (was: [PATCH] fortran: Fix up gfc_trans_oacc_construct [PR104717]) Thomas Schwinge
2022-04-26 17:51 ` Thomas Schwinge
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).