From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id 92DF13858D28 for ; Tue, 30 Nov 2021 12:20:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 92DF13858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: Xpeo+HR4pbf+JlXPZ/lAeEDcHviEj0i013xxZ4XU+/Wmesu9R4+GI/vtgecyDVJSDK1G3RHov5 B6OBQf+250i0CV2eet6QX03LDFdIkLKBYJ6Wiwoa8QGi/RI82F7mEzHGwyPe/gY0ecrfPA6n6Y xvdVE62HHf/lRloacVWJDTIhd/RWL4mjiHuqvk4iLP+9c8YB1fYPqzAcuzWRW8hRGhQdRfzhaA xckpoEdPDU32uWorFDc4YeuNc+SIHSavLjHp5A9XLP/srBVc/+6cPFghTkMGSXdsXC/E7POPvv 4zQzT/2r8CqqKUP/oX81yuoj X-IronPort-AV: E=Sophos;i="5.87,275,1631606400"; d="scan'208,223";a="69074422" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa2.mentor.iphmx.com with ESMTP; 30 Nov 2021 04:20:07 -0800 IronPort-SDR: xZAtFStzMZNDki5EEnIGodEaHkIpKe6Fv+pl8ekZiy+fSuJ2Tf4fCD7FKmxQfA55M1d7g3UBb7 MPuku0AfzvC7ywRqSXBaFOZWoMJtz2+Vwo2QPuEnhQJ67g6CG2u/2vzRwWD15UBhdIewXTEYrD 85B2qqmcp22fWtXNbpjwS2uNjvPU2SSroLM+GS3oAJ5Emhe3iE9UG/38pcRhwEix+k5tzcKsiz Oc7B3suz8Bnl6GCpxxDxht2zIT9k/I3ZBItFOILGoZCkVFYY3sAzHoR4vWHG2SBM5h2dxs4k1+ HXE= From: Thomas Schwinge To: CC: Kwok Cheung Yeung Subject: Re: Gang-level reductions in OpenACC routine In-Reply-To: <83eadd47-9619-3e48-41ca-861ace07eacb@mentor.com> References: <91521931-373f-a3a6-7a3f-1752f08858a1@codesourcery.com> <83eadd47-9619-3e48-41ca-861ace07eacb@mentor.com> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/27.1 (x86_64-pc-linux-gnu) Date: Tue, 30 Nov 2021 13:20:01 +0100 Message-ID: <87fsrdx2by.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Nov 2021 12:20:12 -0000 --=-=-= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi! On 2020-03-19T17:12:02+0000, Kwok Cheung Yeung wrot= e: > On 18/03/2020 11:34 pm, Kwok Cheung Yeung wrote: >> I was looking at the regression in c-c++-common/goacc/nested-reductions.= c, which >> has the following excess warnings in acc_routine: >> >> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-= c++-common/goacc/nested-reductions.c:360:15: >> warning: insufficient partitioning available to parallelize loop >> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-= c++-common/goacc/nested-reductions.c:369:17: >> warning: insufficient partitioning available to parallelize loop >> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-= c++-common/goacc/nested-reductions.c:375:17: >> warning: insufficient partitioning available to parallelize loop >> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-= c++-common/goacc/nested-reductions.c:320:6: >> warning: region is gang partitioned but does not contain gang partitione= d code >> >> It is caused by the following code in the patch 'Make OpenACC orphan >> gang reductions errors"] (originally by Cesar): >> >> + /* Orphan reductions cannot have gang partitioning. */ >> + if ((loop->flags & OLF_REDUCTION) >> + && oacc_get_fn_attrib (current_function_decl) >> + && !lookup_attribute ("omp target entrypoint", >> + DECL_ATTRIBUTES (current_function_decl))= ) >> + this_mask =3D GOMP_DIM_MASK (GOMP_DIM_WORKER); Right. However, that code doesn't implement what the OpenACC specification actually says. ;-) >> The problem is that acc_routine is not declared with 'omp target entrypo= int', >> but it does have '#pragma acc_routine gang' applied to it. From what I >> understand of the OpenACC spec, this means that the function can be call= ed from >> the accelerator, and may contain a loop at the gang-level. Right. >> So is allowing gang >> reductions for functions with '#pragma acc_routine gang' (but not for wo= rker or >> vector) the right thing to do here? No, that's precisely the thing that the compiler needs to diagnose. See OpenACC 2.6, 2.9.11. "reduction clause", which places a restriction such that "The 'reduction' clause may not be specified on an orphaned 'loop' construct with the 'gang' clause, or on an orphaned 'loop' construct that will generate gang parallelism in a procedure that is compiled with the 'routine gang' clause." */ Cesar apparently read the last part to mean that inside a 'routine gang', a 'loop reduction' with implicit 'gang' level of parallelism should be demoted to 'worker' level of parallelism. But what actually is meant, simply, is that in such cases we raise the same "gang reduction on an orphan loop" error diagnostic that we raise for explicit 'gang' level of parallelism. (..., and adjust our offending test cases). Now, re your og10 etc. change: > Allow gang-level reductions in OpenACC routines with gang-level paral= lelism > gcc/ > * omp-offload.c (oacc_loop_auto_partitions): Check for 'omp declare > target' attributes with a gang clause attached. > --- a/gcc/omp-offload.c > +++ b/gcc/omp-offload.c > @@ -1374,14 +1374,32 @@ oacc_loop_auto_partitions (oacc_loop *loop, unsig= ned outer_mask, > /* Orphan reductions cannot have gang partitioning. */ > if ((loop->flags & OLF_REDUCTION) > - && oacc_get_fn_attrib (current_function_decl) > - && !lookup_attribute ("omp target entrypoint", > + && oacc_get_fn_attrib (current_function_decl)) > + { > + bool gang_p =3D false; > + tree attr > + =3D lookup_attribute ("omp declare target", > + DECL_ATTRIBUTES (current_function_decl)); > + > + if (attr) > + for (tree c =3D TREE_VALUE (attr); c; c =3D OMP_CLAUSE_CHAIN (c= )) > + if (OMP_CLAUSE_CODE (c) =3D=3D OMP_CLAUSE_GANG) > + { > + gang_p =3D true; > + break; > + } > + > + if (lookup_attribute ("omp target entrypoint", > DECL_ATTRIBUTES (current_function_decl))) > - this_mask =3D GOMP_DIM_MASK (GOMP_DIM_WORKER); > + gang_p =3D true; > + > + if (!gang_p) > + this_mask =3D GOMP_DIM_MASK (GOMP_DIM_WORKER); > + } ..., I don't understand what exactly that is meant to do: as far as I can tell, we always get 'gang_p =3D=3D true' from that code? Instead, I've pushed to master branch commit 365cd5f9ba812c389b404a53d99ab5dded5097f4 '[OpenACC] Remove erroneous "Orphan reductions cannot have gang partitioning" handling', see attached. This implements the desired "gang reduction on an orphan loop" error diagnostics also for these implicit 'gang' cases, via the middle-end checking that I've just added in commit 77d24d43644909852998043335b5a0e09d1e8f02 'Consolidate OpenACC "gang reduction on an orphan loop" checking'. Gr=C3=BC=C3=9Fe Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955 --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename="0001-OpenACC-Remove-erroneous-Orphan-reductions-cannot-ha.patch" >From 365cd5f9ba812c389b404a53d99ab5dded5097f4 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Fri, 26 Nov 2021 13:11:16 +0100 Subject: [PATCH] [OpenACC] Remove erroneous "Orphan reductions cannot have gang partitioning" handling That is: -/* Ensure that the middle end does not assign gang level parallelism - to orphan loop containing reductions. */ +/* Verify that we diagnose "gang reduction on an orphan loop" for automatically + assigned gang level of parallelism. */ ... to implement what the OpenACC specification actually says. Fix-up for preceding commit 2b7dac2c0dcb087da9e4018943c023c0678234a3 "Make OpenACC orphan gang reductions errors". gcc/ * omp-offload.c (oacc_loop_auto_partitions): Remove erroneous "Orphan reductions cannot have gang partitioning" handling. gcc/testsuite/ * c-c++-common/goacc/nested-reductions-1-routine.c: Adjust. * c-c++-common/goacc/nested-reductions-2-routine.c: Adjust. * c-c++-common/goacc/orphan-reductions-2.c: Adjust. * gfortran.dg/goacc/nested-reductions-1-routine.f90: Adjust. * gfortran.dg/goacc/nested-reductions-2-routine.f90: Adjust. * gfortran.dg/goacc/orphan-reductions-1.f90: Adjust. * gfortran.dg/goacc/orphan-reductions-2.f90: Adjust. --- gcc/omp-offload.c | 7 ------ .../goacc/nested-reductions-1-routine.c | 10 +++++--- .../goacc/nested-reductions-2-routine.c | 17 +++++++------ .../c-c++-common/goacc/orphan-reductions-2.c | 24 +++++++++++-------- .../goacc/nested-reductions-1-routine.f90 | 10 +++++--- .../goacc/nested-reductions-2-routine.f90 | 17 +++++++------ .../gfortran.dg/goacc/orphan-reductions-1.f90 | 4 ++++ .../gfortran.dg/goacc/orphan-reductions-2.f90 | 24 +++++++++++-------- 8 files changed, 62 insertions(+), 51 deletions(-) diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c index 5110a424584..5cdb57d9132 100644 --- a/gcc/omp-offload.c +++ b/gcc/omp-offload.c @@ -1623,13 +1623,6 @@ oacc_loop_auto_partitions (oacc_loop *loop, unsigned outer_mask, non-innermost available level. */ unsigned this_mask = GOMP_DIM_MASK (GOMP_DIM_GANG); - /* Orphan reductions cannot have gang partitioning. */ - if ((loop->flags & OLF_REDUCTION) - && oacc_get_fn_attrib (current_function_decl) - && !lookup_attribute ("omp target entrypoint", - DECL_ATTRIBUTES (current_function_decl))) - this_mask = GOMP_DIM_MASK (GOMP_DIM_WORKER); - /* Find the first outermost available partition. */ while (this_mask <= outer_mask) this_mask <<= 1; diff --git a/gcc/testsuite/c-c++-common/goacc/nested-reductions-1-routine.c b/gcc/testsuite/c-c++-common/goacc/nested-reductions-1-routine.c index 9e34614eb15..45b8cf3451f 100644 --- a/gcc/testsuite/c-c++-common/goacc/nested-reductions-1-routine.c +++ b/gcc/testsuite/c-c++-common/goacc/nested-reductions-1-routine.c @@ -8,18 +8,21 @@ void acc_routine (void) int i, j, k, sum, diff; { + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) for (j = 0; j < 10; j++) for (k = 0; k < 10; k++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop collapse(2) reduction(+:sum) for (i = 0; i < 10; i++) for (j = 0; j < 10; j++) for (k = 0; k < 10; k++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) #pragma acc loop reduction(+:sum) @@ -27,6 +30,7 @@ void acc_routine (void) for (k = 0; k < 10; k++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) #pragma acc loop collapse(2) reduction(+:sum) @@ -34,6 +38,7 @@ void acc_routine (void) for (k = 0; k < 10; k++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) for (j = 0; j < 10; j++) @@ -41,27 +46,26 @@ void acc_routine (void) for (k = 0; k < 10; k++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) #pragma acc loop reduction(+:sum) - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop reduction(+:sum) for (k = 0; k < 10; k++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) reduction(-:diff) for (i = 0; i < 10; i++) { #pragma acc loop reduction(+:sum) - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop reduction(+:sum) for (k = 0; k < 10; k++) sum = 1; #pragma acc loop reduction(-:diff) - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop reduction(-:diff) for (k = 0; k < 10; k++) diff --git a/gcc/testsuite/c-c++-common/goacc/nested-reductions-2-routine.c b/gcc/testsuite/c-c++-common/goacc/nested-reductions-2-routine.c index 9bd79dea4cf..3b2b0275ec8 100644 --- a/gcc/testsuite/c-c++-common/goacc/nested-reductions-2-routine.c +++ b/gcc/testsuite/c-c++-common/goacc/nested-reductions-2-routine.c @@ -8,29 +8,29 @@ void acc_routine (void) int i, j, k, l, sum, diff; { + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) #pragma acc loop // { dg-warning "nested loop in reduction needs reduction clause for .sum." } - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop reduction(+:sum) for (k = 0; k < 10; k++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) #pragma acc loop collapse(2) // { dg-warning "nested loop in reduction needs reduction clause for .sum." } - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) for (k = 0; k < 10; k++) #pragma acc loop reduction(+:sum) for (l = 0; l < 10; l++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) #pragma acc loop // { dg-warning "nested loop in reduction needs reduction clause for .sum." } - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop // { dg-warning "nested loop in reduction needs reduction clause for .sum." } // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } @@ -39,28 +39,28 @@ void acc_routine (void) for (l = 0; l < 10; l++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) #pragma acc loop reduction(-:sum) // { dg-warning "conflicting reduction operations for .sum." } - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop reduction(+:sum) // { dg-warning "conflicting reduction operations for .sum." } for (k = 0; k < 10; k++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) #pragma acc loop reduction(-:sum) // { dg-warning "conflicting reduction operations for .sum." } - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop reduction(-:sum) for (k = 0; k < 10; k++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) #pragma acc loop reduction(-:sum) // { dg-warning "conflicting reduction operations for .sum." } - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop // { dg-warning "nested loop in reduction needs reduction clause for .sum." } // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } @@ -69,10 +69,10 @@ void acc_routine (void) for (l = 0; l < 10; l++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) for (i = 0; i < 10; i++) #pragma acc loop reduction(-:sum) // { dg-warning "conflicting reduction operations for .sum." } - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop reduction(+:sum) // { dg-warning "conflicting reduction operations for .sum." }) // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } @@ -81,18 +81,17 @@ void acc_routine (void) for (l = 0; l < 10; l++) sum = 1; + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ #pragma acc loop reduction(+:sum) reduction(-:diff) for (i = 0; i < 10; i++) { #pragma acc loop reduction(-:diff) // { dg-warning "nested loop in reduction needs reduction clause for .sum." } - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop reduction(+:sum) for (k = 0; k < 10; k++) sum = 1; #pragma acc loop reduction(+:sum) // { dg-warning "nested loop in reduction needs reduction clause for .diff." } - // { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } for (j = 0; j < 10; j++) #pragma acc loop reduction(-:diff) for (k = 0; k < 10; k++) diff --git a/gcc/testsuite/c-c++-common/goacc/orphan-reductions-2.c b/gcc/testsuite/c-c++-common/goacc/orphan-reductions-2.c index 941e5c6126a..6ff8698a35c 100644 --- a/gcc/testsuite/c-c++-common/goacc/orphan-reductions-2.c +++ b/gcc/testsuite/c-c++-common/goacc/orphan-reductions-2.c @@ -1,17 +1,20 @@ -/* Ensure that the middle end does not assign gang level parallelism - to orphan loop containing reductions. */ +/* Verify that we diagnose "gang reduction on an orphan loop" for automatically + assigned gang level of parallelism. */ /* { dg-do compile } */ /* { dg-additional-options "-fopt-info-optimized-omp" } */ /* { dg-additional-options "-Wopenacc-parallelism" } */ #pragma acc routine gang +/* { dg-bogus "warning: region is worker partitioned but does not contain worker partitioned code" "TODO default 'gang' 'vector'" { xfail *-*-* } .+3 } + TODO It's the compiler's own decision to not use 'worker' parallelism here, so it doesn't make sense to bother the user about it. */ int -f1 () /* { dg-warning "region is gang partitioned but does not contain gang partitioned code" } */ +f1 () { int sum = 0, i; -#pragma acc loop reduction (+:sum) /* { dg-optimized "assigned OpenACC worker vector loop parallelism" } */ + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ +#pragma acc loop reduction (+:sum) /* { dg-optimized "assigned OpenACC gang vector loop parallelism" } */ for (i = 0; i < 100; i++) sum++; @@ -20,11 +23,12 @@ f1 () /* { dg-warning "region is gang partitioned but does not contain gang part #pragma acc routine gang int -f2 () /* { dg-warning "region is gang partitioned but does not contain gang partitioned code" } */ +f2 () { int sum = 0, i, j; -#pragma acc loop reduction (+:sum) /* { dg-optimized "assigned OpenACC worker loop parallelism" } */ + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ +#pragma acc loop reduction (+:sum) /* { dg-optimized "assigned OpenACC gang worker loop parallelism" } */ for (i = 0; i < 100; i++) #pragma acc loop reduction (+:sum) /* { dg-optimized "assigned OpenACC vector loop parallelism" } */ for (j = 0; j < 100; j++) @@ -35,14 +39,14 @@ f2 () /* { dg-warning "region is gang partitioned but does not contain gang part #pragma acc routine gang int -f3 () /* { dg-warning "region is gang partitioned but does not contain gang partitioned code" } */ +f3 () { int sum = 0, i, j, k; -#pragma acc loop reduction (+:sum) /* { dg-optimized "assigned OpenACC worker loop parallelism" } */ + /* { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } */ +#pragma acc loop reduction (+:sum) /* { dg-optimized "assigned OpenACC gang loop parallelism" } */ for (i = 0; i < 100; i++) -#pragma acc loop reduction (+:sum) /* { dg-optimized "assigned OpenACC seq loop parallelism" } */ - /* { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } */ +#pragma acc loop reduction (+:sum) /* { dg-optimized "assigned OpenACC worker loop parallelism" } */ for (j = 0; j < 100; j++) #pragma acc loop reduction (+:sum) /* { dg-optimized "assigned OpenACC vector loop parallelism" } */ for (k = 0; k < 100; k++) diff --git a/gcc/testsuite/gfortran.dg/goacc/nested-reductions-1-routine.f90 b/gcc/testsuite/gfortran.dg/goacc/nested-reductions-1-routine.f90 index e8264114714..e1b0a0202d4 100644 --- a/gcc/testsuite/gfortran.dg/goacc/nested-reductions-1-routine.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/nested-reductions-1-routine.f90 @@ -8,6 +8,7 @@ subroutine acc_routine () integer :: i, j, k, sum, diff + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 do j = 1, 10 @@ -17,6 +18,7 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop collapse(2) reduction(+:sum) do i = 1, 10 do j = 1, 10 @@ -26,6 +28,7 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 !$acc loop reduction(+:sum) @@ -36,6 +39,7 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 !$acc loop collapse(2) reduction(+:sum) @@ -46,6 +50,7 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 do j = 1, 10 @@ -56,10 +61,10 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 !$acc loop reduction(+:sum) - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop reduction(+:sum) do k = 1, 10 @@ -68,10 +73,10 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) reduction(-:diff) do i = 1, 10 !$acc loop reduction(+:sum) - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop reduction(+:sum) do k = 1, 10 @@ -80,7 +85,6 @@ subroutine acc_routine () end do !$acc loop reduction(-:diff) - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop reduction(-:diff) do k = 1, 10 diff --git a/gcc/testsuite/gfortran.dg/goacc/nested-reductions-2-routine.f90 b/gcc/testsuite/gfortran.dg/goacc/nested-reductions-2-routine.f90 index 98b1aa641c0..73a05207910 100644 --- a/gcc/testsuite/gfortran.dg/goacc/nested-reductions-2-routine.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/nested-reductions-2-routine.f90 @@ -7,10 +7,10 @@ subroutine acc_routine () !$acc routine gang integer :: i, j, k, l, sum, diff + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 !$acc loop ! { dg-warning "nested loop in reduction needs reduction clause for .sum." } - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop reduction(+:sum) do k = 1, 10 @@ -19,10 +19,10 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 !$acc loop collapse(2) ! { dg-warning "nested loop in reduction needs reduction clause for .sum." } - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 do k = 1, 10 !$acc loop reduction(+:sum) @@ -33,10 +33,10 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 !$acc loop ! { dg-warning "nested loop in reduction needs reduction clause for .sum." } - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop ! { dg-warning "nested loop in reduction needs reduction clause for .sum." } ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } @@ -49,10 +49,10 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 !$acc loop reduction(-:sum) ! { dg-warning "conflicting reduction operations for .sum." } - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop reduction(+:sum) ! { dg-warning "conflicting reduction operations for .sum." } do k = 1, 10 @@ -61,10 +61,10 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 !$acc loop reduction(-:sum) ! { dg-warning "conflicting reduction operations for .sum." } - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop reduction(-:sum) do k = 1, 10 @@ -73,10 +73,10 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 !$acc loop reduction(-:sum) ! { dg-warning "conflicting reduction operations for .sum." } - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop ! { dg-warning "nested loop in reduction needs reduction clause for .sum." } ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } @@ -89,10 +89,10 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) do i = 1, 10 !$acc loop reduction(-:sum) ! { dg-warning "conflicting reduction operations for .sum." } - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop reduction(+:sum) ! { dg-warning "conflicting reduction operations for .sum." } ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } @@ -105,10 +105,10 @@ subroutine acc_routine () end do end do + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } !$acc loop reduction(+:sum) reduction(-:diff) do i = 1, 10 !$acc loop reduction(-:diff) ! { dg-warning "nested loop in reduction needs reduction clause for .sum." } - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop reduction(+:sum) do k = 1, 10 @@ -117,7 +117,6 @@ subroutine acc_routine () end do !$acc loop reduction(+:sum) ! { dg-warning "nested loop in reduction needs reduction clause for .diff." } - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do j = 1, 10 !$acc loop reduction(-:diff) do k = 1, 10 diff --git a/gcc/testsuite/gfortran.dg/goacc/orphan-reductions-1.f90 b/gcc/testsuite/gfortran.dg/goacc/orphan-reductions-1.f90 index 464dee1260a..8eed080a128 100644 --- a/gcc/testsuite/gfortran.dg/goacc/orphan-reductions-1.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/orphan-reductions-1.f90 @@ -42,6 +42,7 @@ subroutine s2 end do !$acc loop reduction(+:sum) + ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do i = 1, n !$acc loop gang reduction(+:sum) ! { dg-error "gang reduction on an orphan loop" } do j = 1, n @@ -92,6 +93,7 @@ integer function f2 () end do !$acc loop reduction(+:sum) + ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do i = 1, n !$acc loop gang reduction(+:sum) ! { dg-error "gang reduction on an orphan loop" } do j = 1, n @@ -144,6 +146,7 @@ contains end do !$acc loop reduction(+:sum) + ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do i = 1, n !$acc loop gang reduction(+:sum) ! { dg-error "gang reduction on an orphan loop" } do j = 1, n @@ -194,6 +197,7 @@ contains end do !$acc loop reduction(+:sum) + ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } do i = 1, n !$acc loop gang reduction(+:sum) ! { dg-error "gang reduction on an orphan loop" } do j = 1, n diff --git a/gcc/testsuite/gfortran.dg/goacc/orphan-reductions-2.f90 b/gcc/testsuite/gfortran.dg/goacc/orphan-reductions-2.f90 index 7ff0a57e620..47f8a326607 100644 --- a/gcc/testsuite/gfortran.dg/goacc/orphan-reductions-2.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/orphan-reductions-2.f90 @@ -1,29 +1,33 @@ -! Ensure that the middle end does not assign gang level parallelism to -! orphan loop containing reductions. +! Verify that we diagnose "gang reduction on an orphan loop" for automatically +! assigned gang level of parallelism. ! { dg-do compile } ! { dg-additional-options "-fopt-info-optimized-omp" } ! { dg-additional-options "-Wopenacc-parallelism" } -subroutine s1 ! { dg-warning "region is gang partitioned but does not contain gang partitioned code" } +subroutine s1 implicit none !$acc routine gang + ! { dg-bogus "\[Ww\]arning: region is worker partitioned but does not contain worker partitioned code" "TODO default 'gang' 'vector'" { xfail *-*-* } .-3 } + !TODO It's the compiler's own decision to not use 'worker' parallelism here, so it doesn't make sense to bother the user about it. integer i, sum sum = 0 - !$acc loop reduction (+:sum) ! { dg-optimized "assigned OpenACC worker vector loop parallelism" } + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } + !$acc loop reduction (+:sum) ! { dg-optimized "assigned OpenACC gang vector loop parallelism" } do i = 1, 10 sum = sum + 1 end do end subroutine s1 -subroutine s2 ! { dg-warning "region is gang partitioned but does not contain gang partitioned code" } +subroutine s2 implicit none !$acc routine gang integer i, j, sum sum = 0 - !$acc loop reduction (+:sum) ! { dg-optimized "assigned OpenACC worker loop parallelism" } + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } + !$acc loop reduction (+:sum) ! { dg-optimized "assigned OpenACC gang worker loop parallelism" } do i = 1, 10 !$acc loop reduction (+:sum) ! { dg-optimized "assigned OpenACC vector loop parallelism" } do j = 1, 10 @@ -32,16 +36,16 @@ subroutine s2 ! { dg-warning "region is gang partitioned but does not contain ga end do end subroutine s2 -subroutine s3 ! { dg-warning "region is gang partitioned but does not contain gang partitioned code" } +subroutine s3 implicit none !$acc routine gang integer i, j, k, sum sum = 0 - !$acc loop reduction (+:sum) ! { dg-optimized "assigned OpenACC worker loop parallelism" } + ! { dg-error "gang reduction on an orphan loop" "" { target *-*-* } .+1 } + !$acc loop reduction (+:sum) ! { dg-optimized "assigned OpenACC gang loop parallelism" } do i = 1, 10 - !$acc loop reduction (+:sum) ! { dg-optimized "assigned OpenACC seq loop parallelism" } - ! { dg-warning "insufficient partitioning available to parallelize loop" "" { target *-*-* } .-1 } + !$acc loop reduction (+:sum) ! { dg-optimized "assigned OpenACC worker loop parallelism" } do j = 1, 10 !$acc loop reduction (+:sum) ! { dg-optimized "assigned OpenACC vector loop parallelism" } do k = 1, 10 -- 2.33.0 --=-=-=--