From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id E72653858CDB for ; Thu, 20 Jul 2023 13:33:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E72653858CDB Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="6.01,218,1684828800"; d="scan'208";a="13927108" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 20 Jul 2023 05:33:34 -0800 IronPort-SDR: EKeO/tQ4AEbUgE77aXe262IHvpsNENJ2d8h34HHhudlJPzVzobJi04BaJ//83mRa925U9Mhgcx JeMgbWVZ2f7eUJSAsQ6skqHwJaAG3KapPXNHNE1Ud5vSULsmOfFb+hmOT3mBmmgQs3xq5bv6Ll niH4+i7IFy4D/1ZN+e4hCHNAVCjLxY3qUv3y7q6JhSp8co8JiOOh2b3aVGHK5g7lcKvMquvrDW i1R8nw1qbZqaVe7LL25qRKq4bNvAS+TZvx73OFwWqGXEkLTLp8wF4JTArQAQEVf9K4ijmatltc JZI= From: Thomas Schwinge To: Chung-Lin Tang , Tobias Burnus CC: , Catherine Moore Subject: Re: [PATCH, OpenACC 2.7] readonly modifier support in front-ends In-Reply-To: References: User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/28.2 (x86_64-pc-linux-gnu) Date: Thu, 20 Jul 2023 15:33:24 +0200 Message-ID: <87lefaaesb.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-5.9 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Chung-Lin, Tobias! On 2023-07-11T02:33:58+0800, Chung-Lin Tang wro= te: > this patch contains support for the 'readonly' modifier in copyin clauses > and the cache directive. Thanks! > As we discussed earlier, the work for actually linking this to middle-end > points-to analysis is a somewhat non-trivial issue. This first patch allo= ws > the language feature to be used in OpenACC directives first (with no effe= ct for now). > The middle-end changes are probably going to be a later patch. ACK. > (Also CCing Tobias because of the Fortran bits) A few specific GCC/Fortran questions for Tobias below, and some more review comments for Chung-Lin: > --- a/gcc/c/c-parser.cc > +++ b/gcc/c/c-parser.cc > @@ -14059,7 +14059,8 @@ c_parser_omp_variable_list (c_parser *parser, > > static tree > c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kin= d, > - tree list, bool allow_deref =3D false) > + tree list, bool allow_deref =3D false, > + bool *readonly =3D NULL) > { > /* The clauses location. */ > location_t loc =3D c_parser_peek_token (parser)->location; > @@ -14067,6 +14068,20 @@ c_parser_omp_var_list_parens (c_parser *parser, = enum omp_clause_code kind, > matching_parens parens; > if (parens.require_open (parser)) > { > + if (readonly !=3D NULL) > + { > + c_token *token =3D c_parser_peek_token (parser); > + if (token->type =3D=3D CPP_NAME > + && !strcmp (IDENTIFIER_POINTER (token->value), "readonly") > + && c_parser_peek_2nd_token (parser)->type =3D=3D CPP_COLON) > + { > + c_parser_consume_token (parser); > + c_parser_consume_token (parser); > + *readonly =3D true; > + } > + else > + *readonly =3D false; > + } > list =3D c_parser_omp_variable_list (parser, loc, kind, list, allo= w_deref); > parens.skip_until_found_close (parser); > } Instead of doing this in 'c_parser_omp_var_list_parens', I think it's clearer to have this special 'readonly :' parsing logic in the two places where it's used. For example (random), like 'ancestor :' is parsed in 'c_parser_omp_clause_device', or 'conditional :' is parsed in 'c_parser_omp_clause_lastprivate'. (Yes, this does duplicate a bit of code, but that's easy enough to follow along.) The existing 'enum omp_clause_code kind', 'bool allow_deref' actually affect the parsing process; the new 'bool readonly' only propagates a flag. > @@ -14084,7 +14099,11 @@ c_parser_omp_var_list_parens (c_parser *parser, = enum omp_clause_code kind, > OpenACC 2.6: > no_create ( variable-list ) > attach ( variable-list ) > - detach ( variable-list ) */ > + detach ( variable-list ) > + > + OpenACC 2.7: > + copyin (readonly : variable-list ) > + */ > > static tree > c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind, > @@ -14135,11 +14154,22 @@ c_parser_oacc_data_clause (c_parser *parser, pr= agma_omp_clause c_kind, > default: > gcc_unreachable (); > } > + > + /* Turn on readonly modifier parsing for copyin clause. */ > + bool readonly =3D false, *readonly_ptr =3D NULL; > + if (c_kind =3D=3D PRAGMA_OACC_CLAUSE_COPYIN) > + readonly_ptr =3D &readonly; > + > tree nl, c; > - nl =3D c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, tru= e); > + nl =3D c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, tru= e, > + readonly_ptr); That is, similar to 'c_parser_omp_clause_device', or 'c_parser_omp_clause_lastprivate', inline 'c_parser_omp_var_list_parens' here, and only for 'PRAGMA_OACC_CLAUSE_COPYIN' parse 'readonly :', then (for all) use 'c_parser_omp_variable_list' etc. instead of 'c_parser_omp_var_list_parens', then set 'readonly': > for (c =3D nl; c !=3D list; c =3D OMP_CLAUSE_CHAIN (c)) > - OMP_CLAUSE_SET_MAP_KIND (c, kind); > + { > + OMP_CLAUSE_SET_MAP_KIND (c, kind); > + if (readonly) > + OMP_CLAUSE_MAP_READONLY (c) =3D 1; > + } > > return nl; > @@ -18212,6 +18242,9 @@ c_parser_omp_structured_block (c_parser *parser, = bool *if_p) > /* OpenACC 2.0: > # pragma acc cache (variable-list) new-line > > + OpenACC 2.7: > + # pragma acc cache (readonly: variable-list) new-line > + > LOC is the location of the #pragma token. > */ > > @@ -18219,8 +18252,14 @@ static tree > c_parser_oacc_cache (location_t loc, c_parser *parser) > { > tree stmt, clauses; > + bool readonly; > + > + clauses =3D c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, = NULL, > + false, &readonly); > + if (readonly) > + for (tree c =3D clauses; c; c =3D OMP_CLAUSE_CHAIN (c)) > + OMP_CLAUSE__CACHE__READONLY (c) =3D 1; > > - clauses =3D c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, = NULL); > clauses =3D c_finish_omp_clauses (clauses, C_ORT_ACC); > > c_parser_skip_to_pragma_eol (parser); Similarly. > --- a/gcc/cp/parser.cc > +++ b/gcc/cp/parser.cc Similarly. > --- a/gcc/fortran/gfortran.h > +++ b/gcc/fortran/gfortran.h > @@ -1360,7 +1360,11 @@ typedef struct gfc_omp_namelist > { > gfc_omp_reduction_op reduction_op; > gfc_omp_depend_doacross_op depend_doacross_op; > - gfc_omp_map_op map_op; > + struct > + { > + ENUM_BITFIELD (gfc_omp_map_op) map_op:8; > + bool readonly; > + }; > gfc_expr *align; > struct > { I did wonder whether the 'readonly' flag should live in the 'gfc_omp_namelist' (as done here -- similar to 'lastprivate_conditional', for example), or in 'gfc_omp_clauses' (similar to 'ancestor', for example). Then I realized/remembered that 'gfc_omp_clauses' exists only once per directive (which is sufficient for 'ancestor', for example, as there may be only one OpenMP 'device' clause), whereas 'gfc_omp_namelist' exists once per list item -- which is what we need for 'readonly'. Thus, the above looks good to me. > --- a/gcc/fortran/openmp.cc > +++ b/gcc/fortran/openmp.cc > @@ -1196,7 +1196,7 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : om= p_mask (m) > > static bool > gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op= , > - bool allow_common, bool allow_derived) > + bool allow_common, bool allow_derived, bool reado= nly =3D false) > { > gfc_omp_namelist **head =3D NULL; > if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, = true, > @@ -1205,7 +1205,10 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list,= gfc_omp_map_op map_op, > { > gfc_omp_namelist *n; > for (n =3D *head; n; n =3D n->next) > - n->u.map_op =3D map_op; > + { > + n->u.map_op =3D map_op; > + n->u.readonly =3D readonly; > + } > return true; > } Similar to 'c_parser_omp_var_list_parens' above, the existing 'bool allow_common', 'bool allow_derived' actually affect the parsing process; the new 'bool readonly' only propagates a flag. Which I acknowledge the existing 'gfc_omp_map_op map_op' also only does, but that one's applicable to a lot more instances than 'readonly'. So I again wonder if we should keep the latter out of 'gfc_match_omp_map_clause', and instead set the flag when parsing the 'copyin' clauses; again, for example (random), like 'ancestor :', or 'conditional :' are parsed -- which you're mostly already doing: > @@ -2079,11 +2082,16 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, cons= t omp_mask mask, > { > if (openacc) > { > - if (gfc_match ("copyin ( ") =3D=3D MATCH_YES > - && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], > - OMP_MAP_TO, true, > - allow_derived)) > - continue; > + if (gfc_match ("copyin ( ") =3D=3D MATCH_YES) > + { > + bool readonly =3D false; > + if (gfc_match ("readonly : ") =3D=3D MATCH_YES) > + readonly =3D true; > + if (gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP]= , > + OMP_MAP_TO, true, > + allow_derived, readonly= )) > + continue; > + } > } ..., so you'd just set 'readonly' here, instead of having 'gfc_match_omp_map_clause' do that. Care has to be taken to only do that for the current list items, which you'll need 'gfc_omp_namelist *head' for, or similar. Hmm. Effectively inline 'gfc_match_omp_map_clause' here, or do add the 'bool readonly' argument to the latter, or something else? Or, we could add a new 'gcc/fortran/gfortran.h:gfc_omp_map_op' item 'OMP_MAP_TO_READONLY', which eventually translates into 'OMP_MAP_TO' with 'readonly' set? Then we'd just here call the (unaltered) 'gfc_match_omp_map_clause', with 'readonly ? OMP_MAP_TO_READONLY : OMP_MAP_TO'? Per 'git grep --cached '[^G]OMP_MAP_TO[^F]' -- gcc/fortran/' not a lot of places need adjusting for that (most of the 'gcc/fortran/openmp.cc' ones are not applicable). Tobias? > @@ -4008,20 +4016,35 @@ gfc_match_oacc_wait (void) > match > gfc_match_oacc_cache (void) > { > + bool readonly =3D false; > gfc_omp_clauses *c =3D gfc_get_omp_clauses (); > /* The OpenACC cache directive explicitly only allows "array elements = or > subarrays", which we're currently not checking here. Either check = this > after the call of gfc_match_omp_variable_list, or add something lik= e a > only_sections variant next to its allow_sections parameter. */ > - match m =3D gfc_match_omp_variable_list (" (", > - &c->lists[OMP_LIST_CACHE], true, > - NULL, NULL, true); > + match m =3D gfc_match (" ( "); > if (m !=3D MATCH_YES) > { > gfc_free_omp_clauses(c); > return m; > } > > + if (gfc_match ("readonly :") =3D=3D MATCH_YES) I note this one does not have a space after ':' in 'gfc_match', but the one above in 'gfc_match_omp_clauses' does. I don't know off-hand if that makes a difference in parsing -- probably not, as all of 'gcc/fortran/openmp.cc' generally doesn't seem to be very consistent about these two variants? > + readonly =3D true; > + > + gfc_omp_namelist **head =3D NULL; > + m =3D gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_CACHE], true= , > + NULL, &head, true); > + if (m !=3D MATCH_YES) > + { > + gfc_free_omp_clauses(c); > + return m; > + } > + > + if (readonly) > + for (gfc_omp_namelist *n =3D *head; n; n =3D n->next) > + n->u.readonly =3D true; This already looks like how I thought it should look like. > --- a/gcc/fortran/trans-openmp.cc > +++ b/gcc/fortran/trans-openmp.cc > @@ -3067,6 +3067,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_= clauses *clauses, > || (n->expr && gfc_expr_attr (n->expr).pointer))) > always_modifier =3D true; > > + if (n->u.readonly) > + OMP_CLAUSE_MAP_READONLY (node) =3D 1; > + > switch (n->u.map_op) > { > case OMP_MAP_ALLOC: > @@ -3920,6 +3923,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_= clauses *clauses, > } > if (n->u.present_modifier) > OMP_CLAUSE_MOTION_PRESENT (node) =3D 1; > + if (list =3D=3D OMP_LIST_CACHE && n->u.readonly) > + OMP_CLAUSE__CACHE__READONLY (node) =3D 1; > omp_clauses =3D gfc_trans_add_clause (node, omp_clauses); > } > break; > --- /dev/null > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c > @@ -0,0 +1,27 @@ > +/* { dg-additional-options "-fdump-tree-original" } */ > + > +struct S > +{ > + int *ptr; > + float f; > +}; > + > + > +int main (void) > +{ > + int x[32]; > + struct S s =3D {x, 0}; > + > + #pragma acc parallel copyin(readonly: x[:32], s.ptr[:16]) > + { > + #pragma acc cache (readonly: x[:32]) > + } > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(rea= donly,to:\\*s.ptr \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[le= n: 128\\\]\\)" 1 "original" { target { c } } } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(rea= donly,to:\\*NON_LVALUE_EXPR \\\[len: 64\\\]\\) .+ map\\(readonly,to= :x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c++ } } } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:= x\\\[0\\\] \\\[len: 128\\\]\\);$" 1 "original" } } */ Are 'len: 64' etc. also correct for targets where 'sizeof (int) !=3D 4'? Maybe just mask these out; they're not the important thing we're testing here? > --- /dev/null > +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 > @@ -0,0 +1,28 @@ > +! { dg-additional-options "-fdump-tree-original" } > + > +subroutine foo (a, n) > + integer :: n, a(:) > + integer :: i, b(n) > + !$acc parallel copyin(readonly: a(:), b(:n)) > + do i =3D 1,32 > + !$acc cache (readonly: a(:), b(:n)) > + enddo > + !$acc end parallel > +end subroutine foo > + > +program main > + integer :: i, n =3D 32, a(32) > + integer :: b(32) > + !$acc parallel copyin(readonly: a(:32), b(:n)) > + do i =3D 1,32 > + !$acc cache (readonly: a(:), b(:n)) > + enddo > + !$acc end parallel > +end program main > + > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(read= only,to:\\*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+= \\\]\\) .+ map\\(readonly,to:\\*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) p= arm.*data \\\[len: .+\\\]\\)" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(read= only,to:a\\\[\\(\\(integer\\(kind=3D8\\)\\) parm.*data - \\(integer\\(kind= =3D8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) .+ map\\(readonly,to:b\\\[\\(\\= (integer\\(kind=3D8\\)\\) parm.*data - \\(integer\\(kind=3D8\\)\\) &b\\) / = 4\\\] \\\[len: .+\\\]\\)" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\= \*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \= \(readonly:\\*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len:= .+\\\]\\);" 2 "original" } } You're scanning only one of the two 'cache' directives? If that's intentional, please add a comment, why. If not, add the missing scanning. Given the peculiarities of the Fortran parsing, where first all directive's clauses are collected and then translated en bloc, I suggest to extent the 'copyin' test cases to have several 'copyin' clauses, some with, some without 'readonly' modifier, so we make sure that 'readonly' is set only for the appropriate ones. Generally, in addition to just 'parallel' compute construct, please spread this out a bit, to also cover 'kernels', 'serial' compute constructs, and the 'data' construct. Generally, please also add testing for the 'declare' directive with 'copyin' with 'readonly' modifier -- and implement handling in case that's not implicitly covered? (..., but please don't let you be dragged into a number of pre-existing issues with OpenACC 'declare' -- I hope the 'readonly' handling is straightforward to test for.) Given that per the implementation in the front ends, the handling of 'readonly' obviously -- famous last words? ;-) -- is specific to 'copyin', it's probably OK to not have test cases to verify that the 'readonly' modifier is rejected for other data clauses? > --- a/gcc/tree-pretty-print.cc > +++ b/gcc/tree-pretty-print.cc > @@ -905,6 +905,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int= spc, dump_flags_t flags) > > case OMP_CLAUSE_MAP: > pp_string (pp, "map("); > + if (OMP_CLAUSE_MAP_READONLY (clause)) > + pp_string (pp, "readonly,"); > switch (OMP_CLAUSE_MAP_KIND (clause)) > { > case GOMP_MAP_ALLOC: > @@ -1075,6 +1077,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, i= nt spc, dump_flags_t flags) > > case OMP_CLAUSE__CACHE_: > pp_string (pp, "("); > + if (OMP_CLAUSE__CACHE__READONLY (clause)) > + pp_string (pp, "readonly:"); > dump_generic_node (pp, OMP_CLAUSE_DECL (clause), > spc, flags, false); > goto print_clause_size; > --- a/gcc/tree.h > +++ b/gcc/tree.h > @@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers > #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \ > (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_fla= g) > > +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */ > +#define OMP_CLAUSE_MAP_READONLY(NODE) \ > + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) > + > +/* Same as above, for use in OpenACC cache directives. */ > +#define OMP_CLAUSE__CACHE__READONLY(NODE) \ > + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_)) I'm not sure if these special accessor functions are actually useful, or we should just directly use 'TREE_READONLY' instead? We're only using them in contexts where it's clear that the 'OMP_CLAUSE_SUBCODE_CHECK' is satisfied, for example. Also, for the new use for OMP clauses, update 'gcc/tree.h:TREE_READONLY', and in 'gcc/tree-core.h' for 'readonly_flag' the "table lists the uses of each of the above flags". Setting 'TREE_READONLY' of the 'OMP_CLAUSE_DECL' instead of the clause itself isn't the right thing to do -- or is it, and might already indicate to the middle end the desired semantics? But does it maybe conflict with front end/language-level use of 'TREE_READONLY' for 'const' etc. (I suppose), and thus diagnostics for mismatches? I mean: int a; #pragma acc parallel copyin(readonly: a) { int *b =3D &a; ... should still continue to work (valid as long as '*b' isn't written to), so should not raise any "warning: initialization discards =E2=80=98const=E2=80=99 qualifier from po= inter target type" diagnostics. But if that's not a problem (I don't know how 'TREE_READONLY' is used elsewhere), maybe that's something to give a thought to? Or, early in the middle end, propagate 'TREE_READONLY' from the clause to its 'OMP_CLAUSE_DECL'? Might need to 'unshare_expr' the latter for modification and use in the associated region only? Just some quick thoughts, obviously without any detailed analysis. ;-) Gr=C3=BC=C3=9Fe Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955