On Thu, 11 Apr 2024, Thomas Schwinge wrote: > Hi Chung-Lin, Richard! > > From me just a few mechanical pieces, see below. Richard, are you able > to again comment on Chung-Lin's general strategy, as I'm not at all > familiar with those parts of the code? I've queued all stage1 material and will be only able to slowly look at it after we released. > On 2024-04-03T19:50:55+0800, Chung-Lin Tang wrote: > > On 2023/10/30 8:46 PM, Richard Biener wrote: > >>> > >>> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the > >>> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY' > >>> flag. > >>> > >>> The actual optimization then is done in this second patch. Chung-Lin > >>> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that. > >>> I don't have much experience with most of the following generic code, so > >>> would appreciate a helping hand, whether that conceptually makes sense as > >>> well as from the implementation point of view: > > > > First of all, I have removed all of the gimplify-stage scanning and setting of > > DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no changes to > > gimplify.cc now) > > > > I remember this code was an artifact of earlier attempts to allow struct-member > > pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but failed anyways. > > I think the omp_data_* member accesses when building child function side > > receiver_refs is blocking points-to analysis from working (didn't try digging deeper) > > > > Also during gimplify, VAR_DECLs appeared to be reused (at least in some cases) for map > > clause decl reference building, so hoping that the variables "happen to be" single-use and > > DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY does appear to be > > a little risky. > > > > However, for firstprivate pointers processed during omp-low, it appears to be somewhat different. > > (see below description) > > > >> No, I don't think you can use that flag on non-default-defs, nor > >> preserve it on copying. So > >> it also doesn't nicely extend to DECLs as done by the patch. We > >> currently _only_ use it > >> for incoming parameters. When used on arbitrary code you can get to for example > >> > >> ptr1(points-to-readony-memory) = &p->x; > >> ... access via ptr1 ... > >> ptr2 = &p->x; > >> ... access via ptr2 ... > >> > >> where both are your OMP regions differently constrained (the constrain is on the > >> code in the region, _not_ on the actual protections of the pointed to > >> data, much like > >> for the fortran case). But now CSE comes along and happily replaces all ptr2 > >> with ptr2 in the second region and ... oops! > > > > Richard, I assume what you meant was "happily replaces all ptr2 with ptr1 in the second region"? > > > > That doesn't happen, because during omp-lower/expand, OMP target regions (which is all that > > this applies currently) is separated into different individual child functions. > > > > (Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during omp-lower, when > > for firstprivate pointers (i.e. 'a' here) we set this bit when constructing the first load > > of this pointer) > > > > #pragma acc parallel copyin(readonly: a[:32]) copyout(r) > > { > > foo (a, a[8]); > > r = a[8]; > > } > > #pragma acc parallel copyin(readonly: a[:32]) copyout(r) > > { > > foo (a, a[12]); > > r = a[12]; > > } > > > > After omp-expand (before SSA): > > > > __attribute__((oacc parallel, omp target entrypoint, noclone)) > > void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i) > > { > > ... > > : > > D.2962 = .omp_data_i->D.2947; > > a.8 = D.2962; > > r.1 = (*a.8)[12]; > > foo (a.8, r.1); > > r.1 = (*a.8)[12]; > > D.2965 = .omp_data_i->r; > > *D.2965 = r.1; > > return; > > } > > > > __attribute__((oacc parallel, omp target entrypoint, noclone)) > > void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i) > > { > > ... > > : > > D.2968 = .omp_data_i->D.2939; > > a.4 = D.2968; > > r.0 = (*a.4)[8]; > > foo (a.4, r.0); > > r.0 = (*a.4)[8]; > > D.2971 = .omp_data_i->r; > > *D.2971 = r.0; > > return; > > } > > > > So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to > > SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a default-def > > for an PARM_DECL, at least conceptually. > > > > (If offloading was structured significantly differently, say if child functions > > were separated much earlier before omp-lowering, than this readonly-modifier might > > possibly be a direct application of 'r' in the "fn spec" attribute) > > > > Other changes since first version of patch include: > > 1) update of C/C++ FE changes to new style in c-family/c-omp.cc > > 2) merging of two if cases in fortran/trans-openmp.cc like Thomas suggested > > 3) Update of readonly-2.c testcase to scan before/after "fre1" pass, to verify removal of a MEM load, also as Thomas suggested. > > Thanks! > > > I have re-tested this patch using mainline, with no regressions. Is this okay for mainline? > > > 2024-04-03 Chung-Lin Tang > > > > gcc/c-family/ChangeLog: > > > > * c-omp.cc (c_omp_address_inspector::expand_array_base): > > Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause. > > (c_omp_address_inspector::expand_component_selector): Likewise. > > > > gcc/fortran/ChangeLog: > > > > * trans-openmp.cc (gfc_trans_omp_array_section): > > Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause. > > > > gcc/ChangeLog: > > > > * gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY > > for VAR_DECLs. > > * omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for > > variables of receiver refs. > > * tree-pretty-print.cc (dump_omp_clause): > > Print OMP_CLAUSE_MAP_POINTS_TO_READONLY. > > (dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY. > > * tree-ssanames.cc (make_ssa_name_fn): Set > > SSA_NAME_POINTS_TO_READONLY_MEMORY if DECL_POINTS_TO_READONLY is set. > > * tree.h (DECL_POINTS_TO_READONLY): New macro. > > (OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro. > > > > gcc/testsuite/ChangeLog: > > > > * c-c++-common/goacc/readonly-1.c: Adjust testcase. > > * c-c++-common/goacc/readonly-2.c: New testcase. > > * gfortran.dg/goacc/readonly-1.f90: Adjust testcase. > > > --- a/gcc/c-family/c-omp.cc > > +++ b/gcc/c-family/c-omp.cc > > @@ -3907,6 +3907,8 @@ c_omp_address_inspector::expand_array_base (tree c, > > } > > else if (c2) > > { > > + if (OMP_CLAUSE_MAP_READONLY (c)) > > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1; > > OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c); > > OMP_CLAUSE_CHAIN (c) = c2; > > if (implicit_p) > > @@ -4051,6 +4053,8 @@ c_omp_address_inspector::expand_component_selector (tree c, > > } > > else if (c2) > > { > > + if (OMP_CLAUSE_MAP_READONLY (c)) > > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1; > > OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c); > > OMP_CLAUSE_CHAIN (c) = c2; > > c = c2; > > (So this replaces the 'gcc/c/c-typeck.cc:handle_omp_array_sections', > 'gcc/cp/semantics.cc:handle_omp_array_sections' changes of the previous > patch revision?) > > Are we sure that really only the 'else if (c2)' branches need to handle > this, and explicitly not the preceding 'if (c3)' branches, too? I > suggest we add a comment and/or handling, as necessary. If that makes > sense, maybe handle for both 'c3', 'c2' via a 'bool readonly_p = [...]', > similar to the existing 'bool implicit_p'? > > > --- a/gcc/fortran/trans-openmp.cc > > +++ b/gcc/fortran/trans-openmp.cc > > @@ -2561,6 +2561,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_exec_op op, > > ptr2 = fold_convert (ptrdiff_type_node, ptr2); > > OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, ptrdiff_type_node, > > ptr, ptr2); > > + if (n->u.map.readonly) > > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1; > > } > > > > static tree > > > --- a/gcc/gimple-expr.cc > > +++ b/gcc/gimple-expr.cc > > @@ -385,6 +385,8 @@ copy_var_decl (tree var, tree name, tree type) > > DECL_CONTEXT (copy) = DECL_CONTEXT (var); > > TREE_USED (copy) = 1; > > DECL_SEEN_IN_BIND_EXPR_P (copy) = 1; > > + if (VAR_P (var)) > > + DECL_POINTS_TO_READONLY (copy) = DECL_POINTS_TO_READONLY (var); > > DECL_ATTRIBUTES (copy) = DECL_ATTRIBUTES (var); > > if (DECL_USER_ALIGN (var)) > > { > > > --- a/gcc/omp-low.cc > > +++ b/gcc/omp-low.cc > > @@ -14003,6 +14003,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) > > if (ref_to_array) > > x = fold_convert_loc (clause_loc, TREE_TYPE (new_var), x); > > gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue); > > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (c) && VAR_P (x)) > > + DECL_POINTS_TO_READONLY (x) = 1; > > if ((is_ref && !ref_to_array) > > || ref_to_ptr) > > { > > > --- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c > > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c > > @@ -48,17 +48,17 @@ int main (void) > > > > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc declare map\\(to:y\\) map\\(readonly,to:s\\) map\\(readonly,to:x\\)" 1 "original" } } */ > > > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */ > > > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > > > > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */ > > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */ > > > --- /dev/null > > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c > > @@ -0,0 +1,16 @@ > > +/* { dg-additional-options "-O -fdump-tree-phiprop -fdump-tree-fre" } */ > > + > > +#pragma acc routine > > +extern void foo (int *ptr, int val); > > + > > +int main (void) > > +{ > > + int r, a[32]; > > + #pragma acc parallel copyin(readonly: a[:32]) copyout(r) > > + { > > + foo (a, a[8]); > > + r = a[8]; > > + } > > +} > > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */ > > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */ > > In the tree where I've been testing your patch, I've not been seeing > 'MEM[x]' but '(*x)', therefore: > > -/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */ > -/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */ > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = \\(\\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 2 "phiprop1" } } */ > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = \\(\\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 1 "fre1" } } */ > > Maybe that's due to something else in my (long...) Git branch; just make > sure you've got PASSes here, eventually. > > > --- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 > > +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 > > @@ -80,16 +80,16 @@ end program main > > ! The front end turns OpenACC 'declare' into OpenACC 'data'. > > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*b\\) map\\(alloc:b.+ map\\(to:\\*c\\) map\\(alloc:c.+" 1 "original" } } > > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:g\\) map\\(to:h\\)" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > > > > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } } > > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } } > > Can we also get an OpenACC/Fortran test case à la > 'c-c++-common/goacc/readonly-2.c' to demonstrate this doing something? > > > --- a/gcc/testsuite/gfortran.dg/pr67170.f90 > > +++ b/gcc/testsuite/gfortran.dg/pr67170.f90 > > @@ -28,4 +28,4 @@ end subroutine foo > > end module test_module > > end program > > > > -! { dg-final { scan-tree-dump-times "= \\*arg_\[0-9\]+\\(D\\);" 1 "fre1" } } > > +! { dg-final { scan-tree-dump-times "= \\*arg_\[0-9\]+\\(D\\)\\(ptro\\);" 1 "fre1" } } > > Is it understood what's happening here, that this is the correct > behavior? I suppose so -- there's no actual change in behavior -- as > this here doesn't trigger for OpenACC 'readonly' modifier, but just the > pretty printer change for 'SSA_NAME_POINTS_TO_READONLY_MEMORY': > > > --- a/gcc/tree-pretty-print.cc > > +++ b/gcc/tree-pretty-print.cc > > @@ -915,6 +915,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) > > pp_string (pp, "map("); > > if (OMP_CLAUSE_MAP_READONLY (clause)) > > pp_string (pp, "readonly,"); > > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (clause)) > > + pp_string (pp, "pt_readonly,"); > > switch (OMP_CLAUSE_MAP_KIND (clause)) > > { > > case GOMP_MAP_ALLOC: > > @@ -3620,6 +3622,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, > > pp_string (pp, "(D)"); > > if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (node)) > > pp_string (pp, "(ab)"); > > + if (SSA_NAME_POINTS_TO_READONLY_MEMORY (node)) > > + pp_string (pp, "(ptro)"); > > break; > > > > case WITH_SIZE_EXPR: > > > --- a/gcc/tree-ssanames.cc > > +++ b/gcc/tree-ssanames.cc > > @@ -402,6 +402,9 @@ make_ssa_name_fn (struct function *fn, tree var, gimple *stmt, > > else > > SSA_NAME_RANGE_INFO (t) = NULL; > > > > + if (VAR_P (var) && DECL_POINTS_TO_READONLY (var)) > > + SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1; > > + > > SSA_NAME_IN_FREE_LIST (t) = 0; > > SSA_NAME_IS_DEFAULT_DEF (t) = 0; > > init_ssa_name_imm_use (t); > > > --- a/gcc/tree.h > > +++ b/gcc/tree.h > > @@ -1036,6 +1036,13 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int, > > #define DECL_HIDDEN_STRING_LENGTH(NODE) \ > > (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag) > > > > +/* In a VAR_DECL, set for variables regarded as pointing to memory not written > > + to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs created from > > + such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in copyin > > + clauses. */ > > +#define DECL_POINTS_TO_READONLY(NODE) \ > > + (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray) > > Again update the table for the flag uses are listed? > > (There is a 'VAR_DECL_CHECK', which hopefully means the same thing.) > > > + > > /* In a CALL_EXPR, means that the call is the jump from a thunk to the > > thunked-to function. Be careful to avoid using this macro when one of the > > next two applies instead. */ > > @@ -1845,6 +1852,10 @@ class auto_suppress_location_wrappers > > #define OMP_CLAUSE_MAP_READONLY(NODE) \ > > TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) > > > > +/* Set if 'OMP_CLAUSE_DECL (NODE)' points to read-only memory. */ > > +#define OMP_CLAUSE_MAP_POINTS_TO_READONLY(NODE) \ > > + TREE_CONSTANT (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) > > + > > /* Same as above, for use in OpenACC cache directives. */ > > #define OMP_CLAUSE__CACHE__READONLY(NODE) \ > > TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_)) > > (Note, corresponding 'OMP_CLAUSE_MAP_POINTS_TO_READONLY' doesn't exist > yet, due to missing actual handling of the OpenACC 'cache' directive; > 'gcc/gimplify.cc:gimplify_oacc_cache'.) > > > Grüße > Thomas > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)