Hi Richard, Thomas, On 2023/10/30 8:46 PM, Richard Biener wrote: >> >> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the >> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY' >> flag. >> >> The actual optimization then is done in this second patch. Chung-Lin >> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that. >> I don't have much experience with most of the following generic code, so >> would appreciate a helping hand, whether that conceptually makes sense as >> well as from the implementation point of view: First of all, I have removed all of the gimplify-stage scanning and setting of DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no changes to gimplify.cc now) I remember this code was an artifact of earlier attempts to allow struct-member pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but failed anyways. I think the omp_data_* member accesses when building child function side receiver_refs is blocking points-to analysis from working (didn't try digging deeper) Also during gimplify, VAR_DECLs appeared to be reused (at least in some cases) for map clause decl reference building, so hoping that the variables "happen to be" single-use and DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY does appear to be a little risky. However, for firstprivate pointers processed during omp-low, it appears to be somewhat different. (see below description) > No, I don't think you can use that flag on non-default-defs, nor > preserve it on copying. So > it also doesn't nicely extend to DECLs as done by the patch. We > currently _only_ use it > for incoming parameters. When used on arbitrary code you can get to for example > > ptr1(points-to-readony-memory) = &p->x; > ... access via ptr1 ... > ptr2 = &p->x; > ... access via ptr2 ... > > where both are your OMP regions differently constrained (the constrain is on the > code in the region, _not_ on the actual protections of the pointed to > data, much like > for the fortran case). But now CSE comes along and happily replaces all ptr2 > with ptr2 in the second region and ... oops! Richard, I assume what you meant was "happily replaces all ptr2 with ptr1 in the second region"? That doesn't happen, because during omp-lower/expand, OMP target regions (which is all that this applies currently) is separated into different individual child functions. (Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during omp-lower, when for firstprivate pointers (i.e. 'a' here) we set this bit when constructing the first load of this pointer) #pragma acc parallel copyin(readonly: a[:32]) copyout(r) { foo (a, a[8]); r = a[8]; } #pragma acc parallel copyin(readonly: a[:32]) copyout(r) { foo (a, a[12]); r = a[12]; } After omp-expand (before SSA): __attribute__((oacc parallel, omp target entrypoint, noclone)) void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i) { ... : D.2962 = .omp_data_i->D.2947; a.8 = D.2962; r.1 = (*a.8)[12]; foo (a.8, r.1); r.1 = (*a.8)[12]; D.2965 = .omp_data_i->r; *D.2965 = r.1; return; } __attribute__((oacc parallel, omp target entrypoint, noclone)) void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i) { ... : D.2968 = .omp_data_i->D.2939; a.4 = D.2968; r.0 = (*a.4)[8]; foo (a.4, r.0); r.0 = (*a.4)[8]; D.2971 = .omp_data_i->r; *D.2971 = r.0; return; } So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a default-def for an PARM_DECL, at least conceptually. (If offloading was structured significantly differently, say if child functions were separated much earlier before omp-lowering, than this readonly-modifier might possibly be a direct application of 'r' in the "fn spec" attribute) Other changes since first version of patch include: 1) update of C/C++ FE changes to new style in c-family/c-omp.cc 2) merging of two if cases in fortran/trans-openmp.cc like Thomas suggested 3) Update of readonly-2.c testcase to scan before/after "fre1" pass, to verify removal of a MEM load, also as Thomas suggested. I have re-tested this patch using mainline, with no regressions. Is this okay for mainline? Thanks, Chung-Lin 2024-04-03 Chung-Lin Tang gcc/c-family/ChangeLog: * c-omp.cc (c_omp_address_inspector::expand_array_base): Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause. (c_omp_address_inspector::expand_component_selector): Likewise. gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_trans_omp_array_section): Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause. gcc/ChangeLog: * gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY for VAR_DECLs. * omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for variables of receiver refs. * tree-pretty-print.cc (dump_omp_clause): Print OMP_CLAUSE_MAP_POINTS_TO_READONLY. (dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY. * tree-ssanames.cc (make_ssa_name_fn): Set SSA_NAME_POINTS_TO_READONLY_MEMORY if DECL_POINTS_TO_READONLY is set. * tree.h (DECL_POINTS_TO_READONLY): New macro. (OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro. gcc/testsuite/ChangeLog: * c-c++-common/goacc/readonly-1.c: Adjust testcase. * c-c++-common/goacc/readonly-2.c: New testcase. * gfortran.dg/goacc/readonly-1.f90: Adjust testcase.