From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x130.google.com (mail-lf1-x130.google.com [IPv6:2a00:1450:4864:20::130]) by sourceware.org (Postfix) with ESMTPS id 790BA3858D20; Mon, 30 Oct 2023 12:46:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 790BA3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 790BA3858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698669992; cv=none; b=E0ZQ308ZM71yeBOtbQR+ljXpy69ru5FPplUmuSLzWCMF8m+LsB27PoSTqovGFcuwD2vb5LSvNsIS1skLHCbc6PrccYNMRSwPX8HxE9NO9oOyDAB4xhaB+UHshNOV4eVj/UP1yj6JYCXQoujv9MFxjjhqPkoYzmsDrWvDjDKiIn0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698669992; c=relaxed/simple; bh=AHlwCHZddUb7xCcM6C7OWU3Qz16EcN1odyecAIgFYy0=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=cO4MMg7GnUrwuMKajJOs97KiS4zewURF2mz9iFHA9fGvYsHf3IfxRa8OTwIh5rWULJtUvz4lmXU2nGR4nmuyTZdHEuNGmhjL2TLGeQwfVrw2aS4Tlmnk2Kl42PHTSbDy2uf52PB6Ge868RORLFJVZdRayiV+OUJ0VQYJNCK7iBg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-5079f6efd64so6153983e87.2; Mon, 30 Oct 2023 05:46:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698669986; x=1699274786; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LYu7tKL9dIQpikDo6aw35trCX7FdEaUih2ffeur0LIg=; b=R5rUm5eiS5odOCbMHNme6+qHNN6u59he7SuSW+q9CPR1dK7tcSxLeKuJEkFXxaihT/ HkJkDVURpOHBiWAvdbOllm9Sibrd81/x2N2hpC8JRvB8tIrLO/cFxG7wkrLjD1GpphHC BKWWl3Zkw/KFR3/nSHzJxQp4V+qxeTBd/daT3uzhcZoouBt8WtQN2EO8ct0JPUW91klN +BPWx32ZkZjKB2QwbYeNaNMZbpTpxlqxJp9CI+RU8mCljR4hQg/L27iH+mygw+/MZ2qa FMSJF98ccTwhtaZZMNz0i6oRcut0eK4ns0350egjLqypnYeWxP0KM66mi7EJho5fdI5X aGNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698669986; x=1699274786; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LYu7tKL9dIQpikDo6aw35trCX7FdEaUih2ffeur0LIg=; b=Cr4tHtWVA9bOh76WqCEdliD+QJvMu7IqLCQpqps40rsUJe/A9H58c2ioLy96BXB62F 2PKjbzTCQg84aMA7Uy1m4VOI5tWTYO5by9eWrNEidQhM6f9nFnVcS2+hv3chBIX6ppga c7rVnAXbouQIO2OCP0YdbebQVmM6TGuBv0aZOooM/6PBwMqwYE/6T/gTvE1tqr34o0VW D/nFLW4pdxODYdoQGtA6OrZnYnvmLp+LCHCTqiouRAQZpboNlYVKRyWeWpd0016HmH71 lBuuGh7KIwl5kR+c5J+w3TfdGmVoFaVUg2ZGA7CfDTWl+n/qk69hf2cslZ+PJZvP5gWZ ismQ== X-Gm-Message-State: AOJu0Ywfxn4uavv/4xWS1Lji01N5KbArHRNlBFpoyXzSX7Yes6E6Uviw PrUS2dYSBXLMol5+b0d5aa+LhDH5MJgHxKjljb8= X-Google-Smtp-Source: AGHT+IGjVX6posKuB8v4BK+zkIqvM9kKC7HP6s4nosEPe9Ywj7lznycbs0exmy7Fu+fLjRkcsJSSQu5HRRXfXLyKtDc= X-Received: by 2002:a05:6512:2810:b0:504:30eb:f2ac with SMTP id cf16-20020a056512281000b0050430ebf2acmr8156256lfb.68.1698669985384; Mon, 30 Oct 2023 05:46:25 -0700 (PDT) MIME-Version: 1.0 References: <5196826c-e81a-ab5c-63e9-bd8509232da0@siemens.com> <87msw4uolq.fsf@euler.schwinge.homeip.net> In-Reply-To: <87msw4uolq.fsf@euler.schwinge.homeip.net> From: Richard Biener Date: Mon, 30 Oct 2023 13:46:13 +0100 Message-ID: Subject: Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis To: Thomas Schwinge Cc: Chung-Lin Tang , gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org, Catherine Moore , Tobias Burnus Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Oct 27, 2023 at 4:28=E2=80=AFPM Thomas Schwinge wrote: > > Hi! > > Richard, as the original author of 'SSA_NAME_POINTS_TO_READONLY_MEMORY': > 2018 commit 6214d5c7e7470bdd5ecbeae668c2522551bfebbc (Subversion r263958) > "Move const_parm trick to generic code"; 'gcc/tree.h': > > /* Nonzero if this SSA_NAME is known to point to memory that may not > be written to. This is set for default defs of function parameter= s > that have a corresponding r or R specification in the functions > fn spec attribute. This is used by alias analysis. */ > #define SSA_NAME_POINTS_TO_READONLY_MEMORY(NODE) \ > SSA_NAME_CHECK (NODE)->base.deprecated_flag > > ..., may I ask you to please help review the following patch > (full-quoted)? > > For context: this patch here ("second patch") depends on a first patch: > > "[PATCH, OpenACC 2.7] readonly modifier support in front-ends". That one > is still under review/rework; so you're not able to apply this second > patch here. > > In a nutshell: a 'readonly' modifier has been added to the OpenACC > 'copyin' clause (copy host to device memory, don't copy back at end of > region): > > | If the optional 'readonly' modifier appears, then the implementation ma= y assume that the data > | referenced by _var-list_ is never written to within the applicable regi= on. > > That is, for example (untested): > > #pragma acc routine > void escape(int *); > > int x[32] =3D [...]; > #pragma acc parallel copyin(readonly: x) > { > int a1 =3D x[3]; > escape(x); > int a2 =3D x[3]; // Per 'readonly', don't need to reload 'x[3]' her= e. > //x[22] =3D 0; // Invalid -- but no diagnostic mandated. > } > > What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the > 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY' > flag. > > The actual optimization then is done in this second patch. Chung-Lin > found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that. > I don't have much experience with most of the following generic code, so > would appreciate a helping hand, whether that conceptually makes sense as > well as from the implementation point of view: No, I don't think you can use that flag on non-default-defs, nor preserve it on copying. So it also doesn't nicely extend to DECLs as done by the patch. We currently _only_ use it for incoming parameters. When used on arbitrary code you can get to for ex= ample ptr1(points-to-readony-memory) =3D &p->x; ... access via ptr1 ... ptr2 =3D &p->x; ... access via ptr2 ... where both are your OMP regions differently constrained (the constrain is o= n the code in the region, _not_ on the actual protections of the pointed to data, much like for the fortran case). But now CSE comes along and happily replaces all pt= r2 with ptr2 in the second region and ... oops! So no, re-using SSA_NAME_POINTS_TO_READONLY_MEMORY doesn't look good. Richard. > On 2023-07-25T23:52:06+0800, Chung-Lin Tang via Gcc-patches wrote: > > On 2023/7/11 2:33 AM, Chung-Lin Tang via Gcc-patches wrote: > >> As we discussed earlier, the work for actually linking this to middle-= end > >> points-to analysis is a somewhat non-trivial issue. This first patch a= llows > >> the language feature to be used in OpenACC directives first (with no e= ffect for now). > >> The middle-end changes are probably going to be a later patch. > > > > This second patch tries to link the readonly modifier to points-to anal= ysis. > > > > There already exists SSA_NAME_POINTS_TO_READONLY_MEMORY and it's suppor= t in the > > alias oracle routines in tree-ssa-alias.cc, so basically what this patc= h does is > > try to make the variables holding the array section base pointers to ha= ve this > > flag set. > > > > There is an another OMP_CLAUSE_MAP_POINTS_TO_READONLY set by front-ends= on the > > associated pointer clauses if OMP_CLAUSE_MAP_READONLY is set. > > Also a DECL_POINTS_TO_READONLY flag is set for VAR_DECLs when creating = the tmp > > vars carrying these receiver references on the offloaded side. These > > eventually get translated to SSA_NAME_POINTS_TO_READONLY_MEMORY. > > > > This still doesn't always work as expected in terms of optimization: > > struct pointer fields and Fortran arrays (kind of like C structs) which= have > > several accesses to create the pointer access on the receive/offloaded = side, > > and SRA appears to not work on these sequences, so gets in the way of m= uch > > redundancy elimination. > > I understand correctly that this is left as future work? Please add the = test > cases you have, XFAILed in some reasonable way. > > > > Currently have one testcase where we can demonstrate 'readonly' can avo= id > > a clobber by function call. > > :-) > > > > --- a/gcc/c/c-typeck.cc > > +++ b/gcc/c/c-typeck.cc > > @@ -14258,6 +14258,8 @@ handle_omp_array_sections (tree c, enum c_omp_r= egion_type ort) > > OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_ATTACH_DETACH); > > else > > OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER); > > + if (OMP_CLAUSE_MAP_READONLY (c)) > > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) =3D 1; > > OMP_CLAUSE_MAP_IMPLICIT (c2) =3D OMP_CLAUSE_MAP_IMPLICIT (c); > > if (OMP_CLAUSE_MAP_KIND (c2) !=3D GOMP_MAP_FIRSTPRIVATE_POINTER > > && !c_mark_addressable (t)) > > > --- a/gcc/cp/semantics.cc > > +++ b/gcc/cp/semantics.cc > > @@ -5872,6 +5872,8 @@ handle_omp_array_sections (tree c, enum c_omp_reg= ion_type ort) > > } > > else > > OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER); > > + if (OMP_CLAUSE_MAP_READONLY (c)) > > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) =3D 1; > > OMP_CLAUSE_MAP_IMPLICIT (c2) =3D OMP_CLAUSE_MAP_IMPLICIT (c); > > if (OMP_CLAUSE_MAP_KIND (c2) !=3D GOMP_MAP_FIRSTPRIVATE_POINTER > > && !cxx_mark_addressable (t)) > > > --- a/gcc/fortran/trans-openmp.cc > > +++ b/gcc/fortran/trans-openmp.cc > > @@ -2524,6 +2524,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, = gfc_exec_op op, > > node3 =3D build_omp_clause (input_location, OMP_CLAUSE_MAP); > > OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind); > > OMP_CLAUSE_DECL (node3) =3D gfc_conv_descriptor_data_get (decl); > > + if (n->u.readonly) > > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) =3D 1; > > /* This purposely does not include GOMP_MAP_ALWAYS_POINTER. The= extra > > cast prevents gimplify.cc from recognising it as being part of t= he > > struct - and adding an 'alloc: for the 'desc.data' pointer, whic= h > > @@ -2559,6 +2561,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, = gfc_exec_op op, > > OMP_CLAUSE_MAP); > > OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind); > > OMP_CLAUSE_DECL (node3) =3D decl; > > + if (n->u.readonly) > > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) =3D 1; > > } > > Could combine these two into one, after > 'if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))' reconverges here, like > where 'OMP_CLAUSE_SIZE (node3)' is set: > > > ptr2 =3D fold_convert (ptrdiff_type_node, ptr2); > > OMP_CLAUSE_SIZE (node3) =3D fold_build2 (MINUS_EXPR, ptrdiff_type_no= de, > > Is 'n->u.readonly =3D=3D OMP_CLAUSE_MAP_READONLY (node)'? If yes, would = the > latter be clearer to use as the 'if' expression (like in C, C++ front > ends)? > > I see further additional 'OMP_CLAUSE_MAP' clauses synthesized, for > example in 'gcc/cp/semantics.cc:handle_omp_array_sections', or > 'gcc/fortran/trans-openmp.cc:gfc_trans_omp_array_section', also > 'gcc/gimplify.cc'. I assume these are not relevant to have > 'OMP_CLAUSE_MAP_READONLY' -> 'OMP_CLAUSE_MAP_POINTS_TO_READONLY' > propagated? Actually, per your changes (see below), there is one > 'OMP_CLAUSE_MAP_POINTS_TO_READONLY' propagation in > 'gcc/gimplify.cc:build_omp_struct_comp_nodes'. > > Is the current situation re flag setting/propagation what was empirically > necessary to make the test case work, or is it a systematic review? (The > former is fine; I'd just like to know.) > > > --- a/gcc/gimple-expr.cc > > +++ b/gcc/gimple-expr.cc > > @@ -376,6 +376,8 @@ copy_var_decl (tree var, tree name, tree type) > > DECL_CONTEXT (copy) =3D DECL_CONTEXT (var); > > TREE_USED (copy) =3D 1; > > DECL_SEEN_IN_BIND_EXPR_P (copy) =3D 1; > > + if (VAR_P (var)) > > + DECL_POINTS_TO_READONLY (copy) =3D DECL_POINTS_TO_READONLY (var); > > DECL_ATTRIBUTES (copy) =3D DECL_ATTRIBUTES (var); > > if (DECL_USER_ALIGN (var)) > > { > > > --- a/gcc/gimplify.cc > > +++ b/gcc/gimplify.cc > > @@ -221,6 +221,7 @@ struct gimplify_omp_ctx > > splay_tree variables; > > hash_set *privatized_types; > > tree clauses; > > + hash_set *pt_readonly_ptrs; > > /* Iteration variables in an OMP_FOR. */ > > vec loop_iter_var; > > location_t location; > > @@ -628,6 +629,15 @@ internal_get_tmp_var (tree val, gimple_seq *pre_p,= gimple_seq *post_p, > > gimplify_expr (&val, pre_p, post_p, is_gimple_reg_rhs_or_call, > > fb_rvalue); > > > > + bool pt_readonly =3D false; > > + if (gimplify_omp_ctxp && gimplify_omp_ctxp->pt_readonly_ptrs) > > + { > > + tree ptr =3D val; > > + if (TREE_CODE (ptr) =3D=3D POINTER_PLUS_EXPR) > > + ptr =3D TREE_OPERAND (ptr, 0); > > + pt_readonly =3D gimplify_omp_ctxp->pt_readonly_ptrs->contains (p= tr); > > + } > > 'POINTER_PLUS_EXPR' is the only special thing we may run into, here? > (Generally, I prefer 'if', 'else if, [...], 'else gcc_unreachable ()'.) > > > + > > if (allow_ssa > > && gimplify_ctxp->into_ssa > > && is_gimple_reg_type (TREE_TYPE (val))) > > @@ -639,9 +649,18 @@ internal_get_tmp_var (tree val, gimple_seq *pre_p,= gimple_seq *post_p, > > if (name) > > SET_SSA_NAME_VAR_OR_IDENTIFIER (t, create_tmp_var_name (name)= ); > > } > > + if (pt_readonly) > > + SSA_NAME_POINTS_TO_READONLY_MEMORY (t) =3D 1; > > } > > else > > - t =3D lookup_tmp_var (val, is_formal, not_gimple_reg); > > + { > > + t =3D lookup_tmp_var (val, is_formal, not_gimple_reg); > > + if (pt_readonly) > > + { > > + DECL_POINTS_TO_READONLY (t) =3D 1; > > + gimplify_omp_ctxp->pt_readonly_ptrs->add (t); > > + } > > + } > > > > mod =3D build2 (INIT_EXPR, TREE_TYPE (t), t, unshare_expr (val)); > > > > @@ -8906,6 +8925,8 @@ build_omp_struct_comp_nodes (enum tree_code code,= tree grp_start, tree grp_end, > > OMP_CLAUSE_SET_MAP_KIND (c2, mkind); > > OMP_CLAUSE_DECL (c2) =3D unshare_expr (OMP_CLAUSE_DECL (grp_end)); > > OMP_CLAUSE_CHAIN (c2) =3D NULL_TREE; > > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (grp_end)) > > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) =3D 1; > > tree grp_mid =3D NULL_TREE; > > if (OMP_CLAUSE_CHAIN (grp_start) !=3D grp_end) > > grp_mid =3D OMP_CLAUSE_CHAIN (grp_start); > > For my understanding, is this empirically necessary, or a systematic > review? > > > @@ -11741,6 +11762,16 @@ gimplify_scan_omp_clauses (tree *list_p, gimpl= e_seq *pre_p, > > > > gimplify_omp_ctxp =3D outer_ctx; > > } > > + else if (OMP_CLAUSE_CODE (c) =3D=3D OMP_CLAUSE_MAP > > + && (code =3D=3D OACC_PARALLEL > > + || code =3D=3D OACC_KERNELS > > + || code =3D=3D OACC_SERIAL) > > + && OMP_CLAUSE_MAP_POINTS_TO_READONLY (c)) > > + { > > + if (ctx->pt_readonly_ptrs =3D=3D NULL) > > + ctx->pt_readonly_ptrs =3D new hash_set= (); > > + ctx->pt_readonly_ptrs->add (OMP_CLAUSE_DECL (c)); > > + } > > if (notice_outer) > > goto do_notice; > > break; > > Also need to 'delete ctx->pt_readonly_ptrs;' somewhere. > > > --- a/gcc/omp-low.cc > > +++ b/gcc/omp-low.cc > > @@ -14098,6 +14098,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, = omp_context *ctx) > > if (ref_to_array) > > x =3D fold_convert_loc (clause_loc, TREE_TYPE (new_var)= , x); > > gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rva= lue); > > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (c) && VAR_P (x)) > > + DECL_POINTS_TO_READONLY (x) =3D 1; > > if ((is_ref && !ref_to_array) > > || ref_to_ptr) > > { > > This is in the middle of the > "Handle GOMP_MAP_FIRSTPRIVATE_{POINTER,REFERENCE} in second pass" code > block. Again, for my understanding, is this empirically necessary, or a > systematic review? > > > --- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c > > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c > > @@ -19,8 +19,8 @@ int main (void) > > return 0; > > } > > > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(r= eadonly,to:\\*s.ptr \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[= len: 128\\\]\\)" 1 "original" { target { c } } } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(r= eadonly,to:\\*NON_LVALUE_EXPR \\\[len: 64\\\]\\) .+ map\\(readonly,= to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c++ } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(r= eadonly,to:\\*s.ptr \\\[len: 64\\\]\\) map\\(pt_readonly,attach_detach:s.pt= r \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\) map\\= (pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original= " { target { c } } } } */ > > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(r= eadonly,to:\\*NON_LVALUE_EXPR \\\[len: 64\\\]\\) map\\(pt_readonly,= attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len= : 128\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\= \]\\)" 1 "original" { target { c++ } } } } */ > > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonl= y:x\\\[0\\\] \\\[len: 128\\\]\\);$" 1 "original" } } */ > > I suppose the new 'map(pt_readonly,attach_detach:s.ptr [bias: 0])' clause > was previously "hidden" in '.+'? Please then change that in the first > patch "[PATCH, OpenACC 2.7] readonly modifier support in front-ends", so > that we can see here what actually is changing (only 'pt_readonly', I > suppose). > > > --- /dev/null > > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c > > @@ -0,0 +1,15 @@ > > +/* { dg-additional-options "-O -fdump-tree-fre" } */ > > + > > +#pragma acc routine > > +extern void foo (int *ptr, int val); > > + > > +int main (void) > > +{ > > + int r, a[32]; > > + #pragma acc parallel copyin(readonly: a[:32]) copyout(r) > > + { > > + foo (a, a[8]); > > + r =3D a[8]; > > + } > > +} > > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ =3D MEM\\\[\[^_\]+_= \[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */ > > Please add a comment why 'fre1', and what generally is being checked > here; that's not obvious to the casual reader. (That is, me in a few > weeks.) ;-) > > Also add a scan for "before the optimization": two 'MEM's, I suppose? > > > --- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 > > +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 > > @@ -20,8 +20,8 @@ program main > > !$acc end parallel > > end program main > > > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(re= adonly,to:\\*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: = .+\\\]\\) .+ map\\(readonly,to:\\*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\)= parm.*data \\\[len: .+\\\]\\)" 1 "original" } } > > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(re= adonly,to:a\\\[\\(\\(integer\\(kind=3D8\\)\\) parm.*data - \\(integer\\(kin= d=3D8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) .+ map\\(readonly,to:b\\\[\\(\= \(integer\\(kind=3D8\\)\\) parm.*data - \\(integer\\(kind=3D8\\)\\) &b\\) /= 4\\\] \\\[len: .+\\\]\\)" 1 "original" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(re= adonly,to:\\*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: = .+\\\]\\) map\\(pt_readonly,alloc:a.0 \\\[pointer assign, bias: \\(integer\= \(kind=3D8\\)\\) parm.*data - \\(integer\\(kind=3D8\\)\\) a.0\\\]\\) map\\(= readonly,to:\\*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len= : .+\\\]\\) map\\(pt_readonly,alloc:b \\\[pointer assign, bias: \\(integer\= \(kind=3D8\\)\\) parm.*data - \\(integer\\(kind=3D8\\)\\) b\\\]\\)" 1 "orig= inal" } } > > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(re= adonly,to:a\\\[\\(\\(integer\\(kind=3D8\\)\\) parm.*data - \\(integer\\(kin= d=3D8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:a \\\[= pointer assign, bias: \\(integer\\(kind=3D8\\)\\) parm.*data - \\(integer\\= (kind=3D8\\)\\) &a\\\]\\) map\\(readonly,to:b\\\[\\(\\(integer\\(kind=3D8\\= )\\) parm.*data - \\(integer\\(kind=3D8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]= \\) map\\(pt_readonly,alloc:b \\\[pointer assign, bias: \\(integer\\(kind= =3D8\\)\\) parm.*data - \\(integer\\(kind=3D8\\)\\) &b\\\]\\)" 1 "original"= } } > > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly= :\\*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\)= \\(readonly:\\*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[le= n: .+\\\]\\);" 2 "original" } } > > Same comment as for 'c-c++-common/goacc/readonly-1.c'. > > > --- a/gcc/tree-pretty-print.cc > > +++ b/gcc/tree-pretty-print.cc > > @@ -907,6 +907,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, i= nt spc, dump_flags_t flags) > > pp_string (pp, "map("); > > if (OMP_CLAUSE_MAP_READONLY (clause)) > > pp_string (pp, "readonly,"); > > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (clause)) > > + pp_string (pp, "pt_readonly,"); > > switch (OMP_CLAUSE_MAP_KIND (clause)) > > { > > case GOMP_MAP_ALLOC: > > @@ -3436,6 +3438,8 @@ dump_generic_node (pretty_printer *pp, tree node,= int spc, dump_flags_t flags, > > pp_string (pp, "(D)"); > > if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (node)) > > pp_string (pp, "(ab)"); > > + if (SSA_NAME_POINTS_TO_READONLY_MEMORY (node)) > > + pp_string (pp, "(ptro)"); > > break; > > > > case WITH_SIZE_EXPR: > > > --- a/gcc/tree-ssanames.cc > > +++ b/gcc/tree-ssanames.cc > > @@ -402,6 +402,9 @@ make_ssa_name_fn (struct function *fn, tree var, gi= mple *stmt, > > else > > SSA_NAME_RANGE_INFO (t) =3D NULL; > > > > + if (VAR_P (var) && DECL_POINTS_TO_READONLY (var)) > > + SSA_NAME_POINTS_TO_READONLY_MEMORY (t) =3D 1; > > + > > SSA_NAME_IN_FREE_LIST (t) =3D 0; > > SSA_NAME_IS_DEFAULT_DEF (t) =3D 0; > > init_ssa_name_imm_use (t); > > > --- a/gcc/tree.h > > +++ b/gcc/tree.h > > @@ -1021,6 +1021,13 @@ extern void omp_clause_range_check_failed (const= _tree, const char *, int, > > #define DECL_HIDDEN_STRING_LENGTH(NODE) \ > > (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag) > > > > +/* In a VAR_DECL, set for variables regarded as pointing to memory not= written > > + to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs creat= ed from > > + such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in co= pyin > > + clauses. */ > > +#define DECL_POINTS_TO_READONLY(NODE) \ > > + (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray) > > + > > /* In a CALL_EXPR, means that the call is the jump from a thunk to the > > thunked-to function. Be careful to avoid using this macro when one= of the > > next two applies instead. */ > > @@ -1815,6 +1822,10 @@ class auto_suppress_location_wrappers > > #define OMP_CLAUSE_MAP_READONLY(NODE) \ > > TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) > > > > +/* Set if 'OMP_CLAUSE_DECL (NODE)' points to read-only memory. */ > > +#define OMP_CLAUSE_MAP_POINTS_TO_READONLY(NODE) \ > > + TREE_CONSTANT (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) > > + > > /* Same as above, for use in OpenACC cache directives. */ > > #define OMP_CLAUSE__CACHE__READONLY(NODE) \ > > TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_)) > > As in my "[PATCH, OpenACC 2.7] readonly modifier support in front-ends" > review, please document how certain flags are used for OMP clauses. > > > I note you're not actually using 'OMP_CLAUSE__CACHE__READONLY' anywhere > -- but that's OK given the current 'gcc/gimplify.cc:gimplify_oacc_cache'. > ;-) > > > Gr=C3=BC=C3=9Fe > Thomas > ----------------- > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 2= 01, 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch= =C3=A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellsc= haft: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955