From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by sourceware.org (Postfix) with ESMTPS id A932B385841C for ; Thu, 11 Apr 2024 14:29:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A932B385841C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=baylibre.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=baylibre.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A932B385841C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712845759; cv=none; b=oXIAyGgbGEx+VU5K+xupYA4pAbcQs7btAmMi0UnjhnTBsF4qwITxXSCUMsYKN3RmudQPhKXAn1xnyeNymhXWCIDoKXRkHDjb9zat7bel3q01kpkkDWX1QjNXT2iZmdiRvKQvO9dF5Ef69iaU/FZMx9KEUFyOykvcN/nA95nGGVo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712845759; c=relaxed/simple; bh=d7/dyTv/b5PiE3sS/nbCebSbI82rVT7iDd+LyhdMBLI=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=xF4FDmiiFqif1ur4u105UjAyhJe9B+VgF03qYtH6LH4d3aaaVV1pSszqrs1hOkZtC9RgByX2ZC7oUeQA/v5LGz9Gkx9mXvjj9K/deSD/F5PQ8wmeVCnkjZe15fDrMnyUOCKMCBbRaA7CG5GEHXJ+2JjLv2oYusPZ8NO4sIseaiI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x32e.google.com with SMTP id 5b1f17b1804b1-416c4767b07so14290935e9.0 for ; Thu, 11 Apr 2024 07:29:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=baylibre-com.20230601.gappssmtp.com; s=20230601; t=1712845752; x=1713450552; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OLQwHgwU+7sXseEEVNzQZ/nnt0GjGjkS74moS78mHjA=; b=MlZHc0VRK3QynT8bER1+KcyQNmCN3o1fhgGxs58TZNR9ePHe5x/z5atAA3XIYtbnKC OWHcD79pivuZbvsQUFjsWFRtopaPsarL5SKaz4hbaHMIKyWqPm5ck6D/H7EDS2nKGKIp 1pRCD3tjgSJeE/ZwDpwvN+VOKFzDV/K8rmfBPzbA2aE1xrrr3N2tTB+GvtEs2I90sQin Xyj4RMNojAmkwl7NsFmhHhfw/WrbbcIe6+pCmd+Gw00PL+VBgt6LfMB7Kk1deO6mouvk 5SIpinc08h4zFz5Bs0hy6OPr6z/kux+lmyEXmxx+kZyaf+/JWmh1JZx+oa31DU57bw0U QakA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712845752; x=1713450552; h=content-transfer-encoding:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=OLQwHgwU+7sXseEEVNzQZ/nnt0GjGjkS74moS78mHjA=; b=E8/sSF4nIYin6ynkxIRflVPSySYIPthZmO0hfv4JDjZPIeLpJYQRKRFrr0H+3XcFiv xkyWbqFfnqzBjt4q/SW6F5B3akxuuT+LeOsBvRhxr31e4muDE4OXl3XOg7/EQCp8mZIT m8zHxmb1SqDbzCgxA7q14l5bSjxlrsiudz3YaVphkdosZCPCvi9TTPvC6Yj69hDNOR1w 0O3mNj64Ogx2Yz2M0sfAQ5j/lu/9GXTuxV15xP87GxE3Kkytub7eqztC+P0sfdSlq40z ryRy6OygNDATZz24FkG+fFniKOL+sdinCgejjI0rbax6Egkvbxr16MykXQM+/7PZ3oRt 37nw== X-Gm-Message-State: AOJu0Yy10c+0Y+RDqCw6lBu2HTemdE1C1AlaTc47W2LTfd9sSMYs4803 FCVWXVzS/wyuDLfCqv75KOpVkfOOGlJq6T6cCBJAmXNsn/v+UjKp6bz8c1o7qvc= X-Google-Smtp-Source: AGHT+IFy2uBPyo3yA8iu6GTO8j+wnA3HVjpqZvazagRJJIMu9e+5/cdRbpBaOF0LVinh1Bl/zPY+2Q== X-Received: by 2002:a05:600c:4fc8:b0:416:8a5d:1792 with SMTP id o8-20020a05600c4fc800b004168a5d1792mr4258026wmq.30.1712845752091; Thu, 11 Apr 2024 07:29:12 -0700 (PDT) Received: from euler.schwinge.homeip.net (p200300c8b70ce600fbf8323f8abe0da4.dip0.t-ipconnect.de. [2003:c8:b70c:e600:fbf8:323f:8abe:da4]) by smtp.gmail.com with ESMTPSA id b19-20020a05600c4e1300b004169f147ceasm2493064wmq.41.2024.04.11.07.29.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Apr 2024 07:29:11 -0700 (PDT) From: Thomas Schwinge To: Chung-Lin Tang , Richard Biener Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis In-Reply-To: <4abdfb2f-30e8-44eb-82a4-250a3641d9e9@pllab.cs.nthu.edu.tw> References: <5196826c-e81a-ab5c-63e9-bd8509232da0@siemens.com> <87msw4uolq.fsf@euler.schwinge.homeip.net> <4abdfb2f-30e8-44eb-82a4-250a3641d9e9@pllab.cs.nthu.edu.tw> User-Agent: Notmuch/0.30+8~g47a4bad (https://notmuchmail.org) Emacs/29.2 (x86_64-pc-linux-gnu) Date: Thu, 11 Apr 2024 16:29:09 +0200 Message-ID: <87pluwdlca.fsf@euler.schwinge.ddns.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Chung-Lin, Richard! >From me just a few mechanical pieces, see below. Richard, are you able to again comment on Chung-Lin's general strategy, as I'm not at all familiar with those parts of the code? On 2024-04-03T19:50:55+0800, Chung-Lin Tang w= rote: > On 2023/10/30 8:46 PM, Richard Biener wrote: >>> >>> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not t= he >>> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY' >>> flag. >>> >>> The actual optimization then is done in this second patch. Chung-Lin >>> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that. >>> I don't have much experience with most of the following generic code, so >>> would appreciate a helping hand, whether that conceptually makes sense = as >>> well as from the implementation point of view: > > First of all, I have removed all of the gimplify-stage scanning and setti= ng of > DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no cha= nges to > gimplify.cc now) > > I remember this code was an artifact of earlier attempts to allow struct-= member > pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but faile= d anyways. > I think the omp_data_* member accesses when building child function side > receiver_refs is blocking points-to analysis from working (didn't try dig= ging deeper) > > Also during gimplify, VAR_DECLs appeared to be reused (at least in some c= ases) for map > clause decl reference building, so hoping that the variables "happen to b= e" single-use and > DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY = does appear to be > a little risky. > > However, for firstprivate pointers processed during omp-low, it appears t= o be somewhat different. > (see below description) > >> No, I don't think you can use that flag on non-default-defs, nor >> preserve it on copying. So >> it also doesn't nicely extend to DECLs as done by the patch. We >> currently _only_ use it >> for incoming parameters. When used on arbitrary code you can get to for= example >>=20 >> ptr1(points-to-readony-memory) =3D &p->x; >> ... access via ptr1 ... >> ptr2 =3D &p->x; >> ... access via ptr2 ... >>=20 >> where both are your OMP regions differently constrained (the constrain i= s on the >> code in the region, _not_ on the actual protections of the pointed to >> data, much like >> for the fortran case). But now CSE comes along and happily replaces all= ptr2 >> with ptr2 in the second region and ... oops! > > Richard, I assume what you meant was "happily replaces all ptr2 with ptr1= in the second region"? > > That doesn't happen, because during omp-lower/expand, OMP target regions = (which is all that > this applies currently) is separated into different individual child func= tions. > > (Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during= omp-lower, when > for firstprivate pointers (i.e. 'a' here) we set this bit when constructi= ng the first load > of this pointer) > > #pragma acc parallel copyin(readonly: a[:32]) copyout(r) > { > foo (a, a[8]); > r =3D a[8]; > } > #pragma acc parallel copyin(readonly: a[:32]) copyout(r) > { > foo (a, a[12]); > r =3D a[12]; > } > > After omp-expand (before SSA): > > __attribute__((oacc parallel, omp target entrypoint, noclone)) > void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i) > { > ... > : > D.2962 =3D .omp_data_i->D.2947; > a.8 =3D D.2962; > r.1 =3D (*a.8)[12]; > foo (a.8, r.1); > r.1 =3D (*a.8)[12]; > D.2965 =3D .omp_data_i->r; > *D.2965 =3D r.1; > return; > } > > __attribute__((oacc parallel, omp target entrypoint, noclone)) > void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i) > { > ... > : > D.2968 =3D .omp_data_i->D.2939; > a.4 =3D D.2968; > r.0 =3D (*a.4)[8]; > foo (a.4, r.0); > r.0 =3D (*a.4)[8]; > D.2971 =3D .omp_data_i->r; > *D.2971 =3D r.0; > return; > } > > So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to > SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a d= efault-def > for an PARM_DECL, at least conceptually. > > (If offloading was structured significantly differently, say if child fun= ctions > were separated much earlier before omp-lowering, than this readonly-modif= ier might > possibly be a direct application of 'r' in the "fn spec" attribute) > > Other changes since first version of patch include: > 1) update of C/C++ FE changes to new style in c-family/c-omp.cc > 2) merging of two if cases in fortran/trans-openmp.cc like Thomas suggest= ed > 3) Update of readonly-2.c testcase to scan before/after "fre1" pass, to v= erify removal of a MEM load, also as Thomas suggested. Thanks! > I have re-tested this patch using mainline, with no regressions. Is this = okay for mainline? > 2024-04-03 Chung-Lin Tang > > gcc/c-family/ChangeLog: > > * c-omp.cc (c_omp_address_inspector::expand_array_base): > Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause. > (c_omp_address_inspector::expand_component_selector): Likewise. > > gcc/fortran/ChangeLog: > > * trans-openmp.cc (gfc_trans_omp_array_section): > Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause. > > gcc/ChangeLog: > > * gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY > for VAR_DECLs. > * omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for > variables of receiver refs. > * tree-pretty-print.cc (dump_omp_clause): > Print OMP_CLAUSE_MAP_POINTS_TO_READONLY. > (dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY. > * tree-ssanames.cc (make_ssa_name_fn): Set > SSA_NAME_POINTS_TO_READONLY_MEMORY if DECL_POINTS_TO_READONLY is set. > * tree.h (DECL_POINTS_TO_READONLY): New macro. > (OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro. > > gcc/testsuite/ChangeLog: > > * c-c++-common/goacc/readonly-1.c: Adjust testcase. > * c-c++-common/goacc/readonly-2.c: New testcase. > * gfortran.dg/goacc/readonly-1.f90: Adjust testcase. > --- a/gcc/c-family/c-omp.cc > +++ b/gcc/c-family/c-omp.cc > @@ -3907,6 +3907,8 @@ c_omp_address_inspector::expand_array_base (tree c, > } > else if (c2) > { > + if (OMP_CLAUSE_MAP_READONLY (c)) > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) =3D 1; > OMP_CLAUSE_CHAIN (c2) =3D OMP_CLAUSE_CHAIN (c); > OMP_CLAUSE_CHAIN (c) =3D c2; > if (implicit_p) > @@ -4051,6 +4053,8 @@ c_omp_address_inspector::expand_component_selector = (tree c, > } > else if (c2) > { > + if (OMP_CLAUSE_MAP_READONLY (c)) > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) =3D 1; > OMP_CLAUSE_CHAIN (c2) =3D OMP_CLAUSE_CHAIN (c); > OMP_CLAUSE_CHAIN (c) =3D c2; > c =3D c2; (So this replaces the 'gcc/c/c-typeck.cc:handle_omp_array_sections', 'gcc/cp/semantics.cc:handle_omp_array_sections' changes of the previous patch revision?) Are we sure that really only the 'else if (c2)' branches need to handle this, and explicitly not the preceding 'if (c3)' branches, too? I suggest we add a comment and/or handling, as necessary. If that makes sense, maybe handle for both 'c3', 'c2' via a 'bool readonly_p =3D [...]', similar to the existing 'bool implicit_p'? > --- a/gcc/fortran/trans-openmp.cc > +++ b/gcc/fortran/trans-openmp.cc > @@ -2561,6 +2561,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gf= c_exec_op op, > ptr2 =3D fold_convert (ptrdiff_type_node, ptr2); > OMP_CLAUSE_SIZE (node3) =3D fold_build2 (MINUS_EXPR, ptrdiff_type_node, > ptr, ptr2); > + if (n->u.map.readonly) > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) =3D 1; > } >=20=20 > static tree > --- a/gcc/gimple-expr.cc > +++ b/gcc/gimple-expr.cc > @@ -385,6 +385,8 @@ copy_var_decl (tree var, tree name, tree type) > DECL_CONTEXT (copy) =3D DECL_CONTEXT (var); > TREE_USED (copy) =3D 1; > DECL_SEEN_IN_BIND_EXPR_P (copy) =3D 1; > + if (VAR_P (var)) > + DECL_POINTS_TO_READONLY (copy) =3D DECL_POINTS_TO_READONLY (var); > DECL_ATTRIBUTES (copy) =3D DECL_ATTRIBUTES (var); > if (DECL_USER_ALIGN (var)) > { > --- a/gcc/omp-low.cc > +++ b/gcc/omp-low.cc > @@ -14003,6 +14003,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, om= p_context *ctx) > if (ref_to_array) > x =3D fold_convert_loc (clause_loc, TREE_TYPE (new_var), x); > gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue); > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (c) && VAR_P (x)) > + DECL_POINTS_TO_READONLY (x) =3D 1; > if ((is_ref && !ref_to_array) > || ref_to_ptr) > { > --- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c > @@ -48,17 +48,17 @@ int main (void) >=20=20 > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc declare map\\(to:y= \\) map\\(readonly,to:s\\) map\\(readonly,to:x\\)" 1 "original" } } */ >=20=20 > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:= y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: = \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1= "original" { target { c } } } } */ > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y= \\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \= [0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 = "original" { target { c } } } } */ > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\= \\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[= 0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "= original" { target { c } } } } */ > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\= [0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]= +\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "origi= nal" { target { c } } } } */ > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(t= o:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \= [0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 = "original" { target { c } } } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:= y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: = \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) ma= p\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,first= private:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } }= } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y= \\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \= [0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map= \\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstp= rivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } = } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\= \\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[= 0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\= \(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstpr= ivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } }= } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\= [0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]= +\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(rea= donly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } }= } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(t= o:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \= [0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map= \\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target {= c } } } } */ >=20=20 > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:= y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR= \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: = \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y= \\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR = \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \= [0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\= \\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <= s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[= 0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\= [0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]= +\\\]\\)" 1 "original" { target { c++ } } } } */ > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(t= o:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR = \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \= [0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:= y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR= \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\= [bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\= (pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original= " { target { c++ } } } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y= \\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR = \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[= bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(= pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original"= { target { c++ } } } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\= \\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <= s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[b= ias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(p= t_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" = { target { c++ } } } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\= [0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: = 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original= " { target { c++ } } } } */ > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(t= o:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR = \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[= bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "or= iginal" { target { c++ } } } } */ >=20=20 > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:= x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */ > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(y\\\[0\\\= ] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */ > --- /dev/null > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c > @@ -0,0 +1,16 @@ > +/* { dg-additional-options "-O -fdump-tree-phiprop -fdump-tree-fre" } */ > + > +#pragma acc routine > +extern void foo (int *ptr, int val); > + > +int main (void) > +{ > + int r, a[32]; > + #pragma acc parallel copyin(readonly: a[:32]) copyout(r) > + { > + foo (a, a[8]); > + r =3D a[8]; > + } > +} > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ =3D MEM\\\[\[^_\]+_\[= 0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */ > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ =3D MEM\\\[\[^_\]+_\[= 0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */ In the tree where I've been testing your patch, I've not been seeing 'MEM[x]' but '(*x)', therefore: -/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ =3D MEM\\\[\[^_\]+_= \[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */ -/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ =3D MEM\\\[\[^_\]+_= \[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */ +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ =3D \\(\\\*_\[0-9\]= +\\(ptro\\)\\)\\\[8\\\];" 2 "phiprop1" } } */ +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ =3D \\(\\\*_\[0-9\]= +\\(ptro\\)\\)\\\[8\\\];" 1 "fre1" } } */ Maybe that's due to something else in my (long...) Git branch; just make sure you've got PASSes here, eventually. > --- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 > +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 > @@ -80,16 +80,16 @@ end program main > ! The front end turns OpenACC 'declare' into OpenACC 'data'. > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readon= ly,to:\\*b\\) map\\(alloc:b.+ map\\(to:\\*c\\) map\\(alloc:c.+" 1 "original= " } } > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readon= ly,to:g\\) map\\(to:h\\)" 1 "original" } } > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(read= only,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\= (to:\\*.+ map\\(alloc:c.+" 1 "original" } } > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(read= only,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:= c.+ map\\(alloc:c.+" 1 "original" } } > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(reado= nly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(= to:\\*.+ map\\(alloc:c.+" 1 "original" } } > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(reado= nly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c= .+ map\\(alloc:c.+" 1 "original" } } > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readon= ly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(t= o:\\*.+ map\\(alloc:c.+" 1 "original" } } > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readon= ly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.= + map\\(alloc:c.+" 1 "original" } } > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly= ,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:= \\*.+ map\\(alloc:c.+" 1 "original" } } > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly= ,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ = map\\(alloc:c.+" 1 "original" } } > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(re= adonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map= \\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(re= adonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(t= o:c.+ map\\(alloc:c.+" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(read= only,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_= readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(read= only,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_read= only,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(reado= nly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_r= eadonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(reado= nly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_reado= nly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readon= ly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_re= adonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readon= ly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readon= ly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly= ,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_read= only,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly= ,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly= ,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(re= adonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(p= t_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } } > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(re= adonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_re= adonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } } >=20=20 > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\= \*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \= \(readonly:\\*\\(integer\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len:= .+\\\]\\);" 8 "original" } } > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(\\*\\(inte= ger\\(kind=3D4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "origi= nal" } } Can we also get an OpenACC/Fortran test case =C3=A0 la 'c-c++-common/goacc/readonly-2.c' to demonstrate this doing something? > --- a/gcc/testsuite/gfortran.dg/pr67170.f90 > +++ b/gcc/testsuite/gfortran.dg/pr67170.f90 > @@ -28,4 +28,4 @@ end subroutine foo > end module test_module > end program >=20=20 > -! { dg-final { scan-tree-dump-times "=3D \\*arg_\[0-9\]+\\(D\\);" 1 "fre= 1" } } > +! { dg-final { scan-tree-dump-times "=3D \\*arg_\[0-9\]+\\(D\\)\\(ptro\\= );" 1 "fre1" } } Is it understood what's happening here, that this is the correct behavior? I suppose so -- there's no actual change in behavior -- as this here doesn't trigger for OpenACC 'readonly' modifier, but just the pretty printer change for 'SSA_NAME_POINTS_TO_READONLY_MEMORY': > --- a/gcc/tree-pretty-print.cc > +++ b/gcc/tree-pretty-print.cc > @@ -915,6 +915,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int= spc, dump_flags_t flags) > pp_string (pp, "map("); > if (OMP_CLAUSE_MAP_READONLY (clause)) > pp_string (pp, "readonly,"); > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (clause)) > + pp_string (pp, "pt_readonly,"); > switch (OMP_CLAUSE_MAP_KIND (clause)) > { > case GOMP_MAP_ALLOC: > @@ -3620,6 +3622,8 @@ dump_generic_node (pretty_printer *pp, tree node, i= nt spc, dump_flags_t flags, > pp_string (pp, "(D)"); > if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (node)) > pp_string (pp, "(ab)"); > + if (SSA_NAME_POINTS_TO_READONLY_MEMORY (node)) > + pp_string (pp, "(ptro)"); > break; >=20=20 > case WITH_SIZE_EXPR: > --- a/gcc/tree-ssanames.cc > +++ b/gcc/tree-ssanames.cc > @@ -402,6 +402,9 @@ make_ssa_name_fn (struct function *fn, tree var, gimp= le *stmt, > else > SSA_NAME_RANGE_INFO (t) =3D NULL; >=20=20 > + if (VAR_P (var) && DECL_POINTS_TO_READONLY (var)) > + SSA_NAME_POINTS_TO_READONLY_MEMORY (t) =3D 1; > + > SSA_NAME_IN_FREE_LIST (t) =3D 0; > SSA_NAME_IS_DEFAULT_DEF (t) =3D 0; > init_ssa_name_imm_use (t); > --- a/gcc/tree.h > +++ b/gcc/tree.h > @@ -1036,6 +1036,13 @@ extern void omp_clause_range_check_failed (const_t= ree, const char *, int, > #define DECL_HIDDEN_STRING_LENGTH(NODE) \ > (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag) >=20=20 > +/* In a VAR_DECL, set for variables regarded as pointing to memory not w= ritten > + to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs created= from > + such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in copy= in > + clauses. */ > +#define DECL_POINTS_TO_READONLY(NODE) \ > + (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray) Again update the table for the flag uses are listed? (There is a 'VAR_DECL_CHECK', which hopefully means the same thing.) > + > /* In a CALL_EXPR, means that the call is the jump from a thunk to the > thunked-to function. Be careful to avoid using this macro when one o= f the > next two applies instead. */ > @@ -1845,6 +1852,10 @@ class auto_suppress_location_wrappers > #define OMP_CLAUSE_MAP_READONLY(NODE) \ > TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) >=20=20 > +/* Set if 'OMP_CLAUSE_DECL (NODE)' points to read-only memory. */ > +#define OMP_CLAUSE_MAP_POINTS_TO_READONLY(NODE) \ > + TREE_CONSTANT (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) > + > /* Same as above, for use in OpenACC cache directives. */ > #define OMP_CLAUSE__CACHE__READONLY(NODE) \ > TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_)) (Note, corresponding 'OMP_CLAUSE_MAP_POINTS_TO_READONLY' doesn't exist yet, due to missing actual handling of the OpenACC 'cache' directive; 'gcc/gimplify.cc:gimplify_oacc_cache'.) Gr=C3=BC=C3=9Fe Thomas