* [PATCH, OpenACC 2.7] readonly modifier support in front-ends
@ 2023-07-10 18:33 Chung-Lin Tang
2023-07-11 7:00 ` Tobias Burnus
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Chung-Lin Tang @ 2023-07-10 18:33 UTC (permalink / raw)
To: gcc-patches, Thomas Schwinge, Catherine Moore, Tobias Burnus
[-- Attachment #1: Type: text/plain, Size: 2628 bytes --]
Hi Thomas,
this patch contains support for the 'readonly' modifier in copyin clauses
and the cache directive.
As we discussed earlier, the work for actually linking this to middle-end
points-to analysis is a somewhat non-trivial issue. This first patch allows
the language feature to be used in OpenACC directives first (with no effect for now).
The middle-end changes are probably going to be a later patch.
(Also CCing Tobias because of the Fortran bits)
Tested on powerpc64le-linux with nvptx offloading. Is this okay for trunk?
Thanks,
Chung-Lin
2023-07-10 Chung-Lin Tang <cltang@codesourcery.com>
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_var_list_parens):
Add 'bool *readonly = NULL' parameter, add readonly modifier parsing
support.
(c_parser_oacc_data_clause): Adjust c_parser_omp_var_list_parens call
to turn on readonly modifier parsing for copyin clause, set
OMP_CLAUSE_MAP_READONLY if readonly modifier found, update comments.
(c_parser_oacc_cache): Adjust c_parser_omp_var_list_parens call
to turn on readonly modifier parsing, set OMP_CLAUSE__CACHE__READONLY
if readonly modifier found, update comments.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_var_list):
Add 'bool *readonly = NULL' parameter, add readonly modifier parsing
support.
(cp_parser_oacc_data_clause): Adjust cp_parser_omp_var_list call
to turn on readonly modifier parsing for copyin clause, set
OMP_CLAUSE_MAP_READONLY if readonly modifier found, update comments.
(cp_parser_oacc_cache): Adjust cp_parser_omp_var_list call
to turn on readonly modifier parsing, set OMP_CLAUSE__CACHE__READONLY
if readonly modifier found, update comments.
gcc/fortran/ChangeLog:
* gfortran.h (typedef struct gfc_omp_namelist): Adjust map_op as
ENUM_BITFIELD field, add 'bool readonly' field.
* openmp.cc (gfc_match_omp_map_clause): Add 'bool readonly = false'
parameter, set n->u.readonly field.
(gfc_match_omp_clauses): Add readonly modifier parsing for OpenACC
copyin clause, adjust call to gfc_match_omp_map_clause.
(gfc_match_oacc_cache): Add readonly modifier parsing for OpenACC
cache directive, adjust call to gfc_match_omp_map_clause.
* trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_CLAUSE_MAP_READONLY,
OMP_CLAUSE__CACHE__READONLY to 1 when readonly is set.
gcc/ChangeLog:
* tree-pretty-print.cc (dump_omp_clause): Add support for printing
OMP_CLAUSE_MAP_READONLY and OMP_CLAUSE__CACHE__READONLY.
* tree.h (OMP_CLAUSE_MAP_READONLY): New macro.
(OMP_CLAUSE__CACHE__READONLY): New macro.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/readonly-1.c: New test.
* gfortran.dg/goacc/readonly-1.f90: New test.
[-- Attachment #2: openacc-readonly-modifier.patch --]
[-- Type: text/plain, Size: 14732 bytes --]
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index d4b98d5d8b6..09e1e89d793 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -14059,7 +14059,8 @@ c_parser_omp_variable_list (c_parser *parser,
static tree
c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
- tree list, bool allow_deref = false)
+ tree list, bool allow_deref = false,
+ bool *readonly = NULL)
{
/* The clauses location. */
location_t loc = c_parser_peek_token (parser)->location;
@@ -14067,6 +14068,20 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
matching_parens parens;
if (parens.require_open (parser))
{
+ if (readonly != NULL)
+ {
+ c_token *token = c_parser_peek_token (parser);
+ if (token->type == CPP_NAME
+ && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
+ && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
+ {
+ c_parser_consume_token (parser);
+ c_parser_consume_token (parser);
+ *readonly = true;
+ }
+ else
+ *readonly = false;
+ }
list = c_parser_omp_variable_list (parser, loc, kind, list, allow_deref);
parens.skip_until_found_close (parser);
}
@@ -14084,7 +14099,11 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
OpenACC 2.6:
no_create ( variable-list )
attach ( variable-list )
- detach ( variable-list ) */
+ detach ( variable-list )
+
+ OpenACC 2.7:
+ copyin (readonly : variable-list )
+ */
static tree
c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
@@ -14135,11 +14154,22 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
default:
gcc_unreachable ();
}
+
+ /* Turn on readonly modifier parsing for copyin clause. */
+ bool readonly = false, *readonly_ptr = NULL;
+ if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
+ readonly_ptr = &readonly;
+
tree nl, c;
- nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, true);
+ nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, true,
+ readonly_ptr);
for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
- OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ {
+ OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ if (readonly)
+ OMP_CLAUSE_MAP_READONLY (c) = 1;
+ }
return nl;
}
@@ -18212,6 +18242,9 @@ c_parser_omp_structured_block (c_parser *parser, bool *if_p)
/* OpenACC 2.0:
# pragma acc cache (variable-list) new-line
+ OpenACC 2.7:
+ # pragma acc cache (readonly: variable-list) new-line
+
LOC is the location of the #pragma token.
*/
@@ -18219,8 +18252,14 @@ static tree
c_parser_oacc_cache (location_t loc, c_parser *parser)
{
tree stmt, clauses;
+ bool readonly;
+
+ clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL,
+ false, &readonly);
+ if (readonly)
+ for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+ OMP_CLAUSE__CACHE__READONLY (c) = 1;
- clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL);
clauses = c_finish_omp_clauses (clauses, C_ORT_ACC);
c_parser_skip_to_pragma_eol (parser);
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index acd1bd48af5..0f51289539b 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -37727,11 +37727,27 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
static tree
cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list,
- bool allow_deref = false)
+ bool allow_deref = false, bool *readonly = NULL)
{
if (cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
- return cp_parser_omp_var_list_no_open (parser, kind, list, NULL,
- allow_deref);
+ {
+ if (readonly != NULL)
+ {
+ cp_token *token = cp_lexer_peek_token (parser->lexer);
+ if (token->type == CPP_NAME
+ && !strcmp (IDENTIFIER_POINTER (token->u.value), "readonly")
+ && cp_lexer_nth_token_is (parser->lexer, 2, CPP_COLON))
+ {
+ cp_lexer_consume_token (parser->lexer);
+ cp_lexer_consume_token (parser->lexer);
+ *readonly = true;
+ }
+ else
+ *readonly = false;
+ }
+ return cp_parser_omp_var_list_no_open (parser, kind, list, NULL,
+ allow_deref);
+ }
return list;
}
@@ -37746,7 +37762,11 @@ cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list,
OpenACC 2.6:
no_create ( variable-list )
attach ( variable-list )
- detach ( variable-list ) */
+ detach ( variable-list )
+
+ OpenACC 2.7:
+ copyin (readonly : variable-list )
+ */
static tree
cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
@@ -37797,11 +37817,22 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
default:
gcc_unreachable ();
}
+
+ /* Turn on readonly modifier parsing for copyin clause. */
+ bool readonly = false, *readonly_ptr = NULL;
+ if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
+ readonly_ptr = &readonly;
+
tree nl, c;
- nl = cp_parser_omp_var_list (parser, OMP_CLAUSE_MAP, list, true);
+ nl = cp_parser_omp_var_list (parser, OMP_CLAUSE_MAP, list, true,
+ readonly_ptr);
for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
- OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ {
+ OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ if (readonly)
+ OMP_CLAUSE_MAP_READONLY (c) = 1;
+ }
return nl;
}
@@ -45875,6 +45906,9 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok,
/* OpenACC 2.0:
# pragma acc cache (variable-list) new-line
+
+ OpenACC 2.7:
+ # pragma acc cache (readonly: variable-list) new-line
*/
static tree
@@ -45885,8 +45919,14 @@ cp_parser_oacc_cache (cp_parser *parser, cp_token *pragma_tok)
auto_suppress_location_wrappers sentinel;
tree stmt, clauses;
+ bool readonly;
+
+ clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE__CACHE_, NULL,
+ false, &readonly);
+ if (readonly)
+ for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+ OMP_CLAUSE__CACHE__READONLY (c) = 1;
- clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE__CACHE_, NULL_TREE);
clauses = finish_omp_clauses (clauses, C_ORT_ACC);
cp_parser_require_pragma_eol (parser, cp_lexer_peek_token (parser->lexer));
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index cc7ba7c8846..9fa8962d63f 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1360,7 +1360,11 @@ typedef struct gfc_omp_namelist
{
gfc_omp_reduction_op reduction_op;
gfc_omp_depend_doacross_op depend_doacross_op;
- gfc_omp_map_op map_op;
+ struct
+ {
+ ENUM_BITFIELD (gfc_omp_map_op) map_op:8;
+ bool readonly;
+ };
gfc_expr *align;
struct
{
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 038907baa48..acd1428d2d7 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -1196,7 +1196,7 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
static bool
gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
- bool allow_common, bool allow_derived)
+ bool allow_common, bool allow_derived, bool readonly = false)
{
gfc_omp_namelist **head = NULL;
if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, true,
@@ -1205,7 +1205,10 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
{
gfc_omp_namelist *n;
for (n = *head; n; n = n->next)
- n->u.map_op = map_op;
+ {
+ n->u.map_op = map_op;
+ n->u.readonly = readonly;
+ }
return true;
}
@@ -2079,11 +2082,16 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
{
if (openacc)
{
- if (gfc_match ("copyin ( ") == MATCH_YES
- && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
- OMP_MAP_TO, true,
- allow_derived))
- continue;
+ if (gfc_match ("copyin ( ") == MATCH_YES)
+ {
+ bool readonly = false;
+ if (gfc_match ("readonly : ") == MATCH_YES)
+ readonly = true;
+ if (gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
+ OMP_MAP_TO, true,
+ allow_derived, readonly))
+ continue;
+ }
}
else if (gfc_match_omp_variable_list ("copyin (",
&c->lists[OMP_LIST_COPYIN],
@@ -4008,20 +4016,35 @@ gfc_match_oacc_wait (void)
match
gfc_match_oacc_cache (void)
{
+ bool readonly = false;
gfc_omp_clauses *c = gfc_get_omp_clauses ();
/* The OpenACC cache directive explicitly only allows "array elements or
subarrays", which we're currently not checking here. Either check this
after the call of gfc_match_omp_variable_list, or add something like a
only_sections variant next to its allow_sections parameter. */
- match m = gfc_match_omp_variable_list (" (",
- &c->lists[OMP_LIST_CACHE], true,
- NULL, NULL, true);
+ match m = gfc_match (" ( ");
if (m != MATCH_YES)
{
gfc_free_omp_clauses(c);
return m;
}
+ if (gfc_match ("readonly :") == MATCH_YES)
+ readonly = true;
+
+ gfc_omp_namelist **head = NULL;
+ m = gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_CACHE], true,
+ NULL, &head, true);
+ if (m != MATCH_YES)
+ {
+ gfc_free_omp_clauses(c);
+ return m;
+ }
+
+ if (readonly)
+ for (gfc_omp_namelist *n = *head; n; n = n->next)
+ n->u.readonly = true;
+
if (gfc_current_state() != COMP_DO
&& gfc_current_state() != COMP_DO_CONCURRENT)
{
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 0f8323901d7..87d0b5e0cdf 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -3067,6 +3067,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
|| (n->expr && gfc_expr_attr (n->expr).pointer)))
always_modifier = true;
+ if (n->u.readonly)
+ OMP_CLAUSE_MAP_READONLY (node) = 1;
+
switch (n->u.map_op)
{
case OMP_MAP_ALLOC:
@@ -3920,6 +3923,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
}
if (n->u.present_modifier)
OMP_CLAUSE_MOTION_PRESENT (node) = 1;
+ if (list == OMP_LIST_CACHE && n->u.readonly)
+ OMP_CLAUSE__CACHE__READONLY (node) = 1;
omp_clauses = gfc_trans_add_clause (node, omp_clauses);
}
break;
diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-1.c b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
new file mode 100644
index 00000000000..171f96c08db
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
@@ -0,0 +1,27 @@
+/* { dg-additional-options "-fdump-tree-original" } */
+
+struct S
+{
+ int *ptr;
+ float f;
+};
+
+
+int main (void)
+{
+ int x[32];
+ struct S s = {x, 0};
+
+ #pragma acc parallel copyin(readonly: x[:32], s.ptr[:16])
+ {
+ #pragma acc cache (readonly: x[:32])
+ }
+ return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*s.ptr \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: 128\\\]\\);$" 1 "original" } } */
+
+
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
new file mode 100644
index 00000000000..069fec0a0d5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
@@ -0,0 +1,28 @@
+! { dg-additional-options "-fdump-tree-original" }
+
+subroutine foo (a, n)
+ integer :: n, a(:)
+ integer :: i, b(n)
+ !$acc parallel copyin(readonly: a(:), b(:n))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ enddo
+ !$acc end parallel
+end subroutine foo
+
+program main
+ integer :: i, n = 32, a(32)
+ integer :: b(32)
+ !$acc parallel copyin(readonly: a(:32), b(:n))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ enddo
+ !$acc end parallel
+end program main
+
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) .+ map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) .+ map\\(readonly,to:b\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 2 "original" } }
+
+
+
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index a743e3cdfd8..6a9812c2253 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -905,6 +905,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
case OMP_CLAUSE_MAP:
pp_string (pp, "map(");
+ if (OMP_CLAUSE_MAP_READONLY (clause))
+ pp_string (pp, "readonly,");
switch (OMP_CLAUSE_MAP_KIND (clause))
{
case GOMP_MAP_ALLOC:
@@ -1075,6 +1077,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
case OMP_CLAUSE__CACHE_:
pp_string (pp, "(");
+ if (OMP_CLAUSE__CACHE__READONLY (clause))
+ pp_string (pp, "readonly:");
dump_generic_node (pp, OMP_CLAUSE_DECL (clause),
spc, flags, false);
goto print_clause_size;
diff --git a/gcc/tree.h b/gcc/tree.h
index 3eebf5709b7..a79260e48eb 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers
#define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
(OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
+/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */
+#define OMP_CLAUSE_MAP_READONLY(NODE) \
+ TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
+
+/* Same as above, for use in OpenACC cache directives. */
+#define OMP_CLAUSE__CACHE__READONLY(NODE) \
+ TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
+
/* True on an OMP_CLAUSE_USE_DEVICE_PTR with an OpenACC 'if_present'
clause. */
#define OMP_CLAUSE_USE_DEVICE_PTR_IF_PRESENT(NODE) \
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7] readonly modifier support in front-ends
2023-07-10 18:33 [PATCH, OpenACC 2.7] readonly modifier support in front-ends Chung-Lin Tang
@ 2023-07-11 7:00 ` Tobias Burnus
2023-07-20 13:33 ` Thomas Schwinge
2023-07-25 15:52 ` [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis Chung-Lin Tang
2 siblings, 0 replies; 18+ messages in thread
From: Tobias Burnus @ 2023-07-11 7:00 UTC (permalink / raw)
To: cltang, gcc-patches, Thomas Schwinge; +Cc: Catherine Moore
Hi,
just a remark regarding OpenMP. With
...omp ... firstprivate(var) allocator(omp_const_mem_alloc: var) one can also create constant memory in OpenMP.
Likewise with a custom allocator that uses the memory space
omp_const_mem_space, which is then a run-time thing. I don't think
that's particular useful on the host as the !PROT_WRITE property is a
memory-page thing which requires to allocate a multiple of a page size
(and after writing the value, mprotect can make it read only). But I
think it can be useful on the device (cf. OpenACC). OpenMP and OpenACC
likely differ in terms of whether an entry is in the mapping table
(firstprivate vs copy) and in the ref count. In any case, it would be
good to have the code written such that both OpenACC's and OpenMP's use
case can share as much code as possible, even if only OpenACC is
initially supported. Tobias PS: I should eventually have a closer look
at your patch!
On 10.07.23 20:33, Chung-Lin Tang wrote:
> this patch contains support for the 'readonly' modifier in copyin clauses
> and the cache directive.
>
> As we discussed earlier, the work for actually linking this to middle-end
> points-to analysis is a somewhat non-trivial issue. This first patch allows
> the language feature to be used in OpenACC directives first (with no effect for now).
> The middle-end changes are probably going to be a later patch.
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7] readonly modifier support in front-ends
2023-07-10 18:33 [PATCH, OpenACC 2.7] readonly modifier support in front-ends Chung-Lin Tang
2023-07-11 7:00 ` Tobias Burnus
@ 2023-07-20 13:33 ` Thomas Schwinge
2023-07-20 15:08 ` Tobias Burnus
2023-07-25 15:52 ` [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis Chung-Lin Tang
2 siblings, 1 reply; 18+ messages in thread
From: Thomas Schwinge @ 2023-07-20 13:33 UTC (permalink / raw)
To: Chung-Lin Tang, Tobias Burnus; +Cc: gcc-patches, Catherine Moore
Hi Chung-Lin, Tobias!
On 2023-07-11T02:33:58+0800, Chung-Lin Tang <chunglin.tang@siemens.com> wrote:
> this patch contains support for the 'readonly' modifier in copyin clauses
> and the cache directive.
Thanks!
> As we discussed earlier, the work for actually linking this to middle-end
> points-to analysis is a somewhat non-trivial issue. This first patch allows
> the language feature to be used in OpenACC directives first (with no effect for now).
> The middle-end changes are probably going to be a later patch.
ACK.
> (Also CCing Tobias because of the Fortran bits)
A few specific GCC/Fortran questions for Tobias below, and some more
review comments for Chung-Lin:
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -14059,7 +14059,8 @@ c_parser_omp_variable_list (c_parser *parser,
>
> static tree
> c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
> - tree list, bool allow_deref = false)
> + tree list, bool allow_deref = false,
> + bool *readonly = NULL)
> {
> /* The clauses location. */
> location_t loc = c_parser_peek_token (parser)->location;
> @@ -14067,6 +14068,20 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
> matching_parens parens;
> if (parens.require_open (parser))
> {
> + if (readonly != NULL)
> + {
> + c_token *token = c_parser_peek_token (parser);
> + if (token->type == CPP_NAME
> + && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
> + && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
> + {
> + c_parser_consume_token (parser);
> + c_parser_consume_token (parser);
> + *readonly = true;
> + }
> + else
> + *readonly = false;
> + }
> list = c_parser_omp_variable_list (parser, loc, kind, list, allow_deref);
> parens.skip_until_found_close (parser);
> }
Instead of doing this in 'c_parser_omp_var_list_parens', I think it's
clearer to have this special 'readonly :' parsing logic in the two places
where it's used. For example (random), like 'ancestor :' is parsed in
'c_parser_omp_clause_device', or 'conditional :' is parsed in
'c_parser_omp_clause_lastprivate'. (Yes, this does duplicate a bit of
code, but that's easy enough to follow along.)
The existing 'enum omp_clause_code kind', 'bool allow_deref' actually
affect the parsing process; the new 'bool readonly' only propagates a
flag.
> @@ -14084,7 +14099,11 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
> OpenACC 2.6:
> no_create ( variable-list )
> attach ( variable-list )
> - detach ( variable-list ) */
> + detach ( variable-list )
> +
> + OpenACC 2.7:
> + copyin (readonly : variable-list )
> + */
>
> static tree
> c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
> @@ -14135,11 +14154,22 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
> default:
> gcc_unreachable ();
> }
> +
> + /* Turn on readonly modifier parsing for copyin clause. */
> + bool readonly = false, *readonly_ptr = NULL;
> + if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
> + readonly_ptr = &readonly;
> +
> tree nl, c;
> - nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, true);
> + nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, true,
> + readonly_ptr);
That is, similar to 'c_parser_omp_clause_device', or
'c_parser_omp_clause_lastprivate', inline 'c_parser_omp_var_list_parens'
here, and only for 'PRAGMA_OACC_CLAUSE_COPYIN' parse 'readonly :', then
(for all) use 'c_parser_omp_variable_list' etc. instead of
'c_parser_omp_var_list_parens', then set 'readonly':
> for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
> - OMP_CLAUSE_SET_MAP_KIND (c, kind);
> + {
> + OMP_CLAUSE_SET_MAP_KIND (c, kind);
> + if (readonly)
> + OMP_CLAUSE_MAP_READONLY (c) = 1;
> + }
>
> return nl;
> @@ -18212,6 +18242,9 @@ c_parser_omp_structured_block (c_parser *parser, bool *if_p)
> /* OpenACC 2.0:
> # pragma acc cache (variable-list) new-line
>
> + OpenACC 2.7:
> + # pragma acc cache (readonly: variable-list) new-line
> +
> LOC is the location of the #pragma token.
> */
>
> @@ -18219,8 +18252,14 @@ static tree
> c_parser_oacc_cache (location_t loc, c_parser *parser)
> {
> tree stmt, clauses;
> + bool readonly;
> +
> + clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL,
> + false, &readonly);
> + if (readonly)
> + for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> + OMP_CLAUSE__CACHE__READONLY (c) = 1;
>
> - clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL);
> clauses = c_finish_omp_clauses (clauses, C_ORT_ACC);
>
> c_parser_skip_to_pragma_eol (parser);
Similarly.
> --- a/gcc/cp/parser.cc
> +++ b/gcc/cp/parser.cc
Similarly.
> --- a/gcc/fortran/gfortran.h
> +++ b/gcc/fortran/gfortran.h
> @@ -1360,7 +1360,11 @@ typedef struct gfc_omp_namelist
> {
> gfc_omp_reduction_op reduction_op;
> gfc_omp_depend_doacross_op depend_doacross_op;
> - gfc_omp_map_op map_op;
> + struct
> + {
> + ENUM_BITFIELD (gfc_omp_map_op) map_op:8;
> + bool readonly;
> + };
> gfc_expr *align;
> struct
> {
I did wonder whether the 'readonly' flag should live in the
'gfc_omp_namelist' (as done here -- similar to 'lastprivate_conditional',
for example), or in 'gfc_omp_clauses' (similar to 'ancestor', for
example). Then I realized/remembered that 'gfc_omp_clauses' exists only
once per directive (which is sufficient for 'ancestor', for example, as
there may be only one OpenMP 'device' clause), whereas 'gfc_omp_namelist'
exists once per list item -- which is what we need for 'readonly'. Thus,
the above looks good to me.
> --- a/gcc/fortran/openmp.cc
> +++ b/gcc/fortran/openmp.cc
> @@ -1196,7 +1196,7 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
>
> static bool
> gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
> - bool allow_common, bool allow_derived)
> + bool allow_common, bool allow_derived, bool readonly = false)
> {
> gfc_omp_namelist **head = NULL;
> if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, true,
> @@ -1205,7 +1205,10 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
> {
> gfc_omp_namelist *n;
> for (n = *head; n; n = n->next)
> - n->u.map_op = map_op;
> + {
> + n->u.map_op = map_op;
> + n->u.readonly = readonly;
> + }
> return true;
> }
Similar to 'c_parser_omp_var_list_parens' above, the existing
'bool allow_common', 'bool allow_derived' actually affect the parsing
process; the new 'bool readonly' only propagates a flag. Which I
acknowledge the existing 'gfc_omp_map_op map_op' also only does, but that
one's applicable to a lot more instances than 'readonly'. So I again
wonder if we should keep the latter out of 'gfc_match_omp_map_clause',
and instead set the flag when parsing the 'copyin' clauses; again, for
example (random), like 'ancestor :', or 'conditional :' are parsed --
which you're mostly already doing:
> @@ -2079,11 +2082,16 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
> {
> if (openacc)
> {
> - if (gfc_match ("copyin ( ") == MATCH_YES
> - && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> - OMP_MAP_TO, true,
> - allow_derived))
> - continue;
> + if (gfc_match ("copyin ( ") == MATCH_YES)
> + {
> + bool readonly = false;
> + if (gfc_match ("readonly : ") == MATCH_YES)
> + readonly = true;
> + if (gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> + OMP_MAP_TO, true,
> + allow_derived, readonly))
> + continue;
> + }
> }
..., so you'd just set 'readonly' here, instead of having
'gfc_match_omp_map_clause' do that. Care has to be taken to only do that
for the current list items, which you'll need 'gfc_omp_namelist *head'
for, or similar. Hmm. Effectively inline 'gfc_match_omp_map_clause'
here, or do add the 'bool readonly' argument to the latter, or something
else?
Or, we could add a new 'gcc/fortran/gfortran.h:gfc_omp_map_op' item
'OMP_MAP_TO_READONLY', which eventually translates into 'OMP_MAP_TO' with
'readonly' set? Then we'd just here call the (unaltered)
'gfc_match_omp_map_clause', with
'readonly ? OMP_MAP_TO_READONLY : OMP_MAP_TO'? Per
'git grep --cached '[^G]OMP_MAP_TO[^F]' -- gcc/fortran/' not a lot of
places need adjusting for that (most of the 'gcc/fortran/openmp.cc' ones
are not applicable).
Tobias?
> @@ -4008,20 +4016,35 @@ gfc_match_oacc_wait (void)
> match
> gfc_match_oacc_cache (void)
> {
> + bool readonly = false;
> gfc_omp_clauses *c = gfc_get_omp_clauses ();
> /* The OpenACC cache directive explicitly only allows "array elements or
> subarrays", which we're currently not checking here. Either check this
> after the call of gfc_match_omp_variable_list, or add something like a
> only_sections variant next to its allow_sections parameter. */
> - match m = gfc_match_omp_variable_list (" (",
> - &c->lists[OMP_LIST_CACHE], true,
> - NULL, NULL, true);
> + match m = gfc_match (" ( ");
> if (m != MATCH_YES)
> {
> gfc_free_omp_clauses(c);
> return m;
> }
>
> + if (gfc_match ("readonly :") == MATCH_YES)
I note this one does not have a space after ':' in 'gfc_match', but the
one above in 'gfc_match_omp_clauses' does. I don't know off-hand if that
makes a difference in parsing -- probably not, as all of
'gcc/fortran/openmp.cc' generally doesn't seem to be very consistent
about these two variants?
> + readonly = true;
> +
> + gfc_omp_namelist **head = NULL;
> + m = gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_CACHE], true,
> + NULL, &head, true);
> + if (m != MATCH_YES)
> + {
> + gfc_free_omp_clauses(c);
> + return m;
> + }
> +
> + if (readonly)
> + for (gfc_omp_namelist *n = *head; n; n = n->next)
> + n->u.readonly = true;
This already looks like how I thought it should look like.
> --- a/gcc/fortran/trans-openmp.cc
> +++ b/gcc/fortran/trans-openmp.cc
> @@ -3067,6 +3067,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> || (n->expr && gfc_expr_attr (n->expr).pointer)))
> always_modifier = true;
>
> + if (n->u.readonly)
> + OMP_CLAUSE_MAP_READONLY (node) = 1;
> +
> switch (n->u.map_op)
> {
> case OMP_MAP_ALLOC:
> @@ -3920,6 +3923,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> }
> if (n->u.present_modifier)
> OMP_CLAUSE_MOTION_PRESENT (node) = 1;
> + if (list == OMP_LIST_CACHE && n->u.readonly)
> + OMP_CLAUSE__CACHE__READONLY (node) = 1;
> omp_clauses = gfc_trans_add_clause (node, omp_clauses);
> }
> break;
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> @@ -0,0 +1,27 @@
> +/* { dg-additional-options "-fdump-tree-original" } */
> +
> +struct S
> +{
> + int *ptr;
> + float f;
> +};
> +
> +
> +int main (void)
> +{
> + int x[32];
> + struct S s = {x, 0};
> +
> + #pragma acc parallel copyin(readonly: x[:32], s.ptr[:16])
> + {
> + #pragma acc cache (readonly: x[:32])
> + }
> + return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*s.ptr \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: 128\\\]\\);$" 1 "original" } } */
Are 'len: 64' etc. also correct for targets where 'sizeof (int) != 4'?
Maybe just mask these out; they're not the important thing we're testing
here?
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> @@ -0,0 +1,28 @@
> +! { dg-additional-options "-fdump-tree-original" }
> +
> +subroutine foo (a, n)
> + integer :: n, a(:)
> + integer :: i, b(n)
> + !$acc parallel copyin(readonly: a(:), b(:n))
> + do i = 1,32
> + !$acc cache (readonly: a(:), b(:n))
> + enddo
> + !$acc end parallel
> +end subroutine foo
> +
> +program main
> + integer :: i, n = 32, a(32)
> + integer :: b(32)
> + !$acc parallel copyin(readonly: a(:32), b(:n))
> + do i = 1,32
> + !$acc cache (readonly: a(:), b(:n))
> + enddo
> + !$acc end parallel
> +end program main
> +
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) .+ map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\)" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) .+ map\\(readonly,to:b\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]\\)" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 2 "original" } }
You're scanning only one of the two 'cache' directives? If that's
intentional, please add a comment, why. If not, add the missing
scanning.
Given the peculiarities of the Fortran parsing, where first all
directive's clauses are collected and then translated en bloc, I
suggest to extent the 'copyin' test cases to have several 'copyin'
clauses, some with, some without 'readonly' modifier, so we make sure
that 'readonly' is set only for the appropriate ones.
Generally, in addition to just 'parallel' compute construct, please
spread this out a bit, to also cover 'kernels', 'serial' compute
constructs, and the 'data' construct.
Generally, please also add testing for the 'declare' directive with
'copyin' with 'readonly' modifier -- and implement handling in case
that's not implicitly covered? (..., but please don't let you be dragged
into a number of pre-existing issues with OpenACC 'declare' -- I hope the
'readonly' handling is straightforward to test for.)
Given that per the implementation in the front ends, the handling of
'readonly' obviously -- famous last words? ;-) -- is specific to
'copyin', it's probably OK to not have test cases to verify that the
'readonly' modifier is rejected for other data clauses?
> --- a/gcc/tree-pretty-print.cc
> +++ b/gcc/tree-pretty-print.cc
> @@ -905,6 +905,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
>
> case OMP_CLAUSE_MAP:
> pp_string (pp, "map(");
> + if (OMP_CLAUSE_MAP_READONLY (clause))
> + pp_string (pp, "readonly,");
> switch (OMP_CLAUSE_MAP_KIND (clause))
> {
> case GOMP_MAP_ALLOC:
> @@ -1075,6 +1077,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
>
> case OMP_CLAUSE__CACHE_:
> pp_string (pp, "(");
> + if (OMP_CLAUSE__CACHE__READONLY (clause))
> + pp_string (pp, "readonly:");
> dump_generic_node (pp, OMP_CLAUSE_DECL (clause),
> spc, flags, false);
> goto print_clause_size;
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers
> #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
> (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
>
> +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */
> +#define OMP_CLAUSE_MAP_READONLY(NODE) \
> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> +
> +/* Same as above, for use in OpenACC cache directives. */
> +#define OMP_CLAUSE__CACHE__READONLY(NODE) \
> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
I'm not sure if these special accessor functions are actually useful, or
we should just directly use 'TREE_READONLY' instead? We're only using
them in contexts where it's clear that the 'OMP_CLAUSE_SUBCODE_CHECK' is
satisfied, for example.
Also, for the new use for OMP clauses, update 'gcc/tree.h:TREE_READONLY',
and in 'gcc/tree-core.h' for 'readonly_flag' the
"table lists the uses of each of the above flags".
Setting 'TREE_READONLY' of the 'OMP_CLAUSE_DECL' instead of the clause
itself isn't the right thing to do -- or is it, and might already
indicate to the middle end the desired semantics? But does it maybe
conflict with front end/language-level use of 'TREE_READONLY' for 'const'
etc. (I suppose), and thus diagnostics for mismatches? I mean:
int a;
#pragma acc parallel copyin(readonly: a)
{
int *b = &a;
... should still continue to work (valid as long as '*b' isn't written
to), so should not raise any
"warning: initialization discards ‘const’ qualifier from pointer target type"
diagnostics. But if that's not a problem (I don't know how
'TREE_READONLY' is used elsewhere), maybe that's something to give a
thought to?
Or, early in the middle end, propagate 'TREE_READONLY' from the clause to
its 'OMP_CLAUSE_DECL'? Might need to 'unshare_expr' the latter for
modification and use in the associated region only?
Just some quick thoughts, obviously without any detailed analysis. ;-)
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7] readonly modifier support in front-ends
2023-07-20 13:33 ` Thomas Schwinge
@ 2023-07-20 15:08 ` Tobias Burnus
2023-08-07 13:58 ` [PATCH, OpenACC 2.7, v2] " Chung-Lin Tang
0 siblings, 1 reply; 18+ messages in thread
From: Tobias Burnus @ 2023-07-20 15:08 UTC (permalink / raw)
To: Thomas Schwinge, Chung-Lin Tang; +Cc: gcc-patches, Catherine Moore
Hi Thomas & Chung-Lin,
On 20.07.23 15:33, Thomas Schwinge wrote:
> On 2023-07-11T02:33:58+0800, Chung-Lin Tang
> <chunglin.tang@siemens.com> wrote:
>> +++ b/gcc/c/c-parser.cc
>> @@ -14059,7 +14059,8 @@ c_parser_omp_variable_list (c_parser *parser,
>>
>> static tree
>> c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
>> - tree list, bool allow_deref = false)
>> + tree list, bool allow_deref = false,
>> + bool *readonly = NULL)
>> ...
> Instead of doing this in 'c_parser_omp_var_list_parens', I think it's
> clearer to have this special 'readonly :' parsing logic in the two places
> where it's used.
I concur. The same issue also occurred for OpenMP's
c_parser_omp_clause_to, and c_parser_omp_clause_from and the 'present'
modifier. For it, I created a combined function but the main reason for
that is that OpenMP also permits more modifiers (like 'iterators'),
which would cause more duplication of code ('iterator' is not yet
supported).
For something as simple to parse as this modifier, I would just do it at
the two places – as Thomas suggested.
>> +++ b/gcc/fortran/gfortran.h
>> @@ -1360,7 +1360,11 @@ typedef struct gfc_omp_namelist
>> {
>> gfc_omp_reduction_op reduction_op;
>> gfc_omp_depend_doacross_op depend_doacross_op;
>> - gfc_omp_map_op map_op;
>> + struct
>> + {
>> + ENUM_BITFIELD (gfc_omp_map_op) map_op:8;
>> + bool readonly;
>> + };
>> gfc_expr *align;
>> struct
>> {
> [...] Thus, the above looks good to me.
I concur but I wonder whether it would be cleaner to name the struct;
this makes it also more obvious what belongs together in the union.
Namely, naming the struct 'map' and then changing the 45 users from
'u.map_op' to 'u.map.op' and the new 'u.readonly' to 'u.map.readonly'. –
this seems to be cleaner.
>> static bool
>> gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
>> - bool allow_common, bool allow_derived)
>> + bool allow_common, bool allow_derived, bool readonly = false)
>> {
> Similar to 'c_parser_omp_var_list_parens' above,
I concur that not doing it here is cleaner.
> again, for
> example (random), like 'ancestor :', or 'conditional :' are parsed --
> which you're mostly already doing
I think OpenMP's "present" (as modifier to "omp target updates"'s
"to"/"from") is a better example than "ancestor" as for present we also
have a list. See: gfc_match_motion_var_list how to handle the headp.
(There an extra functions was used as in the future also other modifiers
like 'iterator' will be used.)
However, as Thomas noted, the patch contains also an example (see
further down in Thomas' email, not quoted here).
> Or, we could add a new 'gcc/fortran/gfortran.h:gfc_omp_map_op' item
> 'OMP_MAP_TO_READONLY', which eventually translates into 'OMP_MAP_TO' with
> 'readonly' set?
I think having the additional flag is easier to understand - and at least
memory wise we do not save memory as it is in a union. The advantage
of not having a union is that accessing the int-enum is faster than accessing
an char-wide bitset enum.
In terms of code changes (and without having a closer look), the two
approaches seems to be be similar.
Hence, using OMP_MAP_TO_READONLY for OpenACC would be fine, too. And
I do not have a strong preference for either.
* * *
I did wonder about the following, but I now believe it won't affect
the choice. Namely, we want to handle at some point the following:
!$omp target firstprivate(var) allocator(omp_const_mem_alloc: var)
This could be turned into GOMP_MAP_FIRSTPRIVATE... + OMP_.*READONLY flag.
But if we don't do it in the FE, the internal Fortran representation
does not matter.
Advantage for doing it in the ME: Only one code location, especially as
we might use the opportunity to also check that the omp_const_mem_alloc
is only used with privatization (in OpenMP).
Difference: OpenMP uses 'firstprivate' (i.e. private copy, no reference count bump,
only permitted for 'target') while OpenACC uses 'copy' which implies reference
counting and permitted in 'acc (enter/exit) data' and not only for compute constructs.
OpenMP in principle also permits user-defined allocator with a constant
memory space - I am not completely sure whether/when it can be used with
omp target firstprivate(...) allocator(my_alloc : ...)
> Then we'd just here call the (unaltered)
> 'gfc_match_omp_map_clause', with
> 'readonly ? OMP_MAP_TO_READONLY : OMP_MAP_TO'? Per
> 'git grep --cached '[^G]OMP_MAP_TO[^F]' -- gcc/fortran/' not a lot of
> places need adjusting for that (most of the 'gcc/fortran/openmp.cc' ones
> are not applicable).
I think either would work. – I have no strong feeling what's better.
But you still need to handle it for clause resolution.
> + if (gfc_match ("readonly :") == MATCH_YES)
> I note this one does not have a space after ':' in 'gfc_match', but the
> one above in 'gfc_match_omp_clauses' does. I don't know off-hand if that
> makes a difference in parsing -- probably not, as all of
> 'gcc/fortran/openmp.cc' generally doesn't seem to be very consistent
> about these two variants?
It *does* make a difference. And for obvious reasons. You don't want to permit:
!$acc kernels asnyccopy(a)
but require at least one space (or comma) between "async" and "copy"..
(In fixed form Fortran, it would be fine - as would be "!$acc k e nelsasy nc co p y(a)".)
A " " matches zero or more whitespaces, but with gfc_match_space you can find out
whether there was whitespace or not.
Whether the tailing " " in the gfc_match matters or not, depends on what comes next.
If there is a "gfc_gobble_whitespace ();", everything is fine. If not, the next to match
has to start with a " ", which is usually ugly; an exception is " , " or " )" which still
is somewhat fine.
I think that it is mostly implemented correctly, but I wouldn't be surprised if a
space is missing in some matches - be it a tailing white space or e.g. in "foo:" before
the colon.
BTW: One reason of stripping tailing spaces before matching a non-whitespace: the
associated location is the one before the parsing; thus, for a match error or when saving
the old_locus, pointing to the first non-whitespace looks nicer than pointing to the
(first of the) whitspace character(s).
>> + if (readonly)
>> + for (gfc_omp_namelist *n = *head; n; n = n->next)
>> + n->u.readonly = true;
> This already looks like how I thought it should look like.
Indeed.--- a/gcc/tree.h
>> +++ b/gcc/tree.h
>> @@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers
>> #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
>> (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
>>
>> +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */
>> +#define OMP_CLAUSE_MAP_READONLY(NODE) \
>> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
>> +
>> +/* Same as above, for use in OpenACC cache directives. */
>> +#define OMP_CLAUSE__CACHE__READONLY(NODE) \
>> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
> I'm not sure if these special accessor functions are actually useful, or
> we should just directly use 'TREE_READONLY' instead? We're only using
> them in contexts where it's clear that the 'OMP_CLAUSE_SUBCODE_CHECK' is
> satisfied, for example.
I find directly using TREE_READONLY confusing.
> Setting 'TREE_READONLY' of the 'OMP_CLAUSE_DECL' instead of the clause
> itself isn't the right thing to do -- or is it, and might already
> indicate to the middle end the desired semantics? But does it maybe
> conflict with front end/language-level use of 'TREE_READONLY' for 'const'
> etc. (I suppose), and thus diagnostics for mismatches?
I think is is cleaner not to one flag to mean two different things.
In particular, wouldn't the following cause issues, if you mark 'a' as TREE_READONLY?
int a;
#pragma acc parallel copyin(readonly : a)
{...}
a = 5;
> Or, early in the middle end, propagate 'TREE_READONLY' from the clause to
> its 'OMP_CLAUSE_DECL'? Might need to 'unshare_expr' the latter for
> modification and use in the associated region only?
Unsharing a tree would surely help – but it is still ugly and, for
declarations, unshare_expr does not create a copy!
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis
2023-07-10 18:33 [PATCH, OpenACC 2.7] readonly modifier support in front-ends Chung-Lin Tang
2023-07-11 7:00 ` Tobias Burnus
2023-07-20 13:33 ` Thomas Schwinge
@ 2023-07-25 15:52 ` Chung-Lin Tang
2023-10-27 14:28 ` Thomas Schwinge
2 siblings, 1 reply; 18+ messages in thread
From: Chung-Lin Tang @ 2023-07-25 15:52 UTC (permalink / raw)
To: cltang, gcc-patches, Thomas Schwinge, Catherine Moore, Tobias Burnus
[-- Attachment #1: Type: text/plain, Size: 3101 bytes --]
On 2023/7/11 2:33 AM, Chung-Lin Tang via Gcc-patches wrote:
> As we discussed earlier, the work for actually linking this to middle-end
> points-to analysis is a somewhat non-trivial issue. This first patch allows
> the language feature to be used in OpenACC directives first (with no effect for now).
> The middle-end changes are probably going to be a later patch.
This second patch tries to link the readonly modifier to points-to analysis.
There already exists SSA_NAME_POINTS_TO_READONLY_MEMORY and it's support in the
alias oracle routines in tree-ssa-alias.cc, so basically what this patch does is
try to make the variables holding the array section base pointers to have this
flag set.
There is an another OMP_CLAUSE_MAP_POINTS_TO_READONLY set by front-ends on the
associated pointer clauses if OMP_CLAUSE_MAP_READONLY is set.
Also a DECL_POINTS_TO_READONLY flag is set for VAR_DECLs when creating the tmp
vars carrying these receiver references on the offloaded side. These
eventually get translated to SSA_NAME_POINTS_TO_READONLY_MEMORY.
This still doesn't always work as expected in terms of optimization:
struct pointer fields and Fortran arrays (kind of like C structs) which have
several accesses to create the pointer access on the receive/offloaded side,
and SRA appears to not work on these sequences, so gets in the way of much
redundancy elimination.
Currently have one testcase where we can demonstrate 'readonly' can avoid
a clobber by function call. Tested on powerpc64le-linux/nvptx.
Note this patch is create a-top of the front-end patch.
(will respond to the other front-end patch comments later)
Thanks,
Chung-Lin
2023-07-25 Chung-Lin Tang <cltang@codesourcery.com>
gcc/c/ChangeLog:
* c-typeck.cc (handle_omp_array_sections):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
gcc/cp/ChangeLog:
* semantics.cc (handle_omp_array_sections):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_trans_omp_array_section):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
gcc/ChangeLog:
* gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY
for VAR_DECLs.
* gimplify.cc (struct gimplify_omp_ctx):
Add 'hash_set<tree_operand_hash> *pt_readonly_ptrs' field.
(internal_get_tmp_var): Set
DECL_POINTS_TO_READONLY/SSA_NAME_POINTS_TO_READONLY_MEMORY for
new temp vars.
(build_omp_struct_comp_nodes):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
(gimplify_scan_omp_clauses): Collect OMP_CLAUSE_MAP_POINTS_TO_READONLY
to ctx->pt_readonly_ptrs.
* omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for
variables of receiver refs.
* tree-pretty-print.cc (dump_omp_clause):
Print OMP_CLAUSE_MAP_POINTS_TO_READONLY.
(dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY.
* tree.h (DECL_POINTS_TO_READONLY): New macro.
(OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/readonly-1.c: Adjust testcase.
* c-c++-common/goacc/readonly-2.c: New testcase.
* gfortran.dg/goacc/readonly-1.f90: Adjust testcase.
[-- Attachment #2: oacc-points-to-readonly.patch --]
[-- Type: text/plain, Size: 13090 bytes --]
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 7cf411155c6..42591e4029a 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -14258,6 +14258,8 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_ATTACH_DETACH);
else
OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER);
+ if (OMP_CLAUSE_MAP_READONLY (c))
+ OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
OMP_CLAUSE_MAP_IMPLICIT (c2) = OMP_CLAUSE_MAP_IMPLICIT (c);
if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER
&& !c_mark_addressable (t))
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..6ab467e1140 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -5872,6 +5872,8 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
}
else
OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER);
+ if (OMP_CLAUSE_MAP_READONLY (c))
+ OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
OMP_CLAUSE_MAP_IMPLICIT (c2) = OMP_CLAUSE_MAP_IMPLICIT (c);
if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER
&& !cxx_mark_addressable (t))
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 2253d559f9c..d7cd65af1bb 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -2524,6 +2524,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_exec_op op,
node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind);
OMP_CLAUSE_DECL (node3) = gfc_conv_descriptor_data_get (decl);
+ if (n->u.readonly)
+ OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
/* This purposely does not include GOMP_MAP_ALWAYS_POINTER. The extra
cast prevents gimplify.cc from recognising it as being part of the
struct - and adding an 'alloc: for the 'desc.data' pointer, which
@@ -2559,6 +2561,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_exec_op op,
OMP_CLAUSE_MAP);
OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind);
OMP_CLAUSE_DECL (node3) = decl;
+ if (n->u.readonly)
+ OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
}
ptr2 = fold_convert (ptrdiff_type_node, ptr2);
OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, ptrdiff_type_node,
diff --git a/gcc/gimple-expr.cc b/gcc/gimple-expr.cc
index f15cc0ba715..42c0f6469b1 100644
--- a/gcc/gimple-expr.cc
+++ b/gcc/gimple-expr.cc
@@ -376,6 +376,8 @@ copy_var_decl (tree var, tree name, tree type)
DECL_CONTEXT (copy) = DECL_CONTEXT (var);
TREE_USED (copy) = 1;
DECL_SEEN_IN_BIND_EXPR_P (copy) = 1;
+ if (VAR_P (var))
+ DECL_POINTS_TO_READONLY (copy) = DECL_POINTS_TO_READONLY (var);
DECL_ATTRIBUTES (copy) = DECL_ATTRIBUTES (var);
if (DECL_USER_ALIGN (var))
{
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 36e5df050b9..394e40fead2 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -221,6 +221,7 @@ struct gimplify_omp_ctx
splay_tree variables;
hash_set<tree> *privatized_types;
tree clauses;
+ hash_set<tree_operand_hash> *pt_readonly_ptrs;
/* Iteration variables in an OMP_FOR. */
vec<tree> loop_iter_var;
location_t location;
@@ -628,6 +629,15 @@ internal_get_tmp_var (tree val, gimple_seq *pre_p, gimple_seq *post_p,
gimplify_expr (&val, pre_p, post_p, is_gimple_reg_rhs_or_call,
fb_rvalue);
+ bool pt_readonly = false;
+ if (gimplify_omp_ctxp && gimplify_omp_ctxp->pt_readonly_ptrs)
+ {
+ tree ptr = val;
+ if (TREE_CODE (ptr) == POINTER_PLUS_EXPR)
+ ptr = TREE_OPERAND (ptr, 0);
+ pt_readonly = gimplify_omp_ctxp->pt_readonly_ptrs->contains (ptr);
+ }
+
if (allow_ssa
&& gimplify_ctxp->into_ssa
&& is_gimple_reg_type (TREE_TYPE (val)))
@@ -639,9 +649,18 @@ internal_get_tmp_var (tree val, gimple_seq *pre_p, gimple_seq *post_p,
if (name)
SET_SSA_NAME_VAR_OR_IDENTIFIER (t, create_tmp_var_name (name));
}
+ if (pt_readonly)
+ SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1;
}
else
- t = lookup_tmp_var (val, is_formal, not_gimple_reg);
+ {
+ t = lookup_tmp_var (val, is_formal, not_gimple_reg);
+ if (pt_readonly)
+ {
+ DECL_POINTS_TO_READONLY (t) = 1;
+ gimplify_omp_ctxp->pt_readonly_ptrs->add (t);
+ }
+ }
mod = build2 (INIT_EXPR, TREE_TYPE (t), t, unshare_expr (val));
@@ -8906,6 +8925,8 @@ build_omp_struct_comp_nodes (enum tree_code code, tree grp_start, tree grp_end,
OMP_CLAUSE_SET_MAP_KIND (c2, mkind);
OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (grp_end));
OMP_CLAUSE_CHAIN (c2) = NULL_TREE;
+ if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (grp_end))
+ OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
tree grp_mid = NULL_TREE;
if (OMP_CLAUSE_CHAIN (grp_start) != grp_end)
grp_mid = OMP_CLAUSE_CHAIN (grp_start);
@@ -11741,6 +11762,16 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
gimplify_omp_ctxp = outer_ctx;
}
+ else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+ && (code == OACC_PARALLEL
+ || code == OACC_KERNELS
+ || code == OACC_SERIAL)
+ && OMP_CLAUSE_MAP_POINTS_TO_READONLY (c))
+ {
+ if (ctx->pt_readonly_ptrs == NULL)
+ ctx->pt_readonly_ptrs = new hash_set<tree_operand_hash> ();
+ ctx->pt_readonly_ptrs->add (OMP_CLAUSE_DECL (c));
+ }
if (notice_outer)
goto do_notice;
break;
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index b882df048ef..204fc72ca2d 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -14098,6 +14098,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
if (ref_to_array)
x = fold_convert_loc (clause_loc, TREE_TYPE (new_var), x);
gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue);
+ if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (c) && VAR_P (x))
+ DECL_POINTS_TO_READONLY (x) = 1;
if ((is_ref && !ref_to_array)
|| ref_to_ptr)
{
diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-1.c b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
index 171f96c08db..1f10fd25e46 100644
--- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
@@ -19,8 +19,8 @@ int main (void)
return 0;
}
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*s.ptr \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c } } } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*s.ptr \\\[len: 64\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: 64\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: 128\\\]\\);$" 1 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-2.c b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
new file mode 100644
index 00000000000..d32d3362000
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
@@ -0,0 +1,15 @@
+/* { dg-additional-options "-O -fdump-tree-fre" } */
+
+#pragma acc routine
+extern void foo (int *ptr, int val);
+
+int main (void)
+{
+ int r, a[32];
+ #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
+ {
+ foo (a, a[8]);
+ r = a[8];
+ }
+}
+/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
diff --git a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
index 069fec0a0d5..1e5e60f9744 100644
--- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
@@ -20,8 +20,8 @@ program main
!$acc end parallel
end program main
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) .+ map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\)" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) .+ map\\(readonly,to:b\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:a.0 \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) a.0\\\]\\) map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:b \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) b\\\]\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:a \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\\]\\) map\\(readonly,to:b\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:b \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\\]\\)" 1 "original" } }
! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 2 "original" } }
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 9604c3eecc5..1a8b121f30b 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -907,6 +907,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
pp_string (pp, "map(");
if (OMP_CLAUSE_MAP_READONLY (clause))
pp_string (pp, "readonly,");
+ if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (clause))
+ pp_string (pp, "pt_readonly,");
switch (OMP_CLAUSE_MAP_KIND (clause))
{
case GOMP_MAP_ALLOC:
@@ -3436,6 +3438,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
pp_string (pp, "(D)");
if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (node))
pp_string (pp, "(ab)");
+ if (SSA_NAME_POINTS_TO_READONLY_MEMORY (node))
+ pp_string (pp, "(ptro)");
break;
case WITH_SIZE_EXPR:
diff --git a/gcc/tree-ssanames.cc b/gcc/tree-ssanames.cc
index 23387b90fe3..32d35a29dfc 100644
--- a/gcc/tree-ssanames.cc
+++ b/gcc/tree-ssanames.cc
@@ -402,6 +402,9 @@ make_ssa_name_fn (struct function *fn, tree var, gimple *stmt,
else
SSA_NAME_RANGE_INFO (t) = NULL;
+ if (VAR_P (var) && DECL_POINTS_TO_READONLY (var))
+ SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1;
+
SSA_NAME_IN_FREE_LIST (t) = 0;
SSA_NAME_IS_DEFAULT_DEF (t) = 0;
init_ssa_name_imm_use (t);
diff --git a/gcc/tree.h b/gcc/tree.h
index ac563de1fc3..880ffb367a3 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1021,6 +1021,13 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
#define DECL_HIDDEN_STRING_LENGTH(NODE) \
(TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag)
+/* In a VAR_DECL, set for variables regarded as pointing to memory not written
+ to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs created from
+ such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in copyin
+ clauses. */
+#define DECL_POINTS_TO_READONLY(NODE) \
+ (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray)
+
/* In a CALL_EXPR, means that the call is the jump from a thunk to the
thunked-to function. Be careful to avoid using this macro when one of the
next two applies instead. */
@@ -1815,6 +1822,10 @@ class auto_suppress_location_wrappers
#define OMP_CLAUSE_MAP_READONLY(NODE) \
TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
+/* Set if 'OMP_CLAUSE_DECL (NODE)' points to read-only memory. */
+#define OMP_CLAUSE_MAP_POINTS_TO_READONLY(NODE) \
+ TREE_CONSTANT (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
+
/* Same as above, for use in OpenACC cache directives. */
#define OMP_CLAUSE__CACHE__READONLY(NODE) \
TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends
2023-07-20 15:08 ` Tobias Burnus
@ 2023-08-07 13:58 ` Chung-Lin Tang
2023-10-26 9:43 ` Thomas Schwinge
0 siblings, 1 reply; 18+ messages in thread
From: Chung-Lin Tang @ 2023-08-07 13:58 UTC (permalink / raw)
To: Tobias Burnus, Thomas Schwinge, Chung-Lin Tang
Cc: gcc-patches, Catherine Moore
[-- Attachment #1: Type: text/plain, Size: 6963 bytes --]
Hi Thomas, Tobias,
here's the updated v2 of the readonly modifier front-end patch.
On 2023/7/20 11:08 PM, Tobias Burnus wrote:
>>> +++ b/gcc/c/c-parser.cc
>>> @@ -14059,7 +14059,8 @@ c_parser_omp_variable_list (c_parser *parser,
>>>
>>> static tree
>>> c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
>>> - tree list, bool allow_deref = false)
>>> + tree list, bool allow_deref = false,
>>> + bool *readonly = NULL)
>>> ...
>> Instead of doing this in 'c_parser_omp_var_list_parens', I think it's
>> clearer to have this special 'readonly :' parsing logic in the two places
>> where it's used.
> I concur. The same issue also occurred for OpenMP's
> c_parser_omp_clause_to, and c_parser_omp_clause_from and the 'present'
> modifier. For it, I created a combined function but the main reason for
> that is that OpenMP also permits more modifiers (like 'iterators'),
> which would cause more duplication of code ('iterator' is not yet
> supported).
>
> For something as simple to parse as this modifier, I would just do it at
> the two places – as Thomas suggested.
Okay, I've changed the C/C++ parser parts to have the parsing logic directly
added.
>>> +++ b/gcc/fortran/gfortran.h
>>> @@ -1360,7 +1360,11 @@ typedef struct gfc_omp_namelist
>>> {
>>> gfc_omp_reduction_op reduction_op;
>>> gfc_omp_depend_doacross_op depend_doacross_op;
>>> - gfc_omp_map_op map_op;
>>> + struct
>>> + {
>>> + ENUM_BITFIELD (gfc_omp_map_op) map_op:8;
>>> + bool readonly;
>>> + };
>>> gfc_expr *align;
>>> struct
>>> {
>> [...] Thus, the above looks good to me.
> I concur but I wonder whether it would be cleaner to name the struct;
> this makes it also more obvious what belongs together in the union.
>
> Namely, naming the struct 'map' and then changing the 45 users from
> 'u.map_op' to 'u.map.op' and the new 'u.readonly' to 'u.map.readonly'. –
> this seems to be cleaner.
I've adjusted 'u.map' to be a named struct now, and updated the references.
>> + if (gfc_match ("readonly :") == MATCH_YES)
>> I note this one does not have a space after ':' in 'gfc_match', but the
>> one above in 'gfc_match_omp_clauses' does. I don't know off-hand if that
>> makes a difference in parsing -- probably not, as all of
>> 'gcc/fortran/openmp.cc' generally doesn't seem to be very consistent
>> about these two variants?
> It *does* make a difference. And for obvious reasons. You don't want to permit:
>
> !$acc kernels asnyccopy(a)
>
> but require at least one space (or comma) between "async" and "copy"..
> (In fixed form Fortran, it would be fine - as would be "!$acc k e nelsasy nc co p y(a)".)
>
> A " " matches zero or more whitespaces, but with gfc_match_space you can find out
> whether there was whitespace or not.
Okay, made sure both are 'gfc_match ("readonly : ")'. Thanks for catching that, didn't
realize that space was significant.
>>> +++ b/gcc/tree.h
>>> @@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers
>>> #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
>>> (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
>>>
>>> +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */
>>> +#define OMP_CLAUSE_MAP_READONLY(NODE) \
>>> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
>>> +
>>> +/* Same as above, for use in OpenACC cache directives. */
>>> +#define OMP_CLAUSE__CACHE__READONLY(NODE) \
>>> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
>> I'm not sure if these special accessor functions are actually useful, or
>> we should just directly use 'TREE_READONLY' instead? We're only using
>> them in contexts where it's clear that the 'OMP_CLAUSE_SUBCODE_CHECK' is
>> satisfied, for example.
> I find directly using TREE_READONLY confusing.
FWIW, I've changed to use TREE_NOTHROW instead, if it can give a better sense of safety :P
I think there's a misunderstanding here anyways: we are not relying on a DECL marked
TREE_READONLY here. We merely need the OMP_CLAUSE_MAP to be marked as OMP_CLAUSE_MAP_READONLY == 1.
The other points-to patch then (also in front-ends) take the OMP_CLAUSE_MAP_READONLY
to mark the clauses of "base-pointers of array-sections" as OMP_CLAUSE_MAP_POINTS_TO_READONLY,
and later this gradually gets relayed to alias oracle routines in tree-ssa-alias.cc
Re-tested this v2 patch on powerpc64le-linux/nvptx. Okay for trunk?
Thanks,
Chung-Lin
2023-08-07 Chung-Lin Tang <cltang@codesourcery.com>
gcc/c/ChangeLog:
* c-parser.cc (c_parser_oacc_data_clause): Add parsing support for
'readonly' modifier, set OMP_CLAUSE_MAP_READONLY if readonly modifier
found, update comments.
(c_parser_oacc_cache): Add parsing support for 'readonly' modifier,
set OMP_CLAUSE__CACHE__READONLY if readonly modifier found, update
comments.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_oacc_data_clause): Add parsing support for
'readonly' modifier, set OMP_CLAUSE_MAP_READONLY if readonly modifier
found, update comments.
(cp_parser_oacc_cache): Add parsing support for 'readonly' modifier,
set OMP_CLAUSE__CACHE__READONLY if readonly modifier found, update
comments.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_namelist): Print "readonly," for
OMP_LIST_MAP and OMP_LIST_CACHE if n->u.map.readonly is set.
Adjust 'n->u.map_op' to 'n->u.map.op'.
* gfortran.h (typedef struct gfc_omp_namelist): Adjust map_op as
'ENUM_BITFIELD (gfc_omp_map_op) op:8', add 'bool readonly' field,
change to named struct field 'map'.
* openmp.cc (gfc_match_omp_map_clause): Add 'bool readonly = false'
parameter, set n->u.map.readonly field. Adjust 'n->u.map_op' to
'n->u.map.op'.
(gfc_match_omp_clause_reduction): Adjust 'n->u.map_op' to 'n->u.map.op'.
(gfc_match_omp_clauses): Add readonly modifier parsing for OpenACC
copyin clause, adjust call to gfc_match_omp_map_clause.
Adjust 'n->u.map_op' to 'n->u.map.op'.
(gfc_match_oacc_declare): Adjust 'n->u.map_op' to 'n->u.map.op'.
(gfc_match_oacc_cache): Add readonly modifier parsing for OpenACC
cache directive.
(resolve_omp_clauses): Adjust 'n->u.map_op' to 'n->u.map.op'.
* trans-decl.cc (add_clause): Adjust 'n->u.map_op' to 'n->u.map.op'.
(finish_oacc_declare): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_CLAUSE_MAP_READONLY,
OMP_CLAUSE__CACHE__READONLY to 1 when readonly is set. Adjust
'n->u.map_op' to 'n->u.map.op'.
(gfc_add_clause_implicitly): Adjust 'n->u.map_op' to 'n->u.map.op'.
gcc/ChangeLog:
* tree-pretty-print.cc (dump_omp_clause): Add support for printing
OMP_CLAUSE_MAP_READONLY and OMP_CLAUSE__CACHE__READONLY.
* tree.h (OMP_CLAUSE_MAP_READONLY): New macro.
(OMP_CLAUSE__CACHE__READONLY): New macro.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/readonly-1.c: New test.
* gfortran.dg/goacc/readonly-1.f90: New test.
[-- Attachment #2: readonly-fe-v2.patch --]
[-- Type: text/plain, Size: 26539 bytes --]
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 24a6eb6e459..5779f499ae1 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -14084,7 +14084,11 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
OpenACC 2.6:
no_create ( variable-list )
attach ( variable-list )
- detach ( variable-list ) */
+ detach ( variable-list )
+
+ OpenACC 2.7:
+ copyin (readonly : variable-list )
+ */
static tree
c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
@@ -14135,11 +14139,36 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
default:
gcc_unreachable ();
}
- tree nl, c;
- nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, true);
- for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
- OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ tree nl = list;
+ bool readonly = false;
+ matching_parens parens;
+ if (parens.require_open (parser))
+ {
+ /* Turn on readonly modifier parsing for copyin clause. */
+ if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
+ {
+ c_token *token = c_parser_peek_token (parser);
+ if (token->type == CPP_NAME
+ && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
+ && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
+ {
+ c_parser_consume_token (parser);
+ c_parser_consume_token (parser);
+ readonly = true;
+ }
+ }
+ location_t loc = c_parser_peek_token (parser)->location;
+ nl = c_parser_omp_variable_list (parser, loc, OMP_CLAUSE_MAP, list, true);
+ parens.skip_until_found_close (parser);
+ }
+
+ for (tree c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
+ {
+ OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ if (readonly)
+ OMP_CLAUSE_MAP_READONLY (c) = 1;
+ }
return nl;
}
@@ -18161,15 +18190,40 @@ c_parser_omp_structured_block (c_parser *parser, bool *if_p)
/* OpenACC 2.0:
# pragma acc cache (variable-list) new-line
+ OpenACC 2.7:
+ # pragma acc cache (readonly: variable-list) new-line
+
LOC is the location of the #pragma token.
*/
static tree
c_parser_oacc_cache (location_t loc, c_parser *parser)
{
- tree stmt, clauses;
+ tree stmt, clauses = NULL_TREE;
+ bool readonly = false;
+ matching_parens parens;
+
+ if (parens.require_open (parser))
+ {
+ c_token *token = c_parser_peek_token (parser);
+ if (token->type == CPP_NAME
+ && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
+ && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
+ {
+ c_parser_consume_token (parser);
+ c_parser_consume_token (parser);
+ readonly = true;
+ }
+ location_t loc = c_parser_peek_token (parser)->location;
+ clauses = c_parser_omp_variable_list (parser, loc, OMP_CLAUSE__CACHE_,
+ NULL_TREE);
+ parens.skip_until_found_close (parser);
+ }
+
+ if (readonly)
+ for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+ OMP_CLAUSE__CACHE__READONLY (c) = 1;
- clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL);
clauses = c_finish_omp_clauses (clauses, C_ORT_ACC);
c_parser_skip_to_pragma_eol (parser);
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index d7ef5b34d42..ac8a656874a 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -37750,7 +37750,11 @@ cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list,
OpenACC 2.6:
no_create ( variable-list )
attach ( variable-list )
- detach ( variable-list ) */
+ detach ( variable-list )
+
+ OpenACC 2.7:
+ copyin (readonly : variable-list )
+ */
static tree
cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
@@ -37801,11 +37805,33 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
default:
gcc_unreachable ();
}
- tree nl, c;
- nl = cp_parser_omp_var_list (parser, OMP_CLAUSE_MAP, list, true);
+ tree nl = list;
+ bool readonly = false;
+ if (cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+ {
+ /* Turn on readonly modifier parsing for copyin clause. */
+ if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
+ {
+ cp_token *token = cp_lexer_peek_token (parser->lexer);
+ if (token->type == CPP_NAME
+ && !strcmp (IDENTIFIER_POINTER (token->u.value), "readonly")
+ && cp_lexer_peek_nth_token (parser->lexer, 2)->type == CPP_COLON)
+ {
+ cp_lexer_consume_token (parser->lexer);
+ cp_lexer_consume_token (parser->lexer);
+ readonly = true;
+ }
+ }
+ nl = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE_MAP, list, NULL,
+ true);
+ }
- for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
- OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ for (tree c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
+ {
+ OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ if (readonly)
+ OMP_CLAUSE_MAP_READONLY (c) = 1;
+ }
return nl;
}
@@ -45825,6 +45851,9 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok,
/* OpenACC 2.0:
# pragma acc cache (variable-list) new-line
+
+ OpenACC 2.7:
+ # pragma acc cache (readonly: variable-list) new-line
*/
static tree
@@ -45834,9 +45863,28 @@ cp_parser_oacc_cache (cp_parser *parser, cp_token *pragma_tok)
clauses. */
auto_suppress_location_wrappers sentinel;
- tree stmt, clauses;
+ tree stmt, clauses = NULL_TREE;
+ bool readonly = false;
+
+ if (cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+ {
+ cp_token *token = cp_lexer_peek_token (parser->lexer);
+ if (token->type == CPP_NAME
+ && !strcmp (IDENTIFIER_POINTER (token->u.value), "readonly")
+ && cp_lexer_peek_nth_token (parser->lexer, 2)->type == CPP_COLON)
+ {
+ cp_lexer_consume_token (parser->lexer);
+ cp_lexer_consume_token (parser->lexer);
+ readonly = true;
+ }
+ clauses = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE__CACHE_,
+ NULL, NULL);
+ }
+
+ if (readonly)
+ for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+ OMP_CLAUSE__CACHE__READONLY (c) = 1;
- clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE__CACHE_, NULL_TREE);
clauses = finish_omp_clauses (clauses, C_ORT_ACC);
cp_parser_require_pragma_eol (parser, cp_lexer_peek_token (parser->lexer));
diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
index 68122e3e6fd..0e888fafe7b 100644
--- a/gcc/fortran/dump-parse-tree.cc
+++ b/gcc/fortran/dump-parse-tree.cc
@@ -1398,6 +1398,9 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
fputs (") ALLOCATE(", dumpfile);
continue;
}
+ if ((list_type == OMP_LIST_MAP || list_type == OMP_LIST_CACHE)
+ && n->u.map.readonly)
+ fputs ("readonly,", dumpfile);
if (list_type == OMP_LIST_REDUCTION)
switch (n->u.reduction_op)
{
@@ -1465,7 +1468,7 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
default: break;
}
else if (list_type == OMP_LIST_MAP)
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_ALLOC: fputs ("alloc:", dumpfile); break;
case OMP_MAP_TO: fputs ("to:", dumpfile); break;
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 9a00e6dea6f..a8667a6e6d3 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1360,7 +1360,11 @@ typedef struct gfc_omp_namelist
{
gfc_omp_reduction_op reduction_op;
gfc_omp_depend_doacross_op depend_doacross_op;
- gfc_omp_map_op map_op;
+ struct
+ {
+ ENUM_BITFIELD (gfc_omp_map_op) op:8;
+ bool readonly;
+ } map;
gfc_expr *align;
struct
{
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 2952cd300ac..af769d9efbd 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -1197,7 +1197,7 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
static bool
gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
- bool allow_common, bool allow_derived)
+ bool allow_common, bool allow_derived, bool readonly = false)
{
gfc_omp_namelist **head = NULL;
if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, true,
@@ -1206,7 +1206,10 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
{
gfc_omp_namelist *n;
for (n = *head; n; n = n->next)
- n->u.map_op = map_op;
+ {
+ n->u.map.op = map_op;
+ n->u.map.readonly = readonly;
+ }
return true;
}
@@ -1520,7 +1523,7 @@ gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses *c, bool openacc,
gfc_omp_namelist *p = gfc_get_omp_namelist (), **tl;
p->sym = n->sym;
p->where = p->where;
- p->u.map_op = OMP_MAP_ALWAYS_TOFROM;
+ p->u.map.op = OMP_MAP_ALWAYS_TOFROM;
tl = &c->lists[OMP_LIST_MAP];
while (*tl)
@@ -2180,11 +2183,16 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
{
if (openacc)
{
- if (gfc_match ("copyin ( ") == MATCH_YES
- && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
- OMP_MAP_TO, true,
- allow_derived))
- continue;
+ if (gfc_match ("copyin ( ") == MATCH_YES)
+ {
+ bool readonly = false;
+ if (gfc_match ("readonly : ") == MATCH_YES)
+ readonly = true;
+ if (gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
+ OMP_MAP_TO, true,
+ allow_derived, readonly))
+ continue;
+ }
}
else if (gfc_match_omp_variable_list ("copyin (",
&c->lists[OMP_LIST_COPYIN],
@@ -3101,7 +3109,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
{
gfc_omp_namelist *n;
for (n = *head; n; n = n->next)
- n->u.map_op = map_op;
+ n->u.map.op = map_op;
continue;
}
gfc_current_locus = old_loc;
@@ -3942,7 +3950,7 @@ gfc_match_oacc_declare (void)
if (gfc_current_ns->proc_name
&& gfc_current_ns->proc_name->attr.flavor == FL_MODULE)
{
- if (n->u.map_op != OMP_MAP_ALLOC && n->u.map_op != OMP_MAP_TO)
+ if (n->u.map.op != OMP_MAP_ALLOC && n->u.map.op != OMP_MAP_TO)
{
gfc_error ("Invalid clause in module with !$ACC DECLARE at %L",
&where);
@@ -3976,7 +3984,7 @@ gfc_match_oacc_declare (void)
return MATCH_ERROR;
}
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_FORCE_ALLOC:
case OMP_MAP_ALLOC:
@@ -4091,20 +4099,35 @@ gfc_match_oacc_wait (void)
match
gfc_match_oacc_cache (void)
{
+ bool readonly = false;
gfc_omp_clauses *c = gfc_get_omp_clauses ();
/* The OpenACC cache directive explicitly only allows "array elements or
subarrays", which we're currently not checking here. Either check this
after the call of gfc_match_omp_variable_list, or add something like a
only_sections variant next to its allow_sections parameter. */
- match m = gfc_match_omp_variable_list (" (",
- &c->lists[OMP_LIST_CACHE], true,
- NULL, NULL, true);
+ match m = gfc_match (" ( ");
if (m != MATCH_YES)
{
gfc_free_omp_clauses(c);
return m;
}
+ if (gfc_match ("readonly : ") == MATCH_YES)
+ readonly = true;
+
+ gfc_omp_namelist **head = NULL;
+ m = gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_CACHE], true,
+ NULL, &head, true);
+ if (m != MATCH_YES)
+ {
+ gfc_free_omp_clauses(c);
+ return m;
+ }
+
+ if (readonly)
+ for (gfc_omp_namelist *n = *head; n; n = n->next)
+ n->u.map.readonly = true;
+
if (gfc_current_state() != COMP_DO
&& gfc_current_state() != COMP_DO_CONCURRENT)
{
@@ -8142,8 +8165,8 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
}
if (openacc
&& list == OMP_LIST_MAP
- && (n->u.map_op == OMP_MAP_ATTACH
- || n->u.map_op == OMP_MAP_DETACH))
+ && (n->u.map.op == OMP_MAP_ATTACH
+ || n->u.map.op == OMP_MAP_DETACH))
{
symbol_attribute attr;
if (n->expr)
@@ -8153,7 +8176,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
if (!attr.pointer && !attr.allocatable)
gfc_error ("%qs clause argument must be ALLOCATABLE or "
"a POINTER at %L",
- (n->u.map_op == OMP_MAP_ATTACH) ? "attach"
+ (n->u.map.op == OMP_MAP_ATTACH) ? "attach"
: "detach", &n->where);
}
if (lastref
@@ -8224,7 +8247,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
else if (openacc)
{
if (list == OMP_LIST_MAP
- && n->u.map_op == OMP_MAP_FORCE_DEVICEPTR)
+ && n->u.map.op == OMP_MAP_FORCE_DEVICEPTR)
resolve_oacc_deviceptr_clause (n->sym, n->where, name);
else
resolve_oacc_data_clauses (n->sym, n->where, name);
@@ -8246,7 +8269,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
{
case EXEC_OMP_TARGET:
case EXEC_OMP_TARGET_DATA:
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_TO:
case OMP_MAP_ALWAYS_TO:
@@ -8273,7 +8296,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
}
break;
case EXEC_OMP_TARGET_ENTER_DATA:
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_TO:
case OMP_MAP_ALWAYS_TO:
@@ -8283,16 +8306,16 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
case OMP_MAP_PRESENT_ALLOC:
break;
case OMP_MAP_TOFROM:
- n->u.map_op = OMP_MAP_TO;
+ n->u.map.op = OMP_MAP_TO;
break;
case OMP_MAP_ALWAYS_TOFROM:
- n->u.map_op = OMP_MAP_ALWAYS_TO;
+ n->u.map.op = OMP_MAP_ALWAYS_TO;
break;
case OMP_MAP_PRESENT_TOFROM:
- n->u.map_op = OMP_MAP_PRESENT_TO;
+ n->u.map.op = OMP_MAP_PRESENT_TO;
break;
case OMP_MAP_ALWAYS_PRESENT_TOFROM:
- n->u.map_op = OMP_MAP_ALWAYS_PRESENT_TO;
+ n->u.map.op = OMP_MAP_ALWAYS_PRESENT_TO;
break;
default:
gfc_error ("TARGET ENTER DATA with map-type other "
@@ -8302,7 +8325,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
}
break;
case EXEC_OMP_TARGET_EXIT_DATA:
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_FROM:
case OMP_MAP_ALWAYS_FROM:
@@ -8312,16 +8335,16 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
case OMP_MAP_DELETE:
break;
case OMP_MAP_TOFROM:
- n->u.map_op = OMP_MAP_FROM;
+ n->u.map.op = OMP_MAP_FROM;
break;
case OMP_MAP_ALWAYS_TOFROM:
- n->u.map_op = OMP_MAP_ALWAYS_FROM;
+ n->u.map.op = OMP_MAP_ALWAYS_FROM;
break;
case OMP_MAP_PRESENT_TOFROM:
- n->u.map_op = OMP_MAP_PRESENT_FROM;
+ n->u.map.op = OMP_MAP_PRESENT_FROM;
break;
case OMP_MAP_ALWAYS_PRESENT_TOFROM:
- n->u.map_op = OMP_MAP_ALWAYS_PRESENT_FROM;
+ n->u.map.op = OMP_MAP_ALWAYS_PRESENT_FROM;
break;
default:
gfc_error ("TARGET EXIT DATA with map-type other "
diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index b0fd25e92a3..1ff1dda026a 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc
@@ -6614,7 +6614,7 @@ add_clause (gfc_symbol *sym, gfc_omp_map_op map_op)
n = gfc_get_omp_namelist ();
n->sym = sym;
- n->u.map_op = map_op;
+ n->u.map.op = map_op;
if (!module_oacc_clauses)
module_oacc_clauses = gfc_get_omp_clauses ();
@@ -6716,10 +6716,10 @@ finish_oacc_declare (gfc_namespace *ns, gfc_symbol *sym, bool block)
for (n = omp_clauses->lists[OMP_LIST_MAP]; n; n = n->next)
{
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_DEVICE_RESIDENT:
- n->u.map_op = OMP_MAP_FORCE_ALLOC;
+ n->u.map.op = OMP_MAP_FORCE_ALLOC;
break;
default:
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index cf741cebf91..a4628e460bd 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -3067,7 +3067,10 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
|| (n->expr && gfc_expr_attr (n->expr).pointer)))
always_modifier = true;
- switch (n->u.map_op)
+ if (n->u.map.readonly)
+ OMP_CLAUSE_MAP_READONLY (node) = 1;
+
+ switch (n->u.map.op)
{
case OMP_MAP_ALLOC:
OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALLOC);
@@ -3194,8 +3197,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
&& n->sym->attr.omp_declare_target
&& (always_modifier || n->sym->attr.pointer)
&& op != EXEC_OMP_TARGET_EXIT_DATA
- && n->u.map_op != OMP_MAP_DELETE
- && n->u.map_op != OMP_MAP_RELEASE)
+ && n->u.map.op != OMP_MAP_DELETE
+ && n->u.map.op != OMP_MAP_RELEASE)
{
gcc_assert (n->sym->ts.u.cl->backend_decl);
node5 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
@@ -3261,7 +3264,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
{
enum gomp_map_kind gmk = GOMP_MAP_POINTER;
if (op == EXEC_OMP_TARGET_EXIT_DATA
- && n->u.map_op == OMP_MAP_DELETE)
+ && n->u.map.op == OMP_MAP_DELETE)
gmk = GOMP_MAP_DELETE;
else if (op == EXEC_OMP_TARGET_EXIT_DATA)
gmk = GOMP_MAP_RELEASE;
@@ -3284,7 +3287,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
{
enum gomp_map_kind gmk;
if (op == EXEC_OMP_TARGET_EXIT_DATA
- && n->u.map_op == OMP_MAP_DELETE)
+ && n->u.map.op == OMP_MAP_DELETE)
gmk = GOMP_MAP_DELETE;
else if (op == EXEC_OMP_TARGET_EXIT_DATA)
gmk = GOMP_MAP_RELEASE;
@@ -3316,18 +3319,18 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
node2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
OMP_CLAUSE_DECL (node2) = decl;
OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
- if (n->u.map_op == OMP_MAP_DELETE)
+ if (n->u.map.op == OMP_MAP_DELETE)
map_kind = GOMP_MAP_DELETE;
else if (op == EXEC_OMP_TARGET_EXIT_DATA
- || n->u.map_op == OMP_MAP_RELEASE)
+ || n->u.map.op == OMP_MAP_RELEASE)
map_kind = GOMP_MAP_RELEASE;
else
map_kind = GOMP_MAP_TO_PSET;
OMP_CLAUSE_SET_MAP_KIND (node2, map_kind);
if (op != EXEC_OMP_TARGET_EXIT_DATA
- && n->u.map_op != OMP_MAP_DELETE
- && n->u.map_op != OMP_MAP_RELEASE)
+ && n->u.map.op != OMP_MAP_DELETE
+ && n->u.map.op != OMP_MAP_RELEASE)
{
node3 = build_omp_clause (input_location,
OMP_CLAUSE_MAP);
@@ -3345,7 +3348,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
= gfc_conv_descriptor_data_get (decl);
OMP_CLAUSE_SIZE (node3) = size_int (0);
- if (n->u.map_op == OMP_MAP_ATTACH)
+ if (n->u.map.op == OMP_MAP_ATTACH)
{
/* Standalone attach clauses used with arrays with
descriptors must copy the descriptor to the
@@ -3361,7 +3364,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
node3 = NULL;
goto finalize_map_clause;
}
- else if (n->u.map_op == OMP_MAP_DETACH)
+ else if (n->u.map.op == OMP_MAP_DETACH)
{
OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_DETACH);
/* Similarly to above, we don't want to unmap PTR
@@ -3553,8 +3556,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
to perform a single attach/detach operation, of the
pointer itself, not of the pointed-to object. */
if (openacc
- && (n->u.map_op == OMP_MAP_ATTACH
- || n->u.map_op == OMP_MAP_DETACH))
+ && (n->u.map.op == OMP_MAP_ATTACH
+ || n->u.map.op == OMP_MAP_DETACH))
{
OMP_CLAUSE_DECL (node)
= build_fold_addr_expr (OMP_CLAUSE_DECL (node));
@@ -3585,7 +3588,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
fold_convert (size_type_node,
se.string_length),
TYPE_SIZE_UNIT (tmp));
- if (n->u.map_op == OMP_MAP_DELETE)
+ if (n->u.map.op == OMP_MAP_DELETE)
kind = GOMP_MAP_DELETE;
else if (op == EXEC_OMP_TARGET_EXIT_DATA)
kind = GOMP_MAP_RELEASE;
@@ -3642,8 +3645,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
to perform a single attach/detach operation, of the
pointer itself, not of the pointed-to object. */
if (openacc
- && (n->u.map_op == OMP_MAP_ATTACH
- || n->u.map_op == OMP_MAP_DETACH))
+ && (n->u.map.op == OMP_MAP_ATTACH
+ || n->u.map.op == OMP_MAP_DETACH))
{
OMP_CLAUSE_DECL (node)
= build_fold_addr_expr (inner);
@@ -3689,8 +3692,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
{
/* Bare attach and detach clauses don't want any
additional nodes. */
- if ((n->u.map_op == OMP_MAP_ATTACH
- || n->u.map_op == OMP_MAP_DETACH)
+ if ((n->u.map.op == OMP_MAP_ATTACH
+ || n->u.map.op == OMP_MAP_DETACH)
&& (POINTER_TYPE_P (TREE_TYPE (inner))
|| GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (inner))))
{
@@ -3724,8 +3727,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
map_kind = ((GOMP_MAP_ALWAYS_P (map_kind)
|| gfc_expr_attr (n->expr).pointer)
? GOMP_MAP_ALWAYS_TO : GOMP_MAP_TO);
- else if (n->u.map_op == OMP_MAP_RELEASE
- || n->u.map_op == OMP_MAP_DELETE)
+ else if (n->u.map.op == OMP_MAP_RELEASE
+ || n->u.map.op == OMP_MAP_DELETE)
;
else if (op == EXEC_OMP_TARGET_EXIT_DATA)
map_kind = GOMP_MAP_RELEASE;
@@ -3920,6 +3923,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
}
if (n->u.present_modifier)
OMP_CLAUSE_MOTION_PRESENT (node) = 1;
+ if (list == OMP_LIST_CACHE && n->u.map.readonly)
+ OMP_CLAUSE__CACHE__READONLY (node) = 1;
omp_clauses = gfc_trans_add_clause (node, omp_clauses);
}
break;
@@ -6333,7 +6338,7 @@ gfc_add_clause_implicitly (gfc_omp_clauses *clauses_out,
n2->where = n->where;
n2->sym = n->sym;
if (is_target)
- n2->u.map_op = OMP_MAP_TOFROM;
+ n2->u.map.op = OMP_MAP_TOFROM;
if (tail)
{
tail->next = n2;
diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-1.c b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
new file mode 100644
index 00000000000..171f96c08db
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
@@ -0,0 +1,27 @@
+/* { dg-additional-options "-fdump-tree-original" } */
+
+struct S
+{
+ int *ptr;
+ float f;
+};
+
+
+int main (void)
+{
+ int x[32];
+ struct S s = {x, 0};
+
+ #pragma acc parallel copyin(readonly: x[:32], s.ptr[:16])
+ {
+ #pragma acc cache (readonly: x[:32])
+ }
+ return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*s.ptr \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: 128\\\]\\);$" 1 "original" } } */
+
+
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
new file mode 100644
index 00000000000..069fec0a0d5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
@@ -0,0 +1,28 @@
+! { dg-additional-options "-fdump-tree-original" }
+
+subroutine foo (a, n)
+ integer :: n, a(:)
+ integer :: i, b(n)
+ !$acc parallel copyin(readonly: a(:), b(:n))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ enddo
+ !$acc end parallel
+end subroutine foo
+
+program main
+ integer :: i, n = 32, a(32)
+ integer :: b(32)
+ !$acc parallel copyin(readonly: a(:32), b(:n))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ enddo
+ !$acc end parallel
+end program main
+
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) .+ map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) .+ map\\(readonly,to:b\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 2 "original" } }
+
+
+
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 25d191b10fd..9604c3eecc5 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -905,6 +905,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
case OMP_CLAUSE_MAP:
pp_string (pp, "map(");
+ if (OMP_CLAUSE_MAP_READONLY (clause))
+ pp_string (pp, "readonly,");
switch (OMP_CLAUSE_MAP_KIND (clause))
{
case GOMP_MAP_ALLOC:
@@ -1075,6 +1077,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
case OMP_CLAUSE__CACHE_:
pp_string (pp, "(");
+ if (OMP_CLAUSE__CACHE__READONLY (clause))
+ pp_string (pp, "readonly:");
dump_generic_node (pp, OMP_CLAUSE_DECL (clause),
spc, flags, false);
goto print_clause_size;
diff --git a/gcc/tree.h b/gcc/tree.h
index 4c04245e2b1..1301491587f 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1811,6 +1811,14 @@ class auto_suppress_location_wrappers
#define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
(OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
+/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */
+#define OMP_CLAUSE_MAP_READONLY(NODE) \
+ TREE_NOTHROW (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
+
+/* Same as above, for use in OpenACC cache directives. */
+#define OMP_CLAUSE__CACHE__READONLY(NODE) \
+ TREE_NOTHROW (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
+
/* True on an OMP_CLAUSE_USE_DEVICE_PTR with an OpenACC 'if_present'
clause. */
#define OMP_CLAUSE_USE_DEVICE_PTR_IF_PRESENT(NODE) \
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends
2023-08-07 13:58 ` [PATCH, OpenACC 2.7, v2] " Chung-Lin Tang
@ 2023-10-26 9:43 ` Thomas Schwinge
2024-03-07 8:02 ` Chung-Lin Tang
0 siblings, 1 reply; 18+ messages in thread
From: Thomas Schwinge @ 2023-10-26 9:43 UTC (permalink / raw)
To: Chung-Lin Tang, Tobias Burnus; +Cc: gcc-patches, Catherine Moore, fortran
Hi!
On 2023-08-07T21:58:27+0800, Chung-Lin Tang <chunglin.tang@siemens.com> wrote:
> here's the updated v2 of the readonly modifier front-end patch.
Thanks.
>>>> +++ b/gcc/c/c-parser.cc
>>>> @@ -14059,7 +14059,8 @@ c_parser_omp_variable_list (c_parser *parser,
>>>>
>>>> static tree
>>>> c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
>>>> - tree list, bool allow_deref = false)
>>>> + tree list, bool allow_deref = false,
>>>> + bool *readonly = NULL)
>>>> ...
>>> Instead of doing this in 'c_parser_omp_var_list_parens', I think it's
>>> clearer to have this special 'readonly :' parsing logic in the two places
>>> where it's used.
> On 2023/7/20 11:08 PM, Tobias Burnus wrote:
>> I concur. [...]
>
> Okay, I've changed the C/C++ parser parts to have the parsing logic directly
> added.
These parts now looks good to me, with one remark for the C front end
changes, see below.
>>>> +++ b/gcc/fortran/gfortran.h
>>>> @@ -1360,7 +1360,11 @@ typedef struct gfc_omp_namelist
>>>> {
>>>> gfc_omp_reduction_op reduction_op;
>>>> gfc_omp_depend_doacross_op depend_doacross_op;
>>>> - gfc_omp_map_op map_op;
>>>> + struct
>>>> + {
>>>> + ENUM_BITFIELD (gfc_omp_map_op) map_op:8;
>>>> + bool readonly;
>>>> + };
>>>> gfc_expr *align;
>>>> struct
>>>> {
>>> [...] Thus, the above looks good to me.
>> I concur but I wonder whether it would be cleaner to name the struct;
>> this makes it also more obvious what belongs together in the union.
>>
>> Namely, naming the struct 'map' and then changing the 45 users from
>> 'u.map_op' to 'u.map.op' and the new 'u.readonly' to 'u.map.readonly'. –
>> this seems to be cleaner.
>
> I've adjusted 'u.map' to be a named struct now, and updated the references.
I like that, thanks. (Tobias, to reduce the volume of this patch here,
please let us know if the 'map_op' -> 'map.op' mass-change should be done
separately and go into master branch already, instead of as part of this
patch.)
>>> + if (gfc_match ("readonly :") == MATCH_YES)
>>> I note this one does not have a space after ':' in 'gfc_match', but the
>>> one above in 'gfc_match_omp_clauses' does. I don't know off-hand if that
>>> makes a difference in parsing -- probably not, as all of
>>> 'gcc/fortran/openmp.cc' generally doesn't seem to be very consistent
>>> about these two variants?
>> It *does* make a difference. And for obvious reasons. You don't want to permit:
>>
>> !$acc kernels asnyccopy(a)
>>
>> but require at least one space (or comma) between "async" and "copy"..
>> (In fixed form Fortran, it would be fine - as would be "!$acc k e nelsasy nc co p y(a)".)
>>
>> A " " matches zero or more whitespaces, but with gfc_match_space you can find out
>> whether there was whitespace or not.
OK, I generally follow -- but does this rationale also apply in this case
here, concerning space after ':'?
> Okay, made sure both are 'gfc_match ("readonly : ")'. Thanks for catching that, didn't
> realize that space was significant.
>>>> +++ b/gcc/tree.h
>>>> @@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers
>>>> #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
>>>> (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
>>>>
>>>> +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */
>>>> +#define OMP_CLAUSE_MAP_READONLY(NODE) \
>>>> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
>>>> +
>>>> +/* Same as above, for use in OpenACC cache directives. */
>>>> +#define OMP_CLAUSE__CACHE__READONLY(NODE) \
>>>> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
>>> I'm not sure if these special accessor functions are actually useful, or
>>> we should just directly use 'TREE_READONLY' instead? We're only using
>>> them in contexts where it's clear that the 'OMP_CLAUSE_SUBCODE_CHECK' is
>>> satisfied, for example.
>> I find directly using TREE_READONLY confusing.
>
> FWIW, I've changed to use TREE_NOTHROW instead, if it can give a better sense of safety :P
I don't understand that, why not use 'TREE_READONLY'?
> I think there's a misunderstanding here anyways: we are not relying on a DECL marked
> TREE_READONLY here. We merely need the OMP_CLAUSE_MAP to be marked as OMP_CLAUSE_MAP_READONLY == 1.
Yes, I understand that. My question was why we don't just use
'TREE_READONLY (c)', where 'c' is the
'OMP_CLAUSE_MAP'/'OMP_CLAUSE__CACHE_' clause (not its decl), and avoid
the indirection through
'#define OMP_CLAUSE_MAP_READONLY'/'#define OMP_CLAUSE__CACHE__READONLY',
given that we're only using them in contexts where it's clear that the
'OMP_CLAUSE_SUBCODE_CHECK' is satisfied. I don't have a strong
preference, though.
Either way, you still need to document this:
| Also, for the new use for OMP clauses, update 'gcc/tree.h:TREE_READONLY',
| and in 'gcc/tree-core.h' for 'readonly_flag' the
| "table lists the uses of each of the above flags".
Then, my idea of "Setting 'TREE_READONLY' of the 'OMP_CLAUSE_DECL'
instead of the clause itself" was just that: an idea, so if you conclude
that doesn't make sense, don't follow it further. In particular, Tobias
said:
| In particular, wouldn't the following cause issues, if you mark 'a' as TREE_READONLY?
|
| int a;
| #pragma acc parallel copyin(readonly : a)
| {...}
| a = 5;
|
| > Or, early in the middle end, propagate 'TREE_READONLY' from the clause to
| > its 'OMP_CLAUSE_DECL'? Might need to 'unshare_expr' the latter for
| > modification and use in the associated region only?
|
| Unsharing a tree would surely help – but it is still ugly and, for
| declarations, unshare_expr does not create a copy!
Aha, my thinking was that we'd have a separate decl inside the compute
region, that is, the host-side 'a' not affected by the 'readonly'
modifier, and thus host-side 'a = 5;' continue to work as expected.
But you're of course right: we cannot set 'TREE_READONLY' early (front
end, before OMP function split off), for the very reason you've cited.
So we definitely need a separate flag, and then it's probably easier
(less invasive) to have it on the clause instead of its decl. (... as
you've implemented.)
As I said:
| Just some quick thoughts, obviously without any detailed analysis. ;-)
Another thing, I did wonder: there are cases where for one source-level
OpenACC clause we synthesize several actual clauses (in the front ends,
but possibly also during gimplification?). Do we understand how such
additionally synthesized clause react to an original clause's 'readonly'
modifier (that is, do they get it propagated, do they also get
'OMP_CLAUSE_MAP_READONLY'/'OMP_CLAUSE__CACHE__READONLY' set, or not?),
and test cases to verify/document that?
Later I found that's part of your follow-on
"[PATCH, OpenACC 2.7] readonly modifier support in front-ends", as you've
also written here:
> The other points-to patch then (also in front-ends) take the OMP_CLAUSE_MAP_READONLY
> to mark the clauses of "base-pointers of array-sections" as OMP_CLAUSE_MAP_POINTS_TO_READONLY,
> and later this gradually gets relayed to alias oracle routines in tree-ssa-alias.cc
> Re-tested this v2 patch on powerpc64le-linux/nvptx. Okay for trunk?
In addition to a few individual comments above and below, you've also not
yet responded to my requests re test cases.
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -14084,7 +14084,11 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
> OpenACC 2.6:
> no_create ( variable-list )
> attach ( variable-list )
> - detach ( variable-list ) */
> + detach ( variable-list )
> +
> + OpenACC 2.7:
> + copyin (readonly : variable-list )
> + */
>
> static tree
> c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
> @@ -14135,11 +14139,36 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
> default:
> gcc_unreachable ();
> }
> - tree nl, c;
> - nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, true);
>
> - for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
> - OMP_CLAUSE_SET_MAP_KIND (c, kind);
> + tree nl = list;
> + bool readonly = false;
> + matching_parens parens;
> + if (parens.require_open (parser))
> + {
> + /* Turn on readonly modifier parsing for copyin clause. */
> + if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
> + {
> + c_token *token = c_parser_peek_token (parser);
> + if (token->type == CPP_NAME
> + && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
> + && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
> + {
> + c_parser_consume_token (parser);
> + c_parser_consume_token (parser);
> + readonly = true;
> + }
> + }
> + location_t loc = c_parser_peek_token (parser)->location;
I suppose 'loc' here now points to after the opening '(' or after the
'readonly :'? This is different from what 'c_parser_omp_var_list_parens'
does, and indeed, 'c_parser_omp_variable_list' states that "CLAUSE_LOC is
the location of the clause", not the location of the variable-list? As
this, I suppose, may change diagnostics, please restore the original
behavior. (This appears to be different in the C++ front end, huh.)
> + nl = c_parser_omp_variable_list (parser, loc, OMP_CLAUSE_MAP, list, true);
> + parens.skip_until_found_close (parser);
> + }
> +
> + for (tree c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
> + {
> + OMP_CLAUSE_SET_MAP_KIND (c, kind);
> + if (readonly)
> + OMP_CLAUSE_MAP_READONLY (c) = 1;
> + }
>
> return nl;
> }
> @@ -18161,15 +18190,40 @@ c_parser_omp_structured_block (c_parser *parser, bool *if_p)
> /* OpenACC 2.0:
> # pragma acc cache (variable-list) new-line
>
> + OpenACC 2.7:
> + # pragma acc cache (readonly: variable-list) new-line
> +
> LOC is the location of the #pragma token.
> */
>
> static tree
> c_parser_oacc_cache (location_t loc, c_parser *parser)
> {
> - tree stmt, clauses;
> + tree stmt, clauses = NULL_TREE;
> + bool readonly = false;
> + matching_parens parens;
> +
> + if (parens.require_open (parser))
> + {
> + c_token *token = c_parser_peek_token (parser);
> + if (token->type == CPP_NAME
> + && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
> + && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
> + {
> + c_parser_consume_token (parser);
> + c_parser_consume_token (parser);
> + readonly = true;
> + }
> + location_t loc = c_parser_peek_token (parser)->location;
Similar. (That is, here, location of the directive.)
> + clauses = c_parser_omp_variable_list (parser, loc, OMP_CLAUSE__CACHE_,
> + NULL_TREE);
> + parens.skip_until_found_close (parser);
> + }
> +
> + if (readonly)
> + for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> + OMP_CLAUSE__CACHE__READONLY (c) = 1;
>
> - clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL);
> clauses = c_finish_omp_clauses (clauses, C_ORT_ACC);
>
> c_parser_skip_to_pragma_eol (parser);
> --- a/gcc/fortran/openmp.cc
> +++ b/gcc/fortran/openmp.cc
> @@ -1197,7 +1197,7 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
>
> static bool
> gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
> - bool allow_common, bool allow_derived)
> + bool allow_common, bool allow_derived, bool readonly = false)
> {
> gfc_omp_namelist **head = NULL;
> if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, true,
> @@ -1206,7 +1206,10 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
> {
> gfc_omp_namelist *n;
> for (n = *head; n; n = n->next)
> - n->u.map_op = map_op;
> + {
> + n->u.map.op = map_op;
> + n->u.map.readonly = readonly;
> + }
> return true;
> }
Didn't we conclude that "not doing it here is cleaner" (Tobias' words),
and instead do this "Similar to 'c_parser_omp_var_list_parens'" (my
words)? That is, not add the 'bool readonly' formal parameter to
'gfc_match_omp_map_clause'.
(..., but don't do the 'OMP_MAP_TO_READONLY' way that I considered, but
instead keep the 'readonly' flag.)
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis
2023-07-25 15:52 ` [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis Chung-Lin Tang
@ 2023-10-27 14:28 ` Thomas Schwinge
2023-10-30 12:46 ` Richard Biener
0 siblings, 1 reply; 18+ messages in thread
From: Thomas Schwinge @ 2023-10-27 14:28 UTC (permalink / raw)
To: Chung-Lin Tang, Richard Biener
Cc: gcc-patches, fortran, Catherine Moore, Tobias Burnus
Hi!
Richard, as the original author of 'SSA_NAME_POINTS_TO_READONLY_MEMORY':
2018 commit 6214d5c7e7470bdd5ecbeae668c2522551bfebbc (Subversion r263958)
"Move const_parm trick to generic code"; 'gcc/tree.h':
/* Nonzero if this SSA_NAME is known to point to memory that may not
be written to. This is set for default defs of function parameters
that have a corresponding r or R specification in the functions
fn spec attribute. This is used by alias analysis. */
#define SSA_NAME_POINTS_TO_READONLY_MEMORY(NODE) \
SSA_NAME_CHECK (NODE)->base.deprecated_flag
..., may I ask you to please help review the following patch
(full-quoted)?
For context: this patch here ("second patch") depends on a first patch:
<inbox.sourceware.org/d0e6013f-ca38-b98d-dc01-b30adbd5901a@siemens.com>
"[PATCH, OpenACC 2.7] readonly modifier support in front-ends". That one
is still under review/rework; so you're not able to apply this second
patch here.
In a nutshell: a 'readonly' modifier has been added to the OpenACC
'copyin' clause (copy host to device memory, don't copy back at end of
region):
| If the optional 'readonly' modifier appears, then the implementation may assume that the data
| referenced by _var-list_ is never written to within the applicable region.
That is, for example (untested):
#pragma acc routine
void escape(int *);
int x[32] = [...];
#pragma acc parallel copyin(readonly: x)
{
int a1 = x[3];
escape(x);
int a2 = x[3]; // Per 'readonly', don't need to reload 'x[3]' here.
//x[22] = 0; // Invalid -- but no diagnostic mandated.
}
What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
flag.
The actual optimization then is done in this second patch. Chung-Lin
found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
I don't have much experience with most of the following generic code, so
would appreciate a helping hand, whether that conceptually makes sense as
well as from the implementation point of view:
On 2023-07-25T23:52:06+0800, Chung-Lin Tang via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> On 2023/7/11 2:33 AM, Chung-Lin Tang via Gcc-patches wrote:
>> As we discussed earlier, the work for actually linking this to middle-end
>> points-to analysis is a somewhat non-trivial issue. This first patch allows
>> the language feature to be used in OpenACC directives first (with no effect for now).
>> The middle-end changes are probably going to be a later patch.
>
> This second patch tries to link the readonly modifier to points-to analysis.
>
> There already exists SSA_NAME_POINTS_TO_READONLY_MEMORY and it's support in the
> alias oracle routines in tree-ssa-alias.cc, so basically what this patch does is
> try to make the variables holding the array section base pointers to have this
> flag set.
>
> There is an another OMP_CLAUSE_MAP_POINTS_TO_READONLY set by front-ends on the
> associated pointer clauses if OMP_CLAUSE_MAP_READONLY is set.
> Also a DECL_POINTS_TO_READONLY flag is set for VAR_DECLs when creating the tmp
> vars carrying these receiver references on the offloaded side. These
> eventually get translated to SSA_NAME_POINTS_TO_READONLY_MEMORY.
> This still doesn't always work as expected in terms of optimization:
> struct pointer fields and Fortran arrays (kind of like C structs) which have
> several accesses to create the pointer access on the receive/offloaded side,
> and SRA appears to not work on these sequences, so gets in the way of much
> redundancy elimination.
I understand correctly that this is left as future work? Please add the test
cases you have, XFAILed in some reasonable way.
> Currently have one testcase where we can demonstrate 'readonly' can avoid
> a clobber by function call.
:-)
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -14258,6 +14258,8 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
> OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_ATTACH_DETACH);
> else
> OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER);
> + if (OMP_CLAUSE_MAP_READONLY (c))
> + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> OMP_CLAUSE_MAP_IMPLICIT (c2) = OMP_CLAUSE_MAP_IMPLICIT (c);
> if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER
> && !c_mark_addressable (t))
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -5872,6 +5872,8 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
> }
> else
> OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER);
> + if (OMP_CLAUSE_MAP_READONLY (c))
> + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> OMP_CLAUSE_MAP_IMPLICIT (c2) = OMP_CLAUSE_MAP_IMPLICIT (c);
> if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER
> && !cxx_mark_addressable (t))
> --- a/gcc/fortran/trans-openmp.cc
> +++ b/gcc/fortran/trans-openmp.cc
> @@ -2524,6 +2524,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_exec_op op,
> node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
> OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind);
> OMP_CLAUSE_DECL (node3) = gfc_conv_descriptor_data_get (decl);
> + if (n->u.readonly)
> + OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
> /* This purposely does not include GOMP_MAP_ALWAYS_POINTER. The extra
> cast prevents gimplify.cc from recognising it as being part of the
> struct - and adding an 'alloc: for the 'desc.data' pointer, which
> @@ -2559,6 +2561,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_exec_op op,
> OMP_CLAUSE_MAP);
> OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind);
> OMP_CLAUSE_DECL (node3) = decl;
> + if (n->u.readonly)
> + OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
> }
Could combine these two into one, after
'if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))' reconverges here, like
where 'OMP_CLAUSE_SIZE (node3)' is set:
> ptr2 = fold_convert (ptrdiff_type_node, ptr2);
> OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, ptrdiff_type_node,
Is 'n->u.readonly == OMP_CLAUSE_MAP_READONLY (node)'? If yes, would the
latter be clearer to use as the 'if' expression (like in C, C++ front
ends)?
I see further additional 'OMP_CLAUSE_MAP' clauses synthesized, for
example in 'gcc/cp/semantics.cc:handle_omp_array_sections', or
'gcc/fortran/trans-openmp.cc:gfc_trans_omp_array_section', also
'gcc/gimplify.cc'. I assume these are not relevant to have
'OMP_CLAUSE_MAP_READONLY' -> 'OMP_CLAUSE_MAP_POINTS_TO_READONLY'
propagated? Actually, per your changes (see below), there is one
'OMP_CLAUSE_MAP_POINTS_TO_READONLY' propagation in
'gcc/gimplify.cc:build_omp_struct_comp_nodes'.
Is the current situation re flag setting/propagation what was empirically
necessary to make the test case work, or is it a systematic review? (The
former is fine; I'd just like to know.)
> --- a/gcc/gimple-expr.cc
> +++ b/gcc/gimple-expr.cc
> @@ -376,6 +376,8 @@ copy_var_decl (tree var, tree name, tree type)
> DECL_CONTEXT (copy) = DECL_CONTEXT (var);
> TREE_USED (copy) = 1;
> DECL_SEEN_IN_BIND_EXPR_P (copy) = 1;
> + if (VAR_P (var))
> + DECL_POINTS_TO_READONLY (copy) = DECL_POINTS_TO_READONLY (var);
> DECL_ATTRIBUTES (copy) = DECL_ATTRIBUTES (var);
> if (DECL_USER_ALIGN (var))
> {
> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -221,6 +221,7 @@ struct gimplify_omp_ctx
> splay_tree variables;
> hash_set<tree> *privatized_types;
> tree clauses;
> + hash_set<tree_operand_hash> *pt_readonly_ptrs;
> /* Iteration variables in an OMP_FOR. */
> vec<tree> loop_iter_var;
> location_t location;
> @@ -628,6 +629,15 @@ internal_get_tmp_var (tree val, gimple_seq *pre_p, gimple_seq *post_p,
> gimplify_expr (&val, pre_p, post_p, is_gimple_reg_rhs_or_call,
> fb_rvalue);
>
> + bool pt_readonly = false;
> + if (gimplify_omp_ctxp && gimplify_omp_ctxp->pt_readonly_ptrs)
> + {
> + tree ptr = val;
> + if (TREE_CODE (ptr) == POINTER_PLUS_EXPR)
> + ptr = TREE_OPERAND (ptr, 0);
> + pt_readonly = gimplify_omp_ctxp->pt_readonly_ptrs->contains (ptr);
> + }
'POINTER_PLUS_EXPR' is the only special thing we may run into, here?
(Generally, I prefer 'if', 'else if, [...], 'else gcc_unreachable ()'.)
> +
> if (allow_ssa
> && gimplify_ctxp->into_ssa
> && is_gimple_reg_type (TREE_TYPE (val)))
> @@ -639,9 +649,18 @@ internal_get_tmp_var (tree val, gimple_seq *pre_p, gimple_seq *post_p,
> if (name)
> SET_SSA_NAME_VAR_OR_IDENTIFIER (t, create_tmp_var_name (name));
> }
> + if (pt_readonly)
> + SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1;
> }
> else
> - t = lookup_tmp_var (val, is_formal, not_gimple_reg);
> + {
> + t = lookup_tmp_var (val, is_formal, not_gimple_reg);
> + if (pt_readonly)
> + {
> + DECL_POINTS_TO_READONLY (t) = 1;
> + gimplify_omp_ctxp->pt_readonly_ptrs->add (t);
> + }
> + }
>
> mod = build2 (INIT_EXPR, TREE_TYPE (t), t, unshare_expr (val));
>
> @@ -8906,6 +8925,8 @@ build_omp_struct_comp_nodes (enum tree_code code, tree grp_start, tree grp_end,
> OMP_CLAUSE_SET_MAP_KIND (c2, mkind);
> OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (grp_end));
> OMP_CLAUSE_CHAIN (c2) = NULL_TREE;
> + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (grp_end))
> + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> tree grp_mid = NULL_TREE;
> if (OMP_CLAUSE_CHAIN (grp_start) != grp_end)
> grp_mid = OMP_CLAUSE_CHAIN (grp_start);
For my understanding, is this empirically necessary, or a systematic
review?
> @@ -11741,6 +11762,16 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
>
> gimplify_omp_ctxp = outer_ctx;
> }
> + else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
> + && (code == OACC_PARALLEL
> + || code == OACC_KERNELS
> + || code == OACC_SERIAL)
> + && OMP_CLAUSE_MAP_POINTS_TO_READONLY (c))
> + {
> + if (ctx->pt_readonly_ptrs == NULL)
> + ctx->pt_readonly_ptrs = new hash_set<tree_operand_hash> ();
> + ctx->pt_readonly_ptrs->add (OMP_CLAUSE_DECL (c));
> + }
> if (notice_outer)
> goto do_notice;
> break;
Also need to 'delete ctx->pt_readonly_ptrs;' somewhere.
> --- a/gcc/omp-low.cc
> +++ b/gcc/omp-low.cc
> @@ -14098,6 +14098,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
> if (ref_to_array)
> x = fold_convert_loc (clause_loc, TREE_TYPE (new_var), x);
> gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue);
> + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (c) && VAR_P (x))
> + DECL_POINTS_TO_READONLY (x) = 1;
> if ((is_ref && !ref_to_array)
> || ref_to_ptr)
> {
This is in the middle of the
"Handle GOMP_MAP_FIRSTPRIVATE_{POINTER,REFERENCE} in second pass" code
block. Again, for my understanding, is this empirically necessary, or a
systematic review?
> --- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> @@ -19,8 +19,8 @@ int main (void)
> return 0;
> }
>
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*s.ptr \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c } } } } */
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*s.ptr \\\[len: 64\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: 64\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
> /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: 128\\\]\\);$" 1 "original" } } */
I suppose the new 'map(pt_readonly,attach_detach:s.ptr [bias: 0])' clause
was previously "hidden" in '.+'? Please then change that in the first
patch "[PATCH, OpenACC 2.7] readonly modifier support in front-ends", so
that we can see here what actually is changing (only 'pt_readonly', I
suppose).
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-additional-options "-O -fdump-tree-fre" } */
> +
> +#pragma acc routine
> +extern void foo (int *ptr, int val);
> +
> +int main (void)
> +{
> + int r, a[32];
> + #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> + {
> + foo (a, a[8]);
> + r = a[8];
> + }
> +}
> +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
Please add a comment why 'fre1', and what generally is being checked
here; that's not obvious to the casual reader. (That is, me in a few
weeks.) ;-)
Also add a scan for "before the optimization": two 'MEM's, I suppose?
> --- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> @@ -20,8 +20,8 @@ program main
> !$acc end parallel
> end program main
>
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) .+ map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\)" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) .+ map\\(readonly,to:b\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]\\)" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:a.0 \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) a.0\\\]\\) map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:b \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) b\\\]\\)" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:a \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\\]\\) map\\(readonly,to:b\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:b \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\\]\\)" 1 "original" } }
> ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 2 "original" } }
Same comment as for 'c-c++-common/goacc/readonly-1.c'.
> --- a/gcc/tree-pretty-print.cc
> +++ b/gcc/tree-pretty-print.cc
> @@ -907,6 +907,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
> pp_string (pp, "map(");
> if (OMP_CLAUSE_MAP_READONLY (clause))
> pp_string (pp, "readonly,");
> + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (clause))
> + pp_string (pp, "pt_readonly,");
> switch (OMP_CLAUSE_MAP_KIND (clause))
> {
> case GOMP_MAP_ALLOC:
> @@ -3436,6 +3438,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
> pp_string (pp, "(D)");
> if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (node))
> pp_string (pp, "(ab)");
> + if (SSA_NAME_POINTS_TO_READONLY_MEMORY (node))
> + pp_string (pp, "(ptro)");
> break;
>
> case WITH_SIZE_EXPR:
> --- a/gcc/tree-ssanames.cc
> +++ b/gcc/tree-ssanames.cc
> @@ -402,6 +402,9 @@ make_ssa_name_fn (struct function *fn, tree var, gimple *stmt,
> else
> SSA_NAME_RANGE_INFO (t) = NULL;
>
> + if (VAR_P (var) && DECL_POINTS_TO_READONLY (var))
> + SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1;
> +
> SSA_NAME_IN_FREE_LIST (t) = 0;
> SSA_NAME_IS_DEFAULT_DEF (t) = 0;
> init_ssa_name_imm_use (t);
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -1021,6 +1021,13 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
> #define DECL_HIDDEN_STRING_LENGTH(NODE) \
> (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag)
>
> +/* In a VAR_DECL, set for variables regarded as pointing to memory not written
> + to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs created from
> + such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in copyin
> + clauses. */
> +#define DECL_POINTS_TO_READONLY(NODE) \
> + (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray)
> +
> /* In a CALL_EXPR, means that the call is the jump from a thunk to the
> thunked-to function. Be careful to avoid using this macro when one of the
> next two applies instead. */
> @@ -1815,6 +1822,10 @@ class auto_suppress_location_wrappers
> #define OMP_CLAUSE_MAP_READONLY(NODE) \
> TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
>
> +/* Set if 'OMP_CLAUSE_DECL (NODE)' points to read-only memory. */
> +#define OMP_CLAUSE_MAP_POINTS_TO_READONLY(NODE) \
> + TREE_CONSTANT (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> +
> /* Same as above, for use in OpenACC cache directives. */
> #define OMP_CLAUSE__CACHE__READONLY(NODE) \
> TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
As in my "[PATCH, OpenACC 2.7] readonly modifier support in front-ends"
review, please document how certain flags are used for OMP clauses.
I note you're not actually using 'OMP_CLAUSE__CACHE__READONLY' anywhere
-- but that's OK given the current 'gcc/gimplify.cc:gimplify_oacc_cache'.
;-)
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis
2023-10-27 14:28 ` Thomas Schwinge
@ 2023-10-30 12:46 ` Richard Biener
2024-04-03 11:50 ` Chung-Lin Tang
0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2023-10-30 12:46 UTC (permalink / raw)
To: Thomas Schwinge
Cc: Chung-Lin Tang, gcc-patches, fortran, Catherine Moore, Tobias Burnus
On Fri, Oct 27, 2023 at 4:28 PM Thomas Schwinge <thomas@codesourcery.com> wrote:
>
> Hi!
>
> Richard, as the original author of 'SSA_NAME_POINTS_TO_READONLY_MEMORY':
> 2018 commit 6214d5c7e7470bdd5ecbeae668c2522551bfebbc (Subversion r263958)
> "Move const_parm trick to generic code"; 'gcc/tree.h':
>
> /* Nonzero if this SSA_NAME is known to point to memory that may not
> be written to. This is set for default defs of function parameters
> that have a corresponding r or R specification in the functions
> fn spec attribute. This is used by alias analysis. */
> #define SSA_NAME_POINTS_TO_READONLY_MEMORY(NODE) \
> SSA_NAME_CHECK (NODE)->base.deprecated_flag
>
> ..., may I ask you to please help review the following patch
> (full-quoted)?
>
> For context: this patch here ("second patch") depends on a first patch:
> <inbox.sourceware.org/d0e6013f-ca38-b98d-dc01-b30adbd5901a@siemens.com>
> "[PATCH, OpenACC 2.7] readonly modifier support in front-ends". That one
> is still under review/rework; so you're not able to apply this second
> patch here.
>
> In a nutshell: a 'readonly' modifier has been added to the OpenACC
> 'copyin' clause (copy host to device memory, don't copy back at end of
> region):
>
> | If the optional 'readonly' modifier appears, then the implementation may assume that the data
> | referenced by _var-list_ is never written to within the applicable region.
>
> That is, for example (untested):
>
> #pragma acc routine
> void escape(int *);
>
> int x[32] = [...];
> #pragma acc parallel copyin(readonly: x)
> {
> int a1 = x[3];
> escape(x);
> int a2 = x[3]; // Per 'readonly', don't need to reload 'x[3]' here.
> //x[22] = 0; // Invalid -- but no diagnostic mandated.
> }
>
> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
> flag.
>
> The actual optimization then is done in this second patch. Chung-Lin
> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
> I don't have much experience with most of the following generic code, so
> would appreciate a helping hand, whether that conceptually makes sense as
> well as from the implementation point of view:
No, I don't think you can use that flag on non-default-defs, nor
preserve it on copying. So
it also doesn't nicely extend to DECLs as done by the patch. We
currently _only_ use it
for incoming parameters. When used on arbitrary code you can get to for example
ptr1(points-to-readony-memory) = &p->x;
... access via ptr1 ...
ptr2 = &p->x;
... access via ptr2 ...
where both are your OMP regions differently constrained (the constrain is on the
code in the region, _not_ on the actual protections of the pointed to
data, much like
for the fortran case). But now CSE comes along and happily replaces all ptr2
with ptr2 in the second region and ... oops!
So no, re-using SSA_NAME_POINTS_TO_READONLY_MEMORY doesn't look good.
Richard.
> On 2023-07-25T23:52:06+0800, Chung-Lin Tang via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> > On 2023/7/11 2:33 AM, Chung-Lin Tang via Gcc-patches wrote:
> >> As we discussed earlier, the work for actually linking this to middle-end
> >> points-to analysis is a somewhat non-trivial issue. This first patch allows
> >> the language feature to be used in OpenACC directives first (with no effect for now).
> >> The middle-end changes are probably going to be a later patch.
> >
> > This second patch tries to link the readonly modifier to points-to analysis.
> >
> > There already exists SSA_NAME_POINTS_TO_READONLY_MEMORY and it's support in the
> > alias oracle routines in tree-ssa-alias.cc, so basically what this patch does is
> > try to make the variables holding the array section base pointers to have this
> > flag set.
> >
> > There is an another OMP_CLAUSE_MAP_POINTS_TO_READONLY set by front-ends on the
> > associated pointer clauses if OMP_CLAUSE_MAP_READONLY is set.
> > Also a DECL_POINTS_TO_READONLY flag is set for VAR_DECLs when creating the tmp
> > vars carrying these receiver references on the offloaded side. These
> > eventually get translated to SSA_NAME_POINTS_TO_READONLY_MEMORY.
>
>
> > This still doesn't always work as expected in terms of optimization:
> > struct pointer fields and Fortran arrays (kind of like C structs) which have
> > several accesses to create the pointer access on the receive/offloaded side,
> > and SRA appears to not work on these sequences, so gets in the way of much
> > redundancy elimination.
>
> I understand correctly that this is left as future work? Please add the test
> cases you have, XFAILed in some reasonable way.
>
>
> > Currently have one testcase where we can demonstrate 'readonly' can avoid
> > a clobber by function call.
>
> :-)
>
>
> > --- a/gcc/c/c-typeck.cc
> > +++ b/gcc/c/c-typeck.cc
> > @@ -14258,6 +14258,8 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
> > OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_ATTACH_DETACH);
> > else
> > OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER);
> > + if (OMP_CLAUSE_MAP_READONLY (c))
> > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> > OMP_CLAUSE_MAP_IMPLICIT (c2) = OMP_CLAUSE_MAP_IMPLICIT (c);
> > if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER
> > && !c_mark_addressable (t))
>
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -5872,6 +5872,8 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
> > }
> > else
> > OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER);
> > + if (OMP_CLAUSE_MAP_READONLY (c))
> > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> > OMP_CLAUSE_MAP_IMPLICIT (c2) = OMP_CLAUSE_MAP_IMPLICIT (c);
> > if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER
> > && !cxx_mark_addressable (t))
>
> > --- a/gcc/fortran/trans-openmp.cc
> > +++ b/gcc/fortran/trans-openmp.cc
> > @@ -2524,6 +2524,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_exec_op op,
> > node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
> > OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind);
> > OMP_CLAUSE_DECL (node3) = gfc_conv_descriptor_data_get (decl);
> > + if (n->u.readonly)
> > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
> > /* This purposely does not include GOMP_MAP_ALWAYS_POINTER. The extra
> > cast prevents gimplify.cc from recognising it as being part of the
> > struct - and adding an 'alloc: for the 'desc.data' pointer, which
> > @@ -2559,6 +2561,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_exec_op op,
> > OMP_CLAUSE_MAP);
> > OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind);
> > OMP_CLAUSE_DECL (node3) = decl;
> > + if (n->u.readonly)
> > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
> > }
>
> Could combine these two into one, after
> 'if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))' reconverges here, like
> where 'OMP_CLAUSE_SIZE (node3)' is set:
>
> > ptr2 = fold_convert (ptrdiff_type_node, ptr2);
> > OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, ptrdiff_type_node,
>
> Is 'n->u.readonly == OMP_CLAUSE_MAP_READONLY (node)'? If yes, would the
> latter be clearer to use as the 'if' expression (like in C, C++ front
> ends)?
>
> I see further additional 'OMP_CLAUSE_MAP' clauses synthesized, for
> example in 'gcc/cp/semantics.cc:handle_omp_array_sections', or
> 'gcc/fortran/trans-openmp.cc:gfc_trans_omp_array_section', also
> 'gcc/gimplify.cc'. I assume these are not relevant to have
> 'OMP_CLAUSE_MAP_READONLY' -> 'OMP_CLAUSE_MAP_POINTS_TO_READONLY'
> propagated? Actually, per your changes (see below), there is one
> 'OMP_CLAUSE_MAP_POINTS_TO_READONLY' propagation in
> 'gcc/gimplify.cc:build_omp_struct_comp_nodes'.
>
> Is the current situation re flag setting/propagation what was empirically
> necessary to make the test case work, or is it a systematic review? (The
> former is fine; I'd just like to know.)
>
> > --- a/gcc/gimple-expr.cc
> > +++ b/gcc/gimple-expr.cc
> > @@ -376,6 +376,8 @@ copy_var_decl (tree var, tree name, tree type)
> > DECL_CONTEXT (copy) = DECL_CONTEXT (var);
> > TREE_USED (copy) = 1;
> > DECL_SEEN_IN_BIND_EXPR_P (copy) = 1;
> > + if (VAR_P (var))
> > + DECL_POINTS_TO_READONLY (copy) = DECL_POINTS_TO_READONLY (var);
> > DECL_ATTRIBUTES (copy) = DECL_ATTRIBUTES (var);
> > if (DECL_USER_ALIGN (var))
> > {
>
> > --- a/gcc/gimplify.cc
> > +++ b/gcc/gimplify.cc
> > @@ -221,6 +221,7 @@ struct gimplify_omp_ctx
> > splay_tree variables;
> > hash_set<tree> *privatized_types;
> > tree clauses;
> > + hash_set<tree_operand_hash> *pt_readonly_ptrs;
> > /* Iteration variables in an OMP_FOR. */
> > vec<tree> loop_iter_var;
> > location_t location;
> > @@ -628,6 +629,15 @@ internal_get_tmp_var (tree val, gimple_seq *pre_p, gimple_seq *post_p,
> > gimplify_expr (&val, pre_p, post_p, is_gimple_reg_rhs_or_call,
> > fb_rvalue);
> >
> > + bool pt_readonly = false;
> > + if (gimplify_omp_ctxp && gimplify_omp_ctxp->pt_readonly_ptrs)
> > + {
> > + tree ptr = val;
> > + if (TREE_CODE (ptr) == POINTER_PLUS_EXPR)
> > + ptr = TREE_OPERAND (ptr, 0);
> > + pt_readonly = gimplify_omp_ctxp->pt_readonly_ptrs->contains (ptr);
> > + }
>
> 'POINTER_PLUS_EXPR' is the only special thing we may run into, here?
> (Generally, I prefer 'if', 'else if, [...], 'else gcc_unreachable ()'.)
>
> > +
> > if (allow_ssa
> > && gimplify_ctxp->into_ssa
> > && is_gimple_reg_type (TREE_TYPE (val)))
> > @@ -639,9 +649,18 @@ internal_get_tmp_var (tree val, gimple_seq *pre_p, gimple_seq *post_p,
> > if (name)
> > SET_SSA_NAME_VAR_OR_IDENTIFIER (t, create_tmp_var_name (name));
> > }
> > + if (pt_readonly)
> > + SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1;
> > }
> > else
> > - t = lookup_tmp_var (val, is_formal, not_gimple_reg);
> > + {
> > + t = lookup_tmp_var (val, is_formal, not_gimple_reg);
> > + if (pt_readonly)
> > + {
> > + DECL_POINTS_TO_READONLY (t) = 1;
> > + gimplify_omp_ctxp->pt_readonly_ptrs->add (t);
> > + }
> > + }
> >
> > mod = build2 (INIT_EXPR, TREE_TYPE (t), t, unshare_expr (val));
> >
> > @@ -8906,6 +8925,8 @@ build_omp_struct_comp_nodes (enum tree_code code, tree grp_start, tree grp_end,
> > OMP_CLAUSE_SET_MAP_KIND (c2, mkind);
> > OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (grp_end));
> > OMP_CLAUSE_CHAIN (c2) = NULL_TREE;
> > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (grp_end))
> > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> > tree grp_mid = NULL_TREE;
> > if (OMP_CLAUSE_CHAIN (grp_start) != grp_end)
> > grp_mid = OMP_CLAUSE_CHAIN (grp_start);
>
> For my understanding, is this empirically necessary, or a systematic
> review?
>
> > @@ -11741,6 +11762,16 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
> >
> > gimplify_omp_ctxp = outer_ctx;
> > }
> > + else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
> > + && (code == OACC_PARALLEL
> > + || code == OACC_KERNELS
> > + || code == OACC_SERIAL)
> > + && OMP_CLAUSE_MAP_POINTS_TO_READONLY (c))
> > + {
> > + if (ctx->pt_readonly_ptrs == NULL)
> > + ctx->pt_readonly_ptrs = new hash_set<tree_operand_hash> ();
> > + ctx->pt_readonly_ptrs->add (OMP_CLAUSE_DECL (c));
> > + }
> > if (notice_outer)
> > goto do_notice;
> > break;
>
> Also need to 'delete ctx->pt_readonly_ptrs;' somewhere.
>
> > --- a/gcc/omp-low.cc
> > +++ b/gcc/omp-low.cc
> > @@ -14098,6 +14098,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
> > if (ref_to_array)
> > x = fold_convert_loc (clause_loc, TREE_TYPE (new_var), x);
> > gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue);
> > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (c) && VAR_P (x))
> > + DECL_POINTS_TO_READONLY (x) = 1;
> > if ((is_ref && !ref_to_array)
> > || ref_to_ptr)
> > {
>
> This is in the middle of the
> "Handle GOMP_MAP_FIRSTPRIVATE_{POINTER,REFERENCE} in second pass" code
> block. Again, for my understanding, is this empirically necessary, or a
> systematic review?
>
> > --- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> > @@ -19,8 +19,8 @@ int main (void)
> > return 0;
> > }
> >
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*s.ptr \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*s.ptr \\\[len: 64\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: 64\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
> > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: 128\\\]\\);$" 1 "original" } } */
>
> I suppose the new 'map(pt_readonly,attach_detach:s.ptr [bias: 0])' clause
> was previously "hidden" in '.+'? Please then change that in the first
> patch "[PATCH, OpenACC 2.7] readonly modifier support in front-ends", so
> that we can see here what actually is changing (only 'pt_readonly', I
> suppose).
>
> > --- /dev/null
> > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
> > @@ -0,0 +1,15 @@
> > +/* { dg-additional-options "-O -fdump-tree-fre" } */
> > +
> > +#pragma acc routine
> > +extern void foo (int *ptr, int val);
> > +
> > +int main (void)
> > +{
> > + int r, a[32];
> > + #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> > + {
> > + foo (a, a[8]);
> > + r = a[8];
> > + }
> > +}
> > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
>
> Please add a comment why 'fre1', and what generally is being checked
> here; that's not obvious to the casual reader. (That is, me in a few
> weeks.) ;-)
>
> Also add a scan for "before the optimization": two 'MEM's, I suppose?
>
> > --- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> > +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> > @@ -20,8 +20,8 @@ program main
> > !$acc end parallel
> > end program main
> >
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) .+ map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\)" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) .+ map\\(readonly,to:b\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]\\)" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:a.0 \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) a.0\\\]\\) map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:b \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) b\\\]\\)" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:a \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &a\\\]\\) map\\(readonly,to:b\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]\\) map\\(pt_readonly,alloc:b \\\[pointer assign, bias: \\(integer\\(kind=8\\)\\) parm.*data - \\(integer\\(kind=8\\)\\) &b\\\]\\)" 1 "original" } }
> > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 2 "original" } }
>
> Same comment as for 'c-c++-common/goacc/readonly-1.c'.
>
> > --- a/gcc/tree-pretty-print.cc
> > +++ b/gcc/tree-pretty-print.cc
> > @@ -907,6 +907,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
> > pp_string (pp, "map(");
> > if (OMP_CLAUSE_MAP_READONLY (clause))
> > pp_string (pp, "readonly,");
> > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (clause))
> > + pp_string (pp, "pt_readonly,");
> > switch (OMP_CLAUSE_MAP_KIND (clause))
> > {
> > case GOMP_MAP_ALLOC:
> > @@ -3436,6 +3438,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
> > pp_string (pp, "(D)");
> > if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (node))
> > pp_string (pp, "(ab)");
> > + if (SSA_NAME_POINTS_TO_READONLY_MEMORY (node))
> > + pp_string (pp, "(ptro)");
> > break;
> >
> > case WITH_SIZE_EXPR:
>
> > --- a/gcc/tree-ssanames.cc
> > +++ b/gcc/tree-ssanames.cc
> > @@ -402,6 +402,9 @@ make_ssa_name_fn (struct function *fn, tree var, gimple *stmt,
> > else
> > SSA_NAME_RANGE_INFO (t) = NULL;
> >
> > + if (VAR_P (var) && DECL_POINTS_TO_READONLY (var))
> > + SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1;
> > +
> > SSA_NAME_IN_FREE_LIST (t) = 0;
> > SSA_NAME_IS_DEFAULT_DEF (t) = 0;
> > init_ssa_name_imm_use (t);
>
> > --- a/gcc/tree.h
> > +++ b/gcc/tree.h
> > @@ -1021,6 +1021,13 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
> > #define DECL_HIDDEN_STRING_LENGTH(NODE) \
> > (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag)
> >
> > +/* In a VAR_DECL, set for variables regarded as pointing to memory not written
> > + to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs created from
> > + such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in copyin
> > + clauses. */
> > +#define DECL_POINTS_TO_READONLY(NODE) \
> > + (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray)
> > +
> > /* In a CALL_EXPR, means that the call is the jump from a thunk to the
> > thunked-to function. Be careful to avoid using this macro when one of the
> > next two applies instead. */
> > @@ -1815,6 +1822,10 @@ class auto_suppress_location_wrappers
> > #define OMP_CLAUSE_MAP_READONLY(NODE) \
> > TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> >
> > +/* Set if 'OMP_CLAUSE_DECL (NODE)' points to read-only memory. */
> > +#define OMP_CLAUSE_MAP_POINTS_TO_READONLY(NODE) \
> > + TREE_CONSTANT (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> > +
> > /* Same as above, for use in OpenACC cache directives. */
> > #define OMP_CLAUSE__CACHE__READONLY(NODE) \
> > TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
>
> As in my "[PATCH, OpenACC 2.7] readonly modifier support in front-ends"
> review, please document how certain flags are used for OMP clauses.
>
>
> I note you're not actually using 'OMP_CLAUSE__CACHE__READONLY' anywhere
> -- but that's OK given the current 'gcc/gimplify.cc:gimplify_oacc_cache'.
> ;-)
>
>
> Grüße
> Thomas
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends
2023-10-26 9:43 ` Thomas Schwinge
@ 2024-03-07 8:02 ` Chung-Lin Tang
2024-03-13 9:12 ` Thomas Schwinge
0 siblings, 1 reply; 18+ messages in thread
From: Chung-Lin Tang @ 2024-03-07 8:02 UTC (permalink / raw)
To: Thomas Schwinge, Tobias Burnus, Chung-Lin Tang; +Cc: gcc-patches, fortran
[-- Attachment #1: Type: text/plain, Size: 7876 bytes --]
Hi Thomas, Tobias,
On 2023/10/26 6:43 PM, Thomas Schwinge wrote:
>>>>> +++ b/gcc/tree.h
>>>>> @@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers
>>>>> #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
>>>>> (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
>>>>>
>>>>> +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */
>>>>> +#define OMP_CLAUSE_MAP_READONLY(NODE) \
>>>>> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
>>>>> +
>>>>> +/* Same as above, for use in OpenACC cache directives. */
>>>>> +#define OMP_CLAUSE__CACHE__READONLY(NODE) \
>>>>> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
>>>> I'm not sure if these special accessor functions are actually useful, or
>>>> we should just directly use 'TREE_READONLY' instead? We're only using
>>>> them in contexts where it's clear that the 'OMP_CLAUSE_SUBCODE_CHECK' is
>>>> satisfied, for example.
>>> I find directly using TREE_READONLY confusing.
>>
>> FWIW, I've changed to use TREE_NOTHROW instead, if it can give a better sense of safety :P
>
> I don't understand that, why not use 'TREE_READONLY'?
>
>> I think there's a misunderstanding here anyways: we are not relying on a DECL marked
>> TREE_READONLY here. We merely need the OMP_CLAUSE_MAP to be marked as OMP_CLAUSE_MAP_READONLY == 1.
>
> Yes, I understand that. My question was why we don't just use
> 'TREE_READONLY (c)', where 'c' is the
> 'OMP_CLAUSE_MAP'/'OMP_CLAUSE__CACHE_' clause (not its decl), and avoid
> the indirection through
> '#define OMP_CLAUSE_MAP_READONLY'/'#define OMP_CLAUSE__CACHE__READONLY',
> given that we're only using them in contexts where it's clear that the
> 'OMP_CLAUSE_SUBCODE_CHECK' is satisfied. I don't have a strong
> preference, though.
After further re-testing using TREE_NOTHROW, I have reverted to using TREE_READONLY, because TREE_NOTHROW clashes
with OMP_CLAUSE_RELEASE_DESCRIPTOR (which doesn't use the OMP_CLAUSE_MAP_* naming convention and is
not documented in gcc/tree-core.h either, hmmm...)
I have added the comment adjustments in gcc/tree-core.h for the new uses of TREE_READONLY/readonly_flag.
We basically all use OMP_CLAUSE_SUBCODE_CHECK macros for OpenMP clause expressions exclusively,
so I don't see a reason to diverge from that style (even when context is clear).
> Either way, you still need to document this:
>
> | Also, for the new use for OMP clauses, update 'gcc/tree.h:TREE_READONLY',
> | and in 'gcc/tree-core.h' for 'readonly_flag' the
> | "table lists the uses of each of the above flags".
Okay, done as mentioned above.
> In addition to a few individual comments above and below, you've also not
> yet responded to my requests re test cases.
I have greatly expanded the test scan patterns to include parallel/kernels/serial/data/enter data,
as well as non-readonly copyin clause together with readonly.
Also added simple 'declare' tests, but there is not anything to scan in the 'tree-original' dump though.
>> + tree nl = list;
>> + bool readonly = false;
>> + matching_parens parens;
>> + if (parens.require_open (parser))
>> + {
>> + /* Turn on readonly modifier parsing for copyin clause. */
>> + if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
>> + {
>> + c_token *token = c_parser_peek_token (parser);
>> + if (token->type == CPP_NAME
>> + && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
>> + && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
>> + {
>> + c_parser_consume_token (parser);
>> + c_parser_consume_token (parser);
>> + readonly = true;
>> + }
>> + }
>> + location_t loc = c_parser_peek_token (parser)->location;
>
> I suppose 'loc' here now points to after the opening '(' or after the
> 'readonly :'? This is different from what 'c_parser_omp_var_list_parens'
> does, and indeed, 'c_parser_omp_variable_list' states that "CLAUSE_LOC is
> the location of the clause", not the location of the variable-list? As
> this, I suppose, may change diagnostics, please restore the original
> behavior. (This appears to be different in the C++ front end, huh.)
Thanks for catching this! Fixed.
>> --- a/gcc/fortran/openmp.cc
>> +++ b/gcc/fortran/openmp.cc
>> @@ -1197,7 +1197,7 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
>>
>> static bool
>> gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
>> - bool allow_common, bool allow_derived)
>> + bool allow_common, bool allow_derived, bool readonly = false)
>> {
>> gfc_omp_namelist **head = NULL;
>> if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, true,
>> @@ -1206,7 +1206,10 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
>> {
>> gfc_omp_namelist *n;
>> for (n = *head; n; n = n->next)
>> - n->u.map_op = map_op;
>> + {
>> + n->u.map.op = map_op;
>> + n->u.map.readonly = readonly;
>> + }
>> return true;
>> }
>
> Didn't we conclude that "not doing it here is cleaner" (Tobias' words),
> and instead do this "Similar to 'c_parser_omp_var_list_parens'" (my
> words)? That is, not add the 'bool readonly' formal parameter to
> 'gfc_match_omp_map_clause'.
Fixed in this v3 patch.
Again, tested on x86_64-linux + nvptx offloading. Okay for mainline?
Thanks,
Chung-Lin
gcc/c/ChangeLog:
* c-parser.cc (c_parser_oacc_data_clause): Add parsing support for
'readonly' modifier, set OMP_CLAUSE_MAP_READONLY if readonly modifier
found, update comments.
(c_parser_oacc_cache): Add parsing support for 'readonly' modifier,
set OMP_CLAUSE__CACHE__READONLY if readonly modifier found, update
comments.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_oacc_data_clause): Add parsing support for
'readonly' modifier, set OMP_CLAUSE_MAP_READONLY if readonly modifier
found, update comments.
(cp_parser_oacc_cache): Add parsing support for 'readonly' modifier,
set OMP_CLAUSE__CACHE__READONLY if readonly modifier found, update
comments.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_namelist): Print "readonly," for
OMP_LIST_MAP and OMP_LIST_CACHE if n->u.map.readonly is set.
Adjust 'n->u.map_op' to 'n->u.map.op'.
* gfortran.h (typedef struct gfc_omp_namelist): Adjust map_op as
'ENUM_BITFIELD (gfc_omp_map_op) op:8', add 'bool readonly' field,
change to named struct field 'map'.
* openmp.cc (gfc_match_omp_map_clause): Adjust 'n->u.map_op' to
'n->u.map.op'.
(gfc_match_omp_clause_reduction): Likewise.
(gfc_match_omp_clauses): Add readonly modifier parsing for OpenACC
copyin clause, set 'n->u.map.op' and 'n->u.map.readonly' for parsed
clause. Adjust 'n->u.map_op' to 'n->u.map.op'.
(gfc_match_oacc_declare): Adjust 'n->u.map_op' to 'n->u.map.op'.
(gfc_match_oacc_cache): Add readonly modifier parsing for OpenACC
cache directive.
(resolve_omp_clauses): Adjust 'n->u.map_op' to 'n->u.map.op'.
* trans-decl.cc (add_clause): Adjust 'n->u.map_op' to 'n->u.map.op'.
(finish_oacc_declare): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_CLAUSE_MAP_READONLY,
OMP_CLAUSE__CACHE__READONLY to 1 when readonly is set. Adjust
'n->u.map_op' to 'n->u.map.op'.
(gfc_add_clause_implicitly): Adjust 'n->u.map_op' to 'n->u.map.op'.
gcc/ChangeLog:
* tree.h (OMP_CLAUSE_MAP_READONLY): New macro.
(OMP_CLAUSE__CACHE__READONLY): New macro.
* tree-core.h (struct GTY(()) tree_base): Adjust comments for new
uses of readonly_flag bit in OMP_CLAUSE_MAP_READONLY and
OMP_CLAUSE__CACHE__READONLY.
* tree-pretty-print.cc (dump_omp_clause): Add support for printing
OMP_CLAUSE_MAP_READONLY and OMP_CLAUSE__CACHE__READONLY.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/readonly-1.c: New test.
* gfortran.dg/goacc/readonly-1.f90: New test.
[-- Attachment #2: readonly-fe-v3.patch --]
[-- Type: text/plain, Size: 32644 bytes --]
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 53e99aa29d9..00f8bf4376e 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -15627,7 +15627,11 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
OpenACC 2.6:
no_create ( variable-list )
attach ( variable-list )
- detach ( variable-list ) */
+ detach ( variable-list )
+
+ OpenACC 2.7:
+ copyin (readonly : variable-list )
+ */
static tree
c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
@@ -15680,11 +15684,37 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
default:
gcc_unreachable ();
}
- tree nl, c;
- nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, false);
- for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
- OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ tree nl = list;
+ bool readonly = false;
+ location_t open_loc = c_parser_peek_token (parser)->location;
+ matching_parens parens;
+ if (parens.require_open (parser))
+ {
+ /* Turn on readonly modifier parsing for copyin clause. */
+ if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
+ {
+ c_token *token = c_parser_peek_token (parser);
+ if (token->type == CPP_NAME
+ && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
+ && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
+ {
+ c_parser_consume_token (parser);
+ c_parser_consume_token (parser);
+ readonly = true;
+ }
+ }
+ nl = c_parser_omp_variable_list (parser, open_loc, OMP_CLAUSE_MAP, list,
+ false);
+ parens.skip_until_found_close (parser);
+ }
+
+ for (tree c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
+ {
+ OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ if (readonly)
+ OMP_CLAUSE_MAP_READONLY (c) = 1;
+ }
return nl;
}
@@ -19821,15 +19851,39 @@ c_parser_omp_structured_block (c_parser *parser, bool *if_p)
/* OpenACC 2.0:
# pragma acc cache (variable-list) new-line
+ OpenACC 2.7:
+ # pragma acc cache (readonly: variable-list) new-line
+
LOC is the location of the #pragma token.
*/
static tree
c_parser_oacc_cache (location_t loc, c_parser *parser)
{
- tree stmt, clauses;
+ tree stmt, clauses = NULL_TREE;
+ bool readonly = false;
+ location_t open_loc = c_parser_peek_token (parser)->location;
+ matching_parens parens;
+ if (parens.require_open (parser))
+ {
+ c_token *token = c_parser_peek_token (parser);
+ if (token->type == CPP_NAME
+ && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
+ && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
+ {
+ c_parser_consume_token (parser);
+ c_parser_consume_token (parser);
+ readonly = true;
+ }
+ clauses = c_parser_omp_variable_list (parser, open_loc,
+ OMP_CLAUSE__CACHE_, NULL_TREE);
+ parens.skip_until_found_close (parser);
+ }
+
+ if (readonly)
+ for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+ OMP_CLAUSE__CACHE__READONLY (c) = 1;
- clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL);
clauses = c_finish_omp_clauses (clauses, C_ORT_ACC);
c_parser_skip_to_pragma_eol (parser);
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index e32acfc30a2..4fe27fb07b2 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -38544,7 +38544,11 @@ cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list,
OpenACC 2.6:
no_create ( variable-list )
attach ( variable-list )
- detach ( variable-list ) */
+ detach ( variable-list )
+
+ OpenACC 2.7:
+ copyin (readonly : variable-list )
+ */
static tree
cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
@@ -38597,11 +38601,34 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
default:
gcc_unreachable ();
}
- tree nl, c;
- nl = cp_parser_omp_var_list (parser, OMP_CLAUSE_MAP, list, false);
- for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
- OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ tree nl = list;
+ bool readonly = false;
+ if (cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+ {
+ /* Turn on readonly modifier parsing for copyin clause. */
+ if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
+ {
+ cp_token *token = cp_lexer_peek_token (parser->lexer);
+ if (token->type == CPP_NAME
+ && !strcmp (IDENTIFIER_POINTER (token->u.value), "readonly")
+ && cp_lexer_peek_nth_token (parser->lexer, 2)->type == CPP_COLON)
+ {
+ cp_lexer_consume_token (parser->lexer);
+ cp_lexer_consume_token (parser->lexer);
+ readonly = true;
+ }
+ }
+ nl = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE_MAP, list, NULL,
+ false);
+ }
+
+ for (tree c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
+ {
+ OMP_CLAUSE_SET_MAP_KIND (c, kind);
+ if (readonly)
+ OMP_CLAUSE_MAP_READONLY (c) = 1;
+ }
return nl;
}
@@ -47178,6 +47205,9 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok,
/* OpenACC 2.0:
# pragma acc cache (variable-list) new-line
+
+ OpenACC 2.7:
+ # pragma acc cache (readonly: variable-list) new-line
*/
static tree
@@ -47187,9 +47217,28 @@ cp_parser_oacc_cache (cp_parser *parser, cp_token *pragma_tok)
clauses. */
auto_suppress_location_wrappers sentinel;
- tree stmt, clauses;
+ tree stmt, clauses = NULL_TREE;
+ bool readonly = false;
+
+ if (cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+ {
+ cp_token *token = cp_lexer_peek_token (parser->lexer);
+ if (token->type == CPP_NAME
+ && !strcmp (IDENTIFIER_POINTER (token->u.value), "readonly")
+ && cp_lexer_peek_nth_token (parser->lexer, 2)->type == CPP_COLON)
+ {
+ cp_lexer_consume_token (parser->lexer);
+ cp_lexer_consume_token (parser->lexer);
+ readonly = true;
+ }
+ clauses = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE__CACHE_,
+ NULL, NULL);
+ }
+
+ if (readonly)
+ for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+ OMP_CLAUSE__CACHE__READONLY (c) = 1;
- clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE__CACHE_, NULL_TREE);
clauses = finish_omp_clauses (clauses, C_ORT_ACC);
cp_parser_require_pragma_eol (parser, cp_lexer_peek_token (parser->lexer));
diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
index 7b154eb3ca7..db84b06289b 100644
--- a/gcc/fortran/dump-parse-tree.cc
+++ b/gcc/fortran/dump-parse-tree.cc
@@ -1400,6 +1400,9 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
fputs (") ALLOCATE(", dumpfile);
continue;
}
+ if ((list_type == OMP_LIST_MAP || list_type == OMP_LIST_CACHE)
+ && n->u.map.readonly)
+ fputs ("readonly,", dumpfile);
if (list_type == OMP_LIST_REDUCTION)
switch (n->u.reduction_op)
{
@@ -1467,7 +1470,7 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
default: break;
}
else if (list_type == OMP_LIST_MAP)
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_ALLOC: fputs ("alloc:", dumpfile); break;
case OMP_MAP_TO: fputs ("to:", dumpfile); break;
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index ebba2336e12..32b792f85fb 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1363,7 +1363,11 @@ typedef struct gfc_omp_namelist
{
gfc_omp_reduction_op reduction_op;
gfc_omp_depend_doacross_op depend_doacross_op;
- gfc_omp_map_op map_op;
+ struct
+ {
+ ENUM_BITFIELD (gfc_omp_map_op) op:8;
+ bool readonly;
+ } map;
gfc_expr *align;
struct
{
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 38de60238c0..5c44e666eb9 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -1210,7 +1210,7 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
{
gfc_omp_namelist *n;
for (n = *head; n; n = n->next)
- n->u.map_op = map_op;
+ n->u.map.op = map_op;
return true;
}
@@ -1524,7 +1524,7 @@ gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses *c, bool openacc,
gfc_omp_namelist *p = gfc_get_omp_namelist (), **tl;
p->sym = n->sym;
p->where = p->where;
- p->u.map_op = OMP_MAP_ALWAYS_TOFROM;
+ p->u.map.op = OMP_MAP_ALWAYS_TOFROM;
tl = &c->lists[OMP_LIST_MAP];
while (*tl)
@@ -2181,11 +2181,25 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
{
if (openacc)
{
- if (gfc_match ("copyin ( ") == MATCH_YES
- && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
- OMP_MAP_TO, true,
- allow_derived))
- continue;
+ if (gfc_match ("copyin ( ") == MATCH_YES)
+ {
+ bool readonly = gfc_match ("readonly : ") == MATCH_YES;
+ head = NULL;
+ if (gfc_match_omp_variable_list ("",
+ &c->lists[OMP_LIST_MAP],
+ true, NULL, &head, true,
+ allow_derived)
+ == MATCH_YES)
+ {
+ gfc_omp_namelist *n;
+ for (n = *head; n; n = n->next)
+ {
+ n->u.map.op = OMP_MAP_TO;
+ n->u.map.readonly = readonly;
+ }
+ continue;
+ }
+ }
}
else if (gfc_match_omp_variable_list ("copyin (",
&c->lists[OMP_LIST_COPYIN],
@@ -3134,7 +3148,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
{
gfc_omp_namelist *n;
for (n = *head; n; n = n->next)
- n->u.map_op = map_op;
+ n->u.map.op = map_op;
continue;
}
gfc_current_locus = old_loc;
@@ -4002,7 +4016,7 @@ gfc_match_oacc_declare (void)
if (gfc_current_ns->proc_name
&& gfc_current_ns->proc_name->attr.flavor == FL_MODULE)
{
- if (n->u.map_op != OMP_MAP_ALLOC && n->u.map_op != OMP_MAP_TO)
+ if (n->u.map.op != OMP_MAP_ALLOC && n->u.map.op != OMP_MAP_TO)
{
gfc_error ("Invalid clause in module with !$ACC DECLARE at %L",
&where);
@@ -4036,7 +4050,7 @@ gfc_match_oacc_declare (void)
return MATCH_ERROR;
}
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_FORCE_ALLOC:
case OMP_MAP_ALLOC:
@@ -4151,21 +4165,36 @@ gfc_match_oacc_wait (void)
match
gfc_match_oacc_cache (void)
{
+ bool readonly = false;
gfc_omp_clauses *c = gfc_get_omp_clauses ();
/* The OpenACC cache directive explicitly only allows "array elements or
subarrays", which we're currently not checking here. Either check this
after the call of gfc_match_omp_variable_list, or add something like a
only_sections variant next to its allow_sections parameter. */
- match m = gfc_match_omp_variable_list (" (",
- &c->lists[OMP_LIST_CACHE], true,
- NULL, NULL, true);
+ match m = gfc_match (" ( ");
if (m != MATCH_YES)
{
gfc_free_omp_clauses(c);
return m;
}
- if (gfc_current_state() != COMP_DO
+ if (gfc_match ("readonly : ") == MATCH_YES)
+ readonly = true;
+
+ gfc_omp_namelist **head = NULL;
+ m = gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_CACHE], true,
+ NULL, &head, true);
+ if (m != MATCH_YES)
+ {
+ gfc_free_omp_clauses(c);
+ return m;
+ }
+
+ if (readonly)
+ for (gfc_omp_namelist *n = *head; n; n = n->next)
+ n->u.map.readonly = true;
+
+ if (gfc_current_state() != COMP_DO
&& gfc_current_state() != COMP_DO_CONCURRENT)
{
gfc_error ("ACC CACHE directive must be inside of loop %C");
@@ -8436,8 +8465,8 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
}
if (openacc
&& list == OMP_LIST_MAP
- && (n->u.map_op == OMP_MAP_ATTACH
- || n->u.map_op == OMP_MAP_DETACH))
+ && (n->u.map.op == OMP_MAP_ATTACH
+ || n->u.map.op == OMP_MAP_DETACH))
{
symbol_attribute attr;
if (n->expr)
@@ -8447,7 +8476,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
if (!attr.pointer && !attr.allocatable)
gfc_error ("%qs clause argument must be ALLOCATABLE or "
"a POINTER at %L",
- (n->u.map_op == OMP_MAP_ATTACH) ? "attach"
+ (n->u.map.op == OMP_MAP_ATTACH) ? "attach"
: "detach", &n->where);
}
if (lastref
@@ -8518,7 +8547,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
else if (openacc)
{
if (list == OMP_LIST_MAP
- && n->u.map_op == OMP_MAP_FORCE_DEVICEPTR)
+ && n->u.map.op == OMP_MAP_FORCE_DEVICEPTR)
resolve_oacc_deviceptr_clause (n->sym, n->where, name);
else
resolve_oacc_data_clauses (n->sym, n->where, name);
@@ -8540,7 +8569,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
{
case EXEC_OMP_TARGET:
case EXEC_OMP_TARGET_DATA:
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_TO:
case OMP_MAP_ALWAYS_TO:
@@ -8567,7 +8596,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
}
break;
case EXEC_OMP_TARGET_ENTER_DATA:
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_TO:
case OMP_MAP_ALWAYS_TO:
@@ -8577,16 +8606,16 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
case OMP_MAP_PRESENT_ALLOC:
break;
case OMP_MAP_TOFROM:
- n->u.map_op = OMP_MAP_TO;
+ n->u.map.op = OMP_MAP_TO;
break;
case OMP_MAP_ALWAYS_TOFROM:
- n->u.map_op = OMP_MAP_ALWAYS_TO;
+ n->u.map.op = OMP_MAP_ALWAYS_TO;
break;
case OMP_MAP_PRESENT_TOFROM:
- n->u.map_op = OMP_MAP_PRESENT_TO;
+ n->u.map.op = OMP_MAP_PRESENT_TO;
break;
case OMP_MAP_ALWAYS_PRESENT_TOFROM:
- n->u.map_op = OMP_MAP_ALWAYS_PRESENT_TO;
+ n->u.map.op = OMP_MAP_ALWAYS_PRESENT_TO;
break;
default:
gfc_error ("TARGET ENTER DATA with map-type other "
@@ -8596,7 +8625,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
}
break;
case EXEC_OMP_TARGET_EXIT_DATA:
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_FROM:
case OMP_MAP_ALWAYS_FROM:
@@ -8606,16 +8635,16 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
case OMP_MAP_DELETE:
break;
case OMP_MAP_TOFROM:
- n->u.map_op = OMP_MAP_FROM;
+ n->u.map.op = OMP_MAP_FROM;
break;
case OMP_MAP_ALWAYS_TOFROM:
- n->u.map_op = OMP_MAP_ALWAYS_FROM;
+ n->u.map.op = OMP_MAP_ALWAYS_FROM;
break;
case OMP_MAP_PRESENT_TOFROM:
- n->u.map_op = OMP_MAP_PRESENT_FROM;
+ n->u.map.op = OMP_MAP_PRESENT_FROM;
break;
case OMP_MAP_ALWAYS_PRESENT_TOFROM:
- n->u.map_op = OMP_MAP_ALWAYS_PRESENT_FROM;
+ n->u.map.op = OMP_MAP_ALWAYS_PRESENT_FROM;
break;
default:
gfc_error ("TARGET EXIT DATA with map-type other "
diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index 6d463036966..b7dea11461f 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc
@@ -6744,7 +6744,7 @@ add_clause (gfc_symbol *sym, gfc_omp_map_op map_op)
n = gfc_get_omp_namelist ();
n->sym = sym;
- n->u.map_op = map_op;
+ n->u.map.op = map_op;
if (!module_oacc_clauses)
module_oacc_clauses = gfc_get_omp_clauses ();
@@ -6846,10 +6846,10 @@ finish_oacc_declare (gfc_namespace *ns, gfc_symbol *sym, bool block)
for (n = omp_clauses->lists[OMP_LIST_MAP]; n; n = n->next)
{
- switch (n->u.map_op)
+ switch (n->u.map.op)
{
case OMP_MAP_DEVICE_RESIDENT:
- n->u.map_op = OMP_MAP_FORCE_ALLOC;
+ n->u.map.op = OMP_MAP_FORCE_ALLOC;
break;
default:
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index a2bf15665b3..fa1bfd41380 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -3139,7 +3139,10 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
|| (n->expr && gfc_expr_attr (n->expr).pointer)))
always_modifier = true;
- switch (n->u.map_op)
+ if (n->u.map.readonly)
+ OMP_CLAUSE_MAP_READONLY (node) = 1;
+
+ switch (n->u.map.op)
{
case OMP_MAP_ALLOC:
OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALLOC);
@@ -3266,8 +3269,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
&& n->sym->attr.omp_declare_target
&& (always_modifier || n->sym->attr.pointer)
&& op != EXEC_OMP_TARGET_EXIT_DATA
- && n->u.map_op != OMP_MAP_DELETE
- && n->u.map_op != OMP_MAP_RELEASE)
+ && n->u.map.op != OMP_MAP_DELETE
+ && n->u.map.op != OMP_MAP_RELEASE)
{
gcc_assert (n->sym->ts.u.cl->backend_decl);
node5 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
@@ -3333,7 +3336,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
{
enum gomp_map_kind gmk = GOMP_MAP_POINTER;
if (op == EXEC_OMP_TARGET_EXIT_DATA
- && n->u.map_op == OMP_MAP_DELETE)
+ && n->u.map.op == OMP_MAP_DELETE)
gmk = GOMP_MAP_DELETE;
else if (op == EXEC_OMP_TARGET_EXIT_DATA)
gmk = GOMP_MAP_RELEASE;
@@ -3356,7 +3359,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
{
enum gomp_map_kind gmk;
if (op == EXEC_OMP_TARGET_EXIT_DATA
- && n->u.map_op == OMP_MAP_DELETE)
+ && n->u.map.op == OMP_MAP_DELETE)
gmk = GOMP_MAP_DELETE;
else if (op == EXEC_OMP_TARGET_EXIT_DATA)
gmk = GOMP_MAP_RELEASE;
@@ -3388,18 +3391,18 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
node2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
OMP_CLAUSE_DECL (node2) = decl;
OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
- if (n->u.map_op == OMP_MAP_DELETE)
+ if (n->u.map.op == OMP_MAP_DELETE)
map_kind = GOMP_MAP_DELETE;
else if (op == EXEC_OMP_TARGET_EXIT_DATA
- || n->u.map_op == OMP_MAP_RELEASE)
+ || n->u.map.op == OMP_MAP_RELEASE)
map_kind = GOMP_MAP_RELEASE;
else
map_kind = GOMP_MAP_TO_PSET;
OMP_CLAUSE_SET_MAP_KIND (node2, map_kind);
if (op != EXEC_OMP_TARGET_EXIT_DATA
- && n->u.map_op != OMP_MAP_DELETE
- && n->u.map_op != OMP_MAP_RELEASE)
+ && n->u.map.op != OMP_MAP_DELETE
+ && n->u.map.op != OMP_MAP_RELEASE)
{
node3 = build_omp_clause (input_location,
OMP_CLAUSE_MAP);
@@ -3417,7 +3420,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
= gfc_conv_descriptor_data_get (decl);
OMP_CLAUSE_SIZE (node3) = size_int (0);
- if (n->u.map_op == OMP_MAP_ATTACH)
+ if (n->u.map.op == OMP_MAP_ATTACH)
{
/* Standalone attach clauses used with arrays with
descriptors must copy the descriptor to the
@@ -3433,7 +3436,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
node3 = NULL;
goto finalize_map_clause;
}
- else if (n->u.map_op == OMP_MAP_DETACH)
+ else if (n->u.map.op == OMP_MAP_DETACH)
{
OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_DETACH);
/* Similarly to above, we don't want to unmap PTR
@@ -3626,8 +3629,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
to perform a single attach/detach operation, of the
pointer itself, not of the pointed-to object. */
if (openacc
- && (n->u.map_op == OMP_MAP_ATTACH
- || n->u.map_op == OMP_MAP_DETACH))
+ && (n->u.map.op == OMP_MAP_ATTACH
+ || n->u.map.op == OMP_MAP_DETACH))
{
OMP_CLAUSE_DECL (node)
= build_fold_addr_expr (OMP_CLAUSE_DECL (node));
@@ -3656,7 +3659,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
se.string_length),
TYPE_SIZE_UNIT (tmp));
gomp_map_kind kind;
- if (n->u.map_op == OMP_MAP_DELETE)
+ if (n->u.map.op == OMP_MAP_DELETE)
kind = GOMP_MAP_DELETE;
else if (op == EXEC_OMP_TARGET_EXIT_DATA)
kind = GOMP_MAP_RELEASE;
@@ -3713,8 +3716,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
to perform a single attach/detach operation, of the
pointer itself, not of the pointed-to object. */
if (openacc
- && (n->u.map_op == OMP_MAP_ATTACH
- || n->u.map_op == OMP_MAP_DETACH))
+ && (n->u.map.op == OMP_MAP_ATTACH
+ || n->u.map.op == OMP_MAP_DETACH))
{
OMP_CLAUSE_DECL (node)
= build_fold_addr_expr (inner);
@@ -3806,8 +3809,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
{
/* Bare attach and detach clauses don't want any
additional nodes. */
- if ((n->u.map_op == OMP_MAP_ATTACH
- || n->u.map_op == OMP_MAP_DETACH)
+ if ((n->u.map.op == OMP_MAP_ATTACH
+ || n->u.map.op == OMP_MAP_DETACH)
&& (POINTER_TYPE_P (TREE_TYPE (inner))
|| GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (inner))))
{
@@ -3840,8 +3843,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
map_kind = ((GOMP_MAP_ALWAYS_P (map_kind)
|| gfc_expr_attr (n->expr).pointer)
? GOMP_MAP_ALWAYS_TO : GOMP_MAP_TO);
- else if (n->u.map_op == OMP_MAP_RELEASE
- || n->u.map_op == OMP_MAP_DELETE)
+ else if (n->u.map.op == OMP_MAP_RELEASE
+ || n->u.map.op == OMP_MAP_DELETE)
;
else if (op == EXEC_OMP_TARGET_EXIT_DATA
|| op == EXEC_OACC_EXIT_DATA)
@@ -4088,6 +4091,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
}
if (n->u.present_modifier)
OMP_CLAUSE_MOTION_PRESENT (node) = 1;
+ if (list == OMP_LIST_CACHE && n->u.map.readonly)
+ OMP_CLAUSE__CACHE__READONLY (node) = 1;
omp_clauses = gfc_trans_add_clause (node, omp_clauses);
}
break;
@@ -6561,7 +6566,7 @@ gfc_add_clause_implicitly (gfc_omp_clauses *clauses_out,
n2->where = n->where;
n2->sym = n->sym;
if (is_target)
- n2->u.map_op = OMP_MAP_TOFROM;
+ n2->u.map.op = OMP_MAP_TOFROM;
if (tail)
{
tail->next = n2;
diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-1.c b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
new file mode 100644
index 00000000000..34fc92c24d5
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
@@ -0,0 +1,59 @@
+/* { dg-additional-options "-fdump-tree-original" } */
+
+struct S
+{
+ int *ptr;
+ float f;
+};
+
+int a[32], b[32];
+#pragma acc declare copyin(readonly: a) copyin(b)
+
+int main (void)
+{
+ int x[32], y[32];
+ struct S s = {x, 0};
+
+ #pragma acc parallel copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
+ {
+ #pragma acc cache (readonly: x[:32])
+ #pragma acc cache (y[:32])
+ }
+
+ #pragma acc kernels copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
+ {
+ #pragma acc cache (readonly: x[:32])
+ #pragma acc cache (y[:32])
+ }
+
+ #pragma acc serial copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
+ {
+ #pragma acc cache (readonly: x[:32])
+ #pragma acc cache (y[:32])
+ }
+
+ #pragma acc data copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
+ {
+ #pragma acc cache (readonly: x[:32])
+ #pragma acc cache (y[:32])
+ }
+
+ #pragma acc enter data copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
+
+ return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
+
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
+
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
diff --git a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
new file mode 100644
index 00000000000..696ebd08321
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
@@ -0,0 +1,89 @@
+! { dg-additional-options "-fdump-tree-original" }
+
+subroutine foo (a, n)
+ integer :: n, a(:)
+ integer :: i, b(n), c(n)
+ !$acc parallel copyin(readonly: a(:), b(:n)) copyin(c(:))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ !$acc cache (c(:))
+ enddo
+ !$acc end parallel
+
+ !$acc kernels copyin(readonly: a(:), b(:n)) copyin(c(:))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ !$acc cache (c(:))
+ enddo
+ !$acc end kernels
+
+ !$acc serial copyin(readonly: a(:), b(:n)) copyin(c(:))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ !$acc cache (c(:))
+ enddo
+ !$acc end serial
+
+ !$acc data copyin(readonly: a(:), b(:n)) copyin(c(:))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ !$acc cache (c(:))
+ enddo
+ !$acc end data
+
+ !$acc enter data copyin(readonly: a(:), b(:n)) copyin(c(:))
+
+end subroutine foo
+
+program main
+ integer :: g(32), h(32)
+ integer :: i, n = 32, a(32)
+ integer :: b(32), c(32)
+
+ !$acc declare copyin(readonly: g), copyin(h)
+
+ !$acc parallel copyin(readonly: a(:32), b(:n)) copyin(c(:))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ !$acc cache (c(:))
+ enddo
+ !$acc end parallel
+
+ !$acc kernels copyin(readonly: a(:), b(:n)) copyin(c(:))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ !$acc cache (c(:))
+ enddo
+ !$acc end kernels
+
+ !$acc serial copyin(readonly: a(:), b(:n)) copyin(c(:))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ !$acc cache (c(:))
+ enddo
+ !$acc end serial
+
+ !$acc data copyin(readonly: a(:), b(:n)) copyin(c(:))
+ do i = 1,32
+ !$acc cache (readonly: a(:), b(:n))
+ !$acc cache (c(:))
+ enddo
+ !$acc end data
+
+ !$acc enter data copyin(readonly: a(:), b(:n)) copyin(c(:))
+
+end program main
+
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
+
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 8a89462bd7e..d529712306d 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1344,6 +1344,12 @@ struct GTY(()) tree_base {
TYPE_READONLY in
all types
+ OMP_CLAUSE_MAP_READONLY in
+ OMP_CLAUSE_MAP
+
+ OMP_CLAUSE__CACHE__READONLY in
+ OMP_CLAUSE__CACHE_
+
constant_flag:
TREE_CONSTANT in
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 654f5247e3a..926f7e006a7 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -913,6 +913,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
case OMP_CLAUSE_MAP:
pp_string (pp, "map(");
+ if (OMP_CLAUSE_MAP_READONLY (clause))
+ pp_string (pp, "readonly,");
switch (OMP_CLAUSE_MAP_KIND (clause))
{
case GOMP_MAP_ALLOC:
@@ -1095,6 +1097,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
case OMP_CLAUSE__CACHE_:
pp_string (pp, "(");
+ if (OMP_CLAUSE__CACHE__READONLY (clause))
+ pp_string (pp, "readonly:");
dump_generic_node (pp, OMP_CLAUSE_DECL (clause),
spc, flags, false);
goto print_clause_size;
diff --git a/gcc/tree.h b/gcc/tree.h
index e1fc6c2221d..b67a37d6522 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1841,6 +1841,14 @@ class auto_suppress_location_wrappers
#define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
(OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
+/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */
+#define OMP_CLAUSE_MAP_READONLY(NODE) \
+ TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
+
+/* Same as above, for use in OpenACC cache directives. */
+#define OMP_CLAUSE__CACHE__READONLY(NODE) \
+ TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
+
/* True on an OMP_CLAUSE_USE_DEVICE_PTR with an OpenACC 'if_present'
clause. */
#define OMP_CLAUSE_USE_DEVICE_PTR_IF_PRESENT(NODE) \
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends
2024-03-07 8:02 ` Chung-Lin Tang
@ 2024-03-13 9:12 ` Thomas Schwinge
2024-03-14 15:09 ` OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing (was: [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends) Thomas Schwinge
0 siblings, 1 reply; 18+ messages in thread
From: Thomas Schwinge @ 2024-03-13 9:12 UTC (permalink / raw)
To: Chung-Lin Tang; +Cc: Tobias Burnus, gcc-patches, fortran
Hi Chung-Lin!
On 2024-03-07T17:02:02+0900, Chung-Lin Tang <cltang@pllab.cs.nthu.edu.tw> wrote:
> On 2023/10/26 6:43 PM, Thomas Schwinge wrote:
>>>>>> +++ b/gcc/tree.h
>>>>>> @@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers
>>>>>> #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
>>>>>> (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
>>>>>>
>>>>>> +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */
>>>>>> +#define OMP_CLAUSE_MAP_READONLY(NODE) \
>>>>>> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
>>>>>> +
>>>>>> +/* Same as above, for use in OpenACC cache directives. */
>>>>>> +#define OMP_CLAUSE__CACHE__READONLY(NODE) \
>>>>>> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
>>>>> I'm not sure if these special accessor functions are actually useful, or
>>>>> we should just directly use 'TREE_READONLY' instead? We're only using
>>>>> them in contexts where it's clear that the 'OMP_CLAUSE_SUBCODE_CHECK' is
>>>>> satisfied, for example.
>>>> I find directly using TREE_READONLY confusing.
>>>
>>> FWIW, I've changed to use TREE_NOTHROW instead, if it can give a better sense of safety :P
>>
>> I don't understand that, why not use 'TREE_READONLY'?
>>
>>> I think there's a misunderstanding here anyways: we are not relying on a DECL marked
>>> TREE_READONLY here. We merely need the OMP_CLAUSE_MAP to be marked as OMP_CLAUSE_MAP_READONLY == 1.
>>
>> Yes, I understand that. My question was why we don't just use
>> 'TREE_READONLY (c)', where 'c' is the
>> 'OMP_CLAUSE_MAP'/'OMP_CLAUSE__CACHE_' clause (not its decl), and avoid
>> the indirection through
>> '#define OMP_CLAUSE_MAP_READONLY'/'#define OMP_CLAUSE__CACHE__READONLY',
>> given that we're only using them in contexts where it's clear that the
>> 'OMP_CLAUSE_SUBCODE_CHECK' is satisfied. I don't have a strong
>> preference, though.
>
> After further re-testing using TREE_NOTHROW, I have reverted to using TREE_READONLY
ACK, thanks.
> because TREE_NOTHROW clashes
> with OMP_CLAUSE_RELEASE_DESCRIPTOR (which doesn't use the OMP_CLAUSE_MAP_* naming convention and is
> not documented in gcc/tree-core.h either, hmmm...)
Yeah, it's a mess... The same bits of information spread over three
different places.
(One day I'll turn 'tree's into a proper C++ class hierarchy, with
accessor methods for such flags, statically checked at compile-time, and
thus documented in a single place. Etc.)
> I have added the comment adjustments in gcc/tree-core.h for the new uses of TREE_READONLY/readonly_flag.
>
> We basically all use OMP_CLAUSE_SUBCODE_CHECK macros for OpenMP clause expressions exclusively,
> so I don't see a reason to diverge from that style (even when context is clear).
ACK.
> I have greatly expanded the test scan patterns to include parallel/kernels/serial/data/enter data,
> as well as non-readonly copyin clause together with readonly.
Thanks.
> Also added simple 'declare' tests, but there is not anything to scan in the 'tree-original' dump though.
Yeah, the current OpenACC 'declare' implementation is "special".
>>> --- a/gcc/fortran/openmp.cc
>>> +++ b/gcc/fortran/openmp.cc
>>> @@ -1197,7 +1197,7 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
>>>
>>> static bool
>>> gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
>>> - bool allow_common, bool allow_derived)
>>> + bool allow_common, bool allow_derived, bool readonly = false)
>>> {
>>> gfc_omp_namelist **head = NULL;
>>> if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, true,
>>> @@ -1206,7 +1206,10 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
>>> {
>>> gfc_omp_namelist *n;
>>> for (n = *head; n; n = n->next)
>>> - n->u.map_op = map_op;
>>> + {
>>> + n->u.map.op = map_op;
>>> + n->u.map.readonly = readonly;
>>> + }
>>> return true;
>>> }
>>
>> Didn't we conclude that "not doing it here is cleaner" (Tobias' words),
>> and instead do this "Similar to 'c_parser_omp_var_list_parens'" (my
>> words)? That is, not add the 'bool readonly' formal parameter to
>> 'gfc_match_omp_map_clause'.
>
> Fixed in this v3 patch.
Thanks.
> Again, tested on x86_64-linux + nvptx offloading. Okay for mainline?
Yes, thanks.
Grüße
Thomas
> gcc/c/ChangeLog:
>
> * c-parser.cc (c_parser_oacc_data_clause): Add parsing support for
> 'readonly' modifier, set OMP_CLAUSE_MAP_READONLY if readonly modifier
> found, update comments.
> (c_parser_oacc_cache): Add parsing support for 'readonly' modifier,
> set OMP_CLAUSE__CACHE__READONLY if readonly modifier found, update
> comments.
>
> gcc/cp/ChangeLog:
>
> * parser.cc (cp_parser_oacc_data_clause): Add parsing support for
> 'readonly' modifier, set OMP_CLAUSE_MAP_READONLY if readonly modifier
> found, update comments.
> (cp_parser_oacc_cache): Add parsing support for 'readonly' modifier,
> set OMP_CLAUSE__CACHE__READONLY if readonly modifier found, update
> comments.
>
> gcc/fortran/ChangeLog:
>
> * dump-parse-tree.cc (show_omp_namelist): Print "readonly," for
> OMP_LIST_MAP and OMP_LIST_CACHE if n->u.map.readonly is set.
> Adjust 'n->u.map_op' to 'n->u.map.op'.
> * gfortran.h (typedef struct gfc_omp_namelist): Adjust map_op as
> 'ENUM_BITFIELD (gfc_omp_map_op) op:8', add 'bool readonly' field,
> change to named struct field 'map'.
>
> * openmp.cc (gfc_match_omp_map_clause): Adjust 'n->u.map_op' to
> 'n->u.map.op'.
> (gfc_match_omp_clause_reduction): Likewise.
>
> (gfc_match_omp_clauses): Add readonly modifier parsing for OpenACC
> copyin clause, set 'n->u.map.op' and 'n->u.map.readonly' for parsed
> clause. Adjust 'n->u.map_op' to 'n->u.map.op'.
> (gfc_match_oacc_declare): Adjust 'n->u.map_op' to 'n->u.map.op'.
> (gfc_match_oacc_cache): Add readonly modifier parsing for OpenACC
> cache directive.
> (resolve_omp_clauses): Adjust 'n->u.map_op' to 'n->u.map.op'.
> * trans-decl.cc (add_clause): Adjust 'n->u.map_op' to 'n->u.map.op'.
> (finish_oacc_declare): Likewise.
> * trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_CLAUSE_MAP_READONLY,
> OMP_CLAUSE__CACHE__READONLY to 1 when readonly is set. Adjust
> 'n->u.map_op' to 'n->u.map.op'.
> (gfc_add_clause_implicitly): Adjust 'n->u.map_op' to 'n->u.map.op'.
>
> gcc/ChangeLog:
> * tree.h (OMP_CLAUSE_MAP_READONLY): New macro.
> (OMP_CLAUSE__CACHE__READONLY): New macro.
> * tree-core.h (struct GTY(()) tree_base): Adjust comments for new
> uses of readonly_flag bit in OMP_CLAUSE_MAP_READONLY and
> OMP_CLAUSE__CACHE__READONLY.
> * tree-pretty-print.cc (dump_omp_clause): Add support for printing
> OMP_CLAUSE_MAP_READONLY and OMP_CLAUSE__CACHE__READONLY.
>
> gcc/testsuite/ChangeLog:
>
> * c-c++-common/goacc/readonly-1.c: New test.
> * gfortran.dg/goacc/readonly-1.f90: New test.
>
>
>
>
>
> diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> index 53e99aa29d9..00f8bf4376e 100644
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -15627,7 +15627,11 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
> OpenACC 2.6:
> no_create ( variable-list )
> attach ( variable-list )
> - detach ( variable-list ) */
> + detach ( variable-list )
> +
> + OpenACC 2.7:
> + copyin (readonly : variable-list )
> + */
>
> static tree
> c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
> @@ -15680,11 +15684,37 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
> default:
> gcc_unreachable ();
> }
> - tree nl, c;
> - nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, false);
>
> - for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
> - OMP_CLAUSE_SET_MAP_KIND (c, kind);
> + tree nl = list;
> + bool readonly = false;
> + location_t open_loc = c_parser_peek_token (parser)->location;
> + matching_parens parens;
> + if (parens.require_open (parser))
> + {
> + /* Turn on readonly modifier parsing for copyin clause. */
> + if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
> + {
> + c_token *token = c_parser_peek_token (parser);
> + if (token->type == CPP_NAME
> + && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
> + && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
> + {
> + c_parser_consume_token (parser);
> + c_parser_consume_token (parser);
> + readonly = true;
> + }
> + }
> + nl = c_parser_omp_variable_list (parser, open_loc, OMP_CLAUSE_MAP, list,
> + false);
> + parens.skip_until_found_close (parser);
> + }
> +
> + for (tree c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
> + {
> + OMP_CLAUSE_SET_MAP_KIND (c, kind);
> + if (readonly)
> + OMP_CLAUSE_MAP_READONLY (c) = 1;
> + }
>
> return nl;
> }
> @@ -19821,15 +19851,39 @@ c_parser_omp_structured_block (c_parser *parser, bool *if_p)
> /* OpenACC 2.0:
> # pragma acc cache (variable-list) new-line
>
> + OpenACC 2.7:
> + # pragma acc cache (readonly: variable-list) new-line
> +
> LOC is the location of the #pragma token.
> */
>
> static tree
> c_parser_oacc_cache (location_t loc, c_parser *parser)
> {
> - tree stmt, clauses;
> + tree stmt, clauses = NULL_TREE;
> + bool readonly = false;
> + location_t open_loc = c_parser_peek_token (parser)->location;
> + matching_parens parens;
> + if (parens.require_open (parser))
> + {
> + c_token *token = c_parser_peek_token (parser);
> + if (token->type == CPP_NAME
> + && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
> + && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
> + {
> + c_parser_consume_token (parser);
> + c_parser_consume_token (parser);
> + readonly = true;
> + }
> + clauses = c_parser_omp_variable_list (parser, open_loc,
> + OMP_CLAUSE__CACHE_, NULL_TREE);
> + parens.skip_until_found_close (parser);
> + }
> +
> + if (readonly)
> + for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> + OMP_CLAUSE__CACHE__READONLY (c) = 1;
>
> - clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL);
> clauses = c_finish_omp_clauses (clauses, C_ORT_ACC);
>
> c_parser_skip_to_pragma_eol (parser);
> diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> index e32acfc30a2..4fe27fb07b2 100644
> --- a/gcc/cp/parser.cc
> +++ b/gcc/cp/parser.cc
> @@ -38544,7 +38544,11 @@ cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list,
> OpenACC 2.6:
> no_create ( variable-list )
> attach ( variable-list )
> - detach ( variable-list ) */
> + detach ( variable-list )
> +
> + OpenACC 2.7:
> + copyin (readonly : variable-list )
> + */
>
> static tree
> cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
> @@ -38597,11 +38601,34 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
> default:
> gcc_unreachable ();
> }
> - tree nl, c;
> - nl = cp_parser_omp_var_list (parser, OMP_CLAUSE_MAP, list, false);
>
> - for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
> - OMP_CLAUSE_SET_MAP_KIND (c, kind);
> + tree nl = list;
> + bool readonly = false;
> + if (cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
> + {
> + /* Turn on readonly modifier parsing for copyin clause. */
> + if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
> + {
> + cp_token *token = cp_lexer_peek_token (parser->lexer);
> + if (token->type == CPP_NAME
> + && !strcmp (IDENTIFIER_POINTER (token->u.value), "readonly")
> + && cp_lexer_peek_nth_token (parser->lexer, 2)->type == CPP_COLON)
> + {
> + cp_lexer_consume_token (parser->lexer);
> + cp_lexer_consume_token (parser->lexer);
> + readonly = true;
> + }
> + }
> + nl = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE_MAP, list, NULL,
> + false);
> + }
> +
> + for (tree c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
> + {
> + OMP_CLAUSE_SET_MAP_KIND (c, kind);
> + if (readonly)
> + OMP_CLAUSE_MAP_READONLY (c) = 1;
> + }
>
> return nl;
> }
> @@ -47178,6 +47205,9 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok,
>
> /* OpenACC 2.0:
> # pragma acc cache (variable-list) new-line
> +
> + OpenACC 2.7:
> + # pragma acc cache (readonly: variable-list) new-line
> */
>
> static tree
> @@ -47187,9 +47217,28 @@ cp_parser_oacc_cache (cp_parser *parser, cp_token *pragma_tok)
> clauses. */
> auto_suppress_location_wrappers sentinel;
>
> - tree stmt, clauses;
> + tree stmt, clauses = NULL_TREE;
> + bool readonly = false;
> +
> + if (cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
> + {
> + cp_token *token = cp_lexer_peek_token (parser->lexer);
> + if (token->type == CPP_NAME
> + && !strcmp (IDENTIFIER_POINTER (token->u.value), "readonly")
> + && cp_lexer_peek_nth_token (parser->lexer, 2)->type == CPP_COLON)
> + {
> + cp_lexer_consume_token (parser->lexer);
> + cp_lexer_consume_token (parser->lexer);
> + readonly = true;
> + }
> + clauses = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE__CACHE_,
> + NULL, NULL);
> + }
> +
> + if (readonly)
> + for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> + OMP_CLAUSE__CACHE__READONLY (c) = 1;
>
> - clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE__CACHE_, NULL_TREE);
> clauses = finish_omp_clauses (clauses, C_ORT_ACC);
>
> cp_parser_require_pragma_eol (parser, cp_lexer_peek_token (parser->lexer));
> diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
> index 7b154eb3ca7..db84b06289b 100644
> --- a/gcc/fortran/dump-parse-tree.cc
> +++ b/gcc/fortran/dump-parse-tree.cc
> @@ -1400,6 +1400,9 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
> fputs (") ALLOCATE(", dumpfile);
> continue;
> }
> + if ((list_type == OMP_LIST_MAP || list_type == OMP_LIST_CACHE)
> + && n->u.map.readonly)
> + fputs ("readonly,", dumpfile);
> if (list_type == OMP_LIST_REDUCTION)
> switch (n->u.reduction_op)
> {
> @@ -1467,7 +1470,7 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
> default: break;
> }
> else if (list_type == OMP_LIST_MAP)
> - switch (n->u.map_op)
> + switch (n->u.map.op)
> {
> case OMP_MAP_ALLOC: fputs ("alloc:", dumpfile); break;
> case OMP_MAP_TO: fputs ("to:", dumpfile); break;
> diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
> index ebba2336e12..32b792f85fb 100644
> --- a/gcc/fortran/gfortran.h
> +++ b/gcc/fortran/gfortran.h
> @@ -1363,7 +1363,11 @@ typedef struct gfc_omp_namelist
> {
> gfc_omp_reduction_op reduction_op;
> gfc_omp_depend_doacross_op depend_doacross_op;
> - gfc_omp_map_op map_op;
> + struct
> + {
> + ENUM_BITFIELD (gfc_omp_map_op) op:8;
> + bool readonly;
> + } map;
> gfc_expr *align;
> struct
> {
> diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
> index 38de60238c0..5c44e666eb9 100644
> --- a/gcc/fortran/openmp.cc
> +++ b/gcc/fortran/openmp.cc
> @@ -1210,7 +1210,7 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
> {
> gfc_omp_namelist *n;
> for (n = *head; n; n = n->next)
> - n->u.map_op = map_op;
> + n->u.map.op = map_op;
> return true;
> }
>
> @@ -1524,7 +1524,7 @@ gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses *c, bool openacc,
> gfc_omp_namelist *p = gfc_get_omp_namelist (), **tl;
> p->sym = n->sym;
> p->where = p->where;
> - p->u.map_op = OMP_MAP_ALWAYS_TOFROM;
> + p->u.map.op = OMP_MAP_ALWAYS_TOFROM;
>
> tl = &c->lists[OMP_LIST_MAP];
> while (*tl)
> @@ -2181,11 +2181,25 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
> {
> if (openacc)
> {
> - if (gfc_match ("copyin ( ") == MATCH_YES
> - && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> - OMP_MAP_TO, true,
> - allow_derived))
> - continue;
> + if (gfc_match ("copyin ( ") == MATCH_YES)
> + {
> + bool readonly = gfc_match ("readonly : ") == MATCH_YES;
> + head = NULL;
> + if (gfc_match_omp_variable_list ("",
> + &c->lists[OMP_LIST_MAP],
> + true, NULL, &head, true,
> + allow_derived)
> + == MATCH_YES)
> + {
> + gfc_omp_namelist *n;
> + for (n = *head; n; n = n->next)
> + {
> + n->u.map.op = OMP_MAP_TO;
> + n->u.map.readonly = readonly;
> + }
> + continue;
> + }
> + }
> }
> else if (gfc_match_omp_variable_list ("copyin (",
> &c->lists[OMP_LIST_COPYIN],
> @@ -3134,7 +3148,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
> {
> gfc_omp_namelist *n;
> for (n = *head; n; n = n->next)
> - n->u.map_op = map_op;
> + n->u.map.op = map_op;
> continue;
> }
> gfc_current_locus = old_loc;
> @@ -4002,7 +4016,7 @@ gfc_match_oacc_declare (void)
> if (gfc_current_ns->proc_name
> && gfc_current_ns->proc_name->attr.flavor == FL_MODULE)
> {
> - if (n->u.map_op != OMP_MAP_ALLOC && n->u.map_op != OMP_MAP_TO)
> + if (n->u.map.op != OMP_MAP_ALLOC && n->u.map.op != OMP_MAP_TO)
> {
> gfc_error ("Invalid clause in module with !$ACC DECLARE at %L",
> &where);
> @@ -4036,7 +4050,7 @@ gfc_match_oacc_declare (void)
> return MATCH_ERROR;
> }
>
> - switch (n->u.map_op)
> + switch (n->u.map.op)
> {
> case OMP_MAP_FORCE_ALLOC:
> case OMP_MAP_ALLOC:
> @@ -4151,21 +4165,36 @@ gfc_match_oacc_wait (void)
> match
> gfc_match_oacc_cache (void)
> {
> + bool readonly = false;
> gfc_omp_clauses *c = gfc_get_omp_clauses ();
> /* The OpenACC cache directive explicitly only allows "array elements or
> subarrays", which we're currently not checking here. Either check this
> after the call of gfc_match_omp_variable_list, or add something like a
> only_sections variant next to its allow_sections parameter. */
> - match m = gfc_match_omp_variable_list (" (",
> - &c->lists[OMP_LIST_CACHE], true,
> - NULL, NULL, true);
> + match m = gfc_match (" ( ");
> if (m != MATCH_YES)
> {
> gfc_free_omp_clauses(c);
> return m;
> }
>
> - if (gfc_current_state() != COMP_DO
> + if (gfc_match ("readonly : ") == MATCH_YES)
> + readonly = true;
> +
> + gfc_omp_namelist **head = NULL;
> + m = gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_CACHE], true,
> + NULL, &head, true);
> + if (m != MATCH_YES)
> + {
> + gfc_free_omp_clauses(c);
> + return m;
> + }
> +
> + if (readonly)
> + for (gfc_omp_namelist *n = *head; n; n = n->next)
> + n->u.map.readonly = true;
> +
> + if (gfc_current_state() != COMP_DO
> && gfc_current_state() != COMP_DO_CONCURRENT)
> {
> gfc_error ("ACC CACHE directive must be inside of loop %C");
> @@ -8436,8 +8465,8 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
> }
> if (openacc
> && list == OMP_LIST_MAP
> - && (n->u.map_op == OMP_MAP_ATTACH
> - || n->u.map_op == OMP_MAP_DETACH))
> + && (n->u.map.op == OMP_MAP_ATTACH
> + || n->u.map.op == OMP_MAP_DETACH))
> {
> symbol_attribute attr;
> if (n->expr)
> @@ -8447,7 +8476,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
> if (!attr.pointer && !attr.allocatable)
> gfc_error ("%qs clause argument must be ALLOCATABLE or "
> "a POINTER at %L",
> - (n->u.map_op == OMP_MAP_ATTACH) ? "attach"
> + (n->u.map.op == OMP_MAP_ATTACH) ? "attach"
> : "detach", &n->where);
> }
> if (lastref
> @@ -8518,7 +8547,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
> else if (openacc)
> {
> if (list == OMP_LIST_MAP
> - && n->u.map_op == OMP_MAP_FORCE_DEVICEPTR)
> + && n->u.map.op == OMP_MAP_FORCE_DEVICEPTR)
> resolve_oacc_deviceptr_clause (n->sym, n->where, name);
> else
> resolve_oacc_data_clauses (n->sym, n->where, name);
> @@ -8540,7 +8569,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
> {
> case EXEC_OMP_TARGET:
> case EXEC_OMP_TARGET_DATA:
> - switch (n->u.map_op)
> + switch (n->u.map.op)
> {
> case OMP_MAP_TO:
> case OMP_MAP_ALWAYS_TO:
> @@ -8567,7 +8596,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
> }
> break;
> case EXEC_OMP_TARGET_ENTER_DATA:
> - switch (n->u.map_op)
> + switch (n->u.map.op)
> {
> case OMP_MAP_TO:
> case OMP_MAP_ALWAYS_TO:
> @@ -8577,16 +8606,16 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
> case OMP_MAP_PRESENT_ALLOC:
> break;
> case OMP_MAP_TOFROM:
> - n->u.map_op = OMP_MAP_TO;
> + n->u.map.op = OMP_MAP_TO;
> break;
> case OMP_MAP_ALWAYS_TOFROM:
> - n->u.map_op = OMP_MAP_ALWAYS_TO;
> + n->u.map.op = OMP_MAP_ALWAYS_TO;
> break;
> case OMP_MAP_PRESENT_TOFROM:
> - n->u.map_op = OMP_MAP_PRESENT_TO;
> + n->u.map.op = OMP_MAP_PRESENT_TO;
> break;
> case OMP_MAP_ALWAYS_PRESENT_TOFROM:
> - n->u.map_op = OMP_MAP_ALWAYS_PRESENT_TO;
> + n->u.map.op = OMP_MAP_ALWAYS_PRESENT_TO;
> break;
> default:
> gfc_error ("TARGET ENTER DATA with map-type other "
> @@ -8596,7 +8625,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
> }
> break;
> case EXEC_OMP_TARGET_EXIT_DATA:
> - switch (n->u.map_op)
> + switch (n->u.map.op)
> {
> case OMP_MAP_FROM:
> case OMP_MAP_ALWAYS_FROM:
> @@ -8606,16 +8635,16 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
> case OMP_MAP_DELETE:
> break;
> case OMP_MAP_TOFROM:
> - n->u.map_op = OMP_MAP_FROM;
> + n->u.map.op = OMP_MAP_FROM;
> break;
> case OMP_MAP_ALWAYS_TOFROM:
> - n->u.map_op = OMP_MAP_ALWAYS_FROM;
> + n->u.map.op = OMP_MAP_ALWAYS_FROM;
> break;
> case OMP_MAP_PRESENT_TOFROM:
> - n->u.map_op = OMP_MAP_PRESENT_FROM;
> + n->u.map.op = OMP_MAP_PRESENT_FROM;
> break;
> case OMP_MAP_ALWAYS_PRESENT_TOFROM:
> - n->u.map_op = OMP_MAP_ALWAYS_PRESENT_FROM;
> + n->u.map.op = OMP_MAP_ALWAYS_PRESENT_FROM;
> break;
> default:
> gfc_error ("TARGET EXIT DATA with map-type other "
> diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
> index 6d463036966..b7dea11461f 100644
> --- a/gcc/fortran/trans-decl.cc
> +++ b/gcc/fortran/trans-decl.cc
> @@ -6744,7 +6744,7 @@ add_clause (gfc_symbol *sym, gfc_omp_map_op map_op)
>
> n = gfc_get_omp_namelist ();
> n->sym = sym;
> - n->u.map_op = map_op;
> + n->u.map.op = map_op;
>
> if (!module_oacc_clauses)
> module_oacc_clauses = gfc_get_omp_clauses ();
> @@ -6846,10 +6846,10 @@ finish_oacc_declare (gfc_namespace *ns, gfc_symbol *sym, bool block)
>
> for (n = omp_clauses->lists[OMP_LIST_MAP]; n; n = n->next)
> {
> - switch (n->u.map_op)
> + switch (n->u.map.op)
> {
> case OMP_MAP_DEVICE_RESIDENT:
> - n->u.map_op = OMP_MAP_FORCE_ALLOC;
> + n->u.map.op = OMP_MAP_FORCE_ALLOC;
> break;
>
> default:
> diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
> index a2bf15665b3..fa1bfd41380 100644
> --- a/gcc/fortran/trans-openmp.cc
> +++ b/gcc/fortran/trans-openmp.cc
> @@ -3139,7 +3139,10 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> || (n->expr && gfc_expr_attr (n->expr).pointer)))
> always_modifier = true;
>
> - switch (n->u.map_op)
> + if (n->u.map.readonly)
> + OMP_CLAUSE_MAP_READONLY (node) = 1;
> +
> + switch (n->u.map.op)
> {
> case OMP_MAP_ALLOC:
> OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALLOC);
> @@ -3266,8 +3269,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> && n->sym->attr.omp_declare_target
> && (always_modifier || n->sym->attr.pointer)
> && op != EXEC_OMP_TARGET_EXIT_DATA
> - && n->u.map_op != OMP_MAP_DELETE
> - && n->u.map_op != OMP_MAP_RELEASE)
> + && n->u.map.op != OMP_MAP_DELETE
> + && n->u.map.op != OMP_MAP_RELEASE)
> {
> gcc_assert (n->sym->ts.u.cl->backend_decl);
> node5 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
> @@ -3333,7 +3336,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> {
> enum gomp_map_kind gmk = GOMP_MAP_POINTER;
> if (op == EXEC_OMP_TARGET_EXIT_DATA
> - && n->u.map_op == OMP_MAP_DELETE)
> + && n->u.map.op == OMP_MAP_DELETE)
> gmk = GOMP_MAP_DELETE;
> else if (op == EXEC_OMP_TARGET_EXIT_DATA)
> gmk = GOMP_MAP_RELEASE;
> @@ -3356,7 +3359,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> {
> enum gomp_map_kind gmk;
> if (op == EXEC_OMP_TARGET_EXIT_DATA
> - && n->u.map_op == OMP_MAP_DELETE)
> + && n->u.map.op == OMP_MAP_DELETE)
> gmk = GOMP_MAP_DELETE;
> else if (op == EXEC_OMP_TARGET_EXIT_DATA)
> gmk = GOMP_MAP_RELEASE;
> @@ -3388,18 +3391,18 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> node2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
> OMP_CLAUSE_DECL (node2) = decl;
> OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
> - if (n->u.map_op == OMP_MAP_DELETE)
> + if (n->u.map.op == OMP_MAP_DELETE)
> map_kind = GOMP_MAP_DELETE;
> else if (op == EXEC_OMP_TARGET_EXIT_DATA
> - || n->u.map_op == OMP_MAP_RELEASE)
> + || n->u.map.op == OMP_MAP_RELEASE)
> map_kind = GOMP_MAP_RELEASE;
> else
> map_kind = GOMP_MAP_TO_PSET;
> OMP_CLAUSE_SET_MAP_KIND (node2, map_kind);
>
> if (op != EXEC_OMP_TARGET_EXIT_DATA
> - && n->u.map_op != OMP_MAP_DELETE
> - && n->u.map_op != OMP_MAP_RELEASE)
> + && n->u.map.op != OMP_MAP_DELETE
> + && n->u.map.op != OMP_MAP_RELEASE)
> {
> node3 = build_omp_clause (input_location,
> OMP_CLAUSE_MAP);
> @@ -3417,7 +3420,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> = gfc_conv_descriptor_data_get (decl);
> OMP_CLAUSE_SIZE (node3) = size_int (0);
>
> - if (n->u.map_op == OMP_MAP_ATTACH)
> + if (n->u.map.op == OMP_MAP_ATTACH)
> {
> /* Standalone attach clauses used with arrays with
> descriptors must copy the descriptor to the
> @@ -3433,7 +3436,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> node3 = NULL;
> goto finalize_map_clause;
> }
> - else if (n->u.map_op == OMP_MAP_DETACH)
> + else if (n->u.map.op == OMP_MAP_DETACH)
> {
> OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_DETACH);
> /* Similarly to above, we don't want to unmap PTR
> @@ -3626,8 +3629,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> to perform a single attach/detach operation, of the
> pointer itself, not of the pointed-to object. */
> if (openacc
> - && (n->u.map_op == OMP_MAP_ATTACH
> - || n->u.map_op == OMP_MAP_DETACH))
> + && (n->u.map.op == OMP_MAP_ATTACH
> + || n->u.map.op == OMP_MAP_DETACH))
> {
> OMP_CLAUSE_DECL (node)
> = build_fold_addr_expr (OMP_CLAUSE_DECL (node));
> @@ -3656,7 +3659,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> se.string_length),
> TYPE_SIZE_UNIT (tmp));
> gomp_map_kind kind;
> - if (n->u.map_op == OMP_MAP_DELETE)
> + if (n->u.map.op == OMP_MAP_DELETE)
> kind = GOMP_MAP_DELETE;
> else if (op == EXEC_OMP_TARGET_EXIT_DATA)
> kind = GOMP_MAP_RELEASE;
> @@ -3713,8 +3716,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> to perform a single attach/detach operation, of the
> pointer itself, not of the pointed-to object. */
> if (openacc
> - && (n->u.map_op == OMP_MAP_ATTACH
> - || n->u.map_op == OMP_MAP_DETACH))
> + && (n->u.map.op == OMP_MAP_ATTACH
> + || n->u.map.op == OMP_MAP_DETACH))
> {
> OMP_CLAUSE_DECL (node)
> = build_fold_addr_expr (inner);
> @@ -3806,8 +3809,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> {
> /* Bare attach and detach clauses don't want any
> additional nodes. */
> - if ((n->u.map_op == OMP_MAP_ATTACH
> - || n->u.map_op == OMP_MAP_DETACH)
> + if ((n->u.map.op == OMP_MAP_ATTACH
> + || n->u.map.op == OMP_MAP_DETACH)
> && (POINTER_TYPE_P (TREE_TYPE (inner))
> || GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (inner))))
> {
> @@ -3840,8 +3843,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> map_kind = ((GOMP_MAP_ALWAYS_P (map_kind)
> || gfc_expr_attr (n->expr).pointer)
> ? GOMP_MAP_ALWAYS_TO : GOMP_MAP_TO);
> - else if (n->u.map_op == OMP_MAP_RELEASE
> - || n->u.map_op == OMP_MAP_DELETE)
> + else if (n->u.map.op == OMP_MAP_RELEASE
> + || n->u.map.op == OMP_MAP_DELETE)
> ;
> else if (op == EXEC_OMP_TARGET_EXIT_DATA
> || op == EXEC_OACC_EXIT_DATA)
> @@ -4088,6 +4091,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
> }
> if (n->u.present_modifier)
> OMP_CLAUSE_MOTION_PRESENT (node) = 1;
> + if (list == OMP_LIST_CACHE && n->u.map.readonly)
> + OMP_CLAUSE__CACHE__READONLY (node) = 1;
> omp_clauses = gfc_trans_add_clause (node, omp_clauses);
> }
> break;
> @@ -6561,7 +6566,7 @@ gfc_add_clause_implicitly (gfc_omp_clauses *clauses_out,
> n2->where = n->where;
> n2->sym = n->sym;
> if (is_target)
> - n2->u.map_op = OMP_MAP_TOFROM;
> + n2->u.map.op = OMP_MAP_TOFROM;
> if (tail)
> {
> tail->next = n2;
> diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-1.c b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> new file mode 100644
> index 00000000000..34fc92c24d5
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> @@ -0,0 +1,59 @@
> +/* { dg-additional-options "-fdump-tree-original" } */
> +
> +struct S
> +{
> + int *ptr;
> + float f;
> +};
> +
> +int a[32], b[32];
> +#pragma acc declare copyin(readonly: a) copyin(b)
> +
> +int main (void)
> +{
> + int x[32], y[32];
> + struct S s = {x, 0};
> +
> + #pragma acc parallel copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
> + {
> + #pragma acc cache (readonly: x[:32])
> + #pragma acc cache (y[:32])
> + }
> +
> + #pragma acc kernels copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
> + {
> + #pragma acc cache (readonly: x[:32])
> + #pragma acc cache (y[:32])
> + }
> +
> + #pragma acc serial copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
> + {
> + #pragma acc cache (readonly: x[:32])
> + #pragma acc cache (y[:32])
> + }
> +
> + #pragma acc data copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
> + {
> + #pragma acc cache (readonly: x[:32])
> + #pragma acc cache (y[:32])
> + }
> +
> + #pragma acc enter data copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
> +
> + return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> +
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> +
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
> diff --git a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> new file mode 100644
> index 00000000000..696ebd08321
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> @@ -0,0 +1,89 @@
> +! { dg-additional-options "-fdump-tree-original" }
> +
> +subroutine foo (a, n)
> + integer :: n, a(:)
> + integer :: i, b(n), c(n)
> + !$acc parallel copyin(readonly: a(:), b(:n)) copyin(c(:))
> + do i = 1,32
> + !$acc cache (readonly: a(:), b(:n))
> + !$acc cache (c(:))
> + enddo
> + !$acc end parallel
> +
> + !$acc kernels copyin(readonly: a(:), b(:n)) copyin(c(:))
> + do i = 1,32
> + !$acc cache (readonly: a(:), b(:n))
> + !$acc cache (c(:))
> + enddo
> + !$acc end kernels
> +
> + !$acc serial copyin(readonly: a(:), b(:n)) copyin(c(:))
> + do i = 1,32
> + !$acc cache (readonly: a(:), b(:n))
> + !$acc cache (c(:))
> + enddo
> + !$acc end serial
> +
> + !$acc data copyin(readonly: a(:), b(:n)) copyin(c(:))
> + do i = 1,32
> + !$acc cache (readonly: a(:), b(:n))
> + !$acc cache (c(:))
> + enddo
> + !$acc end data
> +
> + !$acc enter data copyin(readonly: a(:), b(:n)) copyin(c(:))
> +
> +end subroutine foo
> +
> +program main
> + integer :: g(32), h(32)
> + integer :: i, n = 32, a(32)
> + integer :: b(32), c(32)
> +
> + !$acc declare copyin(readonly: g), copyin(h)
> +
> + !$acc parallel copyin(readonly: a(:32), b(:n)) copyin(c(:))
> + do i = 1,32
> + !$acc cache (readonly: a(:), b(:n))
> + !$acc cache (c(:))
> + enddo
> + !$acc end parallel
> +
> + !$acc kernels copyin(readonly: a(:), b(:n)) copyin(c(:))
> + do i = 1,32
> + !$acc cache (readonly: a(:), b(:n))
> + !$acc cache (c(:))
> + enddo
> + !$acc end kernels
> +
> + !$acc serial copyin(readonly: a(:), b(:n)) copyin(c(:))
> + do i = 1,32
> + !$acc cache (readonly: a(:), b(:n))
> + !$acc cache (c(:))
> + enddo
> + !$acc end serial
> +
> + !$acc data copyin(readonly: a(:), b(:n)) copyin(c(:))
> + do i = 1,32
> + !$acc cache (readonly: a(:), b(:n))
> + !$acc cache (c(:))
> + enddo
> + !$acc end data
> +
> + !$acc enter data copyin(readonly: a(:), b(:n)) copyin(c(:))
> +
> +end program main
> +
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> +
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
> diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> index 8a89462bd7e..d529712306d 100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -1344,6 +1344,12 @@ struct GTY(()) tree_base {
> TYPE_READONLY in
> all types
>
> + OMP_CLAUSE_MAP_READONLY in
> + OMP_CLAUSE_MAP
> +
> + OMP_CLAUSE__CACHE__READONLY in
> + OMP_CLAUSE__CACHE_
> +
> constant_flag:
>
> TREE_CONSTANT in
> diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
> index 654f5247e3a..926f7e006a7 100644
> --- a/gcc/tree-pretty-print.cc
> +++ b/gcc/tree-pretty-print.cc
> @@ -913,6 +913,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
>
> case OMP_CLAUSE_MAP:
> pp_string (pp, "map(");
> + if (OMP_CLAUSE_MAP_READONLY (clause))
> + pp_string (pp, "readonly,");
> switch (OMP_CLAUSE_MAP_KIND (clause))
> {
> case GOMP_MAP_ALLOC:
> @@ -1095,6 +1097,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
>
> case OMP_CLAUSE__CACHE_:
> pp_string (pp, "(");
> + if (OMP_CLAUSE__CACHE__READONLY (clause))
> + pp_string (pp, "readonly:");
> dump_generic_node (pp, OMP_CLAUSE_DECL (clause),
> spc, flags, false);
> goto print_clause_size;
> diff --git a/gcc/tree.h b/gcc/tree.h
> index e1fc6c2221d..b67a37d6522 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -1841,6 +1841,14 @@ class auto_suppress_location_wrappers
> #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
> (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
>
> +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'. */
> +#define OMP_CLAUSE_MAP_READONLY(NODE) \
> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> +
> +/* Same as above, for use in OpenACC cache directives. */
> +#define OMP_CLAUSE__CACHE__READONLY(NODE) \
> + TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
> +
> /* True on an OMP_CLAUSE_USE_DEVICE_PTR with an OpenACC 'if_present'
> clause. */
> #define OMP_CLAUSE_USE_DEVICE_PTR_IF_PRESENT(NODE) \
^ permalink raw reply [flat|nested] 18+ messages in thread
* OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing (was: [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends)
2024-03-13 9:12 ` Thomas Schwinge
@ 2024-03-14 15:09 ` Thomas Schwinge
2024-03-14 16:55 ` OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing Tobias Burnus
2024-03-14 16:55 ` Tobias Burnus
0 siblings, 2 replies; 18+ messages in thread
From: Thomas Schwinge @ 2024-03-14 15:09 UTC (permalink / raw)
To: Chung-Lin Tang, gcc-patches; +Cc: Tobias Burnus, fortran
[-- Attachment #1: Type: text/plain, Size: 23603 bytes --]
Hi!
On 2024-03-13T10:12:17+0100, I wrote:
> On 2024-03-07T17:02:02+0900, Chung-Lin Tang <cltang@pllab.cs.nthu.edu.tw> wrote:
>> Also added simple 'declare' tests, but there is not anything to scan in the 'tree-original' dump though.
>
> Yeah, the current OpenACC 'declare' implementation is "special".
Actually -- commit 38958ac987dc3e6162e2ddaba3c7e7f41381e079
"OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing",
see attached.
But I realized another thing: don't we have to handle the 'readonly'
modifier also in Fortran module files, that is, next to the OpenACC
'declare' 'copyin' handling in 'gcc/fortran/module.cc':
'AB_OACC_DECLARE_COPYIN' etc.? Chung-Lin, please check, via test cases.
'gfortran.dg/goacc/routine-module*', for example, should provide some
guidance of how to achieve actual module file use, and then do the same
'scan-tree-dump' as in the current 'readonly' modifier test cases.
I suppose the code changes would look similar to
commit a61f6afbee370785cf091fe46e2e022748528307
"OpenACC 'nohost' clause", for example. By means of only emitting a tag
in the module file if the 'readonly' modifier is specified, we should
maintain compatibility with the current 'MOD_VERSION'.
Grüße
Thomas
>> diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
>> index 7b154eb3ca7..db84b06289b 100644
>> --- a/gcc/fortran/dump-parse-tree.cc
>> +++ b/gcc/fortran/dump-parse-tree.cc
>> @@ -1400,6 +1400,9 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
>> fputs (") ALLOCATE(", dumpfile);
>> continue;
>> }
>> + if ((list_type == OMP_LIST_MAP || list_type == OMP_LIST_CACHE)
>> + && n->u.map.readonly)
>> + fputs ("readonly,", dumpfile);
>> if (list_type == OMP_LIST_REDUCTION)
>> switch (n->u.reduction_op)
>> {
>> @@ -1467,7 +1470,7 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
>> default: break;
>> }
>> else if (list_type == OMP_LIST_MAP)
>> - switch (n->u.map_op)
>> + switch (n->u.map.op)
>> {
>> case OMP_MAP_ALLOC: fputs ("alloc:", dumpfile); break;
>> case OMP_MAP_TO: fputs ("to:", dumpfile); break;
>> diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
>> index ebba2336e12..32b792f85fb 100644
>> --- a/gcc/fortran/gfortran.h
>> +++ b/gcc/fortran/gfortran.h
>> @@ -1363,7 +1363,11 @@ typedef struct gfc_omp_namelist
>> {
>> gfc_omp_reduction_op reduction_op;
>> gfc_omp_depend_doacross_op depend_doacross_op;
>> - gfc_omp_map_op map_op;
>> + struct
>> + {
>> + ENUM_BITFIELD (gfc_omp_map_op) op:8;
>> + bool readonly;
>> + } map;
>> gfc_expr *align;
>> struct
>> {
>> diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
>> index 38de60238c0..5c44e666eb9 100644
>> --- a/gcc/fortran/openmp.cc
>> +++ b/gcc/fortran/openmp.cc
>> @@ -1210,7 +1210,7 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
>> {
>> gfc_omp_namelist *n;
>> for (n = *head; n; n = n->next)
>> - n->u.map_op = map_op;
>> + n->u.map.op = map_op;
>> return true;
>> }
>>
>> @@ -1524,7 +1524,7 @@ gfc_match_omp_clause_reduction (char pc, gfc_omp_clauses *c, bool openacc,
>> gfc_omp_namelist *p = gfc_get_omp_namelist (), **tl;
>> p->sym = n->sym;
>> p->where = p->where;
>> - p->u.map_op = OMP_MAP_ALWAYS_TOFROM;
>> + p->u.map.op = OMP_MAP_ALWAYS_TOFROM;
>>
>> tl = &c->lists[OMP_LIST_MAP];
>> while (*tl)
>> @@ -2181,11 +2181,25 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>> {
>> if (openacc)
>> {
>> - if (gfc_match ("copyin ( ") == MATCH_YES
>> - && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
>> - OMP_MAP_TO, true,
>> - allow_derived))
>> - continue;
>> + if (gfc_match ("copyin ( ") == MATCH_YES)
>> + {
>> + bool readonly = gfc_match ("readonly : ") == MATCH_YES;
>> + head = NULL;
>> + if (gfc_match_omp_variable_list ("",
>> + &c->lists[OMP_LIST_MAP],
>> + true, NULL, &head, true,
>> + allow_derived)
>> + == MATCH_YES)
>> + {
>> + gfc_omp_namelist *n;
>> + for (n = *head; n; n = n->next)
>> + {
>> + n->u.map.op = OMP_MAP_TO;
>> + n->u.map.readonly = readonly;
>> + }
>> + continue;
>> + }
>> + }
>> }
>> else if (gfc_match_omp_variable_list ("copyin (",
>> &c->lists[OMP_LIST_COPYIN],
>> @@ -3134,7 +3148,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>> {
>> gfc_omp_namelist *n;
>> for (n = *head; n; n = n->next)
>> - n->u.map_op = map_op;
>> + n->u.map.op = map_op;
>> continue;
>> }
>> gfc_current_locus = old_loc;
>> @@ -4002,7 +4016,7 @@ gfc_match_oacc_declare (void)
>> if (gfc_current_ns->proc_name
>> && gfc_current_ns->proc_name->attr.flavor == FL_MODULE)
>> {
>> - if (n->u.map_op != OMP_MAP_ALLOC && n->u.map_op != OMP_MAP_TO)
>> + if (n->u.map.op != OMP_MAP_ALLOC && n->u.map.op != OMP_MAP_TO)
>> {
>> gfc_error ("Invalid clause in module with !$ACC DECLARE at %L",
>> &where);
>> @@ -4036,7 +4050,7 @@ gfc_match_oacc_declare (void)
>> return MATCH_ERROR;
>> }
>>
>> - switch (n->u.map_op)
>> + switch (n->u.map.op)
>> {
>> case OMP_MAP_FORCE_ALLOC:
>> case OMP_MAP_ALLOC:
>> @@ -4151,21 +4165,36 @@ gfc_match_oacc_wait (void)
>> match
>> gfc_match_oacc_cache (void)
>> {
>> + bool readonly = false;
>> gfc_omp_clauses *c = gfc_get_omp_clauses ();
>> /* The OpenACC cache directive explicitly only allows "array elements or
>> subarrays", which we're currently not checking here. Either check this
>> after the call of gfc_match_omp_variable_list, or add something like a
>> only_sections variant next to its allow_sections parameter. */
>> - match m = gfc_match_omp_variable_list (" (",
>> - &c->lists[OMP_LIST_CACHE], true,
>> - NULL, NULL, true);
>> + match m = gfc_match (" ( ");
>> if (m != MATCH_YES)
>> {
>> gfc_free_omp_clauses(c);
>> return m;
>> }
>>
>> - if (gfc_current_state() != COMP_DO
>> + if (gfc_match ("readonly : ") == MATCH_YES)
>> + readonly = true;
>> +
>> + gfc_omp_namelist **head = NULL;
>> + m = gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_CACHE], true,
>> + NULL, &head, true);
>> + if (m != MATCH_YES)
>> + {
>> + gfc_free_omp_clauses(c);
>> + return m;
>> + }
>> +
>> + if (readonly)
>> + for (gfc_omp_namelist *n = *head; n; n = n->next)
>> + n->u.map.readonly = true;
>> +
>> + if (gfc_current_state() != COMP_DO
>> && gfc_current_state() != COMP_DO_CONCURRENT)
>> {
>> gfc_error ("ACC CACHE directive must be inside of loop %C");
>> @@ -8436,8 +8465,8 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
>> }
>> if (openacc
>> && list == OMP_LIST_MAP
>> - && (n->u.map_op == OMP_MAP_ATTACH
>> - || n->u.map_op == OMP_MAP_DETACH))
>> + && (n->u.map.op == OMP_MAP_ATTACH
>> + || n->u.map.op == OMP_MAP_DETACH))
>> {
>> symbol_attribute attr;
>> if (n->expr)
>> @@ -8447,7 +8476,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
>> if (!attr.pointer && !attr.allocatable)
>> gfc_error ("%qs clause argument must be ALLOCATABLE or "
>> "a POINTER at %L",
>> - (n->u.map_op == OMP_MAP_ATTACH) ? "attach"
>> + (n->u.map.op == OMP_MAP_ATTACH) ? "attach"
>> : "detach", &n->where);
>> }
>> if (lastref
>> @@ -8518,7 +8547,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
>> else if (openacc)
>> {
>> if (list == OMP_LIST_MAP
>> - && n->u.map_op == OMP_MAP_FORCE_DEVICEPTR)
>> + && n->u.map.op == OMP_MAP_FORCE_DEVICEPTR)
>> resolve_oacc_deviceptr_clause (n->sym, n->where, name);
>> else
>> resolve_oacc_data_clauses (n->sym, n->where, name);
>> @@ -8540,7 +8569,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
>> {
>> case EXEC_OMP_TARGET:
>> case EXEC_OMP_TARGET_DATA:
>> - switch (n->u.map_op)
>> + switch (n->u.map.op)
>> {
>> case OMP_MAP_TO:
>> case OMP_MAP_ALWAYS_TO:
>> @@ -8567,7 +8596,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
>> }
>> break;
>> case EXEC_OMP_TARGET_ENTER_DATA:
>> - switch (n->u.map_op)
>> + switch (n->u.map.op)
>> {
>> case OMP_MAP_TO:
>> case OMP_MAP_ALWAYS_TO:
>> @@ -8577,16 +8606,16 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
>> case OMP_MAP_PRESENT_ALLOC:
>> break;
>> case OMP_MAP_TOFROM:
>> - n->u.map_op = OMP_MAP_TO;
>> + n->u.map.op = OMP_MAP_TO;
>> break;
>> case OMP_MAP_ALWAYS_TOFROM:
>> - n->u.map_op = OMP_MAP_ALWAYS_TO;
>> + n->u.map.op = OMP_MAP_ALWAYS_TO;
>> break;
>> case OMP_MAP_PRESENT_TOFROM:
>> - n->u.map_op = OMP_MAP_PRESENT_TO;
>> + n->u.map.op = OMP_MAP_PRESENT_TO;
>> break;
>> case OMP_MAP_ALWAYS_PRESENT_TOFROM:
>> - n->u.map_op = OMP_MAP_ALWAYS_PRESENT_TO;
>> + n->u.map.op = OMP_MAP_ALWAYS_PRESENT_TO;
>> break;
>> default:
>> gfc_error ("TARGET ENTER DATA with map-type other "
>> @@ -8596,7 +8625,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
>> }
>> break;
>> case EXEC_OMP_TARGET_EXIT_DATA:
>> - switch (n->u.map_op)
>> + switch (n->u.map.op)
>> {
>> case OMP_MAP_FROM:
>> case OMP_MAP_ALWAYS_FROM:
>> @@ -8606,16 +8635,16 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
>> case OMP_MAP_DELETE:
>> break;
>> case OMP_MAP_TOFROM:
>> - n->u.map_op = OMP_MAP_FROM;
>> + n->u.map.op = OMP_MAP_FROM;
>> break;
>> case OMP_MAP_ALWAYS_TOFROM:
>> - n->u.map_op = OMP_MAP_ALWAYS_FROM;
>> + n->u.map.op = OMP_MAP_ALWAYS_FROM;
>> break;
>> case OMP_MAP_PRESENT_TOFROM:
>> - n->u.map_op = OMP_MAP_PRESENT_FROM;
>> + n->u.map.op = OMP_MAP_PRESENT_FROM;
>> break;
>> case OMP_MAP_ALWAYS_PRESENT_TOFROM:
>> - n->u.map_op = OMP_MAP_ALWAYS_PRESENT_FROM;
>> + n->u.map.op = OMP_MAP_ALWAYS_PRESENT_FROM;
>> break;
>> default:
>> gfc_error ("TARGET EXIT DATA with map-type other "
>> diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
>> index 6d463036966..b7dea11461f 100644
>> --- a/gcc/fortran/trans-decl.cc
>> +++ b/gcc/fortran/trans-decl.cc
>> @@ -6744,7 +6744,7 @@ add_clause (gfc_symbol *sym, gfc_omp_map_op map_op)
>>
>> n = gfc_get_omp_namelist ();
>> n->sym = sym;
>> - n->u.map_op = map_op;
>> + n->u.map.op = map_op;
>>
>> if (!module_oacc_clauses)
>> module_oacc_clauses = gfc_get_omp_clauses ();
>> @@ -6846,10 +6846,10 @@ finish_oacc_declare (gfc_namespace *ns, gfc_symbol *sym, bool block)
>>
>> for (n = omp_clauses->lists[OMP_LIST_MAP]; n; n = n->next)
>> {
>> - switch (n->u.map_op)
>> + switch (n->u.map.op)
>> {
>> case OMP_MAP_DEVICE_RESIDENT:
>> - n->u.map_op = OMP_MAP_FORCE_ALLOC;
>> + n->u.map.op = OMP_MAP_FORCE_ALLOC;
>> break;
>>
>> default:
>> diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
>> index a2bf15665b3..fa1bfd41380 100644
>> --- a/gcc/fortran/trans-openmp.cc
>> +++ b/gcc/fortran/trans-openmp.cc
>> @@ -3139,7 +3139,10 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> || (n->expr && gfc_expr_attr (n->expr).pointer)))
>> always_modifier = true;
>>
>> - switch (n->u.map_op)
>> + if (n->u.map.readonly)
>> + OMP_CLAUSE_MAP_READONLY (node) = 1;
>> +
>> + switch (n->u.map.op)
>> {
>> case OMP_MAP_ALLOC:
>> OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALLOC);
>> @@ -3266,8 +3269,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> && n->sym->attr.omp_declare_target
>> && (always_modifier || n->sym->attr.pointer)
>> && op != EXEC_OMP_TARGET_EXIT_DATA
>> - && n->u.map_op != OMP_MAP_DELETE
>> - && n->u.map_op != OMP_MAP_RELEASE)
>> + && n->u.map.op != OMP_MAP_DELETE
>> + && n->u.map.op != OMP_MAP_RELEASE)
>> {
>> gcc_assert (n->sym->ts.u.cl->backend_decl);
>> node5 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
>> @@ -3333,7 +3336,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> {
>> enum gomp_map_kind gmk = GOMP_MAP_POINTER;
>> if (op == EXEC_OMP_TARGET_EXIT_DATA
>> - && n->u.map_op == OMP_MAP_DELETE)
>> + && n->u.map.op == OMP_MAP_DELETE)
>> gmk = GOMP_MAP_DELETE;
>> else if (op == EXEC_OMP_TARGET_EXIT_DATA)
>> gmk = GOMP_MAP_RELEASE;
>> @@ -3356,7 +3359,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> {
>> enum gomp_map_kind gmk;
>> if (op == EXEC_OMP_TARGET_EXIT_DATA
>> - && n->u.map_op == OMP_MAP_DELETE)
>> + && n->u.map.op == OMP_MAP_DELETE)
>> gmk = GOMP_MAP_DELETE;
>> else if (op == EXEC_OMP_TARGET_EXIT_DATA)
>> gmk = GOMP_MAP_RELEASE;
>> @@ -3388,18 +3391,18 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> node2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
>> OMP_CLAUSE_DECL (node2) = decl;
>> OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
>> - if (n->u.map_op == OMP_MAP_DELETE)
>> + if (n->u.map.op == OMP_MAP_DELETE)
>> map_kind = GOMP_MAP_DELETE;
>> else if (op == EXEC_OMP_TARGET_EXIT_DATA
>> - || n->u.map_op == OMP_MAP_RELEASE)
>> + || n->u.map.op == OMP_MAP_RELEASE)
>> map_kind = GOMP_MAP_RELEASE;
>> else
>> map_kind = GOMP_MAP_TO_PSET;
>> OMP_CLAUSE_SET_MAP_KIND (node2, map_kind);
>>
>> if (op != EXEC_OMP_TARGET_EXIT_DATA
>> - && n->u.map_op != OMP_MAP_DELETE
>> - && n->u.map_op != OMP_MAP_RELEASE)
>> + && n->u.map.op != OMP_MAP_DELETE
>> + && n->u.map.op != OMP_MAP_RELEASE)
>> {
>> node3 = build_omp_clause (input_location,
>> OMP_CLAUSE_MAP);
>> @@ -3417,7 +3420,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> = gfc_conv_descriptor_data_get (decl);
>> OMP_CLAUSE_SIZE (node3) = size_int (0);
>>
>> - if (n->u.map_op == OMP_MAP_ATTACH)
>> + if (n->u.map.op == OMP_MAP_ATTACH)
>> {
>> /* Standalone attach clauses used with arrays with
>> descriptors must copy the descriptor to the
>> @@ -3433,7 +3436,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> node3 = NULL;
>> goto finalize_map_clause;
>> }
>> - else if (n->u.map_op == OMP_MAP_DETACH)
>> + else if (n->u.map.op == OMP_MAP_DETACH)
>> {
>> OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_DETACH);
>> /* Similarly to above, we don't want to unmap PTR
>> @@ -3626,8 +3629,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> to perform a single attach/detach operation, of the
>> pointer itself, not of the pointed-to object. */
>> if (openacc
>> - && (n->u.map_op == OMP_MAP_ATTACH
>> - || n->u.map_op == OMP_MAP_DETACH))
>> + && (n->u.map.op == OMP_MAP_ATTACH
>> + || n->u.map.op == OMP_MAP_DETACH))
>> {
>> OMP_CLAUSE_DECL (node)
>> = build_fold_addr_expr (OMP_CLAUSE_DECL (node));
>> @@ -3656,7 +3659,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> se.string_length),
>> TYPE_SIZE_UNIT (tmp));
>> gomp_map_kind kind;
>> - if (n->u.map_op == OMP_MAP_DELETE)
>> + if (n->u.map.op == OMP_MAP_DELETE)
>> kind = GOMP_MAP_DELETE;
>> else if (op == EXEC_OMP_TARGET_EXIT_DATA)
>> kind = GOMP_MAP_RELEASE;
>> @@ -3713,8 +3716,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> to perform a single attach/detach operation, of the
>> pointer itself, not of the pointed-to object. */
>> if (openacc
>> - && (n->u.map_op == OMP_MAP_ATTACH
>> - || n->u.map_op == OMP_MAP_DETACH))
>> + && (n->u.map.op == OMP_MAP_ATTACH
>> + || n->u.map.op == OMP_MAP_DETACH))
>> {
>> OMP_CLAUSE_DECL (node)
>> = build_fold_addr_expr (inner);
>> @@ -3806,8 +3809,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> {
>> /* Bare attach and detach clauses don't want any
>> additional nodes. */
>> - if ((n->u.map_op == OMP_MAP_ATTACH
>> - || n->u.map_op == OMP_MAP_DETACH)
>> + if ((n->u.map.op == OMP_MAP_ATTACH
>> + || n->u.map.op == OMP_MAP_DETACH)
>> && (POINTER_TYPE_P (TREE_TYPE (inner))
>> || GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (inner))))
>> {
>> @@ -3840,8 +3843,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> map_kind = ((GOMP_MAP_ALWAYS_P (map_kind)
>> || gfc_expr_attr (n->expr).pointer)
>> ? GOMP_MAP_ALWAYS_TO : GOMP_MAP_TO);
>> - else if (n->u.map_op == OMP_MAP_RELEASE
>> - || n->u.map_op == OMP_MAP_DELETE)
>> + else if (n->u.map.op == OMP_MAP_RELEASE
>> + || n->u.map.op == OMP_MAP_DELETE)
>> ;
>> else if (op == EXEC_OMP_TARGET_EXIT_DATA
>> || op == EXEC_OACC_EXIT_DATA)
>> @@ -4088,6 +4091,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
>> }
>> if (n->u.present_modifier)
>> OMP_CLAUSE_MOTION_PRESENT (node) = 1;
>> + if (list == OMP_LIST_CACHE && n->u.map.readonly)
>> + OMP_CLAUSE__CACHE__READONLY (node) = 1;
>> omp_clauses = gfc_trans_add_clause (node, omp_clauses);
>> }
>> break;
>> @@ -6561,7 +6566,7 @@ gfc_add_clause_implicitly (gfc_omp_clauses *clauses_out,
>> n2->where = n->where;
>> n2->sym = n->sym;
>> if (is_target)
>> - n2->u.map_op = OMP_MAP_TOFROM;
>> + n2->u.map.op = OMP_MAP_TOFROM;
>> if (tail)
>> {
>> tail->next = n2;
>> diff --git a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
>> new file mode 100644
>> index 00000000000..696ebd08321
>> --- /dev/null
>> +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
>> @@ -0,0 +1,89 @@
>> +! { dg-additional-options "-fdump-tree-original" }
>> +
>> +subroutine foo (a, n)
>> + integer :: n, a(:)
>> + integer :: i, b(n), c(n)
>> + !$acc parallel copyin(readonly: a(:), b(:n)) copyin(c(:))
>> + do i = 1,32
>> + !$acc cache (readonly: a(:), b(:n))
>> + !$acc cache (c(:))
>> + enddo
>> + !$acc end parallel
>> +
>> + !$acc kernels copyin(readonly: a(:), b(:n)) copyin(c(:))
>> + do i = 1,32
>> + !$acc cache (readonly: a(:), b(:n))
>> + !$acc cache (c(:))
>> + enddo
>> + !$acc end kernels
>> +
>> + !$acc serial copyin(readonly: a(:), b(:n)) copyin(c(:))
>> + do i = 1,32
>> + !$acc cache (readonly: a(:), b(:n))
>> + !$acc cache (c(:))
>> + enddo
>> + !$acc end serial
>> +
>> + !$acc data copyin(readonly: a(:), b(:n)) copyin(c(:))
>> + do i = 1,32
>> + !$acc cache (readonly: a(:), b(:n))
>> + !$acc cache (c(:))
>> + enddo
>> + !$acc end data
>> +
>> + !$acc enter data copyin(readonly: a(:), b(:n)) copyin(c(:))
>> +
>> +end subroutine foo
>> +
>> +program main
>> + integer :: g(32), h(32)
>> + integer :: i, n = 32, a(32)
>> + integer :: b(32), c(32)
>> +
>> + !$acc declare copyin(readonly: g), copyin(h)
>> +
>> + !$acc parallel copyin(readonly: a(:32), b(:n)) copyin(c(:))
>> + do i = 1,32
>> + !$acc cache (readonly: a(:), b(:n))
>> + !$acc cache (c(:))
>> + enddo
>> + !$acc end parallel
>> +
>> + !$acc kernels copyin(readonly: a(:), b(:n)) copyin(c(:))
>> + do i = 1,32
>> + !$acc cache (readonly: a(:), b(:n))
>> + !$acc cache (c(:))
>> + enddo
>> + !$acc end kernels
>> +
>> + !$acc serial copyin(readonly: a(:), b(:n)) copyin(c(:))
>> + do i = 1,32
>> + !$acc cache (readonly: a(:), b(:n))
>> + !$acc cache (c(:))
>> + enddo
>> + !$acc end serial
>> +
>> + !$acc data copyin(readonly: a(:), b(:n)) copyin(c(:))
>> + do i = 1,32
>> + !$acc cache (readonly: a(:), b(:n))
>> + !$acc cache (c(:))
>> + enddo
>> + !$acc end data
>> +
>> + !$acc enter data copyin(readonly: a(:), b(:n)) copyin(c(:))
>> +
>> +end program main
>> +
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
>> +
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
>> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-OpenACC-2.7-front-end-support-for-readonly-modifier-.patch --]
[-- Type: text/x-diff, Size: 3887 bytes --]
From 38958ac987dc3e6162e2ddaba3c7e7f41381e079 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <tschwinge@baylibre.com>
Date: Thu, 14 Mar 2024 15:01:01 +0100
Subject: [PATCH] OpenACC 2.7: front-end support for readonly modifier: Add
basic OpenACC 'declare' testing
... to complement commit ddf852dac2abaca317c10b8323f338123b0585c8
"OpenACC 2.7: front-end support for readonly modifier".
gcc/testsuite/
* c-c++-common/goacc/readonly-1.c: Add basic OpenACC 'declare'
testing.
* gfortran.dg/goacc/readonly-1.f90: Likewise.
---
gcc/testsuite/c-c++-common/goacc/readonly-1.c | 5 +++++
gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 | 6 ++++++
2 files changed, 11 insertions(+)
diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-1.c b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
index 34fc92c24d5..300464c92e3 100644
--- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
@@ -8,12 +8,15 @@ struct S
int a[32], b[32];
#pragma acc declare copyin(readonly: a) copyin(b)
+/* Not visible in 'original' dump; handled via 'offload_vars'. */
int main (void)
{
int x[32], y[32];
struct S s = {x, 0};
+ #pragma acc declare copyin(readonly: x/*[:32]*/, s/*.ptr[:16]*/) copyin(y/*[:32]*/)
+
#pragma acc parallel copyin(readonly: x[:32], s.ptr[:16]) copyin(y[:32])
{
#pragma acc cache (readonly: x[:32])
@@ -43,6 +46,8 @@ int main (void)
return 0;
}
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc declare map\\(to:y\\) map\\(readonly,to:s\\) map\\(readonly,to:x\\)" 1 "original" } } */
+
/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
diff --git a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
index 696ebd08321..fc1e2719e67 100644
--- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
@@ -3,6 +3,9 @@
subroutine foo (a, n)
integer :: n, a(:)
integer :: i, b(n), c(n)
+ !!$acc declare copyin(readonly: a(:), b(:n)) copyin(c(:))
+ !$acc declare copyin(readonly: b) copyin(c)
+
!$acc parallel copyin(readonly: a(:), b(:n)) copyin(c(:))
do i = 1,32
!$acc cache (readonly: a(:), b(:n))
@@ -74,6 +77,9 @@ program main
end program main
+! The front end turns OpenACC 'declare' into OpenACC 'data'.
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*b\\) map\\(alloc:b.+ map\\(to:\\*c\\) map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:g\\) map\\(to:h\\)" 1 "original" } }
! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
--
2.34.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing
2024-03-14 15:09 ` OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing (was: [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends) Thomas Schwinge
@ 2024-03-14 16:55 ` Tobias Burnus
2024-03-14 16:55 ` Tobias Burnus
1 sibling, 0 replies; 18+ messages in thread
From: Tobias Burnus @ 2024-03-14 16:55 UTC (permalink / raw)
To: Thomas Schwinge, Chung-Lin Tang, gcc-patches; +Cc: Tobias Burnus, fortran
[-- Attachment #1: Type: text/plain, Size: 1351 bytes --]
Hi all, hi Thomas & Chung-Lin,
Thomas Schwinge wrote:
> But I realized another thing: don't we have to handle the 'readonly'
> modifier also in Fortran module files, that is, next to the OpenACC
> 'declare' 'copyin' handling in 'gcc/fortran/module.cc':
> 'AB_OACC_DECLARE_COPYIN' etc.?
I bet so; it is not as bad as with the others as it is "only" an
optimization hint, but it makes sense to make it available.
Note that when you place the 'module' in the same file as the module
users ('use'), the compiler might know things because they are in the
same translation unit / file not because it is in the module ...
> Chung-Lin, please check, via test cases.
> 'gfortran.dg/goacc/routine-module*', for example, should provide some
> guidance of how to achieve actual module file use, and then do the same
> 'scan-tree-dump' as in the current 'readonly' modifier test cases.
...
> By means of only emitting a tag
> in the module file if the 'readonly' modifier is specified, we should
> maintain compatibility with the current 'MOD_VERSION'.
That was the idea: If only new information gets added (if used), older
compilers still work. This has huge limitations and does not work as
well as imagined but here it should work: Older .mod will work with new
compilers, even though the reverse might not be true.
Tobias
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing
2024-03-14 15:09 ` OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing (was: [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends) Thomas Schwinge
2024-03-14 16:55 ` OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing Tobias Burnus
@ 2024-03-14 16:55 ` Tobias Burnus
1 sibling, 0 replies; 18+ messages in thread
From: Tobias Burnus @ 2024-03-14 16:55 UTC (permalink / raw)
To: Thomas Schwinge, Chung-Lin Tang, gcc-patches; +Cc: Tobias Burnus, fortran
[-- Attachment #1: Type: text/plain, Size: 1327 bytes --]
Hi all, hi Thomas & Chung-Lin,
Thomas Schwinge wrote:
> But I realized another thing: don't we have to handle the 'readonly'
> modifier also in Fortran module files, that is, next to the OpenACC
> 'declare' 'copyin' handling in 'gcc/fortran/module.cc':
> 'AB_OACC_DECLARE_COPYIN' etc.?
I bet so; it is not as bad as with the others as it is "only" an
optimization hint, but it makes sense to make it available.
Note that when you place the 'module' in the same file as the module
users ('use'), the compiler might know things because they are in the
same translation unit / file not because it is in the module ...
> Chung-Lin, please check, via test cases.
> 'gfortran.dg/goacc/routine-module*', for example, should provide some
> guidance of how to achieve actual module file use, and then do the same
> 'scan-tree-dump' as in the current 'readonly' modifier test cases.
...
> By means of only emitting a tag
> in the module file if the 'readonly' modifier is specified, we should
> maintain compatibility with the current 'MOD_VERSION'.
That was the idea: If only new information gets added (if used), older
compilers still work. This has huge limitations and does not work as
well as imagined but here it should work: Older .mod will work with new
compilers, even though the reverse might not be true.
Tobias
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis
2023-10-30 12:46 ` Richard Biener
@ 2024-04-03 11:50 ` Chung-Lin Tang
2024-04-11 14:29 ` Thomas Schwinge
2024-05-16 12:36 ` Richard Biener
0 siblings, 2 replies; 18+ messages in thread
From: Chung-Lin Tang @ 2024-04-03 11:50 UTC (permalink / raw)
To: gcc-patches, Richard Biener, Thomas Schwinge; +Cc: Chung-Lin Tang
[-- Attachment #1: Type: text/plain, Size: 5518 bytes --]
Hi Richard, Thomas,
On 2023/10/30 8:46 PM, Richard Biener wrote:
>>
>> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
>> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
>> flag.
>>
>> The actual optimization then is done in this second patch. Chung-Lin
>> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
>> I don't have much experience with most of the following generic code, so
>> would appreciate a helping hand, whether that conceptually makes sense as
>> well as from the implementation point of view:
First of all, I have removed all of the gimplify-stage scanning and setting of
DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no changes to
gimplify.cc now)
I remember this code was an artifact of earlier attempts to allow struct-member
pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but failed anyways.
I think the omp_data_* member accesses when building child function side
receiver_refs is blocking points-to analysis from working (didn't try digging deeper)
Also during gimplify, VAR_DECLs appeared to be reused (at least in some cases) for map
clause decl reference building, so hoping that the variables "happen to be" single-use and
DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY does appear to be
a little risky.
However, for firstprivate pointers processed during omp-low, it appears to be somewhat different.
(see below description)
> No, I don't think you can use that flag on non-default-defs, nor
> preserve it on copying. So
> it also doesn't nicely extend to DECLs as done by the patch. We
> currently _only_ use it
> for incoming parameters. When used on arbitrary code you can get to for example
>
> ptr1(points-to-readony-memory) = &p->x;
> ... access via ptr1 ...
> ptr2 = &p->x;
> ... access via ptr2 ...
>
> where both are your OMP regions differently constrained (the constrain is on the
> code in the region, _not_ on the actual protections of the pointed to
> data, much like
> for the fortran case). But now CSE comes along and happily replaces all ptr2
> with ptr2 in the second region and ... oops!
Richard, I assume what you meant was "happily replaces all ptr2 with ptr1 in the second region"?
That doesn't happen, because during omp-lower/expand, OMP target regions (which is all that
this applies currently) is separated into different individual child functions.
(Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during omp-lower, when
for firstprivate pointers (i.e. 'a' here) we set this bit when constructing the first load
of this pointer)
#pragma acc parallel copyin(readonly: a[:32]) copyout(r)
{
foo (a, a[8]);
r = a[8];
}
#pragma acc parallel copyin(readonly: a[:32]) copyout(r)
{
foo (a, a[12]);
r = a[12];
}
After omp-expand (before SSA):
__attribute__((oacc parallel, omp target entrypoint, noclone))
void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i)
{
...
<bb 5> :
D.2962 = .omp_data_i->D.2947;
a.8 = D.2962;
r.1 = (*a.8)[12];
foo (a.8, r.1);
r.1 = (*a.8)[12];
D.2965 = .omp_data_i->r;
*D.2965 = r.1;
return;
}
__attribute__((oacc parallel, omp target entrypoint, noclone))
void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i)
{
...
<bb 3> :
D.2968 = .omp_data_i->D.2939;
a.4 = D.2968;
r.0 = (*a.4)[8];
foo (a.4, r.0);
r.0 = (*a.4)[8];
D.2971 = .omp_data_i->r;
*D.2971 = r.0;
return;
}
So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to
SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a default-def
for an PARM_DECL, at least conceptually.
(If offloading was structured significantly differently, say if child functions
were separated much earlier before omp-lowering, than this readonly-modifier might
possibly be a direct application of 'r' in the "fn spec" attribute)
Other changes since first version of patch include:
1) update of C/C++ FE changes to new style in c-family/c-omp.cc
2) merging of two if cases in fortran/trans-openmp.cc like Thomas suggested
3) Update of readonly-2.c testcase to scan before/after "fre1" pass, to verify removal of a MEM load, also as Thomas suggested.
I have re-tested this patch using mainline, with no regressions. Is this okay for mainline?
Thanks,
Chung-Lin
2024-04-03 Chung-Lin Tang <cltang@baylibre.com>
gcc/c-family/ChangeLog:
* c-omp.cc (c_omp_address_inspector::expand_array_base):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
(c_omp_address_inspector::expand_component_selector): Likewise.
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_trans_omp_array_section):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
gcc/ChangeLog:
* gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY
for VAR_DECLs.
* omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for
variables of receiver refs.
* tree-pretty-print.cc (dump_omp_clause):
Print OMP_CLAUSE_MAP_POINTS_TO_READONLY.
(dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY.
* tree-ssanames.cc (make_ssa_name_fn): Set
SSA_NAME_POINTS_TO_READONLY_MEMORY if DECL_POINTS_TO_READONLY is set.
* tree.h (DECL_POINTS_TO_READONLY): New macro.
(OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/readonly-1.c: Adjust testcase.
* c-c++-common/goacc/readonly-2.c: New testcase.
* gfortran.dg/goacc/readonly-1.f90: Adjust testcase.
[-- Attachment #2: readonly-pt-v2.patch --]
[-- Type: text/plain, Size: 18246 bytes --]
diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc
index c0e02aa422f..458df1434ed 100644
--- a/gcc/c-family/c-omp.cc
+++ b/gcc/c-family/c-omp.cc
@@ -3907,6 +3907,8 @@ c_omp_address_inspector::expand_array_base (tree c,
}
else if (c2)
{
+ if (OMP_CLAUSE_MAP_READONLY (c))
+ OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c);
OMP_CLAUSE_CHAIN (c) = c2;
if (implicit_p)
@@ -4051,6 +4053,8 @@ c_omp_address_inspector::expand_component_selector (tree c,
}
else if (c2)
{
+ if (OMP_CLAUSE_MAP_READONLY (c))
+ OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c);
OMP_CLAUSE_CHAIN (c) = c2;
c = c2;
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index f867e2240bf..1b4bdb90cb6 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -2561,6 +2561,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_exec_op op,
ptr2 = fold_convert (ptrdiff_type_node, ptr2);
OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, ptrdiff_type_node,
ptr, ptr2);
+ if (n->u.map.readonly)
+ OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
}
static tree
diff --git a/gcc/gimple-expr.cc b/gcc/gimple-expr.cc
index f8d7185530c..35aca9dc979 100644
--- a/gcc/gimple-expr.cc
+++ b/gcc/gimple-expr.cc
@@ -385,6 +385,8 @@ copy_var_decl (tree var, tree name, tree type)
DECL_CONTEXT (copy) = DECL_CONTEXT (var);
TREE_USED (copy) = 1;
DECL_SEEN_IN_BIND_EXPR_P (copy) = 1;
+ if (VAR_P (var))
+ DECL_POINTS_TO_READONLY (copy) = DECL_POINTS_TO_READONLY (var);
DECL_ATTRIBUTES (copy) = DECL_ATTRIBUTES (var);
if (DECL_USER_ALIGN (var))
{
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index 4d003f42098..3c1024d563a 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -14003,6 +14003,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
if (ref_to_array)
x = fold_convert_loc (clause_loc, TREE_TYPE (new_var), x);
gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue);
+ if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (c) && VAR_P (x))
+ DECL_POINTS_TO_READONLY (x) = 1;
if ((is_ref && !ref_to_array)
|| ref_to_ptr)
{
diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-1.c b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
index 300464c92e3..88b6bb9efcf 100644
--- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
@@ -48,17 +48,17 @@ int main (void)
/* { dg-final { scan-tree-dump-times "(?n)#pragma acc declare map\\(to:y\\) map\\(readonly,to:s\\) map\\(readonly,to:x\\)" 1 "original" } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
-/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
+/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/readonly-2.c b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
new file mode 100644
index 00000000000..3f52a9f6afb
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
@@ -0,0 +1,16 @@
+/* { dg-additional-options "-O -fdump-tree-phiprop -fdump-tree-fre" } */
+
+#pragma acc routine
+extern void foo (int *ptr, int val);
+
+int main (void)
+{
+ int r, a[32];
+ #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
+ {
+ foo (a, a[8]);
+ r = a[8];
+ }
+}
+/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */
+/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
diff --git a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90 b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
index fc1e2719e67..cad449e6d40 100644
--- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
@@ -80,16 +80,16 @@ end program main
! The front end turns OpenACC 'declare' into OpenACC 'data'.
! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*b\\) map\\(alloc:b.+ map\\(to:\\*c\\) map\\(alloc:c.+" 1 "original" } }
! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:g\\) map\\(to:h\\)" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/pr67170.f90 b/gcc/testsuite/gfortran.dg/pr67170.f90
index 80236470f42..d7c33a4c3db 100644
--- a/gcc/testsuite/gfortran.dg/pr67170.f90
+++ b/gcc/testsuite/gfortran.dg/pr67170.f90
@@ -28,4 +28,4 @@ end subroutine foo
end module test_module
end program
-! { dg-final { scan-tree-dump-times "= \\*arg_\[0-9\]+\\(D\\);" 1 "fre1" } }
+! { dg-final { scan-tree-dump-times "= \\*arg_\[0-9\]+\\(D\\)\\(ptro\\);" 1 "fre1" } }
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 926f7e006a7..62411a97ab9 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -915,6 +915,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
pp_string (pp, "map(");
if (OMP_CLAUSE_MAP_READONLY (clause))
pp_string (pp, "readonly,");
+ if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (clause))
+ pp_string (pp, "pt_readonly,");
switch (OMP_CLAUSE_MAP_KIND (clause))
{
case GOMP_MAP_ALLOC:
@@ -3620,6 +3622,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
pp_string (pp, "(D)");
if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (node))
pp_string (pp, "(ab)");
+ if (SSA_NAME_POINTS_TO_READONLY_MEMORY (node))
+ pp_string (pp, "(ptro)");
break;
case WITH_SIZE_EXPR:
diff --git a/gcc/tree-ssanames.cc b/gcc/tree-ssanames.cc
index 1753a421a0b..cbdf4b11769 100644
--- a/gcc/tree-ssanames.cc
+++ b/gcc/tree-ssanames.cc
@@ -402,6 +402,9 @@ make_ssa_name_fn (struct function *fn, tree var, gimple *stmt,
else
SSA_NAME_RANGE_INFO (t) = NULL;
+ if (VAR_P (var) && DECL_POINTS_TO_READONLY (var))
+ SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1;
+
SSA_NAME_IN_FREE_LIST (t) = 0;
SSA_NAME_IS_DEFAULT_DEF (t) = 0;
init_ssa_name_imm_use (t);
diff --git a/gcc/tree.h b/gcc/tree.h
index b67a37d6522..1c5b883bc82 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1036,6 +1036,13 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
#define DECL_HIDDEN_STRING_LENGTH(NODE) \
(TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag)
+/* In a VAR_DECL, set for variables regarded as pointing to memory not written
+ to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs created from
+ such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in copyin
+ clauses. */
+#define DECL_POINTS_TO_READONLY(NODE) \
+ (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray)
+
/* In a CALL_EXPR, means that the call is the jump from a thunk to the
thunked-to function. Be careful to avoid using this macro when one of the
next two applies instead. */
@@ -1845,6 +1852,10 @@ class auto_suppress_location_wrappers
#define OMP_CLAUSE_MAP_READONLY(NODE) \
TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
+/* Set if 'OMP_CLAUSE_DECL (NODE)' points to read-only memory. */
+#define OMP_CLAUSE_MAP_POINTS_TO_READONLY(NODE) \
+ TREE_CONSTANT (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
+
/* Same as above, for use in OpenACC cache directives. */
#define OMP_CLAUSE__CACHE__READONLY(NODE) \
TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis
2024-04-03 11:50 ` Chung-Lin Tang
@ 2024-04-11 14:29 ` Thomas Schwinge
2024-04-12 6:17 ` Richard Biener
2024-05-16 12:36 ` Richard Biener
1 sibling, 1 reply; 18+ messages in thread
From: Thomas Schwinge @ 2024-04-11 14:29 UTC (permalink / raw)
To: Chung-Lin Tang, Richard Biener; +Cc: gcc-patches
Hi Chung-Lin, Richard!
From me just a few mechanical pieces, see below. Richard, are you able
to again comment on Chung-Lin's general strategy, as I'm not at all
familiar with those parts of the code?
On 2024-04-03T19:50:55+0800, Chung-Lin Tang <cltang@pllab.cs.nthu.edu.tw> wrote:
> On 2023/10/30 8:46 PM, Richard Biener wrote:
>>>
>>> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
>>> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
>>> flag.
>>>
>>> The actual optimization then is done in this second patch. Chung-Lin
>>> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
>>> I don't have much experience with most of the following generic code, so
>>> would appreciate a helping hand, whether that conceptually makes sense as
>>> well as from the implementation point of view:
>
> First of all, I have removed all of the gimplify-stage scanning and setting of
> DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no changes to
> gimplify.cc now)
>
> I remember this code was an artifact of earlier attempts to allow struct-member
> pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but failed anyways.
> I think the omp_data_* member accesses when building child function side
> receiver_refs is blocking points-to analysis from working (didn't try digging deeper)
>
> Also during gimplify, VAR_DECLs appeared to be reused (at least in some cases) for map
> clause decl reference building, so hoping that the variables "happen to be" single-use and
> DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY does appear to be
> a little risky.
>
> However, for firstprivate pointers processed during omp-low, it appears to be somewhat different.
> (see below description)
>
>> No, I don't think you can use that flag on non-default-defs, nor
>> preserve it on copying. So
>> it also doesn't nicely extend to DECLs as done by the patch. We
>> currently _only_ use it
>> for incoming parameters. When used on arbitrary code you can get to for example
>>
>> ptr1(points-to-readony-memory) = &p->x;
>> ... access via ptr1 ...
>> ptr2 = &p->x;
>> ... access via ptr2 ...
>>
>> where both are your OMP regions differently constrained (the constrain is on the
>> code in the region, _not_ on the actual protections of the pointed to
>> data, much like
>> for the fortran case). But now CSE comes along and happily replaces all ptr2
>> with ptr2 in the second region and ... oops!
>
> Richard, I assume what you meant was "happily replaces all ptr2 with ptr1 in the second region"?
>
> That doesn't happen, because during omp-lower/expand, OMP target regions (which is all that
> this applies currently) is separated into different individual child functions.
>
> (Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during omp-lower, when
> for firstprivate pointers (i.e. 'a' here) we set this bit when constructing the first load
> of this pointer)
>
> #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> {
> foo (a, a[8]);
> r = a[8];
> }
> #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> {
> foo (a, a[12]);
> r = a[12];
> }
>
> After omp-expand (before SSA):
>
> __attribute__((oacc parallel, omp target entrypoint, noclone))
> void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i)
> {
> ...
> <bb 5> :
> D.2962 = .omp_data_i->D.2947;
> a.8 = D.2962;
> r.1 = (*a.8)[12];
> foo (a.8, r.1);
> r.1 = (*a.8)[12];
> D.2965 = .omp_data_i->r;
> *D.2965 = r.1;
> return;
> }
>
> __attribute__((oacc parallel, omp target entrypoint, noclone))
> void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i)
> {
> ...
> <bb 3> :
> D.2968 = .omp_data_i->D.2939;
> a.4 = D.2968;
> r.0 = (*a.4)[8];
> foo (a.4, r.0);
> r.0 = (*a.4)[8];
> D.2971 = .omp_data_i->r;
> *D.2971 = r.0;
> return;
> }
>
> So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to
> SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a default-def
> for an PARM_DECL, at least conceptually.
>
> (If offloading was structured significantly differently, say if child functions
> were separated much earlier before omp-lowering, than this readonly-modifier might
> possibly be a direct application of 'r' in the "fn spec" attribute)
>
> Other changes since first version of patch include:
> 1) update of C/C++ FE changes to new style in c-family/c-omp.cc
> 2) merging of two if cases in fortran/trans-openmp.cc like Thomas suggested
> 3) Update of readonly-2.c testcase to scan before/after "fre1" pass, to verify removal of a MEM load, also as Thomas suggested.
Thanks!
> I have re-tested this patch using mainline, with no regressions. Is this okay for mainline?
> 2024-04-03 Chung-Lin Tang <cltang@baylibre.com>
>
> gcc/c-family/ChangeLog:
>
> * c-omp.cc (c_omp_address_inspector::expand_array_base):
> Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
> (c_omp_address_inspector::expand_component_selector): Likewise.
>
> gcc/fortran/ChangeLog:
>
> * trans-openmp.cc (gfc_trans_omp_array_section):
> Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
>
> gcc/ChangeLog:
>
> * gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY
> for VAR_DECLs.
> * omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for
> variables of receiver refs.
> * tree-pretty-print.cc (dump_omp_clause):
> Print OMP_CLAUSE_MAP_POINTS_TO_READONLY.
> (dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY.
> * tree-ssanames.cc (make_ssa_name_fn): Set
> SSA_NAME_POINTS_TO_READONLY_MEMORY if DECL_POINTS_TO_READONLY is set.
> * tree.h (DECL_POINTS_TO_READONLY): New macro.
> (OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro.
>
> gcc/testsuite/ChangeLog:
>
> * c-c++-common/goacc/readonly-1.c: Adjust testcase.
> * c-c++-common/goacc/readonly-2.c: New testcase.
> * gfortran.dg/goacc/readonly-1.f90: Adjust testcase.
> --- a/gcc/c-family/c-omp.cc
> +++ b/gcc/c-family/c-omp.cc
> @@ -3907,6 +3907,8 @@ c_omp_address_inspector::expand_array_base (tree c,
> }
> else if (c2)
> {
> + if (OMP_CLAUSE_MAP_READONLY (c))
> + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c);
> OMP_CLAUSE_CHAIN (c) = c2;
> if (implicit_p)
> @@ -4051,6 +4053,8 @@ c_omp_address_inspector::expand_component_selector (tree c,
> }
> else if (c2)
> {
> + if (OMP_CLAUSE_MAP_READONLY (c))
> + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c);
> OMP_CLAUSE_CHAIN (c) = c2;
> c = c2;
(So this replaces the 'gcc/c/c-typeck.cc:handle_omp_array_sections',
'gcc/cp/semantics.cc:handle_omp_array_sections' changes of the previous
patch revision?)
Are we sure that really only the 'else if (c2)' branches need to handle
this, and explicitly not the preceding 'if (c3)' branches, too? I
suggest we add a comment and/or handling, as necessary. If that makes
sense, maybe handle for both 'c3', 'c2' via a 'bool readonly_p = [...]',
similar to the existing 'bool implicit_p'?
> --- a/gcc/fortran/trans-openmp.cc
> +++ b/gcc/fortran/trans-openmp.cc
> @@ -2561,6 +2561,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_exec_op op,
> ptr2 = fold_convert (ptrdiff_type_node, ptr2);
> OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, ptrdiff_type_node,
> ptr, ptr2);
> + if (n->u.map.readonly)
> + OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
> }
>
> static tree
> --- a/gcc/gimple-expr.cc
> +++ b/gcc/gimple-expr.cc
> @@ -385,6 +385,8 @@ copy_var_decl (tree var, tree name, tree type)
> DECL_CONTEXT (copy) = DECL_CONTEXT (var);
> TREE_USED (copy) = 1;
> DECL_SEEN_IN_BIND_EXPR_P (copy) = 1;
> + if (VAR_P (var))
> + DECL_POINTS_TO_READONLY (copy) = DECL_POINTS_TO_READONLY (var);
> DECL_ATTRIBUTES (copy) = DECL_ATTRIBUTES (var);
> if (DECL_USER_ALIGN (var))
> {
> --- a/gcc/omp-low.cc
> +++ b/gcc/omp-low.cc
> @@ -14003,6 +14003,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
> if (ref_to_array)
> x = fold_convert_loc (clause_loc, TREE_TYPE (new_var), x);
> gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue);
> + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (c) && VAR_P (x))
> + DECL_POINTS_TO_READONLY (x) = 1;
> if ((is_ref && !ref_to_array)
> || ref_to_ptr)
> {
> --- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> @@ -48,17 +48,17 @@ int main (void)
>
> /* { dg-final { scan-tree-dump-times "(?n)#pragma acc declare map\\(to:y\\) map\\(readonly,to:s\\) map\\(readonly,to:x\\)" 1 "original" } } */
>
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
>
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
>
> /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
> /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
> @@ -0,0 +1,16 @@
> +/* { dg-additional-options "-O -fdump-tree-phiprop -fdump-tree-fre" } */
> +
> +#pragma acc routine
> +extern void foo (int *ptr, int val);
> +
> +int main (void)
> +{
> + int r, a[32];
> + #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> + {
> + foo (a, a[8]);
> + r = a[8];
> + }
> +}
> +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */
> +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
In the tree where I've been testing your patch, I've not been seeing
'MEM[x]' but '(*x)', therefore:
-/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */
-/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
+/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = \\(\\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 2 "phiprop1" } } */
+/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = \\(\\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 1 "fre1" } } */
Maybe that's due to something else in my (long...) Git branch; just make
sure you've got PASSes here, eventually.
> --- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> @@ -80,16 +80,16 @@ end program main
> ! The front end turns OpenACC 'declare' into OpenACC 'data'.
> ! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*b\\) map\\(alloc:b.+ map\\(to:\\*c\\) map\\(alloc:c.+" 1 "original" } }
> ! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:g\\) map\\(to:h\\)" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> -! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
>
> ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
> ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
Can we also get an OpenACC/Fortran test case à la
'c-c++-common/goacc/readonly-2.c' to demonstrate this doing something?
> --- a/gcc/testsuite/gfortran.dg/pr67170.f90
> +++ b/gcc/testsuite/gfortran.dg/pr67170.f90
> @@ -28,4 +28,4 @@ end subroutine foo
> end module test_module
> end program
>
> -! { dg-final { scan-tree-dump-times "= \\*arg_\[0-9\]+\\(D\\);" 1 "fre1" } }
> +! { dg-final { scan-tree-dump-times "= \\*arg_\[0-9\]+\\(D\\)\\(ptro\\);" 1 "fre1" } }
Is it understood what's happening here, that this is the correct
behavior? I suppose so -- there's no actual change in behavior -- as
this here doesn't trigger for OpenACC 'readonly' modifier, but just the
pretty printer change for 'SSA_NAME_POINTS_TO_READONLY_MEMORY':
> --- a/gcc/tree-pretty-print.cc
> +++ b/gcc/tree-pretty-print.cc
> @@ -915,6 +915,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
> pp_string (pp, "map(");
> if (OMP_CLAUSE_MAP_READONLY (clause))
> pp_string (pp, "readonly,");
> + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (clause))
> + pp_string (pp, "pt_readonly,");
> switch (OMP_CLAUSE_MAP_KIND (clause))
> {
> case GOMP_MAP_ALLOC:
> @@ -3620,6 +3622,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
> pp_string (pp, "(D)");
> if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (node))
> pp_string (pp, "(ab)");
> + if (SSA_NAME_POINTS_TO_READONLY_MEMORY (node))
> + pp_string (pp, "(ptro)");
> break;
>
> case WITH_SIZE_EXPR:
> --- a/gcc/tree-ssanames.cc
> +++ b/gcc/tree-ssanames.cc
> @@ -402,6 +402,9 @@ make_ssa_name_fn (struct function *fn, tree var, gimple *stmt,
> else
> SSA_NAME_RANGE_INFO (t) = NULL;
>
> + if (VAR_P (var) && DECL_POINTS_TO_READONLY (var))
> + SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1;
> +
> SSA_NAME_IN_FREE_LIST (t) = 0;
> SSA_NAME_IS_DEFAULT_DEF (t) = 0;
> init_ssa_name_imm_use (t);
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -1036,6 +1036,13 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
> #define DECL_HIDDEN_STRING_LENGTH(NODE) \
> (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag)
>
> +/* In a VAR_DECL, set for variables regarded as pointing to memory not written
> + to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs created from
> + such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in copyin
> + clauses. */
> +#define DECL_POINTS_TO_READONLY(NODE) \
> + (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray)
Again update the table for the flag uses are listed?
(There is a 'VAR_DECL_CHECK', which hopefully means the same thing.)
> +
> /* In a CALL_EXPR, means that the call is the jump from a thunk to the
> thunked-to function. Be careful to avoid using this macro when one of the
> next two applies instead. */
> @@ -1845,6 +1852,10 @@ class auto_suppress_location_wrappers
> #define OMP_CLAUSE_MAP_READONLY(NODE) \
> TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
>
> +/* Set if 'OMP_CLAUSE_DECL (NODE)' points to read-only memory. */
> +#define OMP_CLAUSE_MAP_POINTS_TO_READONLY(NODE) \
> + TREE_CONSTANT (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> +
> /* Same as above, for use in OpenACC cache directives. */
> #define OMP_CLAUSE__CACHE__READONLY(NODE) \
> TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
(Note, corresponding 'OMP_CLAUSE_MAP_POINTS_TO_READONLY' doesn't exist
yet, due to missing actual handling of the OpenACC 'cache' directive;
'gcc/gimplify.cc:gimplify_oacc_cache'.)
Grüße
Thomas
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis
2024-04-11 14:29 ` Thomas Schwinge
@ 2024-04-12 6:17 ` Richard Biener
0 siblings, 0 replies; 18+ messages in thread
From: Richard Biener @ 2024-04-12 6:17 UTC (permalink / raw)
To: Thomas Schwinge; +Cc: Chung-Lin Tang, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 26477 bytes --]
On Thu, 11 Apr 2024, Thomas Schwinge wrote:
> Hi Chung-Lin, Richard!
>
> From me just a few mechanical pieces, see below. Richard, are you able
> to again comment on Chung-Lin's general strategy, as I'm not at all
> familiar with those parts of the code?
I've queued all stage1 material and will be only able to slowly look
at it after we released.
> On 2024-04-03T19:50:55+0800, Chung-Lin Tang <cltang@pllab.cs.nthu.edu.tw> wrote:
> > On 2023/10/30 8:46 PM, Richard Biener wrote:
> >>>
> >>> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
> >>> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
> >>> flag.
> >>>
> >>> The actual optimization then is done in this second patch. Chung-Lin
> >>> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
> >>> I don't have much experience with most of the following generic code, so
> >>> would appreciate a helping hand, whether that conceptually makes sense as
> >>> well as from the implementation point of view:
> >
> > First of all, I have removed all of the gimplify-stage scanning and setting of
> > DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no changes to
> > gimplify.cc now)
> >
> > I remember this code was an artifact of earlier attempts to allow struct-member
> > pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but failed anyways.
> > I think the omp_data_* member accesses when building child function side
> > receiver_refs is blocking points-to analysis from working (didn't try digging deeper)
> >
> > Also during gimplify, VAR_DECLs appeared to be reused (at least in some cases) for map
> > clause decl reference building, so hoping that the variables "happen to be" single-use and
> > DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY does appear to be
> > a little risky.
> >
> > However, for firstprivate pointers processed during omp-low, it appears to be somewhat different.
> > (see below description)
> >
> >> No, I don't think you can use that flag on non-default-defs, nor
> >> preserve it on copying. So
> >> it also doesn't nicely extend to DECLs as done by the patch. We
> >> currently _only_ use it
> >> for incoming parameters. When used on arbitrary code you can get to for example
> >>
> >> ptr1(points-to-readony-memory) = &p->x;
> >> ... access via ptr1 ...
> >> ptr2 = &p->x;
> >> ... access via ptr2 ...
> >>
> >> where both are your OMP regions differently constrained (the constrain is on the
> >> code in the region, _not_ on the actual protections of the pointed to
> >> data, much like
> >> for the fortran case). But now CSE comes along and happily replaces all ptr2
> >> with ptr2 in the second region and ... oops!
> >
> > Richard, I assume what you meant was "happily replaces all ptr2 with ptr1 in the second region"?
> >
> > That doesn't happen, because during omp-lower/expand, OMP target regions (which is all that
> > this applies currently) is separated into different individual child functions.
> >
> > (Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during omp-lower, when
> > for firstprivate pointers (i.e. 'a' here) we set this bit when constructing the first load
> > of this pointer)
> >
> > #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> > {
> > foo (a, a[8]);
> > r = a[8];
> > }
> > #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> > {
> > foo (a, a[12]);
> > r = a[12];
> > }
> >
> > After omp-expand (before SSA):
> >
> > __attribute__((oacc parallel, omp target entrypoint, noclone))
> > void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i)
> > {
> > ...
> > <bb 5> :
> > D.2962 = .omp_data_i->D.2947;
> > a.8 = D.2962;
> > r.1 = (*a.8)[12];
> > foo (a.8, r.1);
> > r.1 = (*a.8)[12];
> > D.2965 = .omp_data_i->r;
> > *D.2965 = r.1;
> > return;
> > }
> >
> > __attribute__((oacc parallel, omp target entrypoint, noclone))
> > void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i)
> > {
> > ...
> > <bb 3> :
> > D.2968 = .omp_data_i->D.2939;
> > a.4 = D.2968;
> > r.0 = (*a.4)[8];
> > foo (a.4, r.0);
> > r.0 = (*a.4)[8];
> > D.2971 = .omp_data_i->r;
> > *D.2971 = r.0;
> > return;
> > }
> >
> > So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to
> > SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a default-def
> > for an PARM_DECL, at least conceptually.
> >
> > (If offloading was structured significantly differently, say if child functions
> > were separated much earlier before omp-lowering, than this readonly-modifier might
> > possibly be a direct application of 'r' in the "fn spec" attribute)
> >
> > Other changes since first version of patch include:
> > 1) update of C/C++ FE changes to new style in c-family/c-omp.cc
> > 2) merging of two if cases in fortran/trans-openmp.cc like Thomas suggested
> > 3) Update of readonly-2.c testcase to scan before/after "fre1" pass, to verify removal of a MEM load, also as Thomas suggested.
>
> Thanks!
>
> > I have re-tested this patch using mainline, with no regressions. Is this okay for mainline?
>
> > 2024-04-03 Chung-Lin Tang <cltang@baylibre.com>
> >
> > gcc/c-family/ChangeLog:
> >
> > * c-omp.cc (c_omp_address_inspector::expand_array_base):
> > Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
> > (c_omp_address_inspector::expand_component_selector): Likewise.
> >
> > gcc/fortran/ChangeLog:
> >
> > * trans-openmp.cc (gfc_trans_omp_array_section):
> > Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
> >
> > gcc/ChangeLog:
> >
> > * gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY
> > for VAR_DECLs.
> > * omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for
> > variables of receiver refs.
> > * tree-pretty-print.cc (dump_omp_clause):
> > Print OMP_CLAUSE_MAP_POINTS_TO_READONLY.
> > (dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY.
> > * tree-ssanames.cc (make_ssa_name_fn): Set
> > SSA_NAME_POINTS_TO_READONLY_MEMORY if DECL_POINTS_TO_READONLY is set.
> > * tree.h (DECL_POINTS_TO_READONLY): New macro.
> > (OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * c-c++-common/goacc/readonly-1.c: Adjust testcase.
> > * c-c++-common/goacc/readonly-2.c: New testcase.
> > * gfortran.dg/goacc/readonly-1.f90: Adjust testcase.
>
> > --- a/gcc/c-family/c-omp.cc
> > +++ b/gcc/c-family/c-omp.cc
> > @@ -3907,6 +3907,8 @@ c_omp_address_inspector::expand_array_base (tree c,
> > }
> > else if (c2)
> > {
> > + if (OMP_CLAUSE_MAP_READONLY (c))
> > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> > OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c);
> > OMP_CLAUSE_CHAIN (c) = c2;
> > if (implicit_p)
> > @@ -4051,6 +4053,8 @@ c_omp_address_inspector::expand_component_selector (tree c,
> > }
> > else if (c2)
> > {
> > + if (OMP_CLAUSE_MAP_READONLY (c))
> > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> > OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c);
> > OMP_CLAUSE_CHAIN (c) = c2;
> > c = c2;
>
> (So this replaces the 'gcc/c/c-typeck.cc:handle_omp_array_sections',
> 'gcc/cp/semantics.cc:handle_omp_array_sections' changes of the previous
> patch revision?)
>
> Are we sure that really only the 'else if (c2)' branches need to handle
> this, and explicitly not the preceding 'if (c3)' branches, too? I
> suggest we add a comment and/or handling, as necessary. If that makes
> sense, maybe handle for both 'c3', 'c2' via a 'bool readonly_p = [...]',
> similar to the existing 'bool implicit_p'?
>
> > --- a/gcc/fortran/trans-openmp.cc
> > +++ b/gcc/fortran/trans-openmp.cc
> > @@ -2561,6 +2561,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_exec_op op,
> > ptr2 = fold_convert (ptrdiff_type_node, ptr2);
> > OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, ptrdiff_type_node,
> > ptr, ptr2);
> > + if (n->u.map.readonly)
> > + OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
> > }
> >
> > static tree
>
> > --- a/gcc/gimple-expr.cc
> > +++ b/gcc/gimple-expr.cc
> > @@ -385,6 +385,8 @@ copy_var_decl (tree var, tree name, tree type)
> > DECL_CONTEXT (copy) = DECL_CONTEXT (var);
> > TREE_USED (copy) = 1;
> > DECL_SEEN_IN_BIND_EXPR_P (copy) = 1;
> > + if (VAR_P (var))
> > + DECL_POINTS_TO_READONLY (copy) = DECL_POINTS_TO_READONLY (var);
> > DECL_ATTRIBUTES (copy) = DECL_ATTRIBUTES (var);
> > if (DECL_USER_ALIGN (var))
> > {
>
> > --- a/gcc/omp-low.cc
> > +++ b/gcc/omp-low.cc
> > @@ -14003,6 +14003,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
> > if (ref_to_array)
> > x = fold_convert_loc (clause_loc, TREE_TYPE (new_var), x);
> > gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue);
> > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (c) && VAR_P (x))
> > + DECL_POINTS_TO_READONLY (x) = 1;
> > if ((is_ref && !ref_to_array)
> > || ref_to_ptr)
> > {
>
> > --- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> > @@ -48,17 +48,17 @@ int main (void)
> >
> > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc declare map\\(to:y\\) map\\(readonly,to:s\\) map\\(readonly,to:x\\)" 1 "original" } } */
> >
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> >
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 "original" { target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { target { c++ } } } } */
> >
> > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
> > /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
>
> > --- /dev/null
> > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
> > @@ -0,0 +1,16 @@
> > +/* { dg-additional-options "-O -fdump-tree-phiprop -fdump-tree-fre" } */
> > +
> > +#pragma acc routine
> > +extern void foo (int *ptr, int val);
> > +
> > +int main (void)
> > +{
> > + int r, a[32];
> > + #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> > + {
> > + foo (a, a[8]);
> > + r = a[8];
> > + }
> > +}
> > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */
> > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
>
> In the tree where I've been testing your patch, I've not been seeing
> 'MEM[x]' but '(*x)', therefore:
>
> -/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */
> -/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
> +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = \\(\\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 2 "phiprop1" } } */
> +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = \\(\\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 1 "fre1" } } */
>
> Maybe that's due to something else in my (long...) Git branch; just make
> sure you've got PASSes here, eventually.
>
> > --- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> > +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> > @@ -80,16 +80,16 @@ end program main
> > ! The front end turns OpenACC 'declare' into OpenACC 'data'.
> > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*b\\) map\\(alloc:b.+ map\\(to:\\*c\\) map\\(alloc:c.+" 1 "original" } }
> > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:g\\) map\\(to:h\\)" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> >
> > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
> > ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: .+\\\]\\);" 8 "original" } }
>
> Can we also get an OpenACC/Fortran test case à la
> 'c-c++-common/goacc/readonly-2.c' to demonstrate this doing something?
>
> > --- a/gcc/testsuite/gfortran.dg/pr67170.f90
> > +++ b/gcc/testsuite/gfortran.dg/pr67170.f90
> > @@ -28,4 +28,4 @@ end subroutine foo
> > end module test_module
> > end program
> >
> > -! { dg-final { scan-tree-dump-times "= \\*arg_\[0-9\]+\\(D\\);" 1 "fre1" } }
> > +! { dg-final { scan-tree-dump-times "= \\*arg_\[0-9\]+\\(D\\)\\(ptro\\);" 1 "fre1" } }
>
> Is it understood what's happening here, that this is the correct
> behavior? I suppose so -- there's no actual change in behavior -- as
> this here doesn't trigger for OpenACC 'readonly' modifier, but just the
> pretty printer change for 'SSA_NAME_POINTS_TO_READONLY_MEMORY':
>
> > --- a/gcc/tree-pretty-print.cc
> > +++ b/gcc/tree-pretty-print.cc
> > @@ -915,6 +915,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
> > pp_string (pp, "map(");
> > if (OMP_CLAUSE_MAP_READONLY (clause))
> > pp_string (pp, "readonly,");
> > + if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (clause))
> > + pp_string (pp, "pt_readonly,");
> > switch (OMP_CLAUSE_MAP_KIND (clause))
> > {
> > case GOMP_MAP_ALLOC:
> > @@ -3620,6 +3622,8 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
> > pp_string (pp, "(D)");
> > if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (node))
> > pp_string (pp, "(ab)");
> > + if (SSA_NAME_POINTS_TO_READONLY_MEMORY (node))
> > + pp_string (pp, "(ptro)");
> > break;
> >
> > case WITH_SIZE_EXPR:
>
> > --- a/gcc/tree-ssanames.cc
> > +++ b/gcc/tree-ssanames.cc
> > @@ -402,6 +402,9 @@ make_ssa_name_fn (struct function *fn, tree var, gimple *stmt,
> > else
> > SSA_NAME_RANGE_INFO (t) = NULL;
> >
> > + if (VAR_P (var) && DECL_POINTS_TO_READONLY (var))
> > + SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1;
> > +
> > SSA_NAME_IN_FREE_LIST (t) = 0;
> > SSA_NAME_IS_DEFAULT_DEF (t) = 0;
> > init_ssa_name_imm_use (t);
>
> > --- a/gcc/tree.h
> > +++ b/gcc/tree.h
> > @@ -1036,6 +1036,13 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
> > #define DECL_HIDDEN_STRING_LENGTH(NODE) \
> > (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag)
> >
> > +/* In a VAR_DECL, set for variables regarded as pointing to memory not written
> > + to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs created from
> > + such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in copyin
> > + clauses. */
> > +#define DECL_POINTS_TO_READONLY(NODE) \
> > + (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray)
>
> Again update the table for the flag uses are listed?
>
> (There is a 'VAR_DECL_CHECK', which hopefully means the same thing.)
>
> > +
> > /* In a CALL_EXPR, means that the call is the jump from a thunk to the
> > thunked-to function. Be careful to avoid using this macro when one of the
> > next two applies instead. */
> > @@ -1845,6 +1852,10 @@ class auto_suppress_location_wrappers
> > #define OMP_CLAUSE_MAP_READONLY(NODE) \
> > TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> >
> > +/* Set if 'OMP_CLAUSE_DECL (NODE)' points to read-only memory. */
> > +#define OMP_CLAUSE_MAP_POINTS_TO_READONLY(NODE) \
> > + TREE_CONSTANT (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> > +
> > /* Same as above, for use in OpenACC cache directives. */
> > #define OMP_CLAUSE__CACHE__READONLY(NODE) \
> > TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
>
> (Note, corresponding 'OMP_CLAUSE_MAP_POINTS_TO_READONLY' doesn't exist
> yet, due to missing actual handling of the OpenACC 'cache' directive;
> 'gcc/gimplify.cc:gimplify_oacc_cache'.)
>
>
> Grüße
> Thomas
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis
2024-04-03 11:50 ` Chung-Lin Tang
2024-04-11 14:29 ` Thomas Schwinge
@ 2024-05-16 12:36 ` Richard Biener
1 sibling, 0 replies; 18+ messages in thread
From: Richard Biener @ 2024-05-16 12:36 UTC (permalink / raw)
To: Chung-Lin Tang; +Cc: gcc-patches, Thomas Schwinge, Chung-Lin Tang
On Wed, 3 Apr 2024, Chung-Lin Tang wrote:
> Hi Richard, Thomas,
>
> On 2023/10/30 8:46 PM, Richard Biener wrote:
> >>
> >> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
> >> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
> >> flag.
> >>
> >> The actual optimization then is done in this second patch. Chung-Lin
> >> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
> >> I don't have much experience with most of the following generic code, so
> >> would appreciate a helping hand, whether that conceptually makes sense as
> >> well as from the implementation point of view:
>
> First of all, I have removed all of the gimplify-stage scanning and setting of
> DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no changes to
> gimplify.cc now)
>
> I remember this code was an artifact of earlier attempts to allow struct-member
> pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but failed anyways.
> I think the omp_data_* member accesses when building child function side
> receiver_refs is blocking points-to analysis from working (didn't try digging deeper)
>
> Also during gimplify, VAR_DECLs appeared to be reused (at least in some cases) for map
> clause decl reference building, so hoping that the variables "happen to be" single-use and
> DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY does appear to be
> a little risky.
>
> However, for firstprivate pointers processed during omp-low, it appears to be somewhat different.
> (see below description)
>
> > No, I don't think you can use that flag on non-default-defs, nor
> > preserve it on copying. So
> > it also doesn't nicely extend to DECLs as done by the patch. We
> > currently _only_ use it
> > for incoming parameters. When used on arbitrary code you can get to for example
> >
> > ptr1(points-to-readony-memory) = &p->x;
> > ... access via ptr1 ...
> > ptr2 = &p->x;
> > ... access via ptr2 ...
> >
> > where both are your OMP regions differently constrained (the constrain is on the
> > code in the region, _not_ on the actual protections of the pointed to
> > data, much like
> > for the fortran case). But now CSE comes along and happily replaces all ptr2
> > with ptr2 in the second region and ... oops!
>
> Richard, I assume what you meant was "happily replaces all ptr2 with ptr1 in the second region"?
>
> That doesn't happen, because during omp-lower/expand, OMP target regions (which is all that
> this applies currently) is separated into different individual child functions.
>
> (Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during omp-lower, when
> for firstprivate pointers (i.e. 'a' here) we set this bit when constructing the first load
> of this pointer)
>
> #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> {
> foo (a, a[8]);
> r = a[8];
> }
> #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> {
> foo (a, a[12]);
> r = a[12];
> }
>
> After omp-expand (before SSA):
>
> __attribute__((oacc parallel, omp target entrypoint, noclone))
> void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i)
> {
> ...
> <bb 5> :
> D.2962 = .omp_data_i->D.2947;
> a.8 = D.2962;
So 'readonly: a[:32]' is put in .omp_data_i->D.2947 in the caller
and extracted here. And you arrange for 'a.8' to have
DECL_POINTS_TO_READONLY set by "magic"? Looking at this I wonder
if it would be more useful to "const qualify" (but "really", not
in the C sense) .omp_data_i->D.2947 instead? Thus have a
FIELD_POINTS_TO_READONLY_MEMORY flag on the FIELD_DECL.
Points-to analysis should then be able to handle this similar to how
it handles loads of restrict qualified pointers. Well, of course not
as simple since it now adds "qualifiers" to storage since I presume
the same object can be both readonly and not readonly like via
#pragma acc parallel copyin(readonly: a[:32], a[33:64]) copyout(r)
? That is, currently there's only one "readonly" object kind in
points-to, that's STRING_CSTs which get all globbed to string_id
and "ignored" for alias purposes since you can't change them.
So possibly you want to combine this with restrict qualifying the
pointer so we know there's no other (read-write) access to the memory
possible. But then you might get all the good stuff already by
_just_ doing that restrict qualification and ignoring the readonly-ness?
> r.1 = (*a.8)[12];
> foo (a.8, r.1);
> r.1 = (*a.8)[12];
> D.2965 = .omp_data_i->r;
> *D.2965 = r.1;
> return;
> }
>
> __attribute__((oacc parallel, omp target entrypoint, noclone))
> void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i)
> {
> ...
> <bb 3> :
> D.2968 = .omp_data_i->D.2939;
> a.4 = D.2968;
> r.0 = (*a.4)[8];
> foo (a.4, r.0);
> r.0 = (*a.4)[8];
> D.2971 = .omp_data_i->r;
> *D.2971 = r.0;
> return;
> }
>
> So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to
> SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a default-def
> for an PARM_DECL, at least conceptually.
>
> (If offloading was structured significantly differently, say if child functions
> were separated much earlier before omp-lowering, than this readonly-modifier might
> possibly be a direct application of 'r' in the "fn spec" attribute)
>
> Other changes since first version of patch include:
> 1) update of C/C++ FE changes to new style in c-family/c-omp.cc
> 2) merging of two if cases in fortran/trans-openmp.cc like Thomas suggested
> 3) Update of readonly-2.c testcase to scan before/after "fre1" pass, to verify removal of a MEM load, also as Thomas suggested.
>
> I have re-tested this patch using mainline, with no regressions. Is this
> okay for mainline?
+/* In a VAR_DECL, set for variables regarded as pointing to memory not
written
+ to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs created
from
+ such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in
copyin
+ clauses. */
+#define DECL_POINTS_TO_READONLY(NODE) \
+ (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray)
you need to document uses of flags in tree-core.h to avoid clashes.
Also since this doesn't apply to all DECLs it should be named
VAR_POINTS_TO_...
I still think this is too fragile - there's no real constraints
on what VAR_DECL we create SSA names off, so the automatism
in make_ssa_name_fn and esp. copy_var_decl and via copy_node
copy_decl_no_change, thus during inlining, makes your arguments
only apply to the use for OpenMP - but nothing above hints at
this is just usable there, asking for trouble.
Sorry for the delay,
Richard.
> Thanks,
> Chung-Lin
>
> 2024-04-03 Chung-Lin Tang <cltang@baylibre.com>
>
> gcc/c-family/ChangeLog:
>
> * c-omp.cc (c_omp_address_inspector::expand_array_base):
> Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
> (c_omp_address_inspector::expand_component_selector): Likewise.
>
> gcc/fortran/ChangeLog:
>
> * trans-openmp.cc (gfc_trans_omp_array_section):
> Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
>
> gcc/ChangeLog:
>
> * gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY
> for VAR_DECLs.
> * omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for
> variables of receiver refs.
> * tree-pretty-print.cc (dump_omp_clause):
> Print OMP_CLAUSE_MAP_POINTS_TO_READONLY.
> (dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY.
> * tree-ssanames.cc (make_ssa_name_fn): Set
> SSA_NAME_POINTS_TO_READONLY_MEMORY if DECL_POINTS_TO_READONLY is set.
> * tree.h (DECL_POINTS_TO_READONLY): New macro.
> (OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro.
>
> gcc/testsuite/ChangeLog:
>
> * c-c++-common/goacc/readonly-1.c: Adjust testcase.
> * c-c++-common/goacc/readonly-2.c: New testcase.
> * gfortran.dg/goacc/readonly-1.f90: Adjust testcase.
>
>
>
>
>
>
>
>
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2024-05-16 12:36 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-10 18:33 [PATCH, OpenACC 2.7] readonly modifier support in front-ends Chung-Lin Tang
2023-07-11 7:00 ` Tobias Burnus
2023-07-20 13:33 ` Thomas Schwinge
2023-07-20 15:08 ` Tobias Burnus
2023-08-07 13:58 ` [PATCH, OpenACC 2.7, v2] " Chung-Lin Tang
2023-10-26 9:43 ` Thomas Schwinge
2024-03-07 8:02 ` Chung-Lin Tang
2024-03-13 9:12 ` Thomas Schwinge
2024-03-14 15:09 ` OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing (was: [PATCH, OpenACC 2.7, v2] readonly modifier support in front-ends) Thomas Schwinge
2024-03-14 16:55 ` OpenACC 2.7: front-end support for readonly modifier: Add basic OpenACC 'declare' testing Tobias Burnus
2024-03-14 16:55 ` Tobias Burnus
2023-07-25 15:52 ` [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis Chung-Lin Tang
2023-10-27 14:28 ` Thomas Schwinge
2023-10-30 12:46 ` Richard Biener
2024-04-03 11:50 ` Chung-Lin Tang
2024-04-11 14:29 ` Thomas Schwinge
2024-04-12 6:17 ` Richard Biener
2024-05-16 12:36 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).