* [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949]
@ 2022-05-11 17:33 Tobias Burnus
2022-05-19 13:59 ` Jakub Jelinek
2023-02-28 10:56 ` Thomas Schwinge
0 siblings, 2 replies; 5+ messages in thread
From: Tobias Burnus @ 2022-05-11 17:33 UTC (permalink / raw)
To: gcc-patches, fortran, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 2862 bytes --]
Dear all, dear Jakub,
this patch handles (for target regions)
firstprivate(array_descriptor)
by not only firstprivatizing the descriptor but also the data
it points to. This is done by turning it in omp-low.cc the clause
into
firstprivate(descr) firstprivate(descr.data)
and then attaching the latter to the former. That works by
adding an 'attach' after the last firstprivate (and checking
for it in libgomp). The attached-to device address for a
previous (here: the first) firstprivate is obtained by returning
the device address inside the hostaddrs[i] alias omp_arr array,
i.e. the compiler generates:
omp_arr.1 = &descr; /* firstprivate */
omp_arr.2 = descr.data; /* firstprivate */
omp_arr.3 = &omp_arr.1; /* attach; bias: &desc.data-&desc */
and libgomp then knows that the device address is in the
pointer.
Not implemented, but this scheme can also be used for
type
integer, allocatable :: A(:),B(:)
end type
where multiple attachments have to be done to the same
privatized variable.
Side effect: For has_device_addr(array_descr) the pre-patch code
changes this to firstprivate – relying on the shallow copying. Thus,
has_device_addr had to be modified to still be shallow.
OK?
* * *
Note: The code is not active for OpenACC. The existing code uses, e.g.,
'goto oacc_firstprivate' – thus, the new code would be
partially active. I went for making it completely inactive for OpenACC
by adding one '!is_gimple_omp_oacc'. I bet that a deep copy would be
also useful for OpenACC, but I have neither checked what the current
code does nor what the OpenACC spec says about this.
* * *
Some crossrefs:
* https://gcc.gnu.org/PR104949 - the PR to this patch.
* has_device_addr + array descriptor, see clarification
for TR11/OpenMP 6 (passed 2nd vote): OpenMP Spec Issue #3180 / Pull Req. #3204
(related to 'firstprivate' above)
* For a pending is_device_ptr(non-c_ptr) -> has_device_addr issue,
see https://gcc.gnu.org/PR105318
* Regarding issues with reallocation of firstprivate, see:
https://gcc.gnu.org/PR105538
(Not completely clear whether the code is valid; there are
rules related (re,de)allocation for data mapping but not
for firstprivate + issue about deallocation at the end of
the scope in this case.)
* Regarding array constructors with non-const length but
constant items, see https://gcc.gnu.org/PR91544
(and testcase)
* Deep mapping patch (but not firstprivate):
https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593704.html
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Attachment #2: omp-firstprivate-alloc-v3.diff --]
[-- Type: text/x-patch, Size: 23865 bytes --]
OpenMP: Handle descriptors in target's firstprivate [PR104949]
For allocatable/pointer arrays, a firstprivate to a device
not only needs to privatize the descriptor but also the actual
data. This is implemented as:
firstprivate(x) firstprivate(x.data) attach(x [bias: &x.data-&x)
where the address of x in device memory is saved in hostaddrs[i]
by libgomp and the middle end actually passes hostaddrs[i]' to
attach.
As side effect, has_device_addr(array_desc) had to be changed:
before, it was converted to firstprivate in the front end; now
it is handled in omp-low.cc as has_device_addr requires a shallow
firstprivate (not touching the data pointer) while the normal
firstprivate requires (now) a deep firstprivate.
gcc/fortran/ChangeLog:
PR fortran/104949
* f95-lang.cc (LANG_HOOKS_OMP_ARRAY_SIZE): Redefine.
* trans-openmp.cc (gfc_omp_array_size): New.
(gfc_trans_omp_variable_list): Never turn has_device_addr
to firstprivate.
* trans.h (gfc_omp_array_size): New.
gcc/ChangeLog:
PR fortran/104949
* langhooks-def.h (lhd_omp_array_size): New.
(LANG_HOOKS_OMP_ARRAY_SIZE): Define
(LANG_HOOKS_DECLS): Add it.
* langhooks.cc (lhd_omp_array_size): New.
* langhooks.h (struct lang_hooks_for_decls): Add hook.
* omp-low.cc (scan_sharing_clauses, lower_omp_target):
Handle GOMP_MAP_FIRSTPRIVATE for array descriptors.
libgomp/ChangeLog:
PR fortran/104949
* target.c (gomp_map_vars_internal, copy_firstprivate_data):
Support attach for GOMP_MAP_FIRSTPRIVATE.
* testsuite/libgomp.fortran/target-firstprivate-1.f90: New test.
* testsuite/libgomp.fortran/target-firstprivate-2.f90: New test.
* testsuite/libgomp.fortran/target-firstprivate-3.f90: New test.
gcc/fortran/f95-lang.cc | 2 +
gcc/fortran/trans-openmp.cc | 53 ++++++++--
gcc/fortran/trans.h | 1 +
gcc/langhooks-def.h | 3 +
gcc/langhooks.cc | 8 ++
gcc/langhooks.h | 5 +
gcc/omp-low.cc | 102 ++++++++++++++++++-
libgomp/target.c | 22 ++++
.../libgomp.fortran/target-firstprivate-1.f90 | 33 ++++++
.../libgomp.fortran/target-firstprivate-2.f90 | 113 +++++++++++++++++++++
.../libgomp.fortran/target-firstprivate-3.f90 | 24 +++++
11 files changed, 355 insertions(+), 11 deletions(-)
diff --git a/gcc/fortran/f95-lang.cc b/gcc/fortran/f95-lang.cc
index 1a895a25132..e83fef378bb 100644
--- a/gcc/fortran/f95-lang.cc
+++ b/gcc/fortran/f95-lang.cc
@@ -114,6 +114,7 @@ static const struct attribute_spec gfc_attribute_table[] =
#undef LANG_HOOKS_TYPE_FOR_SIZE
#undef LANG_HOOKS_INIT_TS
#undef LANG_HOOKS_OMP_ARRAY_DATA
+#undef LANG_HOOKS_OMP_ARRAY_SIZE
#undef LANG_HOOKS_OMP_IS_ALLOCATABLE_OR_PTR
#undef LANG_HOOKS_OMP_CHECK_OPTIONAL_ARGUMENT
#undef LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE
@@ -152,6 +153,7 @@ static const struct attribute_spec gfc_attribute_table[] =
#define LANG_HOOKS_TYPE_FOR_SIZE gfc_type_for_size
#define LANG_HOOKS_INIT_TS gfc_init_ts
#define LANG_HOOKS_OMP_ARRAY_DATA gfc_omp_array_data
+#define LANG_HOOKS_OMP_ARRAY_SIZE gfc_omp_array_size
#define LANG_HOOKS_OMP_IS_ALLOCATABLE_OR_PTR gfc_omp_is_allocatable_or_ptr
#define LANG_HOOKS_OMP_CHECK_OPTIONAL_ARGUMENT gfc_omp_check_optional_argument
#define LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE gfc_omp_privatize_by_reference
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index baa45f78a0e..5c133ab7fe0 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -169,6 +169,48 @@ gfc_omp_array_data (tree decl, bool type_only)
return decl;
}
+/* Return the byte-size of the passed array descriptor. */
+
+tree
+gfc_omp_array_size (tree decl, gimple_seq *pre_p)
+{
+ stmtblock_t block;
+ if (POINTER_TYPE_P (TREE_TYPE (decl)))
+ decl = build_fold_indirect_ref (decl);
+ tree type = TREE_TYPE (decl);
+ gcc_assert (GFC_DESCRIPTOR_TYPE_P (type));
+ bool allocatable = (GFC_TYPE_ARRAY_AKIND (type) == GFC_ARRAY_ALLOCATABLE
+ || GFC_TYPE_ARRAY_AKIND (type) == GFC_ARRAY_POINTER
+ || GFC_TYPE_ARRAY_AKIND (type) == GFC_ARRAY_POINTER_CONT);
+ gfc_init_block (&block);
+ tree size = gfc_full_array_size (&block, decl,
+ GFC_TYPE_ARRAY_RANK (TREE_TYPE (decl)));
+ size = fold_convert (size_type_node, size);
+ tree elemsz = gfc_get_element_type (TREE_TYPE (decl));
+ if (TREE_CODE (elemsz) == ARRAY_TYPE && TYPE_STRING_FLAG (elemsz))
+ elemsz = gfc_conv_descriptor_elem_len (decl);
+ else
+ elemsz = TYPE_SIZE_UNIT (elemsz);
+ size = fold_build2 (MULT_EXPR, size_type_node, size, elemsz);
+ if (!allocatable)
+ gimplify_and_add (gfc_finish_block (&block), pre_p);
+ else
+ {
+ tree var = create_tmp_var (size_type_node);
+ gfc_add_expr_to_block (&block, build2 (MODIFY_EXPR, sizetype, var, size));
+ tree tmp = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
+ gfc_conv_descriptor_data_get (decl),
+ null_pointer_node);
+ tmp = build3_loc (input_location, COND_EXPR, void_type_node, tmp,
+ gfc_finish_block (&block),
+ build2 (MODIFY_EXPR, sizetype, var, size_zero_node));
+ gimplify_and_add (tmp, pre_p);
+ size = var;
+ }
+ return size;
+}
+
+
/* True if OpenMP should privatize what this DECL points to rather
than the DECL itself. */
@@ -1922,16 +1964,7 @@ gfc_trans_omp_variable_list (enum omp_clause_code code,
if (t != error_mark_node)
{
tree node;
- /* For HAS_DEVICE_ADDR of an array descriptor, firstprivatize the
- descriptor such that the bounds are available; its data component
- is unmodified; it is handled as device address inside target. */
- if (code == OMP_CLAUSE_HAS_DEVICE_ADDR
- && (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (t))
- || (POINTER_TYPE_P (TREE_TYPE (t))
- && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (TREE_TYPE (t))))))
- node = build_omp_clause (input_location, OMP_CLAUSE_FIRSTPRIVATE);
- else
- node = build_omp_clause (input_location, code);
+ node = build_omp_clause (input_location, code);
OMP_CLAUSE_DECL (node) = t;
list = gfc_trans_add_clause (node, list);
diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index 623aceed520..03d5288aad2 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -808,6 +808,7 @@ bool gfc_get_array_descr_info (const_tree, struct array_descr_info *);
bool gfc_omp_is_allocatable_or_ptr (const_tree);
tree gfc_omp_check_optional_argument (tree, bool);
tree gfc_omp_array_data (tree, bool);
+tree gfc_omp_array_size (tree, gimple_seq *);
bool gfc_omp_privatize_by_reference (const_tree);
enum omp_clause_default_kind gfc_omp_predetermined_sharing (tree);
enum omp_clause_defaultmap_kind gfc_omp_predetermined_mapping (tree);
diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
index e2263951709..95d8dec8cc3 100644
--- a/gcc/langhooks-def.h
+++ b/gcc/langhooks-def.h
@@ -84,6 +84,7 @@ extern enum omp_clause_default_kind lhd_omp_predetermined_sharing (tree);
extern enum omp_clause_defaultmap_kind lhd_omp_predetermined_mapping (tree);
extern tree lhd_omp_assignment (tree, tree, tree);
extern void lhd_omp_finish_clause (tree, gimple_seq *, bool);
+extern tree lhd_omp_array_size (tree, gimple_seq *);
struct gimplify_omp_ctx;
extern void lhd_omp_firstprivatize_type_sizes (struct gimplify_omp_ctx *,
tree);
@@ -257,6 +258,7 @@ extern tree lhd_unit_size_without_reusable_padding (tree);
#define LANG_HOOKS_POST_COMPILATION_PARSING_CLEANUPS NULL
#define LANG_HOOKS_DECL_OK_FOR_SIBCALL lhd_decl_ok_for_sibcall
#define LANG_HOOKS_OMP_ARRAY_DATA hook_tree_tree_bool_null
+#define LANG_HOOKS_OMP_ARRAY_SIZE lhd_omp_array_size
#define LANG_HOOKS_OMP_IS_ALLOCATABLE_OR_PTR hook_bool_const_tree_false
#define LANG_HOOKS_OMP_CHECK_OPTIONAL_ARGUMENT hook_tree_tree_bool_null
#define LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE hook_bool_const_tree_false
@@ -290,6 +292,7 @@ extern tree lhd_unit_size_without_reusable_padding (tree);
LANG_HOOKS_POST_COMPILATION_PARSING_CLEANUPS, \
LANG_HOOKS_DECL_OK_FOR_SIBCALL, \
LANG_HOOKS_OMP_ARRAY_DATA, \
+ LANG_HOOKS_OMP_ARRAY_SIZE, \
LANG_HOOKS_OMP_IS_ALLOCATABLE_OR_PTR, \
LANG_HOOKS_OMP_CHECK_OPTIONAL_ARGUMENT, \
LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE, \
diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc
index df970678a08..97e51396521 100644
--- a/gcc/langhooks.cc
+++ b/gcc/langhooks.cc
@@ -634,6 +634,14 @@ lhd_omp_finish_clause (tree, gimple_seq *, bool)
{
}
+/* Return array size; cf. omp_array_data. */
+
+tree
+lhd_omp_array_size (tree, gimple_seq *)
+{
+ return NULL_TREE;
+}
+
/* Return true if DECL is a scalar variable (for the purpose of
implicit firstprivatization & mapping). Only if alloc_ptr_ok
are allocatables and pointers accepted. */
diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index 4731f089a2e..75025550aa4 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -246,6 +246,11 @@ struct lang_hooks_for_decls
is true, only the TREE_TYPE is returned without generating a new tree. */
tree (*omp_array_data) (tree, bool);
+ /* Return a tree for the actual data of an array descriptor - or NULL_TREE
+ if original tree is not an array descriptor. If the second argument
+ is true, only the TREE_TYPE is returned without generating a new tree. */
+ tree (*omp_array_size) (tree, gimple_seq *pre_p);
+
/* True if OpenMP should regard this DECL as being a scalar which has Fortran's
allocatable or pointer attribute. */
bool (*omp_is_allocatable_or_ptr) (const_tree);
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index e7818a9af5f..add99a42e90 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -1372,7 +1372,9 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
|| OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR)
&& is_gimple_omp_offloaded (ctx->stmt))
{
- if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE)
+ if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE
+ || (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR
+ && lang_hooks.decls.omp_array_data (decl, true)))
{
by_ref = !omp_privatize_by_reference (decl);
install_var_field (decl, by_ref, 3, ctx);
@@ -1424,6 +1426,15 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
install_var_field (decl, by_ref, 3, ctx);
}
install_var_local (decl, ctx);
+ /* For descr arrays on target: firstprivatize data + attach ptr. */
+ if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE
+ && is_gimple_omp_offloaded (ctx->stmt)
+ && !is_gimple_omp_oacc (ctx->stmt)
+ && lang_hooks.decls.omp_array_data (decl, true))
+ {
+ install_var_field (decl, false, 16 | 3, ctx);
+ install_var_field (decl, true, 8 | 3, ctx);
+ }
break;
case OMP_CLAUSE_USE_DEVICE_PTR:
@@ -12825,6 +12836,7 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
break;
case OMP_CLAUSE_FIRSTPRIVATE:
+ omp_firstprivate_recv:
gcc_checking_assert (offloaded);
if (is_gimple_omp_oacc (ctx->stmt))
{
@@ -12856,6 +12868,10 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
SET_DECL_VALUE_EXPR (new_var, x);
DECL_HAS_VALUE_EXPR_P (new_var) = 1;
}
+ /* Fortran array descriptors: firstprivate of data + attach. */
+ if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_HAS_DEVICE_ADDR
+ && lang_hooks.decls.omp_array_data (var, true))
+ map_cnt += 2;
break;
case OMP_CLAUSE_PRIVATE:
@@ -12895,6 +12911,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
while (TREE_CODE (var) == INDIRECT_REF
|| TREE_CODE (var) == ARRAY_REF)
var = TREE_OPERAND (var, 0);
+ if (lang_hooks.decls.omp_array_data (var, true))
+ goto omp_firstprivate_recv;
}
map_cnt++;
if (is_variable_sized (var))
@@ -13308,6 +13326,7 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
break;
case OMP_CLAUSE_FIRSTPRIVATE:
+ omp_has_device_addr_descr:
if (is_gimple_omp_oacc (ctx->stmt))
goto oacc_firstprivate_map;
ovar = OMP_CLAUSE_DECL (c);
@@ -13373,6 +13392,82 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
<= tree_to_uhwi (TYPE_MAX_VALUE (tkind_type)));
CONSTRUCTOR_APPEND_ELT (vkind, purpose,
build_int_cstu (tkind_type, tkind));
+ /* Fortran array descriptors: firstprivate of data + attach. */
+ if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_HAS_DEVICE_ADDR
+ && lang_hooks.decls.omp_array_data (ovar, true))
+ {
+ tree not_null_lb, null_lb, after_lb;
+ tree var1, var2, size1, size2;
+ tree present = omp_check_optional_argument (ovar, true);
+ if (present)
+ {
+ location_t clause_loc = OMP_CLAUSE_LOCATION (c);
+ not_null_lb = create_artificial_label (clause_loc);
+ null_lb = create_artificial_label (clause_loc);
+ after_lb = create_artificial_label (clause_loc);
+ gimple_seq seq = NULL;
+ present = force_gimple_operand (present, &seq, true,
+ NULL_TREE);
+ gimple_seq_add_seq (&ilist, seq);
+ gimple_seq_add_stmt (&ilist,
+ gimple_build_cond_from_tree (present,
+ not_null_lb, null_lb));
+ gimple_seq_add_stmt (&ilist,
+ gimple_build_label (not_null_lb));
+ }
+ var1 = lang_hooks.decls.omp_array_data (var, false);
+ size1 = lang_hooks.decls.omp_array_size (var, &ilist);
+ var2 = build_fold_addr_expr (x);
+ if (!POINTER_TYPE_P (TREE_TYPE (var)))
+ var = build_fold_addr_expr (var);
+ size2 = fold_build2 (POINTER_DIFF_EXPR, ssizetype,
+ build_fold_addr_expr (var1), var);
+ size2 = fold_convert (sizetype, size2);
+ if (present)
+ {
+ tree tmp = create_tmp_var (TREE_TYPE (var1));
+ gimplify_assign (tmp, var1, &ilist);
+ var1 = tmp;
+ tmp = create_tmp_var (TREE_TYPE (var2));
+ gimplify_assign (tmp, var2, &ilist);
+ var2 = tmp;
+ tmp = create_tmp_var (TREE_TYPE (size1));
+ gimplify_assign (tmp, size1, &ilist);
+ size1 = tmp;
+ tmp = create_tmp_var (TREE_TYPE (size2));
+ gimplify_assign (tmp, size2, &ilist);
+ size2 = tmp;
+ gimple_seq_add_stmt (&ilist, gimple_build_goto (after_lb));
+ gimple_seq_add_stmt (&ilist, gimple_build_label (null_lb));
+ gimplify_assign (var1, null_pointer_node, &ilist);
+ gimplify_assign (var2, null_pointer_node, &ilist);
+ gimplify_assign (size1, size_zero_node, &ilist);
+ gimplify_assign (size2, size_zero_node, &ilist);
+ gimple_seq_add_stmt (&ilist, gimple_build_label (after_lb));
+ }
+ x = build_sender_ref ((splay_tree_key) &DECL_NAME (ovar), ctx);
+ gimplify_assign (x, var1, &ilist);
+ tkind = GOMP_MAP_FIRSTPRIVATE;
+ talign = DECL_ALIGN_UNIT (ovar);
+ talign = ceil_log2 (talign);
+ tkind |= talign << talign_shift;
+ gcc_checking_assert (tkind
+ <= tree_to_uhwi (
+ TYPE_MAX_VALUE (tkind_type)));
+ purpose = size_int (map_idx++);
+ CONSTRUCTOR_APPEND_ELT (vsize, purpose, size1);
+ if (TREE_CODE (size1) != INTEGER_CST)
+ TREE_STATIC (TREE_VEC_ELT (t, 1)) = 0;
+ CONSTRUCTOR_APPEND_ELT (vkind, purpose,
+ build_int_cstu (tkind_type, tkind));
+ x = build_sender_ref ((splay_tree_key) &DECL_UID (ovar), ctx);
+ gimplify_assign (x, var2, &ilist);
+ tkind = GOMP_MAP_ATTACH;
+ purpose = size_int (map_idx++);
+ CONSTRUCTOR_APPEND_ELT (vsize, purpose, size2);
+ CONSTRUCTOR_APPEND_ELT (vkind, purpose,
+ build_int_cstu (tkind_type, tkind));
+ }
break;
case OMP_CLAUSE_USE_DEVICE_PTR:
@@ -13382,6 +13477,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
ovar = OMP_CLAUSE_DECL (c);
if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR)
{
+ if (lang_hooks.decls.omp_array_data (ovar, true))
+ goto omp_has_device_addr_descr;
while (TREE_CODE (ovar) == INDIRECT_REF
|| TREE_CODE (ovar) == ARRAY_REF)
ovar = TREE_OPERAND (ovar, 0);
@@ -13548,6 +13645,7 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
default:
break;
case OMP_CLAUSE_FIRSTPRIVATE:
+ omp_firstprivatize_data_region:
if (is_gimple_omp_oacc (ctx->stmt))
break;
var = OMP_CLAUSE_DECL (c);
@@ -13642,6 +13740,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
do_optional_check = false;
var = OMP_CLAUSE_DECL (c);
is_array_data = lang_hooks.decls.omp_array_data (var, true) != NULL;
+ if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_HAS_DEVICE_ADDR && is_array_data)
+ goto omp_firstprivatize_data_region;
if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_IS_DEVICE_PTR
&& OMP_CLAUSE_CODE (c) != OMP_CLAUSE_HAS_DEVICE_ADDR)
diff --git a/libgomp/target.c b/libgomp/target.c
index 4d62efdf526..89e7b7b7b0b 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1350,7 +1350,24 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
gomp_copy_host2dev (devicep, aq,
(void *) (tgt->tgt_start + tgt_size),
(void *) hostaddrs[i], len, false, cbufp);
+ /* Save device address in hostaddr to permit latter availablity
+ when doing a deep-firstprivate with pointer attach. */
+ hostaddrs[i] = (void *) (tgt->tgt_start + tgt_size);
tgt_size += len;
+
+ /* If followed by GOMP_MAP_ATTACH, pointer assign this
+ firstprivate to hostaddrs[i+1], which is assumed to contain a
+ device address. */
+ if (i + 1 < mapnum
+ && (GOMP_MAP_ATTACH
+ == (typemask & get_kind (short_mapkind, kinds, i+1))))
+ {
+ uintptr_t target = (uintptr_t) hostaddrs[i];
+ void *devptr = *(void**) hostaddrs[i+1] + sizes[i+1];
+ gomp_copy_host2dev (devicep, aq, devptr, &target,
+ sizeof (void *), false, cbufp);
+ ++i;
+ }
continue;
case GOMP_MAP_FIRSTPRIVATE_INT:
case GOMP_MAP_ZERO_LEN_ARRAY_SECTION:
@@ -2517,6 +2534,11 @@ copy_firstprivate_data (char *tgt, size_t mapnum, void **hostaddrs,
memcpy (tgt + tgt_size, hostaddrs[i], sizes[i]);
hostaddrs[i] = tgt + tgt_size;
tgt_size = tgt_size + sizes[i];
+ if (i + 1 < mapnum && (kinds[i+1] & 0xff) == GOMP_MAP_ATTACH)
+ {
+ *(*(uintptr_t**) hostaddrs[i+1] + sizes[i+1]) = (uintptr_t) hostaddrs[i];
+ ++i;
+ }
}
}
diff --git a/libgomp/testsuite/libgomp.fortran/target-firstprivate-1.f90 b/libgomp/testsuite/libgomp.fortran/target-firstprivate-1.f90
new file mode 100644
index 00000000000..7b77992a21d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-1.f90
@@ -0,0 +1,33 @@
+! PR fortran/104949
+
+implicit none (type,external)
+integer, allocatable :: A(:)
+A = [1,2,3,4,5,6]
+
+!$omp parallel firstprivate(A)
+!$omp master
+ if (any (A /= [1,2,3,4,5])) error stop
+ A(:) = [99,88,77,66,55]
+!$omp end master
+!$omp end parallel
+
+!$omp target firstprivate(A)
+ if (any (A /= [1,2,3,4,5])) error stop
+ A(:) = [99,88,77,66,55]
+!$omp end target
+if (any (A /= [1,2,3,4,5])) error stop
+
+!$omp parallel default(firstprivate)
+!$omp master
+ if (any (A /= [1,2,3,4,5])) error stop
+ A(:) = [99,88,77,66,55]
+!$omp end master
+!$omp end parallel
+if (any (A /= [1,2,3,4,5])) error stop
+
+!$omp target defaultmap(firstprivate)
+ if (any (A /= [1,2,3,4,5])) error stop
+ A(:) = [99,88,77,66,55]
+!$omp end target
+if (any (A /= [1,2,3,4,5])) error stop
+end
diff --git a/libgomp/testsuite/libgomp.fortran/target-firstprivate-2.f90 b/libgomp/testsuite/libgomp.fortran/target-firstprivate-2.f90
new file mode 100644
index 00000000000..d00b4070c11
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-2.f90
@@ -0,0 +1,113 @@
+! PR fortran/104949
+
+module m
+use omp_lib
+implicit none (type, external)
+
+contains
+subroutine one
+ integer, allocatable :: x(:)
+ integer :: i
+
+ do i = 1, omp_get_num_devices() + 1
+ !$omp target firstprivate(x)
+ if (allocated(x)) error stop
+ !$omp end target
+ if (allocated(x)) error stop
+ end do
+
+ do i = 1, omp_get_num_devices() + 1
+ !$omp target firstprivate(x, i)
+ if (allocated(x)) error stop
+ x = [10,20,30,40] + i
+ if (any (x /= [10,20,30,40] + i)) error stop
+ ! This leaks memory!
+ ! deallocate(x)
+ !$omp end target
+ if (allocated(x)) error stop
+ end do
+
+ x = [1,2,3,4]
+
+ do i = 1, omp_get_num_devices() + 1
+ !$omp target firstprivate(x, i)
+ if (i <= 0) error stop
+ if (.not.allocated(x)) error stop
+ if (size(x) /= 4) error stop
+ if (lbound(x,1) /= 1) error stop
+ if (any (x /= [1,2,3,4])) error stop
+ ! no reallocation, just malloced + assignment
+ x = [10,20,30,40] + i
+ if (any (x /= [10,20,30,40] + i)) error stop
+ ! This leaks memory!
+ ! deallocate(x)
+ !$omp end target
+ if (.not.allocated(x)) error stop
+ if (size(x) /= 4) error stop
+ if (lbound(x,1) /= 1) error stop
+ if (any (x /= [1,2,3,4])) error stop
+ end do
+ deallocate(x)
+end
+
+subroutine two
+ character(len=:), allocatable :: x(:)
+ character(len=5) :: str
+ integer :: i
+
+ str = "abcde" ! work around for PR fortran/91544
+ do i = 1, omp_get_num_devices() + 1
+ !$omp target firstprivate(x)
+ if (allocated(x)) error stop
+ !$omp end target
+ if (allocated(x)) error stop
+ end do
+
+ do i = 1, omp_get_num_devices() + 1
+ !$omp target firstprivate(x, i)
+ if (allocated(x)) error stop
+ ! no reallocation, just malloced + assignment
+ x = [character(len=2+i) :: str,"fhji","klmno"]
+ if (len(x) /= 2+i) error stop
+ if (any (x /= [character(len=2+i) :: str,"fhji","klmno"])) error stop
+ ! This leaks memory!
+ ! deallocate(x)
+ !$omp end target
+ if (allocated(x)) error stop
+ end do
+
+ x = [character(len=4) :: "ABCDE","FHJI","KLMNO"]
+
+ do i = 1, omp_get_num_devices() + 1
+ !$omp target firstprivate(x, i)
+ if (i <= 0) error stop
+ if (.not.allocated(x)) error stop
+ if (size(x) /= 3) error stop
+ if (lbound(x,1) /= 1) error stop
+ if (len(x) /= 4) error stop
+ if (any (x /= [character(len=4) :: "ABCDE","FHJI","KLMNO"])) error stop
+ !! Reallocation runs into the issue PR fortran/105538
+ !!
+ !!x = [character(len=2+i) :: str,"fhji","klmno"]
+ !!if (len(x) /= 2+i) error stop
+ !!if (any (x /= [character(len=2+i) :: str,"fhji","klmno"])) error stop
+ !! This leaks memory!
+ !! deallocate(x)
+ ! Just assign:
+ x = [character(len=4) :: "abcde","fhji","klmno"]
+ if (any (x /= [character(len=4) :: "abcde","fhji","klmno"])) error stop
+ !$omp end target
+ if (.not.allocated(x)) error stop
+ if (lbound(x,1) /= 1) error stop
+ if (size(x) /= 3) error stop
+ if (len(x) /= 4) error stop
+ if (any (x /= [character(len=4) :: "ABCDE","FHJI","KLMNO"])) error stop
+ end do
+ deallocate(x)
+end
+end module m
+
+use m
+call one
+call two
+end
diff --git a/libgomp/testsuite/libgomp.fortran/target-firstprivate-3.f90 b/libgomp/testsuite/libgomp.fortran/target-firstprivate-3.f90
new file mode 100644
index 00000000000..7406cdc4e41
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-3.f90
@@ -0,0 +1,24 @@
+implicit none
+ integer, allocatable :: x(:)
+ x = [1,2,3,4]
+ call foo(x)
+ if (any (x /= [1,2,3,4])) error stop
+ call foo()
+contains
+subroutine foo(c)
+ integer, allocatable, optional :: c(:)
+ logical :: is_present
+ is_present = present (c)
+ !$omp target firstprivate(c)
+ if (is_present) then
+ if (.not. allocated(c)) error stop
+ if (any (c /= [1,2,3,4])) error stop
+ c = [99,88,77,66]
+ if (any (c /= [99,88,77,66])) error stop
+ end if
+ !$omp end target
+ if (is_present) then
+ if (any (c /= [1,2,3,4])) error stop
+ end if
+end
+end
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949]
2022-05-11 17:33 [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949] Tobias Burnus
@ 2022-05-19 13:59 ` Jakub Jelinek
2022-05-23 9:00 ` Tobias Burnus
2023-02-28 10:56 ` Thomas Schwinge
1 sibling, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2022-05-19 13:59 UTC (permalink / raw)
To: Tobias Burnus; +Cc: gcc-patches, fortran
On Wed, May 11, 2022 at 07:33:00PM +0200, Tobias Burnus wrote:
> gcc/fortran/ChangeLog:
>
> PR fortran/104949
> * f95-lang.cc (LANG_HOOKS_OMP_ARRAY_SIZE): Redefine.
> * trans-openmp.cc (gfc_omp_array_size): New.
> (gfc_trans_omp_variable_list): Never turn has_device_addr
> to firstprivate.
> * trans.h (gfc_omp_array_size): New.
>
> gcc/ChangeLog:
>
> PR fortran/104949
> * langhooks-def.h (lhd_omp_array_size): New.
> (LANG_HOOKS_OMP_ARRAY_SIZE): Define
Missing full stop above.
> (LANG_HOOKS_DECLS): Add it.
> * langhooks.cc (lhd_omp_array_size): New.
> * langhooks.h (struct lang_hooks_for_decls): Add hook.
> * omp-low.cc (scan_sharing_clauses, lower_omp_target):
> Handle GOMP_MAP_FIRSTPRIVATE for array descriptors.
>
> libgomp/ChangeLog:
>
> PR fortran/104949
> * target.c (gomp_map_vars_internal, copy_firstprivate_data):
> Support attach for GOMP_MAP_FIRSTPRIVATE.
> * testsuite/libgomp.fortran/target-firstprivate-1.f90: New test.
> * testsuite/libgomp.fortran/target-firstprivate-2.f90: New test.
> * testsuite/libgomp.fortran/target-firstprivate-3.f90: New test.
I guess ok like this for now, but handling the further deep copy cases
(allocatable members of derived types) wouldn't be very nice, I think
generally we need a target hook to handle the stuff that is target specific
and express it say in further clauses or their modified copies (perhaps some
flags on them, or new clause types) which will allow the pointer attachments
to be done.
Generally, for firstprivate on constructs other than target we invoke
a copy constructor or its language equivalent (memcpy for C, perhaps some
deep copying for Fortran), which takes care of stuff like in C++ embedded
reference type members etc.
But for target we don't have such a luxury, we don't have copy ctors between
different devices and so we need to do something different, for now it has
been mainly just bitwise copying the aggregate between devices.
But eventually it would be nice to have a target hook that emulates the
cross-device copy construction. And we probably need also something to
emulate destruction...
Jakub
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949]
2022-05-19 13:59 ` Jakub Jelinek
@ 2022-05-23 9:00 ` Tobias Burnus
0 siblings, 0 replies; 5+ messages in thread
From: Tobias Burnus @ 2022-05-23 9:00 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: gcc-patches, fortran
Hi Jakub,
On 19.05.22 15:59, Jakub Jelinek wrote:
> I guess ok like this for now, but handling the further deep copy cases
> (allocatable members of derived types) wouldn't be very nice, I think
> generally we need a target hook to handle the stuff that is target specific
> and express it say in further clauses or their modified copies (perhaps some
> flags on them, or new clause types) which will allow the pointer attachments
> to be done.
I concur – although, the question is how to to it best – i.e. what is statically
known vs. only known at run time. The current patch requires some in-depth
knowledge both of the internal structure (array size) and also the handling of
what is passed to libgomp. But it can be done statically.
Thus, I think it is okay to handle this case of firstprivate differently from:
For the Fortran patch regarding deep-copying of derived types, it is different:
it is a complicated deeply nested structure and with polymorphic types or
recursive types – or array derived types with allocatable derived components.
In this case, omp-low.cc only calls a three lang hooks and defers most to the
language hooks. Namely: Has deep copying, how many (run-time determined) - do
a malloc - and last hook: fill the three arrays (data, sizes, kinds).
I think that patch can be extended to handle deep firstprivate as well. As the
FE lang hook code controls the data/sizes/kinds array handling, it can also
handle the firstprivate bits.
(I need at some point to cleanup the patch and submit it piecewise, starting
with some generic Fortran patches.)
(Cross ref: See omp-low.cc changes at
https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593562.html)
> But eventually it would be nice to have a target hook that emulates the
> cross-device copy construction. And we probably need also something to
> emulate destruction...
Yes – we also need something for map as OpenMP 5.x (x=1 or 2, I forgot; to be
extended in 6.0) permits more with regards to dynamic types and calling virtual
functions. (Likewise, but not relevant to mapping: Also dereferencing function
pointers.)
The submitted patch was now committed as
https://gcc.gnu.org/r13-706-g49d1a2f91325fa8cc011149e27e5093a988b3a49.
Thanks for the comments!
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949]
2022-05-11 17:33 [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949] Tobias Burnus
2022-05-19 13:59 ` Jakub Jelinek
@ 2023-02-28 10:56 ` Thomas Schwinge
2023-03-24 16:18 ` Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949] (was: [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949]) Thomas Schwinge
1 sibling, 1 reply; 5+ messages in thread
From: Thomas Schwinge @ 2023-02-28 10:56 UTC (permalink / raw)
To: Tobias Burnus, Jakub Jelinek, Julian Brown; +Cc: gcc-patches, fortran
Hi!
I'm currently reviewing 'gomp_copy_host2dev', 'ephemeral' in a different
context, and a question came up here;
commit r13-706-g49d1a2f91325fa8cc011149e27e5093a988b3a49
"OpenMP: Handle descriptors in target's firstprivate [PR104949]":
On 2022-05-11T19:33:00+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> this patch handles (for target regions)
> firstprivate(array_descriptor)
> by not only firstprivatizing the descriptor but also the data
> it points to. This is done by turning it in omp-low.cc the clause
> into
> firstprivate(descr) firstprivate(descr.data)
> and then attaching the latter to the former. That works by
> adding an 'attach' after the last firstprivate (and checking
> for it in libgomp). The attached-to device address for a
> previous (here: the first) firstprivate is obtained by returning
> the device address inside the hostaddrs[i] alias omp_arr array,
> i.e. the compiler generates:
> omp_arr.1 = &descr; /* firstprivate */
> omp_arr.2 = descr.data; /* firstprivate */
> omp_arr.3 = &omp_arr.1; /* attach; bias: &desc.data-&desc */
> and libgomp then knows that the device address is in the
> pointer.
> Note: The code is not active for OpenACC. The existing code uses, e.g.,
> 'goto oacc_firstprivate' – thus, the new code would be
> partially active. I went for making it completely inactive for OpenACC
> by adding one '!is_gimple_omp_oacc'.
ACK.
> I bet that a deep copy would be
> also useful for OpenACC, but I have neither checked what the current
> code does nor what the OpenACC spec says about this.
Instead of adding corresponding handling to the OpenACC 'firstprivate'
special code paths later on, I suggest that we first address known issues
with OpenACC 'firstprivate' -- which probably may largely be achieved by
in fact removing those 'goto oacc_firstprivate's and other special code
paths? For example, see <https://gcc.gnu.org/PR92036>
"OpenACC 'firstprivate' clause: initial value".
That means, the following code currently isn't active for OpenACC, and
given that OpenMP 'target' doesn't do asynchronous device execution
(meaning: not in the way/implementation of OpenACC 'async'), it thus
doesn't care about the 'ephemeral' argument to 'gomp_copy_host2dev', but
still, for correctness (and once that code gets used for OpenACC):
> OpenMP: Handle descriptors in target's firstprivate [PR104949]
>
> For allocatable/pointer arrays, a firstprivate to a device
> not only needs to privatize the descriptor but also the actual
> data. This is implemented as:
> firstprivate(x) firstprivate(x.data) attach(x [bias: &x.data-&x)
> where the address of x in device memory is saved in hostaddrs[i]
> by libgomp and the middle end actually passes hostaddrs[i]' to
> attach.
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -1350,7 +1350,24 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
> gomp_copy_host2dev (devicep, aq,
> (void *) (tgt->tgt_start + tgt_size),
> (void *) hostaddrs[i], len, false, cbufp);
Here, passing 'ephemeral <- false' is correct, as 'h <- hostaddrs[i]'
points to non-ephemeral data.
> + /* Save device address in hostaddr to permit latter availablity
> + when doing a deep-firstprivate with pointer attach. */
> + hostaddrs[i] = (void *) (tgt->tgt_start + tgt_size);
Here, we modify 'hostaddrs[i]' (itself -- *not* the data that the
original 'hostaddrs[i]' points to), so the above 'gomp_copy_host2dev'
with 'ephemeral <- false' is still correct, right?
> tgt_size += len;
> +
> + /* If followed by GOMP_MAP_ATTACH, pointer assign this
> + firstprivate to hostaddrs[i+1], which is assumed to contain a
> + device address. */
> + if (i + 1 < mapnum
> + && (GOMP_MAP_ATTACH
> + == (typemask & get_kind (short_mapkind, kinds, i+1))))
> + {
> + uintptr_t target = (uintptr_t) hostaddrs[i];
> + void *devptr = *(void**) hostaddrs[i+1] + sizes[i+1];
> + gomp_copy_host2dev (devicep, aq, devptr, &target,
> + sizeof (void *), false, cbufp);
However, 'h <- &target' here points to data in the local frame
('target'), which potentially goes out of scope before an asynchronous
'gomp_copy_host2dev' has completed. Thus, don't we have to pass here
'ephemeral <- true' instead of 'ephemeral <- false'? Or, actually
instead of '&target', pass '&hostaddrs[i]', which then again points to
non-ephemeral data? Is the latter safe to do, or are we potentially
further down the line modifying the data that '&hostaddrs[i]' points to?
(I got a bit lost in the use of 'hostaddrs[i]' here.)
> + ++i;
> + }
> continue;
> case GOMP_MAP_FIRSTPRIVATE_INT:
> case GOMP_MAP_ZERO_LEN_ARRAY_SECTION:
> @@ -2517,6 +2534,11 @@ copy_firstprivate_data (char *tgt, size_t mapnum, void **hostaddrs,
> memcpy (tgt + tgt_size, hostaddrs[i], sizes[i]);
> hostaddrs[i] = tgt + tgt_size;
> tgt_size = tgt_size + sizes[i];
> + if (i + 1 < mapnum && (kinds[i+1] & 0xff) == GOMP_MAP_ATTACH)
> + {
> + *(*(uintptr_t**) hostaddrs[i+1] + sizes[i+1]) = (uintptr_t) hostaddrs[i];
> + ++i;
> + }
> }
> }
For reference, the 'gomp_copy_host2dev' code that I highlighted above
still triggers for the following test cases only:
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-1.f90
> @@ -0,0 +1,33 @@
> +! PR fortran/104949
> +
> +implicit none (type,external)
> +integer, allocatable :: A(:)
> +A = [1,2,3,4,5,6]
> +
> +!$omp parallel firstprivate(A)
> +!$omp master
> + if (any (A /= [1,2,3,4,5])) error stop
> + A(:) = [99,88,77,66,55]
> +!$omp end master
> +!$omp end parallel
> +
> +!$omp target firstprivate(A)
> + if (any (A /= [1,2,3,4,5])) error stop
> + A(:) = [99,88,77,66,55]
> +!$omp end target
> +if (any (A /= [1,2,3,4,5])) error stop
> +
> +!$omp parallel default(firstprivate)
> +!$omp master
> + if (any (A /= [1,2,3,4,5])) error stop
> + A(:) = [99,88,77,66,55]
> +!$omp end master
> +!$omp end parallel
> +if (any (A /= [1,2,3,4,5])) error stop
> +
> +!$omp target defaultmap(firstprivate)
> + if (any (A /= [1,2,3,4,5])) error stop
> + A(:) = [99,88,77,66,55]
> +!$omp end target
> +if (any (A /= [1,2,3,4,5])) error stop
> +end
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-2.f90
> @@ -0,0 +1,113 @@
> +! PR fortran/104949
> +
> +module m
> +use omp_lib
> +implicit none (type, external)
> +
> +contains
> +subroutine one
> + integer, allocatable :: x(:)
> + integer :: i
> +
> + do i = 1, omp_get_num_devices() + 1
> + !$omp target firstprivate(x)
> + if (allocated(x)) error stop
> + !$omp end target
> + if (allocated(x)) error stop
> + end do
> +
> + do i = 1, omp_get_num_devices() + 1
> + !$omp target firstprivate(x, i)
> + if (allocated(x)) error stop
> + x = [10,20,30,40] + i
> + if (any (x /= [10,20,30,40] + i)) error stop
> + ! This leaks memory!
> + ! deallocate(x)
> + !$omp end target
> + if (allocated(x)) error stop
> + end do
> +
> + x = [1,2,3,4]
> +
> + do i = 1, omp_get_num_devices() + 1
> + !$omp target firstprivate(x, i)
> + if (i <= 0) error stop
> + if (.not.allocated(x)) error stop
> + if (size(x) /= 4) error stop
> + if (lbound(x,1) /= 1) error stop
> + if (any (x /= [1,2,3,4])) error stop
> + ! no reallocation, just malloced + assignment
> + x = [10,20,30,40] + i
> + if (any (x /= [10,20,30,40] + i)) error stop
> + ! This leaks memory!
> + ! deallocate(x)
> + !$omp end target
> + if (.not.allocated(x)) error stop
> + if (size(x) /= 4) error stop
> + if (lbound(x,1) /= 1) error stop
> + if (any (x /= [1,2,3,4])) error stop
> + end do
> + deallocate(x)
> +end
> +
> +subroutine two
> + character(len=:), allocatable :: x(:)
> + character(len=5) :: str
> + integer :: i
> +
> + str = "abcde" ! work around for PR fortran/91544
> + do i = 1, omp_get_num_devices() + 1
> + !$omp target firstprivate(x)
> + if (allocated(x)) error stop
> + !$omp end target
> + if (allocated(x)) error stop
> + end do
> +
> + do i = 1, omp_get_num_devices() + 1
> + !$omp target firstprivate(x, i)
> + if (allocated(x)) error stop
> + ! no reallocation, just malloced + assignment
> + x = [character(len=2+i) :: str,"fhji","klmno"]
> + if (len(x) /= 2+i) error stop
> + if (any (x /= [character(len=2+i) :: str,"fhji","klmno"])) error stop
> + ! This leaks memory!
> + ! deallocate(x)
> + !$omp end target
> + if (allocated(x)) error stop
> + end do
> +
> + x = [character(len=4) :: "ABCDE","FHJI","KLMNO"]
> +
> + do i = 1, omp_get_num_devices() + 1
> + !$omp target firstprivate(x, i)
> + if (i <= 0) error stop
> + if (.not.allocated(x)) error stop
> + if (size(x) /= 3) error stop
> + if (lbound(x,1) /= 1) error stop
> + if (len(x) /= 4) error stop
> + if (any (x /= [character(len=4) :: "ABCDE","FHJI","KLMNO"])) error stop
> + !! Reallocation runs into the issue PR fortran/105538
> + !!
> + !!x = [character(len=2+i) :: str,"fhji","klmno"]
> + !!if (len(x) /= 2+i) error stop
> + !!if (any (x /= [character(len=2+i) :: str,"fhji","klmno"])) error stop
> + !! This leaks memory!
> + !! deallocate(x)
> + ! Just assign:
> + x = [character(len=4) :: "abcde","fhji","klmno"]
> + if (any (x /= [character(len=4) :: "abcde","fhji","klmno"])) error stop
> + !$omp end target
> + if (.not.allocated(x)) error stop
> + if (lbound(x,1) /= 1) error stop
> + if (size(x) /= 3) error stop
> + if (len(x) /= 4) error stop
> + if (any (x /= [character(len=4) :: "ABCDE","FHJI","KLMNO"])) error stop
> + end do
> + deallocate(x)
> +end
> +end module m
> +
> +use m
> +call one
> +call two
> +end
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-3.f90
> @@ -0,0 +1,24 @@
> +implicit none
> + integer, allocatable :: x(:)
> + x = [1,2,3,4]
> + call foo(x)
> + if (any (x /= [1,2,3,4])) error stop
> + call foo()
> +contains
> +subroutine foo(c)
> + integer, allocatable, optional :: c(:)
> + logical :: is_present
> + is_present = present (c)
> + !$omp target firstprivate(c)
> + if (is_present) then
> + if (.not. allocated(c)) error stop
> + if (any (c /= [1,2,3,4])) error stop
> + c = [99,88,77,66]
> + if (any (c /= [99,88,77,66])) error stop
> + end if
> + !$omp end target
> + if (is_present) then
> + if (any (c /= [1,2,3,4])) error stop
> + end if
> +end
> +end
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
^ permalink raw reply [flat|nested] 5+ messages in thread
* Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949] (was: [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949])
2023-02-28 10:56 ` Thomas Schwinge
@ 2023-03-24 16:18 ` Thomas Schwinge
0 siblings, 0 replies; 5+ messages in thread
From: Thomas Schwinge @ 2023-03-24 16:18 UTC (permalink / raw)
To: gcc-patches, Tobias Burnus, Jakub Jelinek, Julian Brown; +Cc: fortran
[-- Attachment #1: Type: text/plain, Size: 12236 bytes --]
Hi!
On 2023-02-28T11:56:01+0100, I wrote:
> I'm currently reviewing 'gomp_copy_host2dev', 'ephemeral' in a different
> context, and a question came up here;
> commit r13-706-g49d1a2f91325fa8cc011149e27e5093a988b3a49
> "OpenMP: Handle descriptors in target's firstprivate [PR104949]":
It doesn't seem as if we're going to address this question anytime soon,
but it also isn't necessary; the worrysome condition currently cannot
arise. I've therefore now documented that via 'assert (!aq)', and pushed
to master branch commit e8fec6998b656dac02d4bc6c69b35a0fb5611e87
"Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949]",
see attached.
Grüße
Thomas
> On 2022-05-11T19:33:00+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
>> this patch handles (for target regions)
>> firstprivate(array_descriptor)
>> by not only firstprivatizing the descriptor but also the data
>> it points to. This is done by turning it in omp-low.cc the clause
>> into
>> firstprivate(descr) firstprivate(descr.data)
>> and then attaching the latter to the former. That works by
>> adding an 'attach' after the last firstprivate (and checking
>> for it in libgomp). The attached-to device address for a
>> previous (here: the first) firstprivate is obtained by returning
>> the device address inside the hostaddrs[i] alias omp_arr array,
>> i.e. the compiler generates:
>> omp_arr.1 = &descr; /* firstprivate */
>> omp_arr.2 = descr.data; /* firstprivate */
>> omp_arr.3 = &omp_arr.1; /* attach; bias: &desc.data-&desc */
>> and libgomp then knows that the device address is in the
>> pointer.
>
>> Note: The code is not active for OpenACC. The existing code uses, e.g.,
>> 'goto oacc_firstprivate' – thus, the new code would be
>> partially active. I went for making it completely inactive for OpenACC
>> by adding one '!is_gimple_omp_oacc'.
>
> ACK.
>
>> I bet that a deep copy would be
>> also useful for OpenACC, but I have neither checked what the current
>> code does nor what the OpenACC spec says about this.
>
> Instead of adding corresponding handling to the OpenACC 'firstprivate'
> special code paths later on, I suggest that we first address known issues
> with OpenACC 'firstprivate' -- which probably may largely be achieved by
> in fact removing those 'goto oacc_firstprivate's and other special code
> paths? For example, see <https://gcc.gnu.org/PR92036>
> "OpenACC 'firstprivate' clause: initial value".
>
> That means, the following code currently isn't active for OpenACC, and
> given that OpenMP 'target' doesn't do asynchronous device execution
> (meaning: not in the way/implementation of OpenACC 'async'), it thus
> doesn't care about the 'ephemeral' argument to 'gomp_copy_host2dev', but
> still, for correctness (and once that code gets used for OpenACC):
>
>> OpenMP: Handle descriptors in target's firstprivate [PR104949]
>>
>> For allocatable/pointer arrays, a firstprivate to a device
>> not only needs to privatize the descriptor but also the actual
>> data. This is implemented as:
>> firstprivate(x) firstprivate(x.data) attach(x [bias: &x.data-&x)
>> where the address of x in device memory is saved in hostaddrs[i]
>> by libgomp and the middle end actually passes hostaddrs[i]' to
>> attach.
>
>> --- a/libgomp/target.c
>> +++ b/libgomp/target.c
>> @@ -1350,7 +1350,24 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
>> gomp_copy_host2dev (devicep, aq,
>> (void *) (tgt->tgt_start + tgt_size),
>> (void *) hostaddrs[i], len, false, cbufp);
>
> Here, passing 'ephemeral <- false' is correct, as 'h <- hostaddrs[i]'
> points to non-ephemeral data.
>
>> + /* Save device address in hostaddr to permit latter availablity
>> + when doing a deep-firstprivate with pointer attach. */
>> + hostaddrs[i] = (void *) (tgt->tgt_start + tgt_size);
>
> Here, we modify 'hostaddrs[i]' (itself -- *not* the data that the
> original 'hostaddrs[i]' points to), so the above 'gomp_copy_host2dev'
> with 'ephemeral <- false' is still correct, right?
>
>> tgt_size += len;
>> +
>> + /* If followed by GOMP_MAP_ATTACH, pointer assign this
>> + firstprivate to hostaddrs[i+1], which is assumed to contain a
>> + device address. */
>> + if (i + 1 < mapnum
>> + && (GOMP_MAP_ATTACH
>> + == (typemask & get_kind (short_mapkind, kinds, i+1))))
>> + {
>> + uintptr_t target = (uintptr_t) hostaddrs[i];
>> + void *devptr = *(void**) hostaddrs[i+1] + sizes[i+1];
>> + gomp_copy_host2dev (devicep, aq, devptr, &target,
>> + sizeof (void *), false, cbufp);
>
> However, 'h <- &target' here points to data in the local frame
> ('target'), which potentially goes out of scope before an asynchronous
> 'gomp_copy_host2dev' has completed. Thus, don't we have to pass here
> 'ephemeral <- true' instead of 'ephemeral <- false'? Or, actually
> instead of '&target', pass '&hostaddrs[i]', which then again points to
> non-ephemeral data? Is the latter safe to do, or are we potentially
> further down the line modifying the data that '&hostaddrs[i]' points to?
> (I got a bit lost in the use of 'hostaddrs[i]' here.)
>
>> + ++i;
>> + }
>> continue;
>> case GOMP_MAP_FIRSTPRIVATE_INT:
>> case GOMP_MAP_ZERO_LEN_ARRAY_SECTION:
>> @@ -2517,6 +2534,11 @@ copy_firstprivate_data (char *tgt, size_t mapnum, void **hostaddrs,
>> memcpy (tgt + tgt_size, hostaddrs[i], sizes[i]);
>> hostaddrs[i] = tgt + tgt_size;
>> tgt_size = tgt_size + sizes[i];
>> + if (i + 1 < mapnum && (kinds[i+1] & 0xff) == GOMP_MAP_ATTACH)
>> + {
>> + *(*(uintptr_t**) hostaddrs[i+1] + sizes[i+1]) = (uintptr_t) hostaddrs[i];
>> + ++i;
>> + }
>> }
>> }
>
> For reference, the 'gomp_copy_host2dev' code that I highlighted above
> still triggers for the following test cases only:
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-1.f90
>> @@ -0,0 +1,33 @@
>> +! PR fortran/104949
>> +
>> +implicit none (type,external)
>> +integer, allocatable :: A(:)
>> +A = [1,2,3,4,5,6]
>> +
>> +!$omp parallel firstprivate(A)
>> +!$omp master
>> + if (any (A /= [1,2,3,4,5])) error stop
>> + A(:) = [99,88,77,66,55]
>> +!$omp end master
>> +!$omp end parallel
>> +
>> +!$omp target firstprivate(A)
>> + if (any (A /= [1,2,3,4,5])) error stop
>> + A(:) = [99,88,77,66,55]
>> +!$omp end target
>> +if (any (A /= [1,2,3,4,5])) error stop
>> +
>> +!$omp parallel default(firstprivate)
>> +!$omp master
>> + if (any (A /= [1,2,3,4,5])) error stop
>> + A(:) = [99,88,77,66,55]
>> +!$omp end master
>> +!$omp end parallel
>> +if (any (A /= [1,2,3,4,5])) error stop
>> +
>> +!$omp target defaultmap(firstprivate)
>> + if (any (A /= [1,2,3,4,5])) error stop
>> + A(:) = [99,88,77,66,55]
>> +!$omp end target
>> +if (any (A /= [1,2,3,4,5])) error stop
>> +end
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-2.f90
>> @@ -0,0 +1,113 @@
>> +! PR fortran/104949
>> +
>> +module m
>> +use omp_lib
>> +implicit none (type, external)
>> +
>> +contains
>> +subroutine one
>> + integer, allocatable :: x(:)
>> + integer :: i
>> +
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x)
>> + if (allocated(x)) error stop
>> + !$omp end target
>> + if (allocated(x)) error stop
>> + end do
>> +
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x, i)
>> + if (allocated(x)) error stop
>> + x = [10,20,30,40] + i
>> + if (any (x /= [10,20,30,40] + i)) error stop
>> + ! This leaks memory!
>> + ! deallocate(x)
>> + !$omp end target
>> + if (allocated(x)) error stop
>> + end do
>> +
>> + x = [1,2,3,4]
>> +
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x, i)
>> + if (i <= 0) error stop
>> + if (.not.allocated(x)) error stop
>> + if (size(x) /= 4) error stop
>> + if (lbound(x,1) /= 1) error stop
>> + if (any (x /= [1,2,3,4])) error stop
>> + ! no reallocation, just malloced + assignment
>> + x = [10,20,30,40] + i
>> + if (any (x /= [10,20,30,40] + i)) error stop
>> + ! This leaks memory!
>> + ! deallocate(x)
>> + !$omp end target
>> + if (.not.allocated(x)) error stop
>> + if (size(x) /= 4) error stop
>> + if (lbound(x,1) /= 1) error stop
>> + if (any (x /= [1,2,3,4])) error stop
>> + end do
>> + deallocate(x)
>> +end
>> +
>> +subroutine two
>> + character(len=:), allocatable :: x(:)
>> + character(len=5) :: str
>> + integer :: i
>> +
>> + str = "abcde" ! work around for PR fortran/91544
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x)
>> + if (allocated(x)) error stop
>> + !$omp end target
>> + if (allocated(x)) error stop
>> + end do
>> +
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x, i)
>> + if (allocated(x)) error stop
>> + ! no reallocation, just malloced + assignment
>> + x = [character(len=2+i) :: str,"fhji","klmno"]
>> + if (len(x) /= 2+i) error stop
>> + if (any (x /= [character(len=2+i) :: str,"fhji","klmno"])) error stop
>> + ! This leaks memory!
>> + ! deallocate(x)
>> + !$omp end target
>> + if (allocated(x)) error stop
>> + end do
>> +
>> + x = [character(len=4) :: "ABCDE","FHJI","KLMNO"]
>> +
>> + do i = 1, omp_get_num_devices() + 1
>> + !$omp target firstprivate(x, i)
>> + if (i <= 0) error stop
>> + if (.not.allocated(x)) error stop
>> + if (size(x) /= 3) error stop
>> + if (lbound(x,1) /= 1) error stop
>> + if (len(x) /= 4) error stop
>> + if (any (x /= [character(len=4) :: "ABCDE","FHJI","KLMNO"])) error stop
>> + !! Reallocation runs into the issue PR fortran/105538
>> + !!
>> + !!x = [character(len=2+i) :: str,"fhji","klmno"]
>> + !!if (len(x) /= 2+i) error stop
>> + !!if (any (x /= [character(len=2+i) :: str,"fhji","klmno"])) error stop
>> + !! This leaks memory!
>> + !! deallocate(x)
>> + ! Just assign:
>> + x = [character(len=4) :: "abcde","fhji","klmno"]
>> + if (any (x /= [character(len=4) :: "abcde","fhji","klmno"])) error stop
>> + !$omp end target
>> + if (.not.allocated(x)) error stop
>> + if (lbound(x,1) /= 1) error stop
>> + if (size(x) /= 3) error stop
>> + if (len(x) /= 4) error stop
>> + if (any (x /= [character(len=4) :: "ABCDE","FHJI","KLMNO"])) error stop
>> + end do
>> + deallocate(x)
>> +end
>> +end module m
>> +
>> +use m
>> +call one
>> +call two
>> +end
>
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.fortran/target-firstprivate-3.f90
>> @@ -0,0 +1,24 @@
>> +implicit none
>> + integer, allocatable :: x(:)
>> + x = [1,2,3,4]
>> + call foo(x)
>> + if (any (x /= [1,2,3,4])) error stop
>> + call foo()
>> +contains
>> +subroutine foo(c)
>> + integer, allocatable, optional :: c(:)
>> + logical :: is_present
>> + is_present = present (c)
>> + !$omp target firstprivate(c)
>> + if (is_present) then
>> + if (.not. allocated(c)) error stop
>> + if (any (c /= [1,2,3,4])) error stop
>> + c = [99,88,77,66]
>> + if (any (c /= [99,88,77,66])) error stop
>> + end if
>> + !$omp end target
>> + if (is_present) then
>> + if (any (c /= [1,2,3,4])) error stop
>> + end if
>> +end
>> +end
>
>
> Grüße
> Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-caveat-safeguard-to-OpenMP-Handle-descriptors-in.patch --]
[-- Type: text/x-diff, Size: 1288 bytes --]
From e8fec6998b656dac02d4bc6c69b35a0fb5611e87 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Thu, 23 Mar 2023 12:32:35 +0100
Subject: [PATCH] Add caveat/safeguard to OpenMP: Handle descriptors in
target's firstprivate [PR104949]
Follow-up to commit 49d1a2f91325fa8cc011149e27e5093a988b3a49
"OpenMP: Handle descriptors in target's firstprivate [PR104949]".
PR fortran/104949
libgomp/
* target.c (gomp_map_vars_internal) <GOMP_MAP_FIRSTPRIVATE>: Add
caveat/safeguard.
---
libgomp/target.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/libgomp/target.c b/libgomp/target.c
index 90b4204133a..b30c6a50c7e 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1396,6 +1396,11 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep,
{
uintptr_t target = (uintptr_t) hostaddrs[i];
void *devptr = *(void**) hostaddrs[i+1] + sizes[i+1];
+ /* Per
+ <https://inbox.sourceware.org/gcc-patches/87o7pe12ke.fsf@euler.schwinge.homeip.net>
+ "OpenMP: Handle descriptors in target's firstprivate [PR104949]"
+ this probably needs revision for 'aq' usage. */
+ assert (!aq);
gomp_copy_host2dev (devicep, aq, devptr, &target,
sizeof (void *), false, cbufp);
++i;
--
2.25.1
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-03-24 16:18 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-11 17:33 [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949] Tobias Burnus
2022-05-19 13:59 ` Jakub Jelinek
2022-05-23 9:00 ` Tobias Burnus
2023-02-28 10:56 ` Thomas Schwinge
2023-03-24 16:18 ` Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949] (was: [Patch] OpenMP: Handle descriptors in target's firstprivate [PR104949]) Thomas Schwinge
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).