From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1534) id B5930382FD8C; Tue, 30 Aug 2022 19:54:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B5930382FD8C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1661889251; bh=dRt4547ByZhYwGX9KHSvlfieKBX6VJM1VH8YJ6pZYFs=; h=From:To:Subject:Date:From; b=ZxleTUmbQoCJ+YoIAfW+95Fg1PJxXTq9MCiggybpi/DFP2elMQLYeLLBRz6kYre4E /0ej/P2+4TEebQjmB/5i7sbpr9a9hK47LzSXEM5MFKK5TVremMVM58xtG0m42DK2AA yTGRmzp5VaFGCrMxrND/EaAIfMCPboxVQClBSUJw= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Tobias Burnus To: gcc-cvs@gcc.gnu.org Subject: [gcc/devel/omp/gcc-12] OpenMP: Support reverse offload (middle end part) X-Act-Checkin: gcc X-Git-Author: Tobias Burnus X-Git-Refname: refs/heads/devel/omp/gcc-12 X-Git-Oldrev: 2d8aa5cc5d34fa333df2b215a89063e37d483eaf X-Git-Newrev: b0256655fb402f87c921cd782b873dd301760ebd Message-Id: <20220830195411.B5930382FD8C@sourceware.org> Date: Tue, 30 Aug 2022 19:54:11 +0000 (GMT) List-Id: https://gcc.gnu.org/g:b0256655fb402f87c921cd782b873dd301760ebd commit b0256655fb402f87c921cd782b873dd301760ebd Author: Tobias Burnus Date: Tue Aug 30 21:44:10 2022 +0200 OpenMP: Support reverse offload (middle end part) gcc/ChangeLog: * internal-fn.cc (expand_GOMP_TARGET_REV): New. * internal-fn.def (GOMP_TARGET_REV): New. * lto-cgraph.cc (lto_output_node, verify_node_partition): Mark 'omp target device_ancestor_host' as in_other_partition and don't error if absent. * omp-low.cc (create_omp_child_function): Mark as 'noclone'. * omp-expand.cc (expand_omp_target): For reverse offload, remove sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create empty-body nohost function. * omp-offload.cc (execute_omp_device_lower): Handle IFN_GOMP_TARGET_REV. (pass_omp_target_link::execute): For ACCEL_COMPILER, don't nullify fn argument for reverse offload libgomp/ChangeLog: * libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but refer to 'requires'. * testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test. * testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test. * testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test. * testsuite/libgomp.fortran/reverse-offload-1.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/reverse-offload-1.c: Remove dg-sorry. * c-c++-common/gomp/target-device-ancestor-4.c: Likewise. * gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise. * gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise. * c-c++-common/goacc/classify-kernels-parloops.c: Add 'noclone' to scan-tree-dump-times. * c-c++-common/goacc/classify-kernels-unparallelized-parloops.c: Likewise. * c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-serial.c: Likewise. * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise. * c-c++-common/goacc/kernels-loop-2.c: Likewise. * c-c++-common/goacc/kernels-loop-3.c: Likewise. * c-c++-common/goacc/kernels-loop-data-2.c: Likewise. * c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise. * c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise. * c-c++-common/goacc/kernels-loop-data-update.c: Likewise. * c-c++-common/goacc/kernels-loop-data.c: Likewise. * c-c++-common/goacc/kernels-loop-g.c: Likewise. * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise. * c-c++-common/goacc/kernels-loop-n.c: Likewise. * c-c++-common/goacc/kernels-loop-nest.c: Likewise. * c-c++-common/goacc/kernels-loop.c: Likewise. * c-c++-common/goacc/kernels-one-counter-var.c: Likewise. * c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: Likewise. * gfortran.dg/goacc/classify-kernels-parloops.f95: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/classify-parallel.f95: Likewise. * gfortran.dg/goacc/classify-serial.f95: Likewise. * gfortran.dg/goacc/kernels-loop-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise. * gfortran.dg/goacc/kernels-loop-data.f95: Likewise. * gfortran.dg/goacc/kernels-loop-n.f95: Likewise. * gfortran.dg/goacc/kernels-loop.f95: Likewise. * gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: Likewise. (cherry picked from commit d6621a2f3176dd6a593d4f5fa7f85db0234b40d2) Diff: --- gcc/ChangeLog.omp | 19 ++++ gcc/internal-fn.cc | 8 ++ gcc/internal-fn.def | 1 + gcc/lto-cgraph.cc | 21 +++- gcc/omp-expand.cc | 106 +++++++++++++++++++-- gcc/omp-low.cc | 5 + gcc/omp-offload.cc | 50 ++++++++++ gcc/testsuite/ChangeLog.omp | 49 ++++++++++ .../c-c++-common/goacc/classify-kernels-parloops.c | 6 +- .../classify-kernels-unparallelized-parloops.c | 6 +- .../c-c++-common/goacc/classify-kernels.c | 2 +- .../c-c++-common/goacc/classify-parallel.c | 4 +- gcc/testsuite/c-c++-common/goacc/classify-serial.c | 4 +- .../goacc/kernels-counter-vars-function-scope.c | 2 +- gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c | 2 +- gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c | 2 +- .../c-c++-common/goacc/kernels-loop-data-2.c | 2 +- .../goacc/kernels-loop-data-enter-exit-2.c | 2 +- .../goacc/kernels-loop-data-enter-exit.c | 2 +- .../c-c++-common/goacc/kernels-loop-data-update.c | 2 +- .../c-c++-common/goacc/kernels-loop-data.c | 2 +- gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c | 2 +- .../c-c++-common/goacc/kernels-loop-mod-not-zero.c | 2 +- gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c | 2 +- .../c-c++-common/goacc/kernels-loop-nest.c | 2 +- gcc/testsuite/c-c++-common/goacc/kernels-loop.c | 2 +- .../c-c++-common/goacc/kernels-one-counter-var.c | 2 +- .../goacc/kernels-parallel-loop-data-enter-exit.c | 2 +- .../c-c++-common/gomp/reverse-offload-1.c | 2 +- .../c-c++-common/gomp/target-device-ancestor-4.c | 2 +- .../goacc/classify-kernels-parloops.f95 | 6 +- .../classify-kernels-unparallelized-parloops.f95 | 6 +- .../goacc/classify-kernels-unparallelized.f95 | 4 +- .../gfortran.dg/goacc/classify-kernels.f95 | 2 +- .../gfortran.dg/goacc/classify-parallel.f95 | 4 +- .../gfortran.dg/goacc/classify-serial.f95 | 4 +- gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 | 2 +- .../gfortran.dg/goacc/kernels-loop-data-2.f95 | 2 +- .../goacc/kernels-loop-data-enter-exit-2.f95 | 2 +- .../goacc/kernels-loop-data-enter-exit.f95 | 2 +- .../gfortran.dg/goacc/kernels-loop-data-update.f95 | 2 +- .../gfortran.dg/goacc/kernels-loop-data.f95 | 2 +- gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 | 2 +- gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95 | 2 +- .../kernels-parallel-loop-data-enter-exit.f95 | 2 +- .../gfortran.dg/gomp/target-device-ancestor-4.f90 | 2 +- .../gfortran.dg/gomp/target-device-ancestor-5.f90 | 2 +- libgomp/ChangeLog.omp | 12 +++ libgomp/libgomp.texi | 2 +- .../libgomp.c-c++-common/reverse-offload-1-aux.c | 10 ++ .../libgomp.c-c++-common/reverse-offload-1.c | 83 ++++++++++++++++ .../libgomp.fortran/reverse-offload-1-aux.f90 | 12 +++ .../libgomp.fortran/reverse-offload-1.f90 | 88 +++++++++++++++++ 53 files changed, 505 insertions(+), 65 deletions(-) diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp index a665cc19ef0..ff9d34b5c7c 100644 --- a/gcc/ChangeLog.omp +++ b/gcc/ChangeLog.omp @@ -1,3 +1,22 @@ +2022-08-30 Tobias Burnus + + Backport from mainline: + 2022-08-26 Tobias Burnus + + * internal-fn.cc (expand_GOMP_TARGET_REV): New. + * internal-fn.def (GOMP_TARGET_REV): New. + * lto-cgraph.cc (lto_output_node, verify_node_partition): Mark + 'omp target device_ancestor_host' as in_other_partition and don't + error if absent. + * omp-low.cc (create_omp_child_function): Mark as 'noclone'. + * omp-expand.cc (expand_omp_target): For reverse offload, remove + sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create + empty-body nohost function. + * omp-offload.cc (execute_omp_device_lower): Handle + IFN_GOMP_TARGET_REV. + (pass_omp_target_link::execute): For ACCEL_COMPILER, don't + nullify fn argument for reverse offload + 2022-08-22 Tobias Burnus Backport from mainline: diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 3de40f4d905..a2227270d8e 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -287,6 +287,14 @@ expand_GOMP_SIMT_VF (internal_fn, gcall *) gcc_unreachable (); } +/* This should get expanded in omp_device_lower pass. */ + +static void +expand_GOMP_TARGET_REV (internal_fn, gcall *) +{ + gcc_unreachable (); +} + /* Lane index of the first SIMT lane that supplies a non-zero argument. This is a SIMT counterpart to GOMP_SIMD_LAST_LANE, used to represent the lane that executed the last iteration for handling OpenMP lastprivate. */ diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index d2d550d3586..f45c2af044c 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -313,6 +313,7 @@ DEF_INTERNAL_INT_FN (FFS, ECF_CONST | ECF_NOTHROW, ffs, unary) DEF_INTERNAL_INT_FN (PARITY, ECF_CONST | ECF_NOTHROW, parity, unary) DEF_INTERNAL_INT_FN (POPCOUNT, ECF_CONST | ECF_NOTHROW, popcount, unary) +DEF_INTERNAL_FN (GOMP_TARGET_REV, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (GOMP_USE_SIMT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (GOMP_SIMT_ENTER, ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (GOMP_SIMT_ENTER_ALLOC, ECF_LEAF | ECF_NOTHROW, NULL) diff --git a/gcc/lto-cgraph.cc b/gcc/lto-cgraph.cc index 39af9c1bd07..95d2774fc45 100644 --- a/gcc/lto-cgraph.cc +++ b/gcc/lto-cgraph.cc @@ -430,6 +430,14 @@ lto_output_node (struct lto_simple_output_block *ob, struct cgraph_node *node, after reading back. */ in_other_partition = 1; } + else if (__builtin_expect ( + lto_stream_offload_p + && lookup_attribute ("omp target device_ancestor_host", + DECL_ATTRIBUTES (node->decl)), 0)) + /* This symbol is only used as argument to IFN_GOMP_TARGET_REV; this IFN + is ignored on ACCEL_COMPILER. Thus, mark it as in_other_partition to silence + verify_node_partition diagnostic. */ + in_other_partition = 1; clone_of = node->clone_of; while (clone_of @@ -1140,10 +1148,15 @@ verify_node_partition (symtab_node *node) if (node->in_other_partition) { if (TREE_CODE (node->decl) == FUNCTION_DECL) - error_at (DECL_SOURCE_LOCATION (node->decl), - "function %qs has been referenced in offloaded code but" - " hasn%'t been marked to be included in the offloaded code", - node->name ()); + { + if (lookup_attribute ("omp target device_ancestor_host", + DECL_ATTRIBUTES (node->decl)) != NULL) + return; + error_at (DECL_SOURCE_LOCATION (node->decl), + "function %qs has been referenced in offloaded code but" + " hasn%'t been marked to be included in the offloaded code", + node->name ()); + } else if (VAR_P (node->decl)) error_at (DECL_SOURCE_LOCATION (node->decl), "variable %qs has been referenced in offloaded code but" diff --git a/gcc/omp-expand.cc b/gcc/omp-expand.cc index 1e85ac285e2..9f887c2d0e1 100644 --- a/gcc/omp-expand.cc +++ b/gcc/omp-expand.cc @@ -9734,7 +9734,7 @@ expand_omp_target (struct omp_region *region) { basic_block entry_bb, exit_bb, new_bb; struct function *child_cfun; - tree child_fn, block, t; + tree child_fn, child_fn2, block, t, c; gimple_stmt_iterator gsi; gomp_target *entry_stmt; gimple *stmt; @@ -9776,10 +9776,16 @@ expand_omp_target (struct omp_region *region) gcc_unreachable (); } - child_fn = NULL_TREE; + tree clauses = gimple_omp_target_clauses (entry_stmt); + + bool is_ancestor = false; + child_fn = child_fn2 = NULL_TREE; child_cfun = NULL; if (offloaded) { + c = omp_find_clause (clauses, OMP_CLAUSE_DEVICE); + if (ENABLE_OFFLOADING && c) + is_ancestor = OMP_CLAUSE_DEVICE_ANCESTOR (c); child_fn = gimple_omp_target_child_fn (entry_stmt); child_cfun = DECL_STRUCT_FUNCTION (child_fn); } @@ -9979,7 +9985,8 @@ expand_omp_target (struct omp_region *region) { if (in_lto_p) DECL_PRESERVE_P (child_fn) = 1; - vec_safe_push (offload_funcs, child_fn); + if (!is_ancestor) + vec_safe_push (offload_funcs, child_fn); } bool need_asm = DECL_ASSEMBLER_NAME_SET_P (current_function_decl) @@ -10018,11 +10025,88 @@ expand_omp_target (struct omp_region *region) } adjust_context_and_scope (region, gimple_block (entry_stmt), child_fn); + + /* Handle the case that an inner ancestor:1 target is called by an outer + target region. */ + if (!is_ancestor) + cgraph_node::get (child_fn)->calls_declare_variant_alt + |= cgraph_node::get (cfun->decl)->calls_declare_variant_alt; + else /* Duplicate function to create empty nonhost variant. */ + { + /* Enable pass_omp_device_lower pass. */ + cgraph_node::get (cfun->decl)->calls_declare_variant_alt = 1; + cgraph_node *fn2_node; + child_fn2 = build_decl (DECL_SOURCE_LOCATION (child_fn), + FUNCTION_DECL, + clone_function_name (child_fn, "nohost"), + TREE_TYPE (child_fn)); + if (in_lto_p) + DECL_PRESERVE_P (child_fn2) = 1; + TREE_STATIC (child_fn2) = 1; + DECL_ARTIFICIAL (child_fn2) = 1; + DECL_IGNORED_P (child_fn2) = 0; + TREE_PUBLIC (child_fn2) = 0; + DECL_UNINLINABLE (child_fn2) = 1; + DECL_EXTERNAL (child_fn2) = 0; + DECL_CONTEXT (child_fn2) = NULL_TREE; + DECL_INITIAL (child_fn2) = make_node (BLOCK); + BLOCK_SUPERCONTEXT (DECL_INITIAL (child_fn2)) = child_fn2; + DECL_ATTRIBUTES (child_fn) + = remove_attribute ("omp target entrypoint", + DECL_ATTRIBUTES (child_fn)); + DECL_ATTRIBUTES (child_fn2) + = tree_cons (get_identifier ("omp target device_ancestor_nohost"), + NULL_TREE, copy_list (DECL_ATTRIBUTES (child_fn))); + DECL_ATTRIBUTES (child_fn) + = tree_cons (get_identifier ("omp target device_ancestor_host"), + NULL_TREE, DECL_ATTRIBUTES (child_fn)); + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (child_fn2) + = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (current_function_decl); + DECL_FUNCTION_SPECIFIC_TARGET (child_fn2) + = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl); + DECL_FUNCTION_VERSIONED (child_fn2) + = DECL_FUNCTION_VERSIONED (current_function_decl); + + fn2_node = cgraph_node::get_create (child_fn2); + fn2_node->offloadable = 1; + fn2_node->force_output = 1; + node->offloadable = 0; + + t = build_decl (DECL_SOURCE_LOCATION (child_fn), + RESULT_DECL, NULL_TREE, void_type_node); + DECL_ARTIFICIAL (t) = 1; + DECL_IGNORED_P (t) = 1; + DECL_CONTEXT (t) = child_fn2; + DECL_RESULT (child_fn2) = t; + DECL_SAVED_TREE (child_fn2) = build1 (RETURN_EXPR, + void_type_node, NULL); + tree tmp = DECL_ARGUMENTS (child_fn); + t = build_decl (DECL_SOURCE_LOCATION (child_fn), PARM_DECL, + DECL_NAME (tmp), TREE_TYPE (tmp)); + DECL_ARTIFICIAL (t) = 1; + DECL_NAMELESS (t) = 1; + DECL_ARG_TYPE (t) = ptr_type_node; + DECL_CONTEXT (t) = current_function_decl; + TREE_USED (t) = 1; + TREE_READONLY (t) = 1; + DECL_ARGUMENTS (child_fn2) = t; + gcc_assert (TREE_CHAIN (tmp) == NULL_TREE); + + gimplify_function_tree (child_fn2); + cgraph_node::add_new_function (child_fn2, true); + + vec_safe_push (offload_funcs, child_fn2); + if (dump_file && !gimple_in_ssa_p (cfun)) + { + dump_function_header (dump_file, child_fn2, dump_flags); + dump_function_to_file (child_fn2, dump_file, dump_flags); + } + } } /* Emit a library call to launch the offloading region, or do data transfers. */ - tree t1, t2, t3, t4, depend, c, clauses; + tree t1, t2, t3, t4, depend; enum built_in_function start_ix; unsigned int flags_i = 0; @@ -10073,8 +10157,6 @@ expand_omp_target (struct omp_region *region) gcc_unreachable (); } - clauses = gimple_omp_target_clauses (entry_stmt); - tree device = NULL_TREE; location_t device_loc = UNKNOWN_LOCATION; tree goacc_flags = NULL_TREE; @@ -10106,7 +10188,8 @@ expand_omp_target (struct omp_region *region) need_device_adjustment = true; device_loc = OMP_CLAUSE_LOCATION (c); if (OMP_CLAUSE_DEVICE_ANCESTOR (c)) - sorry_at (device_loc, "% not yet supported"); + device = build_int_cst (integer_type_node, + GOMP_DEVICE_HOST_FALLBACK); } else { @@ -10294,7 +10377,7 @@ expand_omp_target (struct omp_region *region) else args.quick_push (device); if (offloaded) - args.quick_push (build_fold_addr_expr (child_fn)); + args.quick_push (build_fold_addr_expr (child_fn2 ? child_fn2 : child_fn)); args.quick_push (t1); args.quick_push (t2); args.quick_push (t3); @@ -10436,6 +10519,13 @@ expand_omp_target (struct omp_region *region) for (int i = 3; i < TREE_VEC_LENGTH (t); i++) args.safe_push (TREE_VEC_ELT (t, i)); } + if (child_fn2) + { + g = gimple_build_call_internal (IFN_GOMP_TARGET_REV, 1, + build_fold_addr_expr (child_fn)); + gimple_set_location (g, gimple_location (entry_stmt)); + gsi_insert_before (&gsi, g, GSI_SAME_STMT); + } g = gimple_build_call_vec (builtin_decl_explicit (start_ix), args); gimple_set_location (g, gimple_location (entry_stmt)); diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index 0ec33c73bca..ea51ff9e697 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -2410,6 +2410,11 @@ create_omp_child_function (omp_context *ctx, bool task_copy) else target_attr = NULL; } + if (target_attr + && is_gimple_omp_offloaded (ctx->stmt) + && lookup_attribute ("noclone", DECL_ATTRIBUTES (decl)) == NULL_TREE) + DECL_ATTRIBUTES (decl) = tree_cons (get_identifier ("noclone"), + NULL_TREE, DECL_ATTRIBUTES (decl)); if (target_attr) DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (target_attr), diff --git a/gcc/omp-offload.cc b/gcc/omp-offload.cc index ac236482db5..979bc0badb2 100644 --- a/gcc/omp-offload.cc +++ b/gcc/omp-offload.cc @@ -3305,6 +3305,47 @@ execute_omp_device_lower () tree type = lhs ? TREE_TYPE (lhs) : integer_type_node; switch (gimple_call_internal_fn (stmt)) { + case IFN_GOMP_TARGET_REV: + { +#ifndef ACCEL_COMPILER + gimple_stmt_iterator gsi2 = gsi; + gsi_next (&gsi2); + gcc_assert (!gsi_end_p (gsi2)); + gcc_assert (gimple_call_builtin_p (gsi_stmt (gsi2), + BUILT_IN_GOMP_TARGET)); + tree old_decl + = TREE_OPERAND (gimple_call_arg (gsi_stmt (gsi2), 1), 0); + tree new_decl = gimple_call_arg (gsi_stmt (gsi), 0); + gimple_call_set_arg (gsi_stmt (gsi2), 1, new_decl); + update_stmt (gsi_stmt (gsi2)); + new_decl = TREE_OPERAND (new_decl, 0); + unsigned i; + unsigned num_funcs = vec_safe_length (offload_funcs); + for (i = 0; i < num_funcs; i++) + { + if ((*offload_funcs)[i] == old_decl) + { + (*offload_funcs)[i] = new_decl; + break; + } + else if ((*offload_funcs)[i] == new_decl) + break; /* This can happen due to inlining. */ + } + gcc_assert (i < num_funcs); +#else + tree old_decl = TREE_OPERAND (gimple_call_arg (gsi_stmt (gsi), 0), + 0); +#endif + /* FIXME: Find a way to actually prevent outputting the empty-body + old_decl as debug symbol + function in the assembly file. */ + cgraph_node *node = cgraph_node::get (old_decl); + node->address_taken = false; + node->need_lto_streaming = false; + node->offloadable = false; + + unlink_stmt_vdef (stmt); + } + break; case IFN_GOMP_USE_SIMT: rhs = vf == 1 ? integer_zero_node : integer_one_node; break; @@ -3481,6 +3522,15 @@ pass_omp_target_link::execute (function *fun) { if (gimple_call_builtin_p (gsi_stmt (gsi), BUILT_IN_GOMP_TARGET)) { + tree dev = gimple_call_arg (gsi_stmt (gsi), 0); + tree fn = gimple_call_arg (gsi_stmt (gsi), 1); + if (POINTER_TYPE_P (TREE_TYPE (fn))) + fn = TREE_OPERAND (fn, 0); + if (TREE_CODE (dev) == INTEGER_CST + && wi::to_wide (dev) == GOMP_DEVICE_HOST_FALLBACK + && lookup_attribute ("omp target device_ancestor_nohost", + DECL_ATTRIBUTES (fn)) != NULL_TREE) + continue; /* ancestor:1 */ /* Nullify the second argument of __builtin_GOMP_target_ext. */ gimple_call_set_arg (gsi_stmt (gsi), 1, null_pointer_node); update_stmt (gsi_stmt (gsi)); diff --git a/gcc/testsuite/ChangeLog.omp b/gcc/testsuite/ChangeLog.omp index 12300cb52be..76600f3c77d 100644 --- a/gcc/testsuite/ChangeLog.omp +++ b/gcc/testsuite/ChangeLog.omp @@ -1,3 +1,52 @@ +2022-08-30 Tobias Burnus + + Backport from mainline: + 2022-08-26 Tobias Burnus + + * c-c++-common/gomp/reverse-offload-1.c: Remove dg-sorry. + * c-c++-common/gomp/target-device-ancestor-4.c: Likewise. + * gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise. + * gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise. + * c-c++-common/goacc/classify-kernels-parloops.c: Add 'noclone' to + scan-tree-dump-times. + * c-c++-common/goacc/classify-kernels-unparallelized-parloops.c: + Likewise. + * c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise. + * c-c++-common/goacc/classify-kernels.c: Likewise. + * c-c++-common/goacc/classify-parallel.c: Likewise. + * c-c++-common/goacc/classify-serial.c: Likewise. + * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise. + * c-c++-common/goacc/kernels-loop-2.c: Likewise. + * c-c++-common/goacc/kernels-loop-3.c: Likewise. + * c-c++-common/goacc/kernels-loop-data-2.c: Likewise. + * c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise. + * c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise. + * c-c++-common/goacc/kernels-loop-data-update.c: Likewise. + * c-c++-common/goacc/kernels-loop-data.c: Likewise. + * c-c++-common/goacc/kernels-loop-g.c: Likewise. + * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise. + * c-c++-common/goacc/kernels-loop-n.c: Likewise. + * c-c++-common/goacc/kernels-loop-nest.c: Likewise. + * c-c++-common/goacc/kernels-loop.c: Likewise. + * c-c++-common/goacc/kernels-one-counter-var.c: Likewise. + * c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: Likewise. + * gfortran.dg/goacc/classify-kernels-parloops.f95: Likewise. + * gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95: + Likewise. + * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. + * gfortran.dg/goacc/classify-kernels.f95: Likewise. + * gfortran.dg/goacc/classify-parallel.f95: Likewise. + * gfortran.dg/goacc/classify-serial.f95: Likewise. + * gfortran.dg/goacc/kernels-loop-2.f95: Likewise. + * gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise. + * gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise. + * gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise. + * gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise. + * gfortran.dg/goacc/kernels-loop-data.f95: Likewise. + * gfortran.dg/goacc/kernels-loop-n.f95: Likewise. + * gfortran.dg/goacc/kernels-loop.f95: Likewise. + * gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: Likewise. + 2022-08-30 Tobias Burnus * gfortran.dg/gomp/depend-6.f90: Update expected tree dumps. diff --git a/gcc/testsuite/c-c++-common/goacc/classify-kernels-parloops.c b/gcc/testsuite/c-c++-common/goacc/classify-kernels-parloops.c index 5f470eb86bc..fc2b6375002 100644 --- a/gcc/testsuite/c-c++-common/goacc/classify-kernels-parloops.c +++ b/gcc/testsuite/c-c++-common/goacc/classify-kernels-parloops.c @@ -27,16 +27,16 @@ void KERNELS () } /* Check the offloaded function's attributes. - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint\\)\\)" 1 "ompexp" } } */ + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "ompexp" } } */ /* Check that exactly one OpenACC kernels construct is analyzed, and that it can be parallelized. { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check the offloaded function's classification and compute dimensions (will always be 1 x 1 x 1 for non-offloading compilation). { dg-final { scan-tree-dump-times "(?n)Function is parallelized OpenACC kernels offload" 1 "oaccloops" } } { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccloops" } } */ + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "oaccloops" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized-parloops.c b/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized-parloops.c index a2cc5947697..3792a7c0919 100644 --- a/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized-parloops.c +++ b/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized-parloops.c @@ -33,16 +33,16 @@ void KERNELS () } /* Check the offloaded function's attributes. - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint\\)\\)" 1 "ompexp" } } */ + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "ompexp" } } */ /* Check that exactly one OpenACC kernels construct is analyzed, and that it can't be parallelized. { dg-final { scan-tree-dump-times "FAILED:" 1 "parloops1" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } { dg-final { scan-tree-dump-not "SUCCESS: may be parallelized" "parloops1" } } */ /* Check the offloaded function's classification and compute dimensions (will always be 1 x 1 x 1 for non-offloading compilation). { dg-final { scan-tree-dump-times "(?n)Function is unparallelized OpenACC kernels offload" 1 "oaccloops1" } } { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops1" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccloops1" } } */ + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "oaccloops1" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/classify-kernels.c b/gcc/testsuite/c-c++-common/goacc/classify-kernels.c index 6e51447b401..b1266d45c83 100644 --- a/gcc/testsuite/c-c++-common/goacc/classify-kernels.c +++ b/gcc/testsuite/c-c++-common/goacc/classify-kernels.c @@ -38,4 +38,4 @@ void KERNELS () always be 1 x 1 x 1 for non-offloading compilation). { dg-final { scan-tree-dump-times "(?n)Function is parallel_kernels_graphite OpenACC kernels offload" 1 "oaccloops1" } } { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops1" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc parallel_kernels_graphite, omp target entrypoint\\)\\)" 1 "oaccloops1" } } */ + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc parallel_kernels_graphite, omp target entrypoint, noclone\\)\\)" 1 "oaccloops1" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/classify-parallel.c b/gcc/testsuite/c-c++-common/goacc/classify-parallel.c index d00d2b036f2..14e392b0341 100644 --- a/gcc/testsuite/c-c++-common/goacc/classify-parallel.c +++ b/gcc/testsuite/c-c++-common/goacc/classify-parallel.c @@ -24,10 +24,10 @@ void PARALLEL () } /* Check the offloaded function's attributes. - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc parallel, omp target entrypoint\\)\\)" 1 "ompexp" } } */ + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc parallel, omp target entrypoint, noclone\\)\\)" 1 "ompexp" } } */ /* Check the offloaded function's classification and compute dimensions (will always be 1 x 1 x 1 for non-offloading compilation). { dg-final { scan-tree-dump-times "(?n)Function is OpenACC parallel offload" 1 "oaccloops1" } } { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops1" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc parallel, omp target entrypoint\\)\\)" 1 "oaccloops1" } } */ + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc parallel, omp target entrypoint, noclone\\)\\)" 1 "oaccloops1" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/classify-serial.c b/gcc/testsuite/c-c++-common/goacc/classify-serial.c index eedad55d49b..f892009de35 100644 --- a/gcc/testsuite/c-c++-common/goacc/classify-serial.c +++ b/gcc/testsuite/c-c++-common/goacc/classify-serial.c @@ -27,10 +27,10 @@ void SERIAL () } /* Check the offloaded function's attributes. - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc serial, omp target entrypoint\\)\\)" 1 "ompexp" } } */ + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc serial, omp target entrypoint, noclone\\)\\)" 1 "ompexp" } } */ /* Check the offloaded function's classification and compute dimensions (will always be 1 x 1 x 1 for non-offloading compilation). { dg-final { scan-tree-dump-times "(?n)Function is OpenACC serial offload" 1 "oaccloops1" } } { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops1" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc serial, omp target entrypoint\\)\\)" 1 "oaccloops1" } } */ + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc serial, omp target entrypoint, noclone\\)\\)" 1 "oaccloops1" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c b/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c index e14ea2c4c72..142ebc07ffa 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c @@ -48,7 +48,7 @@ main (void) /* Check that only one loop is analyzed, and that it can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c index eb95dddedd5..7909ca461cc 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c @@ -62,7 +62,7 @@ main (void) /* Check that only three loops are analyzed, and that all can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 3 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 3 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 3 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c index ae812583cf9..3e9db6f5145 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c @@ -42,7 +42,7 @@ main (void) /* Check that only one loop is analyzed, and that it can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c index b9e0458eab1..37197abb5c8 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c @@ -62,7 +62,7 @@ main (void) /* Check that only three loops are analyzed, and that all can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 3 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 3 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 3 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c index 9a88e8e0d39..1bf9635d512 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c @@ -60,7 +60,7 @@ main (void) /* Check that only three loops are analyzed, and that all can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 3 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 3 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 3 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c index 0a018820ca1..b0e2c4835c7 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c @@ -57,7 +57,7 @@ main (void) /* Check that only three loops are analyzed, and that all can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 3 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 3 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 3 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c index 4821cb9675e..38cb55830ea 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c @@ -58,7 +58,7 @@ main (void) /* Check that only two loops are analyzed, and that both can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 2 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 2 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 2 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c index d650bfa9718..27eb18aa94a 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c @@ -56,7 +56,7 @@ main (void) /* Check that only three loops are analyzed, and that all can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 3 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 3 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 3 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c index ca0bc2e59a4..ca3ca5cdd97 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c @@ -11,7 +11,7 @@ /* Check that only one loop is analyzed, and that it can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c index bcc25558373..b38484a007f 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c @@ -46,7 +46,7 @@ main (void) /* Check that only one loop is analyzed, and that it can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c index b9ffc11bb1c..aaa3318ac87 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c @@ -49,7 +49,7 @@ foo (COUNTERTYPE n) /* Check that only one loop is analyzed, and that it can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c index a7cec7fab9e..a97e8a9520c 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c @@ -33,7 +33,7 @@ main (void) /* Check that only one loop is analyzed, and that it can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop.c index 954902edbd9..771b88db06b 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop.c @@ -49,7 +49,7 @@ main (void) /* Check that only one loop is analyzed, and that it can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c b/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c index e2f27af76ed..34d3dd05664 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c @@ -47,7 +47,7 @@ main (void) /* Check that only one loop is analyzed, and that it can be parallelized. */ /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c b/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c index 305974b88ce..abbfaa34f70 100644 --- a/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c +++ b/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c @@ -60,7 +60,7 @@ main (void) // FIXME: OpenACC kernels stopped working with the firstprivate subarray // changes. /* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 2 "parloops1" { xfail *-*-* } } } */ -/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 2 "parloops1" { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 2 "parloops1" { xfail *-*-* } } } */ /* { dg-final { scan-tree-dump-not "FAILED:" "parloops1" { xfail *-*-* } } } */ /* Check that the loop has been split off into a function. */ diff --git a/gcc/testsuite/c-c++-common/gomp/reverse-offload-1.c b/gcc/testsuite/c-c++-common/gomp/reverse-offload-1.c index 3452156f948..9a3fa5230f8 100644 --- a/gcc/testsuite/c-c++-common/gomp/reverse-offload-1.c +++ b/gcc/testsuite/c-c++-common/gomp/reverse-offload-1.c @@ -43,7 +43,7 @@ tg_fn (int *x, int *y) x2 = x2 + 2 + called_in_target1 (); y2 = y2 + 7; - #pragma omp target device(ancestor : 1) map(tofrom: x2) /* { dg-message "sorry, unimplemented: 'ancestor' not yet supported" } */ + #pragma omp target device(ancestor : 1) map(tofrom: x2) check_offload(&x2, &y2); if (x2 != 2+2+3+42 || y2 != 3 + 7) diff --git a/gcc/testsuite/c-c++-common/gomp/target-device-ancestor-4.c b/gcc/testsuite/c-c++-common/gomp/target-device-ancestor-4.c index 37cd1a0f1d3..87ac7548c23 100644 --- a/gcc/testsuite/c-c++-common/gomp/target-device-ancestor-4.c +++ b/gcc/testsuite/c-c++-common/gomp/target-device-ancestor-4.c @@ -9,7 +9,7 @@ void foo (void) { - #pragma omp target device (ancestor: 1) /* { dg-message "" "sorry, unimplemented: 'ancestor' not yet supported" { xfail *-*-* } } */ + #pragma omp target device (ancestor: 1) ; } diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-parloops.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-parloops.f95 index 96814a1697d..5dd763faffe 100644 --- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-parloops.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-parloops.f95 @@ -29,16 +29,16 @@ program main end program main ! Check the offloaded function's attributes. -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint\\)\\)" 1 "ompexp" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "ompexp" } } ! Check that exactly one OpenACC kernels construct is analyzed, and that it ! can be parallelized. ! { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } ! { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } ! Check the offloaded function's classification and compute dimensions (will ! always be 1 x 1 x 1 for non-offloading compilation). ! { dg-final { scan-tree-dump-times "(?n)Function is parallelized OpenACC kernels offload" 1 "oaccloops" } } ! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccloops" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "oaccloops" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95 index 3da7f2c711d..691d33913ab 100644 --- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95 @@ -30,16 +30,16 @@ program main end program main ! Check the offloaded function's attributes. -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint\\)\\)" 1 "ompexp" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "ompexp" } } ! Check that exactly one OpenACC kernels construct is analyzed, and that it ! can't be parallelized. ! { dg-final { scan-tree-dump-times "FAILED:" 1 "parloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" } } ! { dg-final { scan-tree-dump-not "SUCCESS: may be parallelized" "parloops1" } } ! Check the offloaded function's classification and compute dimensions (will ! always be 1 x 1 x 1 for non-offloading compilation). ! { dg-final { scan-tree-dump-times "(?n)Function is unparallelized OpenACC kernels offload" 1 "oaccloops1" } } ! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "oaccloops1" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95 index 297ef48b9e5..a132b47f27c 100644 --- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95 @@ -33,10 +33,10 @@ program main end program main ! Check the offloaded function's attributes. -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc parallel_kernels_graphite, omp target entrypoint\\)\\)" 1 "ompexp" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc parallel_kernels_graphite, omp target entrypoint, noclone\\)\\)" 1 "ompexp" } } ! Check the offloaded function's classification and compute dimensions (will ! always be 1 x 1 x 1 for non-offloading compilation). ! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops1" } } ! { dg-final { scan-tree-dump-not "^assigned OpenACC.*?loop parallelism$" "oaccloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc parallel_kernels_graphite, omp target entrypoint\\)\\)" 1 "oaccloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc parallel_kernels_graphite, omp target entrypoint, noclone\\)\\)" 1 "oaccloops1" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95 index 411bc45390f..97486ba61d8 100644 --- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95 @@ -33,4 +33,4 @@ end program main ! always be 1 x 1 x 1 for non-offloading compilation). ! { dg-final { scan-tree-dump-times "(?n)Function is parallel_kernels_graphite OpenACC kernels offload" 1 "oaccloops1" } } ! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc parallel_kernels_graphite, omp target entrypoint\\)\\)" 1 "oaccloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc parallel_kernels_graphite, omp target entrypoint, noclone\\)\\)" 1 "oaccloops1" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-parallel.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-parallel.f95 index c18ab5a2e57..dc35ac7a090 100644 --- a/gcc/testsuite/gfortran.dg/goacc/classify-parallel.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/classify-parallel.f95 @@ -26,10 +26,10 @@ program main end program main ! Check the offloaded function's attributes. -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc parallel, omp target entrypoint\\)\\)" 1 "ompexp" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc parallel, omp target entrypoint, noclone\\)\\)" 1 "ompexp" } } ! Check the offloaded function's classification and compute dimensions (will ! always be 1 x 1 x 1 for non-offloading compilation). ! { dg-final { scan-tree-dump-times "(?n)Function is OpenACC parallel offload" 1 "oaccloops1" } } ! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc parallel, omp target entrypoint\\)\\)" 1 "oaccloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc parallel, omp target entrypoint, noclone\\)\\)" 1 "oaccloops1" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-serial.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-serial.f95 index 233b94acdb1..425a41768b8 100644 --- a/gcc/testsuite/gfortran.dg/goacc/classify-serial.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/classify-serial.f95 @@ -26,10 +26,10 @@ program main end program main ! Check the offloaded function's attributes. -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc serial, omp target entrypoint\\)\\)" 1 "ompexp" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc serial, omp target entrypoint, noclone\\)\\)" 1 "ompexp" } } ! Check the offloaded function's classification and compute dimensions (will ! always be 1 x 1 x 1 for non-offloading compilation). ! { dg-final { scan-tree-dump-times "(?n)Function is OpenACC serial offload" 1 "oaccloops1" } } ! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc serial, omp target entrypoint\\)\\)" 1 "oaccloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc serial, omp target entrypoint, noclone\\)\\)" 1 "oaccloops1" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 index f3ab71da5f8..206879540de 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 @@ -37,7 +37,7 @@ end program main ! Check that only three loops are analyzed, and that all can be parallelized. ! { dg-final { scan-tree-dump-times "loop has no data-dependences" 6 "graphite" } } ! Two CFG loops per OpenACC loop ! { dg-final { scan-tree-dump-not "loop has data-dependences" "graphite" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc parallel_kernels_graphite, omp target entrypoint\\)\\)" 3 "graphite" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc parallel_kernels_graphite, omp target entrypoint, noclone\\)\\)" 3 "graphite" } } ! Check that the loop has been split off into a function. ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.0 " 1 "optimized" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95 index 348a2e186d8..edb570e3cb4 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95 @@ -42,7 +42,7 @@ end program main ! Check that only three loops are analyzed, and that all can be parallelized. -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc parallel_kernels_graphite, omp target entrypoint\\)\\)" 3 "graphite" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc parallel_kernels_graphite, omp target entrypoint, noclone\\)\\)" 3 "graphite" } } ! { dg-final { scan-tree-dump-times "loop has no data-dependences" 6 "graphite" } } ! Two CFG loops per OpenACC loop ! Check that the loop has been split off into a function. diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95 index 461279d7572..d748b58b965 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95 @@ -43,7 +43,7 @@ end program main ! Check that only three loops are analyzed, and that all can be parallelized. ! { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 3 "parloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 3 "parloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 3 "parloops1" } } ! { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } ! Check that the loop has been split off into a function. diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95 index cfb3803501b..2afbf6497b8 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95 @@ -41,7 +41,7 @@ end program main ! Check that only three loops are analyzed, and that all can be parallelized. ! { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 3 "parloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 3 "parloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 3 "parloops1" } } ! { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } ! Check that the loop has been split off into a function. diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95 index a9ec77b4db6..869bc2f5ea6 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95 @@ -41,7 +41,7 @@ end program main ! Check that only three loops are analyzed, and that all can be parallelized. ! { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 2 "parloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 2 "parloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 2 "parloops1" } } ! { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } ! Check that the loop has been split off into a function. diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95 index 53a082b782e..c123699d27a 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95 @@ -41,7 +41,7 @@ end program main ! Check that only three loops are analyzed, and that all can be parallelized. ! { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 3 "parloops1" } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 3 "parloops1" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 3 "parloops1" } } ! { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } ! Check that the loop has been split off into a function. diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 index 0d1021e3023..78cfa7d79e8 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 @@ -36,7 +36,7 @@ end module test ! Check that only one loop is analyzed, and that it can be parallelized. ! { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops1" } } ! TODO, PR70545. -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "parloops1" { xfail *-*-* } } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 1 "parloops1" { xfail *-*-* } } } ! { dg-final { scan-tree-dump-not "FAILED:" "parloops1" } } ! Check that the loop has been split off into a function. diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95 index 7ac3831294f..48e17d6ebe3 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95 @@ -31,7 +31,7 @@ program main end program main ! Check that only one loop is analyzed, and that it can be parallelized. -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc parallel_kernels_graphite, omp target entrypoint\\)\\)" 1 "graphite" } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(, , \\), oacc parallel_kernels_graphite, omp target entrypoint, noclone\\)\\)" 1 "graphite" } } ! { dg-final { scan-tree-dump-times "loop has no data-dependences" 2 "graphite" } } ! Two CFG loops per OpenACC loop ! Check that the loop has been split off into a function. diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95 index 7e9589de799..552bb618cbc 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95 @@ -42,7 +42,7 @@ end program main ! Check that only three loops are analyzed, and that all can be parallelized. ! { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 2 "parloops1" { xfail *-*-* } } } -! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 2 "parloops1" { xfail *-*-* } } } +! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint, noclone\\)\\)" 2 "parloops1" { xfail *-*-* } } } ! { dg-final { scan-tree-dump-not "FAILED:" "parloops1" { xfail *-*-* } } } ! Check that the loop has been split off into a function. diff --git a/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-4.f90 b/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-4.f90 index 56aff24df50..d73adf2c5a7 100644 --- a/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-4.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-4.f90 @@ -6,7 +6,7 @@ !$omp requires reverse_offload -!$omp target device (ancestor : 1) ! { dg-message "" "sorry, unimplemented: 'ancestor' not yet supported" } +!$omp target device (ancestor : 1) !$omp end target end diff --git a/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-5.f90 b/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-5.f90 index ca8d4b282a0..9596d61f6fa 100644 --- a/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-5.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-5.f90 @@ -17,7 +17,7 @@ contains block block block - !$omp target device(ancestor:1) ! { dg-message "sorry, unimplemented: 'ancestor' not yet supported" } + !$omp target device(ancestor:1) !$omp end target end block end block diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp index baa861ccff4..316e5ee4bbe 100644 --- a/libgomp/ChangeLog.omp +++ b/libgomp/ChangeLog.omp @@ -1,3 +1,15 @@ +2022-08-30 Tobias Burnus + + Backport from mainline: + 2022-08-26 Tobias Burnus + + * libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but + refer to 'requires'. + * testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test. + * testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test. + * testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test. + * testsuite/libgomp.fortran/reverse-offload-1.f90: New test. + 2022-08-17 Tobias Burnus Backport from mainline: diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi index 527d457d8a9..77987619cf8 100644 --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -225,7 +225,7 @@ The OpenMP 4.5 specification is fully supported. @item @code{allocate} clause @tab P @tab Initial support @item @code{use_device_addr} clause on @code{target data} @tab Y @tab @item @code{ancestor} modifier on @code{device} clause - @tab P @tab Reverse offload unsupported + @tab Y @tab See comment for @code{requires} @item Implicit declare target directive @tab Y @tab @item Discontiguous array section with @code{target update} construct @tab N @tab diff --git a/libgomp/testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c b/libgomp/testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c new file mode 100644 index 00000000000..b3a331d12da --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c @@ -0,0 +1,10 @@ +/* { dg-do compile { target skip-all-targets } } */ + +/* Declare the following function in a separare translation unit + to ensure it won't have a device version. */ + +int +add_3 (int x) +{ + return x + 3; +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/reverse-offload-1.c b/libgomp/testsuite/libgomp.c-c++-common/reverse-offload-1.c new file mode 100644 index 00000000000..976e129f560 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reverse-offload-1.c @@ -0,0 +1,83 @@ +/* { dg-do run } */ +/* { dg-additional-sources reverse-offload-1-aux.c } */ + +/* Check that reverse offload works in particular: + - no code is generated on the device side (i.e. no + implicit declare target of called functions and no + code gen for the target-region body) + -> would otherwise fail due to 'add_3' symbol + - Plus the usual (compiles, runs, produces correct result) + + Note: Running also the non-reverse-offload target regions + on the host (host fallback) is valid and will pass. */ + +#pragma omp requires reverse_offload + +extern int add_3 (int); + +static int global_var = 5; + +void +check_offload (int *x, int *y) +{ + *x = add_3 (*x); + *y = add_3 (*y); +} + +#pragma omp declare target +void +tg_fn (int *x, int *y) +{ + int x2 = *x, y2 = *y; + if (x2 != 2 || y2 != 3) + __builtin_abort (); + x2 = x2 + 2; + y2 = y2 + 7; + + #pragma omp target device(ancestor : 1) map(tofrom: x2) + check_offload(&x2, &y2); + + if (x2 != 2+2+3 || y2 != 3 + 7) + __builtin_abort (); + *x = x2, *y = y2; +} +#pragma omp end declare target + +void +my_func (int *x, int *y) +{ + if (global_var != 5) + __builtin_abort (); + global_var = 242; + *x = 2*add_3(*x); + *y = 3*add_3(*y); +} + +int +main () +{ + #pragma omp target + { + int x = 2, y = 3; + tg_fn (&x, &y); + } + + #pragma omp target + { + int x = -2, y = -1; + #pragma omp target device ( ancestor:1 ) firstprivate(y) map(tofrom:x) + { + if (x != -2 || y != -1) + __builtin_abort (); + my_func (&x, &y); + if (x != 2*(3-2) || y != 3*(3-1)) + __builtin_abort (); + } + if (x != 2*(3-2) || y != -1) + __builtin_abort (); + } + + if (global_var != 242) + __builtin_abort (); + return 0; +} diff --git a/libgomp/testsuite/libgomp.fortran/reverse-offload-1-aux.f90 b/libgomp/testsuite/libgomp.fortran/reverse-offload-1-aux.f90 new file mode 100644 index 00000000000..1807f063d5a --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/reverse-offload-1-aux.f90 @@ -0,0 +1,12 @@ +! { dg-do compile { target skip-all-targets } } + +! Declare the following function in a separare translation unit +! to ensure it won't have a device version. + + +integer function add_3 (x) + implicit none + integer, value :: x + + add_3 = x + 3 +end function diff --git a/libgomp/testsuite/libgomp.fortran/reverse-offload-1.f90 b/libgomp/testsuite/libgomp.fortran/reverse-offload-1.f90 new file mode 100644 index 00000000000..7cfb8b6552e --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/reverse-offload-1.f90 @@ -0,0 +1,88 @@ +! { dg-do run } +! { dg-additional-sources reverse-offload-1-aux.f90 } + +! Check that reverse offload works in particular: +! - no code is generated on the device side (i.e. no +! implicit declare target of called functions and no +! code gen for the target-region body) +! -> would otherwise fail due to 'add_3' symbol +! - Plus the usual (compiles, runs, produces correct result) + +! Note: Running also the non-reverse-offload target regions +! on the host (host fallback) is valid and will pass. + +module m + interface + integer function add_3 (x) + implicit none + integer, value :: x + end function + end interface + integer :: global_var = 5 +end module m + +module m2 + use m + !$omp requires reverse_offload + implicit none (type, external) +contains + subroutine check_offload (x, y) + integer :: x, y + x = add_3(x) + y = add_3(y) + end subroutine check_offload + subroutine m2_tg_fn(x, y) + integer :: x, y + !$omp declare target + if (x /= 2 .or. y /= 3) stop 1 + x = x + 2 + y = y + 7 + !$omp target device(ancestor : 1) map(tofrom: x) + call check_offload(x, y) + !$omp end target + if (x /= 2+2+3 .or. y /= 3 + 7) stop 2 + end subroutine +end module m2 + +program main + use m + !$omp requires reverse_offload + implicit none (type, external) + + integer :: prog_var = 99 + + !$omp target + block + use m2 + integer :: x, y + x = 2; y = 3 + call m2_tg_fn (x, y) + end block + + !$omp target + block + use m2 + integer :: x, y + x = -2; y = -1 + !$omp target device ( ancestor:1 ) firstprivate(y) map(tofrom:x) + if (x /= -2 .or. y /= -1) stop 3 + call my_func (x, y) + if (x /= 2*(3-2) .or. y /= 3*(3-1)) stop 5 + !$omp end target + if (x /= 2*(3-2) .or. y /= -1) stop 6 + end block + + if (prog_var /= 41 .or. global_var /= 242) stop 7 + +contains + + subroutine my_func(x, y) + integer :: x, y + if (prog_var /= 99) stop 8 + if (global_var /= 5) stop 9 + prog_var = 41 + global_var = 242 + x = 2*add_3(x) + y = 3*add_3(y) + end subroutine my_func +end