From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1534) id B6098385840A; Mon, 24 Oct 2022 15:16:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B6098385840A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1666624616; bh=rn5dKIBJFG+QU7qkHGs5AJo3XVmsRK2+stNVpvecc8E=; h=From:To:Subject:Date:From; b=HD/hvJJV19/A7Fp/P1jxOxK97j592EoCiTUEaQOEn4eG6DqLB3vUeY1daFmm+wdkL kxTK2MSOrgeY8YRtPa1wdcMqKDuLXfqv5HVGiRRLDQuxD5hmhAS1a+yVZXgvshN+Ld Sopx4NqlxrMs67HlUdJ5JEC0F3LgtOjlgdPTflI8= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Tobias Burnus To: gcc-cvs@gcc.gnu.org Subject: [gcc/devel/omp/gcc-12] OpenMP: Fix reverse offload GOMP_TARGET_REV IFN corner cases [PR107236] X-Act-Checkin: gcc X-Git-Author: Tobias Burnus X-Git-Refname: refs/heads/devel/omp/gcc-12 X-Git-Oldrev: 4b96309e90b3e0723bec411cd68c99983686ebd3 X-Git-Newrev: 497b44832fcef226515bb55271b87ecd12985a60 Message-Id: <20221024151656.B6098385840A@sourceware.org> Date: Mon, 24 Oct 2022 15:16:53 +0000 (GMT) List-Id: https://gcc.gnu.org/g:497b44832fcef226515bb55271b87ecd12985a60 commit 497b44832fcef226515bb55271b87ecd12985a60 Author: Tobias Burnus Date: Mon Oct 24 15:23:43 2022 +0200 OpenMP: Fix reverse offload GOMP_TARGET_REV IFN corner cases [PR107236] For 'target parallel' and similarly nested directives, cgraph_node's calls_declare_variant_alt was not set in the parent region node but in cfun->decl. Hence, pass_omp_device_lower did not process handle the internal function GOMP_TARGET_REV. - Solution is to set it to the DECL_CONTEXT, which is set in adjust_context_and_scope. The cgraph_node::create_clone issue is exposed with -O2 for the existing libgomp.fortran/reverse-offload-1.f90. PR middle-end/107236 gcc/ChangeLog: * omp-expand.cc (expand_omp_target): Set calls_declare_variant_alt in DECL_CONTEXT and not to cfun->decl. * cgraphclones.cc (cgraph_node::create_clone): Copy also the node's calls_declare_variant_alt value. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/target-device-ancestor-6.f90: New test. (cherry picked from commit 178ac530fe67e4f2fc439cc4ce89bc19d571ca31) Diff: --- gcc/ChangeLog.omp | 11 +++++++++++ gcc/cgraphclones.cc | 1 + gcc/omp-expand.cc | 13 ++++++------- gcc/testsuite/ChangeLog.omp | 8 ++++++++ .../gfortran.dg/gomp/target-device-ancestor-6.f90 | 17 +++++++++++++++++ 5 files changed, 43 insertions(+), 7 deletions(-) diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp index e032c05148b..8fc8e06e9ff 100644 --- a/gcc/ChangeLog.omp +++ b/gcc/ChangeLog.omp @@ -1,3 +1,14 @@ +2022-10-24 Tobias Burnus + + Backported from master: + 2022-10-24 Tobias Burnus + + PR middle-end/107236 + * omp-expand.cc (expand_omp_target): Set calls_declare_variant_alt + in DECL_CONTEXT and not to cfun->decl. + * cgraphclones.cc (cgraph_node::create_clone): Copy also the + node's calls_declare_variant_alt value. + 2022-10-21 Tobias Burnus * omp-oacc-kernels-decompose.cc (top_level_omp_for_in_stmt, diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc index eb0fa87b554..bb4b3c5407d 100644 --- a/gcc/cgraphclones.cc +++ b/gcc/cgraphclones.cc @@ -375,6 +375,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count, if (!new_inlined_to) prof_count = count.combine_with_ipa_count (prof_count); new_node->count = prof_count; + new_node->calls_declare_variant_alt = this->calls_declare_variant_alt; /* Update IPA profile. Local profiles need no updating in original. */ if (update_original) diff --git a/gcc/omp-expand.cc b/gcc/omp-expand.cc index 6529f63362b..37c397089b4 100644 --- a/gcc/omp-expand.cc +++ b/gcc/omp-expand.cc @@ -10143,13 +10143,8 @@ expand_omp_target (struct omp_region *region) /* Handle the case that an inner ancestor:1 target is called by an outer target region. */ - if (!is_ancestor) - cgraph_node::get (child_fn)->calls_declare_variant_alt - |= cgraph_node::get (cfun->decl)->calls_declare_variant_alt; - else /* Duplicate function to create empty nonhost variant. */ + if (is_ancestor) { - /* Enable pass_omp_device_lower pass. */ - cgraph_node::get (cfun->decl)->calls_declare_variant_alt = 1; cgraph_node *fn2_node; child_fn2 = build_decl (DECL_SOURCE_LOCATION (child_fn), FUNCTION_DECL, @@ -10163,7 +10158,7 @@ expand_omp_target (struct omp_region *region) TREE_PUBLIC (child_fn2) = 0; DECL_UNINLINABLE (child_fn2) = 1; DECL_EXTERNAL (child_fn2) = 0; - DECL_CONTEXT (child_fn2) = NULL_TREE; + DECL_CONTEXT (child_fn2) = DECL_CONTEXT (child_fn); DECL_INITIAL (child_fn2) = make_node (BLOCK); BLOCK_SUPERCONTEXT (DECL_INITIAL (child_fn2)) = child_fn2; DECL_ATTRIBUTES (child_fn) @@ -10187,6 +10182,10 @@ expand_omp_target (struct omp_region *region) fn2_node->force_output = 1; node->offloadable = 0; + /* Enable pass_omp_device_lower pass. */ + fn2_node = cgraph_node::get (DECL_CONTEXT (child_fn)); + fn2_node->calls_declare_variant_alt = 1; + t = build_decl (DECL_SOURCE_LOCATION (child_fn), RESULT_DECL, NULL_TREE, void_type_node); DECL_ARTIFICIAL (t) = 1; diff --git a/gcc/testsuite/ChangeLog.omp b/gcc/testsuite/ChangeLog.omp index 0ded9edf750..5970a3baf7b 100644 --- a/gcc/testsuite/ChangeLog.omp +++ b/gcc/testsuite/ChangeLog.omp @@ -1,3 +1,11 @@ +2022-10-24 Tobias Burnus + + Backported from master: + 2022-10-24 Tobias Burnus + + PR middle-end/107236 + * gfortran.dg/gomp/target-device-ancestor-6.f90: New test. + 2022-10-21 Marcel Vollweiler * c-c++-common/goacc/classify-kernels-unparallelized-graphite.c: diff --git a/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-6.f90 b/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-6.f90 new file mode 100644 index 00000000000..821e7852e85 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-6.f90 @@ -0,0 +1,17 @@ +! PR middle-end/107236 + +! Did ICE before because IFN .GOMP_TARGET_REV was not +! processed in omp-offload.cc. +! Note: Test required ENABLE_OFFLOADING being true inside GCC. + +implicit none +!$omp requires reverse_offload +!$omp target parallel num_threads(4) + !$omp target device(ancestor:1) + call foo() + !$omp end target +!$omp end target parallel +contains + subroutine foo + end +end