From: Tom de Vries <Tom_deVries@mentor.com>
To: Richard Biener <rguenther@suse.de>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
Jakub Jelinek <jakub@redhat.com>,
Thomas Schwinge <Thomas_Schwinge@mentor.com>,
<ebotcazou@adacore.com>
Subject: Re: [PATCH, 8/8] Do simple omp lowering for no address taken var
Date: Mon, 24 Nov 2014 11:55:00 -0000 [thread overview]
Message-ID: <5473171F.2040708@mentor.com> (raw)
In-Reply-To: <54731678.9090207@mentor.com>
[-- Attachment #1: Type: text/plain, Size: 5217 bytes --]
On 24-11-14 12:28, Tom de Vries wrote:
> On 17-11-14 11:13, Richard Biener wrote:
>> On Sat, 15 Nov 2014, Tom de Vries wrote:
>>
>>> >On 15-11-14 13:14, Tom de Vries wrote:
>>>> > >Hi,
>>>> > >
>>>> > >I'm submitting a patch series with initial support for the oacc kernels
>>>> > >directive.
>>>> > >
>>>> > >The patch series uses pass_parallelize_loops to implement parallelization of
>>>> > >loops in the oacc kernels region.
>>>> > >
>>>> > >The patch series consists of these 8 patches:
>>>> > >...
>>>> > > 1 Expand oacc kernels after pass_build_ealias
>>>> > > 2 Add pass_oacc_kernels
>>>> > > 3 Add pass_ch_oacc_kernels to pass_oacc_kernels
>>>> > > 4 Add pass_tree_loop_{init,done} to pass_oacc_kernels
>>>> > > 5 Add pass_loop_im to pass_oacc_kernels
>>>> > > 6 Add pass_ccp to pass_oacc_kernels
>>>> > > 7 Add pass_parloops_oacc_kernels to pass_oacc_kernels
>>>> > > 8 Do simple omp lowering for no address taken var
>>>> > >...
>>> >
>>> >This patch lowers integer variables that do not have their address taken as
>>> >local variable. We use a copy at region entry and exit to copy the value in
>>> >and out.
>>> >
>>> >In the context of reduction handling in a kernels region, this allows the
>>> >parloops reduction analysis to recognize the reduction, even after oacc
>>> >lowering has been done in pass_lower_omp.
>>> >
>>> >In more detail, without this patch, the omp_data_i load and stores are
>>> >generated in place (in this case, in the loop):
>>> >...
>>> > {
>>> > .omp_data_iD.2201 = &.omp_data_arr.15D.2220;
>>> > {
>>> > unsigned intD.9 iD.2146;
>>> >
>>> > iD.2146 = 0;
>>> > goto <D.2207>;
>>> > <D.2208>:
>>> > D.2216 = .omp_data_iD.2201->cD.2203;
>>> > c.9D.2176 = *D.2216;
>>> > D.2177 = (long unsigned intD.10) iD.2146;
>>> > D.2178 = D.2177 * 4;
>>> > D.2179 = c.9D.2176 + D.2178;
>>> > D.2180 = *D.2179;
>>> > D.2217 = .omp_data_iD.2201->sumD.2205;
>>> > D.2218 = *D.2217;
>>> > D.2217 = .omp_data_iD.2201->sumD.2205;
>>> > D.2219 = D.2180 + D.2218;
>>> > *D.2217 = D.2219;
>>> > iD.2146 = iD.2146 + 1;
>>> > <D.2207>:
>>> > if (iD.2146 <= 524287) goto <D.2208>; else goto <D.2209>;
>>> > <D.2209>:
>>> > }
>>> >...
>>> >
>>> >With this patch, the omp_data_i load and stores for sum are generated at entry
>>> >and exit:
>>> >...
>>> > {
>>> > .omp_data_iD.2201 = &.omp_data_arr.15D.2218;
>>> > D.2216 = .omp_data_iD.2201->sumD.2205;
>>> > sumD.2206 = *D.2216;
>>> > {
>>> > unsigned intD.9 iD.2146;
>>> >
>>> > iD.2146 = 0;
>>> > goto <D.2207>;
>>> > <D.2208>:
>>> > D.2217 = .omp_data_iD.2201->cD.2203;
>>> > c.9D.2176 = *D.2217;
>>> > D.2177 = (long unsigned intD.10) iD.2146;
>>> > D.2178 = D.2177 * 4;
>>> > D.2179 = c.9D.2176 + D.2178;
>>> > D.2180 = *D.2179;
>>> > sumD.2206 = D.2180 + sumD.2206;
>>> > iD.2146 = iD.2146 + 1;
>>> > <D.2207>:
>>> > if (iD.2146 <= 524287) goto <D.2208>; else goto <D.2209>;
>>> > <D.2209>:
>>> > }
>>> > *D.2216 = sumD.2206;
>>> > #pragma omp return
>>> > }
>>> >...
>>> >
>>> >
>>> >So, without the patch the reduction operation looks like this:
>>> >...
>>> > *(.omp_data_iD.2201->sumD.2205) = *(.omp_data_iD.2201->sumD.2205) + x
>>> >...
>>> >
>>> >And with this patch the reduction operation is simply:
>>> >...
>>> > sumD.2206 = sumD.2206 + x:
>>> >...
>>> >
>>> >OK for trunk?
>> I presume the reason you are trying to do that here is that otherwise
>> it happens too late? What you do is what loop store motion would
>> do.
>
> Richard,
>
> Thanks for the hint. I've built a reduction example:
> ...
> void __attribute__((noinline))
> f (unsigned int *__restrict__ a, unsigned int *__restrict__ sum, unsigned int n)
> {
> unsigned int i;
> for (i = 0; i < n; ++i)
> *sum += a[i];
> }...
> and observed that store motion of the *sum store is done by pass_loop_im,
> provided the *sum load is taken out of the the loop by pass_pre first.
>
> So alternatively, we could use pass_pre and pass_loop_im to achieve the same
> effect.
>
> When trying out adding pass_pre as a part of the pass group pass_oacc_kernels, I
> found that also pass_copyprop was required to get parloops to recognize the
> reduction.
>
Attached patch adds pass_copyprop to pass group pass_oacc_kernels.
Bootstrapped and reg-tested in the same way as before.
OK for trunk?
Thanks,
- Tom
[-- Attachment #2: 0008-Add-pass_copy_prop-in-pass_oacc_kernels.patch --]
[-- Type: text/x-patch, Size: 1441 bytes --]
2014-11-23 Tom de Vries <tom@codesourcery.com>
* passes.def: Add pass_copy_prop to pass group pass_oacc_kernels.
* tree-ssa-copy.c (stmt_may_generate_copy): Handle .omp_data_i init
conservatively.
---
gcc/passes.def | 1 +
gcc/tree-ssa-copy.c | 4 ++++
2 files changed, 5 insertions(+)
diff --git a/gcc/passes.def b/gcc/passes.def
index 3a7b096..8c663b0 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -95,6 +95,7 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_tree_loop_init);
NEXT_PASS (pass_lim);
NEXT_PASS (pass_ccp);
+ NEXT_PASS (pass_copy_prop);
NEXT_PASS (pass_tree_loop_done);
POP_INSERT_PASSES ()
NEXT_PASS (pass_expand_omp_ssa);
diff --git a/gcc/tree-ssa-copy.c b/gcc/tree-ssa-copy.c
index 7c22c5e..d6eb7a7 100644
--- a/gcc/tree-ssa-copy.c
+++ b/gcc/tree-ssa-copy.c
@@ -55,6 +55,7 @@ along with GCC; see the file COPYING3. If not see
#include "tree-scalar-evolution.h"
#include "tree-ssa-dom.h"
#include "tree-ssa-loop-niter.h"
+#include "omp-low.h"
/* This file implements the copy propagation pass and provides a
@@ -110,6 +111,9 @@ stmt_may_generate_copy (gimple stmt)
if (gimple_has_volatile_ops (stmt))
return false;
+ if (gimple_stmt_omp_data_i_init_p (stmt))
+ return false;
+
/* Statements with loads and/or stores will never generate a useful copy. */
if (gimple_vuse (stmt))
return false;
--
1.9.1
next prev parent reply other threads:[~2014-11-24 11:31 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-15 14:08 openacc kernels directive -- initial support Tom de Vries
2014-11-15 17:21 ` [PATCH, 1/8] Expand oacc kernels after pass_build_ealias Tom de Vries
2014-11-24 11:29 ` Tom de Vries
2014-11-25 11:30 ` Tom de Vries
2015-04-21 19:40 ` Expand oacc kernels after pass_fre (was: [PATCH, 1/8] Expand oacc kernels after pass_build_ealias) Thomas Schwinge
2015-04-22 7:36 ` Richard Biener
2015-06-04 16:50 ` Expand oacc kernels after pass_fre Tom de Vries
2015-06-08 7:29 ` Richard Biener
2015-06-19 9:04 ` Tom de Vries
2015-08-05 7:24 ` [committed, gomp4] Fix release_dangling_ssa_names Tom de Vries
2015-08-05 7:29 ` Richard Biener
2015-08-05 8:48 ` Tom de Vries
2015-08-05 9:30 ` Richard Biener
2015-08-05 10:49 ` Tom de Vries
2015-08-05 11:13 ` Richard Biener
2015-08-11 9:25 ` [committed] Add todo comment for move_sese_region_to_fn Tom de Vries
2015-08-11 18:53 ` [PATCH] Don't create superfluous parm in expand_omp_taskreg Tom de Vries
2015-08-12 10:51 ` Richard Biener
2015-09-24 6:36 ` Thomas Schwinge
2015-09-24 7:21 ` Tom de Vries
2015-09-24 9:31 ` Thomas Schwinge
2015-09-30 8:05 ` [gomp4,committed] Remove release_dangling_ssa_names Tom de Vries
2015-09-30 10:05 ` Thomas Schwinge
2015-09-30 10:25 ` Tom de Vries
2015-09-30 10:43 ` Thomas Schwinge
2014-11-15 17:22 ` [PATCH, 2/8] Add pass_oacc_kernels Tom de Vries
2014-11-25 11:31 ` Tom de Vries
2015-04-21 19:46 ` Thomas Schwinge
2014-11-15 17:23 ` [PATCH, 4/8] Add pass_tree_loop_{init,done} to pass_oacc_kernels Tom de Vries
2014-11-25 11:42 ` Tom de Vries
2015-04-21 19:52 ` Thomas Schwinge
2015-04-22 7:40 ` Richard Biener
2015-06-02 13:52 ` Tom de Vries
2015-06-02 13:58 ` Richard Biener
2015-06-02 15:40 ` Tom de Vries
2015-06-03 11:26 ` Richard Biener
2014-11-15 17:23 ` [PATCH, 3/8] Add pass_ch_oacc_kernels " Tom de Vries
2014-11-25 11:39 ` Tom de Vries
2015-04-21 19:49 ` Thomas Schwinge
2015-04-22 7:39 ` Richard Biener
2015-06-03 9:22 ` Tom de Vries
2015-06-03 11:21 ` Richard Biener
2015-06-04 15:59 ` Tom de Vries
2015-06-03 10:05 ` Tom de Vries
2015-06-03 11:22 ` Richard Biener
2014-11-15 17:24 ` [PATCH, 5/8] Add pass_loop_im " Tom de Vries
2014-11-25 12:00 ` Tom de Vries
2015-04-21 19:57 ` [PATCH, 5/8] Add pass_lim " Thomas Schwinge
2014-11-15 18:32 ` [PATCH, 6/8] Add pass_ccp " Tom de Vries
2014-11-25 12:03 ` Tom de Vries
2015-04-21 20:01 ` [PATCH, 6/8] Add pass_copy_prop in pass_oacc_kernels Thomas Schwinge
2015-04-22 7:42 ` Richard Biener
2015-06-02 13:04 ` Tom de Vries
2014-11-15 18:52 ` [PATCH, 7/8] Add pass_parloops_oacc_kernels to pass_oacc_kernels Tom de Vries
2014-11-25 12:15 ` Tom de Vries
2015-04-21 20:09 ` [PATCH, 7/8] Add pass_parallelize_loops_oacc_kernels " Thomas Schwinge
2014-11-15 19:04 ` [PATCH, 8/8] Do simple omp lowering for no address taken var Tom de Vries
2014-11-17 10:29 ` Richard Biener
2014-11-18 9:13 ` Eric Botcazou
2014-11-18 9:53 ` Richard Biener
2014-11-18 12:20 ` Richard Biener
2014-11-24 11:53 ` Tom de Vries
2014-11-24 11:55 ` Tom de Vries [this message]
2014-11-24 12:42 ` Richard Biener
2014-11-24 18:49 ` Tom de Vries
2014-11-24 12:40 ` Richard Biener
2014-11-19 20:34 ` openacc kernels directive -- initial support Tom de Vries
2015-04-21 19:27 ` Add BUILT_IN_GOACC_KERNELS_INTERNAL (was: openacc kernels directive -- initial support) Thomas Schwinge
2015-04-21 20:24 ` Handle global loop counters in fortran oacc kernels " Thomas Schwinge
2015-04-21 20:29 ` Handle global loop counters in c/c++ " Thomas Schwinge
2015-04-21 20:33 ` Handle oacc kernels with other directives " Thomas Schwinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5473171F.2040708@mentor.com \
--to=tom_devries@mentor.com \
--cc=Thomas_Schwinge@mentor.com \
--cc=ebotcazou@adacore.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).