public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tom de Vries <Tom_deVries@mentor.com>
To: Richard Biener <rguenther@suse.de>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
	Jakub Jelinek <jakub@redhat.com>,
	Thomas Schwinge <Thomas_Schwinge@mentor.com>,
	<ebotcazou@adacore.com>
Subject: Re: [PATCH, 8/8] Do simple omp lowering for no address taken var
Date: Mon, 24 Nov 2014 11:55:00 -0000	[thread overview]
Message-ID: <5473171F.2040708@mentor.com> (raw)
In-Reply-To: <54731678.9090207@mentor.com>

[-- Attachment #1: Type: text/plain, Size: 5217 bytes --]

On 24-11-14 12:28, Tom de Vries wrote:
> On 17-11-14 11:13, Richard Biener wrote:
>> On Sat, 15 Nov 2014, Tom de Vries wrote:
>>
>>> >On 15-11-14 13:14, Tom de Vries wrote:
>>>> > >Hi,
>>>> > >
>>>> > >I'm submitting a patch series with initial support for the oacc kernels
>>>> > >directive.
>>>> > >
>>>> > >The patch series uses pass_parallelize_loops to implement parallelization of
>>>> > >loops in the oacc kernels region.
>>>> > >
>>>> > >The patch series consists of these 8 patches:
>>>> > >...
>>>> > >      1  Expand oacc kernels after pass_build_ealias
>>>> > >      2  Add pass_oacc_kernels
>>>> > >      3  Add pass_ch_oacc_kernels to pass_oacc_kernels
>>>> > >      4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
>>>> > >      5  Add pass_loop_im to pass_oacc_kernels
>>>> > >      6  Add pass_ccp to pass_oacc_kernels
>>>> > >      7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
>>>> > >      8  Do simple omp lowering for no address taken var
>>>> > >...
>>> >
>>> >This patch lowers integer variables that do not have their address taken as
>>> >local variable.  We use a copy at region entry and exit to copy the value in
>>> >and out.
>>> >
>>> >In the context of reduction handling in a kernels region, this allows the
>>> >parloops reduction analysis to recognize the reduction, even after oacc
>>> >lowering has been done in pass_lower_omp.
>>> >
>>> >In more detail, without this patch, the omp_data_i load and stores are
>>> >generated in place (in this case, in the loop):
>>> >...
>>> >                 {
>>> >                   .omp_data_iD.2201 = &.omp_data_arr.15D.2220;
>>> >                   {
>>> >                     unsigned intD.9 iD.2146;
>>> >
>>> >                     iD.2146 = 0;
>>> >                     goto <D.2207>;
>>> >                     <D.2208>:
>>> >                     D.2216 = .omp_data_iD.2201->cD.2203;
>>> >                     c.9D.2176 = *D.2216;
>>> >                     D.2177 = (long unsigned intD.10) iD.2146;
>>> >                     D.2178 = D.2177 * 4;
>>> >                     D.2179 = c.9D.2176 + D.2178;
>>> >                     D.2180 = *D.2179;
>>> >                     D.2217 = .omp_data_iD.2201->sumD.2205;
>>> >                     D.2218 = *D.2217;
>>> >                     D.2217 = .omp_data_iD.2201->sumD.2205;
>>> >                     D.2219 = D.2180 + D.2218;
>>> >                     *D.2217 = D.2219;
>>> >                     iD.2146 = iD.2146 + 1;
>>> >                     <D.2207>:
>>> >                     if (iD.2146 <= 524287) goto <D.2208>; else goto <D.2209>;
>>> >                     <D.2209>:
>>> >                   }
>>> >...
>>> >
>>> >With this patch, the omp_data_i load and stores for sum are generated at entry
>>> >and exit:
>>> >...
>>> >                 {
>>> >                   .omp_data_iD.2201 = &.omp_data_arr.15D.2218;
>>> >                   D.2216 = .omp_data_iD.2201->sumD.2205;
>>> >                   sumD.2206 = *D.2216;
>>> >                   {
>>> >                     unsigned intD.9 iD.2146;
>>> >
>>> >                     iD.2146 = 0;
>>> >                     goto <D.2207>;
>>> >                     <D.2208>:
>>> >                     D.2217 = .omp_data_iD.2201->cD.2203;
>>> >                     c.9D.2176 = *D.2217;
>>> >                     D.2177 = (long unsigned intD.10) iD.2146;
>>> >                     D.2178 = D.2177 * 4;
>>> >                     D.2179 = c.9D.2176 + D.2178;
>>> >                     D.2180 = *D.2179;
>>> >                     sumD.2206 = D.2180 + sumD.2206;
>>> >                     iD.2146 = iD.2146 + 1;
>>> >                     <D.2207>:
>>> >                     if (iD.2146 <= 524287) goto <D.2208>; else goto <D.2209>;
>>> >                     <D.2209>:
>>> >                   }
>>> >                   *D.2216 = sumD.2206;
>>> >                   #pragma omp return
>>> >                 }
>>> >...
>>> >
>>> >
>>> >So, without the patch the reduction operation looks like this:
>>> >...
>>> >     *(.omp_data_iD.2201->sumD.2205) = *(.omp_data_iD.2201->sumD.2205) + x
>>> >...
>>> >
>>> >And with this patch the reduction operation is simply:
>>> >...
>>> >     sumD.2206 = sumD.2206 + x:
>>> >...
>>> >
>>> >OK for trunk?
>> I presume the reason you are trying to do that here is that otherwise
>> it happens too late?  What you do is what loop store motion would
>> do.
>
> Richard,
>
> Thanks for the hint. I've built a reduction example:
> ...
> void __attribute__((noinline))
> f (unsigned int *__restrict__ a, unsigned int *__restrict__ sum, unsigned int n)
> {
>    unsigned int i;
>    for (i = 0; i < n; ++i)
>      *sum += a[i];
> }...
> and observed that store motion of the *sum store is done by pass_loop_im,
> provided the *sum load is taken out of the the loop by pass_pre first.
>
> So alternatively, we could use pass_pre and pass_loop_im to achieve the same
> effect.
>
> When trying out adding pass_pre as a part of the pass group pass_oacc_kernels, I
> found that also pass_copyprop was required to get parloops to recognize the
> reduction.
>

Attached patch adds pass_copyprop to pass group pass_oacc_kernels.

Bootstrapped and reg-tested in the same way as before.

OK for trunk?

Thanks,
- Tom

[-- Attachment #2: 0008-Add-pass_copy_prop-in-pass_oacc_kernels.patch --]
[-- Type: text/x-patch, Size: 1441 bytes --]

2014-11-23  Tom de Vries  <tom@codesourcery.com>

	* passes.def: Add pass_copy_prop to pass group pass_oacc_kernels.
	* tree-ssa-copy.c (stmt_may_generate_copy): Handle .omp_data_i init
	conservatively.
---
 gcc/passes.def      | 1 +
 gcc/tree-ssa-copy.c | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/gcc/passes.def b/gcc/passes.def
index 3a7b096..8c663b0 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -95,6 +95,7 @@ along with GCC; see the file COPYING3.  If not see
 	      NEXT_PASS (pass_tree_loop_init);
 	      NEXT_PASS (pass_lim);
 	      NEXT_PASS (pass_ccp);
+	      NEXT_PASS (pass_copy_prop);
 	      NEXT_PASS (pass_tree_loop_done);
 	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_expand_omp_ssa);
diff --git a/gcc/tree-ssa-copy.c b/gcc/tree-ssa-copy.c
index 7c22c5e..d6eb7a7 100644
--- a/gcc/tree-ssa-copy.c
+++ b/gcc/tree-ssa-copy.c
@@ -55,6 +55,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-scalar-evolution.h"
 #include "tree-ssa-dom.h"
 #include "tree-ssa-loop-niter.h"
+#include "omp-low.h"
 
 
 /* This file implements the copy propagation pass and provides a
@@ -110,6 +111,9 @@ stmt_may_generate_copy (gimple stmt)
   if (gimple_has_volatile_ops (stmt))
     return false;
 
+  if (gimple_stmt_omp_data_i_init_p (stmt))
+    return false;
+
   /* Statements with loads and/or stores will never generate a useful copy.  */
   if (gimple_vuse (stmt))
     return false;
-- 
1.9.1


  reply	other threads:[~2014-11-24 11:31 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-15 14:08 openacc kernels directive -- initial support Tom de Vries
2014-11-15 17:21 ` [PATCH, 1/8] Expand oacc kernels after pass_build_ealias Tom de Vries
2014-11-24 11:29   ` Tom de Vries
2014-11-25 11:30     ` Tom de Vries
2015-04-21 19:40       ` Expand oacc kernels after pass_fre (was: [PATCH, 1/8] Expand oacc kernels after pass_build_ealias) Thomas Schwinge
2015-04-22  7:36         ` Richard Biener
2015-06-04 16:50           ` Expand oacc kernels after pass_fre Tom de Vries
2015-06-08  7:29             ` Richard Biener
2015-06-19  9:04               ` Tom de Vries
2015-08-05  7:24             ` [committed, gomp4] Fix release_dangling_ssa_names Tom de Vries
2015-08-05  7:29               ` Richard Biener
2015-08-05  8:48                 ` Tom de Vries
2015-08-05  9:30                   ` Richard Biener
2015-08-05 10:49                     ` Tom de Vries
2015-08-05 11:13                       ` Richard Biener
2015-08-11  9:25                         ` [committed] Add todo comment for move_sese_region_to_fn Tom de Vries
2015-08-11 18:53                         ` [PATCH] Don't create superfluous parm in expand_omp_taskreg Tom de Vries
2015-08-12 10:51                           ` Richard Biener
2015-09-24  6:36                           ` Thomas Schwinge
2015-09-24  7:21                             ` Tom de Vries
2015-09-24  9:31                               ` Thomas Schwinge
2015-09-30  8:05                                 ` [gomp4,committed] Remove release_dangling_ssa_names Tom de Vries
2015-09-30 10:05                                   ` Thomas Schwinge
2015-09-30 10:25                                     ` Tom de Vries
2015-09-30 10:43                                       ` Thomas Schwinge
2014-11-15 17:22 ` [PATCH, 2/8] Add pass_oacc_kernels Tom de Vries
2014-11-25 11:31   ` Tom de Vries
2015-04-21 19:46     ` Thomas Schwinge
2014-11-15 17:23 ` [PATCH, 4/8] Add pass_tree_loop_{init,done} to pass_oacc_kernels Tom de Vries
2014-11-25 11:42   ` Tom de Vries
2015-04-21 19:52     ` Thomas Schwinge
2015-04-22  7:40       ` Richard Biener
2015-06-02 13:52         ` Tom de Vries
2015-06-02 13:58           ` Richard Biener
2015-06-02 15:40             ` Tom de Vries
2015-06-03 11:26               ` Richard Biener
2014-11-15 17:23 ` [PATCH, 3/8] Add pass_ch_oacc_kernels " Tom de Vries
2014-11-25 11:39   ` Tom de Vries
2015-04-21 19:49     ` Thomas Schwinge
2015-04-22  7:39       ` Richard Biener
2015-06-03  9:22         ` Tom de Vries
2015-06-03 11:21           ` Richard Biener
2015-06-04 15:59             ` Tom de Vries
2015-06-03 10:05         ` Tom de Vries
2015-06-03 11:22           ` Richard Biener
2014-11-15 17:24 ` [PATCH, 5/8] Add pass_loop_im " Tom de Vries
2014-11-25 12:00   ` Tom de Vries
2015-04-21 19:57     ` [PATCH, 5/8] Add pass_lim " Thomas Schwinge
2014-11-15 18:32 ` [PATCH, 6/8] Add pass_ccp " Tom de Vries
2014-11-25 12:03   ` Tom de Vries
2015-04-21 20:01     ` [PATCH, 6/8] Add pass_copy_prop in pass_oacc_kernels Thomas Schwinge
2015-04-22  7:42       ` Richard Biener
2015-06-02 13:04         ` Tom de Vries
2014-11-15 18:52 ` [PATCH, 7/8] Add pass_parloops_oacc_kernels to pass_oacc_kernels Tom de Vries
2014-11-25 12:15   ` Tom de Vries
2015-04-21 20:09     ` [PATCH, 7/8] Add pass_parallelize_loops_oacc_kernels " Thomas Schwinge
2014-11-15 19:04 ` [PATCH, 8/8] Do simple omp lowering for no address taken var Tom de Vries
2014-11-17 10:29   ` Richard Biener
2014-11-18  9:13     ` Eric Botcazou
2014-11-18  9:53       ` Richard Biener
2014-11-18 12:20         ` Richard Biener
2014-11-24 11:53     ` Tom de Vries
2014-11-24 11:55       ` Tom de Vries [this message]
2014-11-24 12:42         ` Richard Biener
2014-11-24 18:49           ` Tom de Vries
2014-11-24 12:40       ` Richard Biener
2014-11-19 20:34 ` openacc kernels directive -- initial support Tom de Vries
2015-04-21 19:27 ` Add BUILT_IN_GOACC_KERNELS_INTERNAL (was: openacc kernels directive -- initial support) Thomas Schwinge
2015-04-21 20:24 ` Handle global loop counters in fortran oacc kernels " Thomas Schwinge
2015-04-21 20:29 ` Handle global loop counters in c/c++ " Thomas Schwinge
2015-04-21 20:33 ` Handle oacc kernels with other directives " Thomas Schwinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5473171F.2040708@mentor.com \
    --to=tom_devries@mentor.com \
    --cc=Thomas_Schwinge@mentor.com \
    --cc=ebotcazou@adacore.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).