From: Richard Biener <rguenther@suse.de>
To: Tom de Vries <Tom_deVries@mentor.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
Jakub Jelinek <jakub@redhat.com>,
Thomas Schwinge <Thomas_Schwinge@mentor.com>,
ebotcazou@adacore.com
Subject: Re: [PATCH, 8/8] Do simple omp lowering for no address taken var
Date: Mon, 17 Nov 2014 10:29:00 -0000 [thread overview]
Message-ID: <alpine.LSU.2.11.1411171104160.374@zhemvz.fhfr.qr> (raw)
In-Reply-To: <54678C29.40006@mentor.com>
On Sat, 15 Nov 2014, Tom de Vries wrote:
> On 15-11-14 13:14, Tom de Vries wrote:
> > Hi,
> >
> > I'm submitting a patch series with initial support for the oacc kernels
> > directive.
> >
> > The patch series uses pass_parallelize_loops to implement parallelization of
> > loops in the oacc kernels region.
> >
> > The patch series consists of these 8 patches:
> > ...
> > 1 Expand oacc kernels after pass_build_ealias
> > 2 Add pass_oacc_kernels
> > 3 Add pass_ch_oacc_kernels to pass_oacc_kernels
> > 4 Add pass_tree_loop_{init,done} to pass_oacc_kernels
> > 5 Add pass_loop_im to pass_oacc_kernels
> > 6 Add pass_ccp to pass_oacc_kernels
> > 7 Add pass_parloops_oacc_kernels to pass_oacc_kernels
> > 8 Do simple omp lowering for no address taken var
> > ...
>
> This patch lowers integer variables that do not have their address taken as
> local variable. We use a copy at region entry and exit to copy the value in
> and out.
>
> In the context of reduction handling in a kernels region, this allows the
> parloops reduction analysis to recognize the reduction, even after oacc
> lowering has been done in pass_lower_omp.
>
> In more detail, without this patch, the omp_data_i load and stores are
> generated in place (in this case, in the loop):
> ...
> {
> .omp_data_iD.2201 = &.omp_data_arr.15D.2220;
> {
> unsigned intD.9 iD.2146;
>
> iD.2146 = 0;
> goto <D.2207>;
> <D.2208>:
> D.2216 = .omp_data_iD.2201->cD.2203;
> c.9D.2176 = *D.2216;
> D.2177 = (long unsigned intD.10) iD.2146;
> D.2178 = D.2177 * 4;
> D.2179 = c.9D.2176 + D.2178;
> D.2180 = *D.2179;
> D.2217 = .omp_data_iD.2201->sumD.2205;
> D.2218 = *D.2217;
> D.2217 = .omp_data_iD.2201->sumD.2205;
> D.2219 = D.2180 + D.2218;
> *D.2217 = D.2219;
> iD.2146 = iD.2146 + 1;
> <D.2207>:
> if (iD.2146 <= 524287) goto <D.2208>; else goto <D.2209>;
> <D.2209>:
> }
> ...
>
> With this patch, the omp_data_i load and stores for sum are generated at entry
> and exit:
> ...
> {
> .omp_data_iD.2201 = &.omp_data_arr.15D.2218;
> D.2216 = .omp_data_iD.2201->sumD.2205;
> sumD.2206 = *D.2216;
> {
> unsigned intD.9 iD.2146;
>
> iD.2146 = 0;
> goto <D.2207>;
> <D.2208>:
> D.2217 = .omp_data_iD.2201->cD.2203;
> c.9D.2176 = *D.2217;
> D.2177 = (long unsigned intD.10) iD.2146;
> D.2178 = D.2177 * 4;
> D.2179 = c.9D.2176 + D.2178;
> D.2180 = *D.2179;
> sumD.2206 = D.2180 + sumD.2206;
> iD.2146 = iD.2146 + 1;
> <D.2207>:
> if (iD.2146 <= 524287) goto <D.2208>; else goto <D.2209>;
> <D.2209>:
> }
> *D.2216 = sumD.2206;
> #pragma omp return
> }
> ...
>
>
> So, without the patch the reduction operation looks like this:
> ...
> *(.omp_data_iD.2201->sumD.2205) = *(.omp_data_iD.2201->sumD.2205) + x
> ...
>
> And with this patch the reduction operation is simply:
> ...
> sumD.2206 = sumD.2206 + x:
> ...
>
> OK for trunk?
I presume the reason you are trying to do that here is that otherwise
it happens too late? What you do is what loop store motion would
do.
Now - I can see how that is easily confused by the static chain
being address-taken. But I also remember that Eric did some
preparatory work to fix that, for nested functions, that is,
possibly setting DECL_NONADDRESSABLE_P? Don't remember exactly.
That said - the gimple_seq_ior_addresses_taken_op callback looks
completely broken. Consider &a.x which you'd fail to mark as
address-taken. It looks like the body is not yet in CFG form
when you apply all this?
That said - the functions do not belong to gimple.[ch] at least
as they are not going to work in general. I also question
why they are necessary - you do
+ if (gimple_code (stmt) == GIMPLE_OACC_KERNELS
+ && !bitmap_bit_p (addresses_taken, DECL_UID (var))
+ && INTEGRAL_TYPE_P (TREE_TYPE (var)))
but why don't you simply check TREE_ADDRESSABLE (var)? TREE_ADDRESSABLE
is conservative correct here.
And the above won't help for float reductions. So if, then you
should probably test is_gimple_reg_type (TREE_TYPE (var)) instead
of INTEGRAL_TYPE_P and you definitely should limit the number of
vars treated this way.
Oh - and the optimization should be somewhere more general - after
all it applies to all nested functions (thus move it to tree-nested.c?)
and to autopar loops as well. Not sure how much code the omp
lowering shares with unnesting - but hopefully enough.
Richard.
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284
(AG Nuernberg)
Maxfeldstrasse 5, 90409 Nuernberg, Germany
next prev parent reply other threads:[~2014-11-17 10:19 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-15 14:08 openacc kernels directive -- initial support Tom de Vries
2014-11-15 17:21 ` [PATCH, 1/8] Expand oacc kernels after pass_build_ealias Tom de Vries
2014-11-24 11:29 ` Tom de Vries
2014-11-25 11:30 ` Tom de Vries
2015-04-21 19:40 ` Expand oacc kernels after pass_fre (was: [PATCH, 1/8] Expand oacc kernels after pass_build_ealias) Thomas Schwinge
2015-04-22 7:36 ` Richard Biener
2015-06-04 16:50 ` Expand oacc kernels after pass_fre Tom de Vries
2015-06-08 7:29 ` Richard Biener
2015-06-19 9:04 ` Tom de Vries
2015-08-05 7:24 ` [committed, gomp4] Fix release_dangling_ssa_names Tom de Vries
2015-08-05 7:29 ` Richard Biener
2015-08-05 8:48 ` Tom de Vries
2015-08-05 9:30 ` Richard Biener
2015-08-05 10:49 ` Tom de Vries
2015-08-05 11:13 ` Richard Biener
2015-08-11 9:25 ` [committed] Add todo comment for move_sese_region_to_fn Tom de Vries
2015-08-11 18:53 ` [PATCH] Don't create superfluous parm in expand_omp_taskreg Tom de Vries
2015-08-12 10:51 ` Richard Biener
2015-09-24 6:36 ` Thomas Schwinge
2015-09-24 7:21 ` Tom de Vries
2015-09-24 9:31 ` Thomas Schwinge
2015-09-30 8:05 ` [gomp4,committed] Remove release_dangling_ssa_names Tom de Vries
2015-09-30 10:05 ` Thomas Schwinge
2015-09-30 10:25 ` Tom de Vries
2015-09-30 10:43 ` Thomas Schwinge
2014-11-15 17:22 ` [PATCH, 2/8] Add pass_oacc_kernels Tom de Vries
2014-11-25 11:31 ` Tom de Vries
2015-04-21 19:46 ` Thomas Schwinge
2014-11-15 17:23 ` [PATCH, 3/8] Add pass_ch_oacc_kernels to pass_oacc_kernels Tom de Vries
2014-11-25 11:39 ` Tom de Vries
2015-04-21 19:49 ` Thomas Schwinge
2015-04-22 7:39 ` Richard Biener
2015-06-03 9:22 ` Tom de Vries
2015-06-03 11:21 ` Richard Biener
2015-06-04 15:59 ` Tom de Vries
2015-06-03 10:05 ` Tom de Vries
2015-06-03 11:22 ` Richard Biener
2014-11-15 17:23 ` [PATCH, 4/8] Add pass_tree_loop_{init,done} " Tom de Vries
2014-11-25 11:42 ` Tom de Vries
2015-04-21 19:52 ` Thomas Schwinge
2015-04-22 7:40 ` Richard Biener
2015-06-02 13:52 ` Tom de Vries
2015-06-02 13:58 ` Richard Biener
2015-06-02 15:40 ` Tom de Vries
2015-06-03 11:26 ` Richard Biener
2014-11-15 17:24 ` [PATCH, 5/8] Add pass_loop_im " Tom de Vries
2014-11-25 12:00 ` Tom de Vries
2015-04-21 19:57 ` [PATCH, 5/8] Add pass_lim " Thomas Schwinge
2014-11-15 18:32 ` [PATCH, 6/8] Add pass_ccp " Tom de Vries
2014-11-25 12:03 ` Tom de Vries
2015-04-21 20:01 ` [PATCH, 6/8] Add pass_copy_prop in pass_oacc_kernels Thomas Schwinge
2015-04-22 7:42 ` Richard Biener
2015-06-02 13:04 ` Tom de Vries
2014-11-15 18:52 ` [PATCH, 7/8] Add pass_parloops_oacc_kernels to pass_oacc_kernels Tom de Vries
2014-11-25 12:15 ` Tom de Vries
2015-04-21 20:09 ` [PATCH, 7/8] Add pass_parallelize_loops_oacc_kernels " Thomas Schwinge
2014-11-15 19:04 ` [PATCH, 8/8] Do simple omp lowering for no address taken var Tom de Vries
2014-11-17 10:29 ` Richard Biener [this message]
2014-11-18 9:13 ` Eric Botcazou
2014-11-18 9:53 ` Richard Biener
2014-11-18 12:20 ` Richard Biener
2014-11-24 11:53 ` Tom de Vries
2014-11-24 11:55 ` Tom de Vries
2014-11-24 12:42 ` Richard Biener
2014-11-24 18:49 ` Tom de Vries
2014-11-24 12:40 ` Richard Biener
2014-11-19 20:34 ` openacc kernels directive -- initial support Tom de Vries
2015-04-21 19:27 ` Add BUILT_IN_GOACC_KERNELS_INTERNAL (was: openacc kernels directive -- initial support) Thomas Schwinge
2015-04-21 20:24 ` Handle global loop counters in fortran oacc kernels " Thomas Schwinge
2015-04-21 20:29 ` Handle global loop counters in c/c++ " Thomas Schwinge
2015-04-21 20:33 ` Handle oacc kernels with other directives " Thomas Schwinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1411171104160.374@zhemvz.fhfr.qr \
--to=rguenther@suse.de \
--cc=Thomas_Schwinge@mentor.com \
--cc=Tom_deVries@mentor.com \
--cc=ebotcazou@adacore.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).