From: Richard Biener <rguenther@suse.de>
To: Tom de Vries <Tom_deVries@mentor.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>, Jakub Jelinek <jakub@redhat.com>
Subject: Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt
Date: Tue, 26 May 2015 11:05:00 -0000 [thread overview]
Message-ID: <alpine.LSU.2.11.1505261232500.30088@zhemvz.fhfr.qr> (raw)
In-Reply-To: <55548A19.3000302@mentor.com>
On Thu, 14 May 2015, Tom de Vries wrote:
> On 20-04-15 14:25, Richard Biener wrote:
> > On Wed, 15 Apr 2015, Tom de Vries wrote:
> >
> > > On 03-04-15 14:39, Tom de Vries wrote:
> > > > On 27-03-15 15:10, Tom de Vries wrote:
> > > > > Hi,
> > > > >
> > > > > this patch fixes PR65443, a todo in the parloops pass for function
> > > > > transform_to_exit_first_loop:
> > > > > ...
> > > > > TODO: the common case is that latch of the loop is empty and
> > > > > immediately
> > > > > follows the loop exit. In this case, it would be better not to
> > > > > copy
> > > > > the
> > > > > body of the loop, but only move the entry of the loop directly
> > > > > before
> > > > > the
> > > > > exit check and increase the number of iterations of the loop by
> > > > > one.
> > > > > This may need some additional preconditioning in case NIT = ~0.
> > > > > ...
> > > > >
> > > > > The current implementation of transform_to_exit_first_loop transforms
> > > > > the
> > > > > loop
> > > > > by moving and duplicating the loop body. This patch transforms the
> > > > > loop
> > > > > into the
> > > > > same normal form, but without the duplication, if that's possible
> > > > > (determined by
> > > > > try_transform_to_exit_first_loop_alt).
> > > > >
> > > > > The actual transformation, done by transform_to_exit_first_loop_alt
> > > > > transforms
> > > > > this loop:
> > > > > ...
> > > > > bb preheader:
> > > > > ...
> > > > > goto <bb header>
> > > > >
> > > > > <bb header>:
> > > > > ...
> > > > > if (ivtmp < n)
> > > > > goto <bb latch>;
> > > > > else
> > > > > goto <bb exit>;
> > > > >
> > > > > <bb latch>:
> > > > > ivtmp = ivtmp + inc;
> > > > > goto <bb header>
> > > > > ...
> > > > >
> > > > > into this one:
> > > > > ...
> > > > > bb preheader:
> > > > > ...
> > > > > goto <bb newheader>
> > > > >
> > > > > <bb header>:
> > > > > ...
> > > > > goto <bb latch>;
> > > > >
> > > > > <bb newheader>:
> > > > > if (ivtmp < n + 1)
> > > > > goto <bb header>;
> > > > > else
> > > > > goto <bb exit>;
> > > > >
> > > > > <bb latch>:
> > > > > ivtmp = ivtmp + inc;
> > > > > goto <bb newheader>
> > > > > ...
> > > > >
> > > >
> > > > Updated patch, now using redirect_edge_var_map and flush_pending_stmts.
> > > >
> > > > Bootstrapped and reg-tested on x86_64.
> > > >
> > > > OK for stage1 trunk?
> > > >
> > >
> > > Ping.
> >
>
> Hi Richard,
>
> thanks for the review.
Sorry for the delay (too many things to review, too much other work...)
> > +static void
> > +replace_imm_uses (tree val, imm_use_iterator *imm_iter)
> > +{
> > + use_operand_p use_p;
> > +
> > + FOR_EACH_IMM_USE_ON_STMT (use_p, *imm_iter)
> > + SET_USE (use_p, val);
> >
> > Use propagate_value. Why this odd interface passing a imm_iter?!
> >
>
> The replace_imm_uses function is common code factored out of
> replace_uses_in_bb_by and replace_uses_in_bbs_by. I'm not sure what is odd
> about the interface of the replace_imm_uses function, but I've removed the
> function.
>
> I tried using propagate_value, but that one doesn't like virtuals.
>
> > In fact most of the "repair SSA" in transform_to_exit_first_loop_alt
> > looks too ad-hoc to me ... it also looks somewhat familiar to other
> > code ...
> >
> > Unfortunately the comment before the function isn't in SSA form
> > so it's hard to follow the transform.
> >
>
> I've added the ssa bits in the transformation comment before the function, and
> updated variable names and comments in the function.
>
> > I consider the parloops code bitrotten, no repair possible, so
> > I might be convinced to not care about new spaghetti in there...
> >
> > + /* Fix up loop structure. TODO: Check whether this is sufficient. */
> > + loop->header = new_header;
> > +
> >
> > no, surely not. Number of iterations (and estimates) are off
> > after the transform
>
> The loop is cancelled at the end of gen_parallel_loop, and the estimations are
> removed. We only seem to be using a limited set of the loop data until the
> loop is cancelled. Updated comment accordingly.
>
> > and loop->latch might also need updating.
> >
>
> That one actually stays the same. Updated comment accordingly.
>
> > "Safest" is to simply schedule a fixup (but you'll lose any
> > loop annotation in that process).
> >
>
> Since the loop is cancelled, AFAIU we don't need that, but... You're
> referring to the annotations mentioned in
> replace_loop_annotate_in_block? Losing the annotations seems to be a
> generic problem of scheduling such a fixup, not particular to this
> patch. Do you have a concern related to the patch and these annotations?
For example such annotations (or others added by OpenMP like the
safe_len stuff). Generally fixups preserve those _iff_ the loop
header stays the same. So in your case doing
loop->header = new_header;
set_loops_state (LOOPS_NEED_FIXUP);
would probably "work". But yes, if the loop is cancelled anyway
there is no point jumping through hoops.
> > + /* Figure out whether nit + 1 overflows. */
> > + if (TREE_CODE (nit) == INTEGER_CST)
> > + {
> > + if (!tree_int_cst_equal (nit, TYPE_MAXVAL (nit_type)))
> >
> > in case nit_type is a pointer type TYPE_MAXVAL will be NULL I think.
> >
>
> We've done ivcanon just before this point, so AFAIU we can assume we're
> dealing with an unsigned integer.
>
> > Is the whole exercise for performance?
>
> I'm trying to fix the todo in the code for parloops, which is about
> performance, though I don't expect a large gain.
>
> OTOH, my main focus is a related oacc kernels issue. For an oacc kernels
> region, the entire loop is enclosed in a kernels region. Peeling of the last
> iteration means I have to guard that iteration to run on only one thread,
> which currently isn't done, so in that sense it's a correctness issue as well.
But as the patch doesn't cover all cases you still have a correctness
issue to solve, no? It seems to me the last iteration should simply
_not_ be in the oacc kernels region?
> > In that case using an
> > entirely new, unsigned IV, that runs from 0 to niter should be easiest and
>
> Introducting such an IV would mean that we don't have to update the IV, but
> still we have to connect the new IV to the uses of the old IV.
>
> The current special handling of the IV variable is actually not that
> elaborate:
> ...
> + /* Replace ivtmp_a with ivtmp_c in condition 'if (ivtmp_a < n)'. */
> + replace_uses_in_bb_by (res_a, res_c, new_header);
> ...
> So I'm not sure it's easier.
>
> just run-time guard that niter == +INF case?
>
> That doesn't work out nicely for the oacc kernels case. I need to know
> compile-time wheter I can parallelize or not.
Hmm, I see.
> > For the graphite case, can't you make graphite generate the
> > loops exit-first in the first place?
> >
>
> Perhaps, but that doesn't make a difference for the oacc kernels case.
>
> > The testcases are just correctness ones? That is, there isn't
> > any testcase that checks whether the new code is exercised?
> > (no extra debugging dumping?)
> >
>
> There are 3 new test patterns, each with a libgomp/testsuite/libgomp.c and a
> gcc/testsuite/gcc.dg variant, so 6 new test-cases in total. The
> gcc/testsuite/gcc.dg variant checks that the new code is exercised. The
> libgomp/testsuite/libgomp.c variant checks that the new code generates correct
> code.
>
> Reposting updated patch (for brevity, without testcases) below.
I'm ok with the patch and count on you to fix eventual fallout ;)
Thanks,
Richard.
next prev parent reply other threads:[~2015-05-26 10:39 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-27 14:10 [PATCH, stage1][PR65443] " Tom de Vries
2015-04-03 12:40 ` Tom de Vries
2015-04-15 13:39 ` [PING][PATCH][PR65443] " Tom de Vries
2015-04-20 12:26 ` Richard Biener
2015-05-14 11:47 ` Tom de Vries
2015-05-26 11:05 ` Richard Biener [this message]
2015-06-04 8:37 ` Tom de Vries
2015-06-08 10:44 ` Tom de Vries
2015-06-08 11:08 ` Richard Biener
2015-06-08 15:59 ` Thomas Schwinge
2015-06-08 22:12 ` Tom de Vries
2015-06-09 7:47 ` Tom de Vries
2015-06-01 8:12 ` [gomp4,committed,PR65443] Add transform_to_exit_first_loop_al Tom de Vries
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1505261232500.30088@zhemvz.fhfr.qr \
--to=rguenther@suse.de \
--cc=Tom_deVries@mentor.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).