public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Aldy Hernandez <aldyh@redhat.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PR middle-end/70359] uncoalesce IVs outside of loops
Date: Tue, 20 Mar 2018 13:43:00 -0000	[thread overview]
Message-ID: <CAFiYyc2KK1mAhBYQ62Ys41-07c1i66jKSCUJ+Y4LQHx2D=d1FQ@mail.gmail.com> (raw)
In-Reply-To: <9f418de9-7a65-7a25-2bf4-d4bec6681211@redhat.com>

On Mon, Mar 19, 2018 at 6:08 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> Hi Richard.
>
> As discussed in the PR, the problem here is that we have two different
> iterations of an IV live outside of a loop.  This inhibits us from using
> autoinc/dec addressing on ARM, and causes extra lea's on x86.
>
> An abbreviated example is this:
>
> loop:
>   # p_9 = PHI <p_17(2), p_20(3)>
>   p_20 = p_9 + 18446744073709551615;
> goto loop
>   p_24 = p_9 + 18446744073709551614;
>   MEM[(char *)p_20 + -1B] = 45;
>
> Here we have both the previous IV (p_9) and the current IV (p_20) used
> outside of the loop.  On Arm this keeps us from using auto-dec addressing,
> because one use is -2 and the other one is -1.
>
> With the attached patch we attempt to rewrite out-of-loop uses of the IV in
> terms of the current/last IV (p_20 in the case above).  With it, we end up
> with:
>
>   p_24 = p_20 + 18446744073709551615;
>   *p_24 = 45;
>
> ...which helps both x86 and Arm.
>
> As you have suggested in comment 38 on the PR, I handle specially
> out-of-loop IV uses of the form IV+CST and propagate those accordingly
> (along with the MEM_REF above).  Otherwise, in less specific cases, we un-do
> the IV increment, and use that value in all out-of-loop uses.  For instance,
> in the attached testcase, we rewrite:
>
>   george (p_9);
>
> into
>
>   _26 = p_20 + 1;
>   ...
>   george (_26);
>
> The attached testcase tests the IV+CST specific case, as well as the more
> generic case with george().
>
> Although the original PR was for ARM, this behavior can be noticed on x86,
> so I tested on x86 with a full bootstrap + tests.  I also ran the specific
> test on an x86 cross ARM build and made sure we had 2 auto-dec with the
> test.  For the original test (slightly different than the testcase in this
> patch), with this patch we are at 104 bytes versus 116 without it.  There is
> still the issue of a division optimization which would further reduce the
> code size.  I will discuss this separately as it is independent from this
> patch.
>
> Oh yeah, we could make this more generic, and maybe handle any multiple of
> the constant, or perhaps *= and /=.  Perhaps something for next stage1...
>
> OK for trunk?

Sorry for noticing so late but you use loop properties like ->latch and
->loop_father (via flow_bb_inside_loop_p) that may not be valid at
this point given expand doesn't do loop_optimizer_init/finalize.  Generally
you may face loops with multiple latches (->latch == NULL) here and
loops may have multiple exits.  You probably are not hitting any
of those problems because is_iv_plus_constant is quite restrictive
(doesn't handle PLUS_EXPR or MINUS_EXPR).

Also the latch doesn't need to be the block with the exit conditional.

So first of all you'd need to wrap insert_backedge_copies () with

 loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
...
 loop_optimizer_finalize ();

that is required to fixup an eventually broken state.  Then you
probably should restrict handling to PHIs in BBs with
bb->loop_father->header == bb (aka loop header PHIs),
use get_loop_exit_edges when the IV increment is of OK form
and then use dominator info to query in which exit block to
insert compensation (or simply refuse to do anything for
loops with multiple exits).

So if you have more cycles to work on this that would be nice,
otherwise I can certainly take over from here...

Thanks,
Richard.

> Aldy

  reply	other threads:[~2018-03-20 13:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-19 17:55 Aldy Hernandez
2018-03-20 13:43 ` Richard Biener [this message]
2018-03-28 11:52   ` Richard Biener
2018-03-28 14:35     ` Richard Biener
2018-03-20 17:12 ` Bin.Cheng
2018-03-20 18:00   ` Richard Biener
2018-03-20 18:18     ` Bin.Cheng
2018-03-21  9:28       ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc2KK1mAhBYQ62Ys41-07c1i66jKSCUJ+Y4LQHx2D=d1FQ@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=aldyh@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).