public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Bin.Cheng" <amker.cheng@gmail.com>
To: Richard Biener <rguenther@suse.de>
Cc: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>,
	GCC Development <gcc@gcc.gnu.org>, 	Bin Cheng <Bin.Cheng@arm.com>,
	Thomas Preudhomme <thomas.preudhomme@linaro.org>
Subject: Re: PR80155: Code hoisting and register pressure
Date: Wed, 23 May 2018 09:40:00 -0000	[thread overview]
Message-ID: <CAHFci28j6ZdxJex5miL=Li=xJ11y49K6T_20wuwpTZo6tO=7pA@mail.gmail.com> (raw)
In-Reply-To: <alpine.LSU.2.20.1805231011350.24704@zhemvz.fhfr.qr>

On Wed, May 23, 2018 at 9:28 AM, Richard Biener <rguenther@suse.de> wrote:
> On Wed, 23 May 2018, Prathamesh Kulkarni wrote:
>
>> Hi,
>> I am trying to work on PR80155, which exposes a problem with code
>> hoisting and register pressure on a leading embedded benchmark for ARM
>> cortex-m7, where code-hoisting causes an extra register spill.
>>
>> I have attached two test-cases which (hopefully) are representative of
>> the original test-case.
>> The first one (trans_dfa.c) is bigger and somewhat similar to the
>> original test-case and trans_dfa_2.c is hand-reduced version of
>> trans_dfa.c. There's 2 spills caused with trans_dfa.c
>> and one spill with trans_dfa_2.c due to lesser amount of cases.
>> The test-cases in the PR are probably not relevant.
>>
>> Initially I thought the spill was happening because of "too many
>> hoistings" taking place in original test-case thus increasing the
>> register pressure, but it seems the spill is possibly caused because
>> expression gets hoisted out of a block that is on loop exit.
>>
>> For example, the following hoistings take place with trans_dfa_2.c:
>>
>> (1) Inserting expression in block 4 for code hoisting:
>> {mem_ref<0B>,tab_20(D)}@.MEM_45 (0005)
>>
>> (2) Inserting expression in block 4 for code hoisting: {plus_expr,_4,1} (0006)
>>
>> (3) Inserting expression in block 4 for code hoisting:
>> {pointer_plus_expr,s_33,1} (0023)
>>
>> (4) Inserting expression in block 3 for code hoisting:
>> {pointer_plus_expr,s_33,1} (0023)
>>
>> The issue seems to be hoisting of (*tab + 1) which consists of first
>> two hoistings in block 4
>> from blocks 5 and 9, which causes the extra spill. I verified that by
>> disabling hoisting into block 4,
>> which resulted in no extra spills.
>>
>> I wonder if that's because the expression (*tab + 1) is getting
>> hoisted from blocks 5 and 9,
>> which are on loop exit ? So the expression that was previously
>> computed in a block on loop exit, gets hoisted outside that block
>> which possibly makes the allocator more defensive ? Similarly
>> disabling hoisting of expressions which appeared in blocks on loop
>> exit in original test-case prevented the extra spill. The other
>> hoistings didn't seem to matter.
>
> I think that's simply co-incidence.  The only thing that makes
> a block that also exits from the loop special is that an
> expression could be sunk out of the loop and hoisting (commoning
> with another path) could prevent that.  But that isn't what is
> happening here and it would be a pass ordering issue as
> the sinking pass runs only after hoisting (no idea why exactly
> but I guess there are cases where we want to prefer CSE over
> sinking).  So you could try if re-ordering PRE and sinking helps
> your testcase.
>
> What I do see is a missed opportunity to merge the successors
> of BB 4.  After PRE we have
>
> <bb 4> [local count: 159303558]:
> <L1>:
> pretmp_123 = *tab_37(D);
> _87 = pretmp_123 + 1;
> if (c_36 == 65)
>   goto <bb 5>; [34.00%]
> else
>   goto <bb 8>; [66.00%]
>
> <bb 5> [local count: 54163210]:
> *tab_37(D) = _87;
> _96 = MEM[(char *)s_57 + 1B];
> if (_96 != 0)
>   goto <bb 7>; [89.00%]
> else
>   goto <bb 6>; [11.00%]
>
> <bb 8> [local count: 105140348]:
> *tab_37(D) = _87;
> _56 = MEM[(char *)s_57 + 1B];
> if (_56 != 0)
>   goto <bb 10>; [89.00%]
> else
>   goto <bb 9>; [11.00%]
>
> here at least the stores and loads can be hoisted.  Note this
> may also point at the real issue of the code hoisting which is
> tearing apart the RMW operation?
>
>> If that's the case, would it make sense to add another constraint to
>> hoisting to not hoist expression if it appears in a block that is on
>> loop exit (exception is if block has only single successor), at-least
>> for targets like cortex-m7 where the effect of spill is more
>> pronounced ?
>>
>> I tried to write an untested prototype patch (approach-8.diff) on
>> these lines, to refuse hoisting if an expression appears in block on
>> loop exit. The patch adds a new map pre_expr_block_d, that maps
>> pre_expr to the set of blocks (as a bitmap) it appears in and are on
>> loop exit, which is computed during compute_avail.
>> During do_hoist_insertion, it simply checks if the bitmap of blocks is
>> not empty.
>> It works for the attached test-cases and passes ssa-pre-*.c and
>> ssa-hoist-*.c tests.
>
> As said to me the heuristic doesn't make much sense.  There is
> btw loop_exits_from_bb_p ().
>
> If the issue in the end is a RA one (that is, there _is_ a better
> allocation possible?) then you may also want to look at out-of-SSA
> coalescing.
>
> So overall I'm not convinced the hoisting decision is wrong.
> It doesn't increase register pressure at all.  It does expose
Not quite.  There are two hoisting in case of trans_dfa_2.c

For the first one:

Inserting expression in block 4 for code hoisting:
{mem_ref<0B>,tab_20(D)}@.MEM_45 (0005)
Inserted pretmp_30 = *tab_20(D);
 in predecessor 4 (0005)
Inserting expression in block 4 for code hoisting: {plus_expr,_4,1} (0006)
Inserted _12 = pretmp_30 + 1;
 in predecessor 4 (0006)

I agree hoisting separates RM from W, but for the second one:

Inserting expression in block 4 for code hoisting:
{pointer_plus_expr,s_33,1} (0023)
Inserted _14 = s_33 + 1;
 in predecessor 4 (0023)
Inserting expression in block 3 for code hoisting:
{pointer_plus_expr,s_33,1} (0023)
Inserted _62 = s_33 + 1;
 in predecessor 3 (0023)

It does increase register pressure because _62(originally s_34 and
s_27) is extended and s_33 is still alive.

Thanks,
bin

> a missed store hoisting and load CSE opportunity (but we don't
> have a way to "PRE" or hoist stores at the moment).  Stores
> do not fit data flow problems well given they need to be kept
> in order with respect to other stores and loads and appearantly
> *tab aliases *s (yeah, s is char *...  make tab restrict and I
> guess things will work much smoother).
>
> Richard.
>
>> Alternatively, we could restrict replacing expression by it's leader
>> in eliminate_dom_walker::before_dom_children if the expression appears
>> in a block on loop exit.
>> In principle this is more general than hoisting, but I have restricted
>> it to only hoisted expressions to avoid being overly conservative.
>> Similarly, this constrained could be made conditional, only for
>> targets like cortex-m7. I have attached a prototype patch based on
>> this approach (approach-9.diff). Although it works for attached
>> test-cases, unfortunately it ICE's with the original test-case in
>> tail-merge, I am working on that.
>>
>> I am not sure though if either of these approaches are in the correct
>> direction and would
>> be grateful for suggestions on the issue!
>>
>> Thanks,
>> Prathamesh
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

      parent reply	other threads:[~2018-05-23  9:40 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-23  7:27 Prathamesh Kulkarni
2018-05-23  8:29 ` Richard Biener
2018-05-23  9:20   ` Prathamesh Kulkarni
2018-05-23 13:07     ` Jeff Law
2018-05-25  9:23       ` Prathamesh Kulkarni
2018-05-25  9:49         ` Bin.Cheng
2018-05-25  9:58           ` Richard Biener
2018-05-25 16:57           ` Jeff Law
2018-05-25 17:54             ` Richard Biener
2018-05-25 19:26               ` Jeff Law
2018-05-26  6:10                 ` Richard Biener
2018-05-26 18:07               ` Bin.Cheng
2018-05-28 15:22                 ` Richard Biener
2018-05-23  9:40   ` Bin.Cheng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHFci28j6ZdxJex5miL=Li=xJ11y49K6T_20wuwpTZo6tO=7pA@mail.gmail.com' \
    --to=amker.cheng@gmail.com \
    --cc=Bin.Cheng@arm.com \
    --cc=gcc@gcc.gnu.org \
    --cc=prathamesh.kulkarni@linaro.org \
    --cc=rguenther@suse.de \
    --cc=thomas.preudhomme@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).