public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
Cc: gcc@gcc.gnu.org, Bin Cheng <Bin.Cheng@arm.com>,
	    Thomas Preudhomme <thomas.preudhomme@linaro.org>
Subject: Re: PR80155: Code hoisting and register pressure
Date: Wed, 23 May 2018 08:29:00 -0000	[thread overview]
Message-ID: <alpine.LSU.2.20.1805231011350.24704@zhemvz.fhfr.qr> (raw)
In-Reply-To: <CAAgBjMndG3T40NnF6-SZkzyF_UzQwzXGT6HcnMJr0HfBD5mTKg@mail.gmail.com>

On Wed, 23 May 2018, Prathamesh Kulkarni wrote:

> Hi,
> I am trying to work on PR80155, which exposes a problem with code
> hoisting and register pressure on a leading embedded benchmark for ARM
> cortex-m7, where code-hoisting causes an extra register spill.
> 
> I have attached two test-cases which (hopefully) are representative of
> the original test-case.
> The first one (trans_dfa.c) is bigger and somewhat similar to the
> original test-case and trans_dfa_2.c is hand-reduced version of
> trans_dfa.c. There's 2 spills caused with trans_dfa.c
> and one spill with trans_dfa_2.c due to lesser amount of cases.
> The test-cases in the PR are probably not relevant.
> 
> Initially I thought the spill was happening because of "too many
> hoistings" taking place in original test-case thus increasing the
> register pressure, but it seems the spill is possibly caused because
> expression gets hoisted out of a block that is on loop exit.
> 
> For example, the following hoistings take place with trans_dfa_2.c:
> 
> (1) Inserting expression in block 4 for code hoisting:
> {mem_ref<0B>,tab_20(D)}@.MEM_45 (0005)
> 
> (2) Inserting expression in block 4 for code hoisting: {plus_expr,_4,1} (0006)
> 
> (3) Inserting expression in block 4 for code hoisting:
> {pointer_plus_expr,s_33,1} (0023)
> 
> (4) Inserting expression in block 3 for code hoisting:
> {pointer_plus_expr,s_33,1} (0023)
> 
> The issue seems to be hoisting of (*tab + 1) which consists of first
> two hoistings in block 4
> from blocks 5 and 9, which causes the extra spill. I verified that by
> disabling hoisting into block 4,
> which resulted in no extra spills.
> 
> I wonder if that's because the expression (*tab + 1) is getting
> hoisted from blocks 5 and 9,
> which are on loop exit ? So the expression that was previously
> computed in a block on loop exit, gets hoisted outside that block
> which possibly makes the allocator more defensive ? Similarly
> disabling hoisting of expressions which appeared in blocks on loop
> exit in original test-case prevented the extra spill. The other
> hoistings didn't seem to matter.

I think that's simply co-incidence.  The only thing that makes
a block that also exits from the loop special is that an
expression could be sunk out of the loop and hoisting (commoning
with another path) could prevent that.  But that isn't what is
happening here and it would be a pass ordering issue as
the sinking pass runs only after hoisting (no idea why exactly
but I guess there are cases where we want to prefer CSE over
sinking).  So you could try if re-ordering PRE and sinking helps
your testcase.

What I do see is a missed opportunity to merge the successors
of BB 4.  After PRE we have

<bb 4> [local count: 159303558]:
<L1>:
pretmp_123 = *tab_37(D);
_87 = pretmp_123 + 1;
if (c_36 == 65)
  goto <bb 5>; [34.00%]
else
  goto <bb 8>; [66.00%]

<bb 5> [local count: 54163210]:
*tab_37(D) = _87;
_96 = MEM[(char *)s_57 + 1B];
if (_96 != 0)
  goto <bb 7>; [89.00%]
else
  goto <bb 6>; [11.00%]

<bb 8> [local count: 105140348]:
*tab_37(D) = _87;
_56 = MEM[(char *)s_57 + 1B];
if (_56 != 0)
  goto <bb 10>; [89.00%]
else
  goto <bb 9>; [11.00%]

here at least the stores and loads can be hoisted.  Note this
may also point at the real issue of the code hoisting which is
tearing apart the RMW operation?

> If that's the case, would it make sense to add another constraint to
> hoisting to not hoist expression if it appears in a block that is on
> loop exit (exception is if block has only single successor), at-least
> for targets like cortex-m7 where the effect of spill is more
> pronounced ?
> 
> I tried to write an untested prototype patch (approach-8.diff) on
> these lines, to refuse hoisting if an expression appears in block on
> loop exit. The patch adds a new map pre_expr_block_d, that maps
> pre_expr to the set of blocks (as a bitmap) it appears in and are on
> loop exit, which is computed during compute_avail.
> During do_hoist_insertion, it simply checks if the bitmap of blocks is
> not empty.
> It works for the attached test-cases and passes ssa-pre-*.c and
> ssa-hoist-*.c tests.

As said to me the heuristic doesn't make much sense.  There is
btw loop_exits_from_bb_p ().

If the issue in the end is a RA one (that is, there _is_ a better
allocation possible?) then you may also want to look at out-of-SSA
coalescing.

So overall I'm not convinced the hoisting decision is wrong.
It doesn't increase register pressure at all.  It does expose
a missed store hoisting and load CSE opportunity (but we don't
have a way to "PRE" or hoist stores at the moment).  Stores
do not fit data flow problems well given they need to be kept
in order with respect to other stores and loads and appearantly
*tab aliases *s (yeah, s is char *...  make tab restrict and I
guess things will work much smoother).

Richard.

> Alternatively, we could restrict replacing expression by it's leader
> in eliminate_dom_walker::before_dom_children if the expression appears
> in a block on loop exit.
> In principle this is more general than hoisting, but I have restricted
> it to only hoisted expressions to avoid being overly conservative.
> Similarly, this constrained could be made conditional, only for
> targets like cortex-m7. I have attached a prototype patch based on
> this approach (approach-9.diff). Although it works for attached
> test-cases, unfortunately it ICE's with the original test-case in
> tail-merge, I am working on that.
> 
> I am not sure though if either of these approaches are in the correct
> direction and would
> be grateful for suggestions on the issue!
> 
> Thanks,
> Prathamesh
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

  reply	other threads:[~2018-05-23  8:29 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-23  7:27 Prathamesh Kulkarni
2018-05-23  8:29 ` Richard Biener [this message]
2018-05-23  9:20   ` Prathamesh Kulkarni
2018-05-23 13:07     ` Jeff Law
2018-05-25  9:23       ` Prathamesh Kulkarni
2018-05-25  9:49         ` Bin.Cheng
2018-05-25  9:58           ` Richard Biener
2018-05-25 16:57           ` Jeff Law
2018-05-25 17:54             ` Richard Biener
2018-05-25 19:26               ` Jeff Law
2018-05-26  6:10                 ` Richard Biener
2018-05-26 18:07               ` Bin.Cheng
2018-05-28 15:22                 ` Richard Biener
2018-05-23  9:40   ` Bin.Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.20.1805231011350.24704@zhemvz.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=Bin.Cheng@arm.com \
    --cc=gcc@gcc.gnu.org \
    --cc=prathamesh.kulkarni@linaro.org \
    --cc=thomas.preudhomme@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).