[Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "tnfchris at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies
Date: Tue, 07 Dec 2021 23:52:49 +0000	[thread overview]
Message-ID: <bug-98782-4-x3OX8uNnn7@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-98782-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782

--- Comment #15 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
> That is, we're trading two memory accesses in the call branch
> (if we allocate R) against one memory access in both branches
> (if we spill R).  As the call branch gets more likely,
> the cost of doing two memory accesses there gets higher
> relative to the cost of doing one memory access in both branches.
> And that seems like the right behaviour in principle.
> 
> From that point of view, it doesn't look like the memory and register
> costs of R are too wrong here.  The things being costed are the store
> and load around the call (which do exist if we allocate a call-clobbered
> register) and the loads at each use site (which do exist if we spill R).

Indeed, I don't think the heuristics are wrong, but because one frequency
CALL_FREQ grows much quicker than BB_FREQ and at the smaller values they are a
bit sensitive to any changes.  The edge probabilities can barely change while
the BB_FREQ can change dramatically.

> 
> Like Feng Xue says in comment 1, I think the main missed optimisation
> opportunity here is that foo + 1024 is invariant, so if we allocate
> a call-clobbered register, we could save R once outside the loop
> and reload it after each call.  That would give:
> 
> - a store of R outside the loop (low execution count)
> - a load of R inside the loop (after the call) with freq 0.51 * loop iters
> 

Yes, that is the ideal solution, but also requires more changes to RA.
Instead I've chosen a middle ground here (same as yours but done in
ira_tune_allocno_costs instead), which is to store and load only inside
the loop, but to do so only in the BB which contains the call.

This is a major improvement over the current situation because when you
have several nested loops where the value is invariant across a number of them
you run into problems when each of these BB have naturally very high register
pressure.

As you say:

> - a store of R outside the loop (low execution count)
> - a load of R inside the loop (after the call) with freq 0.51 * loop iters
> - a load of R inside the loop with freq 0.49 * loop iters

and if the loop has various BB (like a long if/then/elseif/else) chain the load
has to happen in in every BB in the loop.  That's why we get the large amount
of spills we currently do.

By forcing it to spill only in the BB with the call inside the loop the other
BBs are freed from all the loads.

> If we force R to be allocated a call-clobbered register instead
> of being spilled (and changing nothing else, via a hack to
> ira-color.c:improve_allocation) then we generate:
> 
> - a store of R inside the loop (before the call) with freq 0.51 * loop iters
> - a load of R inside the loop (after the call) with freq 0.51 * loop iters

I essentially did the same thing, but I think in a more conservative way. When
you just have a single call inside the entire loop I force it to assign the
call-clobbered if it needs it.  This removed the loads with freq 0.49.  But
left the ones with 0.51.

I use call counts as the measure here because with 1 call and multiple BB
inside the live range you know that at most 1 BB will have a call and so the
rest won't have any.

Since essentially if you have high register pressure, just only make the part
that increases it more pay for the spills.  As you say it's not perfect, but
it's a conservative improvement over the current situation.

> which is cheaper than both the current approaches.  We don't do that
> optimisation yet though, so the current costing seems to reflect what we
> currently generate.

In many (if not most) Arches stores are significantly cheaper than the loads
though. So the store before the call doesn't end up making that much of a
difference, but yes it adds up if you have many of them.

So indeed removing it is optimal, but that seems like a very hard one to do.  I
would assume that the live range for the loop starts at the body of the loop. 
So I would imagine it's very hard to tell reload to spill outside of the
current allocas it's currently allocating for?

> 
> I don't know how well the above translates to the original example
> though.  Are the some of the spilled values in exchange loop-invariant
> as well?

Yes I believe so, It's a bit hard for me to tell since the functions are huge
and have many nested loops... But in rtl the BBs quite large as well after the
constprop and recursive inlining stuff.

But the behaviour is consistent with the minimal problem here.

next prev parent reply	other threads:[~2021-12-07 23:52 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-21 14:31 [Bug rtl-optimization/98782] New: IRA artificially creating spills due to " tnfchris at gcc dot gnu.org
2021-01-21 14:35 ` [Bug rtl-optimization/98782] " jgreenhalgh at gcc dot gnu.org
2021-01-22 10:12 ` fxue at os dot amperecomputing.com
2021-01-29 13:34 ` tnfchris at gcc dot gnu.org
2021-02-05 12:02 ` [Bug rtl-optimization/98782] [11 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in " tnfchris at gcc dot gnu.org
2021-02-23  2:11 ` jiangning.liu at amperecomputing dot com
2021-02-23 12:06 ` rguenth at gcc dot gnu.org
2021-02-26 12:32 ` rguenth at gcc dot gnu.org
2021-11-28 19:07 ` [Bug rtl-optimization/98782] [11/12 " hubicka at gcc dot gnu.org
2021-11-29  1:33 ` jiangning.liu at amperecomputing dot com
2021-11-29  6:59 ` tnfchris at gcc dot gnu.org
2021-12-03 11:44 ` hubicka at gcc dot gnu.org
2021-12-03 11:47 ` hubicka at gcc dot gnu.org
2021-12-07 11:19 ` tnfchris at gcc dot gnu.org
2021-12-07 11:21 ` tnfchris at gcc dot gnu.org
2021-12-07 11:21 ` tnfchris at gcc dot gnu.org
2021-12-07 19:44 ` rsandifo at gcc dot gnu.org
2021-12-07 23:52 ` tnfchris at gcc dot gnu.org [this message]
2021-12-08  9:33 ` rsandifo at gcc dot gnu.org
2021-12-08 14:31 ` tnfchris at gcc dot gnu.org
2021-12-08 15:02 ` rsandifo at gcc dot gnu.org
2021-12-09 19:56 ` pthaugen at gcc dot gnu.org
2021-12-09 20:12 ` hubicka at gcc dot gnu.org
2021-12-09 21:27 ` pthaugen at gcc dot gnu.org
2021-12-10 11:36 ` hubicka at gcc dot gnu.org
2021-12-14 14:38 ` tnfchris at gcc dot gnu.org
2021-12-14 14:40 ` hubicka at kam dot mff.cuni.cz
2021-12-14 14:48 ` tnfchris at gcc dot gnu.org
2021-12-14 14:58 ` hubicka at kam dot mff.cuni.cz
2021-12-14 15:07 ` tnfchris at gcc dot gnu.org
2021-12-14 15:08 ` tnfchris at gcc dot gnu.org
2021-12-14 18:16 ` jamborm at gcc dot gnu.org
2021-12-15 12:15 ` tnfchris at gcc dot gnu.org
2021-12-20 18:06 ` rsandifo at gcc dot gnu.org
2021-12-31 17:28 ` rsandifo at gcc dot gnu.org
2022-01-04 22:26 ` pthaugen at gcc dot gnu.org
2022-01-04 22:29 ` pthaugen at gcc dot gnu.org
2022-01-06 14:53 ` rsandifo at gcc dot gnu.org
2022-01-10  1:29 ` crazylht at gmail dot com
2022-01-11 10:14   ` Jan Hubicka
2022-01-10 14:47 ` cvs-commit at gcc dot gnu.org
2022-01-10 14:47 ` cvs-commit at gcc dot gnu.org
2022-01-10 14:47 ` cvs-commit at gcc dot gnu.org
2022-01-10 14:47 ` cvs-commit at gcc dot gnu.org
2022-01-10 14:52 ` [Bug rtl-optimization/98782] [11 " rsandifo at gcc dot gnu.org
2022-01-11 10:14 ` hubicka at kam dot mff.cuni.cz
2022-01-11 14:22 ` rsandifo at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-98782-4-x3OX8uNnn7@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).