public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16
@ 2013-09-18 11:56 matthew.fortune at imgtec dot com
2013-09-18 11:57 ` [Bug rtl-optimization/58461] " matthew.fortune at imgtec dot com
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: matthew.fortune at imgtec dot com @ 2013-09-18 11:56 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461
Bug ID: 58461
Summary: [MIPS] Using LRA instead of reload increases code size
for mips16
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: matthew.fortune at imgtec dot com
Created attachment 30852
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30852&action=edit
Test case to trigger LRA reload issue
While working on enabling LRA for MIPS (mips16 in particular) I have observed
that LRA is not producing as optimal code as classic reload. The underlying
cause of this is that the register allocation decisions made in IRA were
sub-optimal but classic reload 'fixes' them whereas LRA does not. Regardless of
fixing the IRA issues there is probably something to fix in LRA.
I have attached a patch that applies to top of tree to enable LRA for
mips/mips16 and exposes two options to demonstrate how LRA differs from classic
reload. I have also attached a test case (reload_test_mips16.i) which is a
function from libjpeg.
The two options added by the patch are -mreload and -mfix-regalloc. LRA is
default on with the patch applied:
* mips-sde-elf-gcc -Os -mips16 -march=m4kec reload_test_mips16.i
LRA introduces a number of reloads that involve $24 which is inaccessible to
most mips16 instructions leading to an increase in code size.
* mips-sde-elf-gcc -Os -mips16 -march=m4kec -mreload ...
Classic reload manages to avoid the reloads of $24 as its reloads converge to
use the same reload register and eliminate $24 altogether.
* mips-sde-elf-gcc -Os -mips16 -march=m4kec -mfix-regalloc ...
LRA now outperforms classic reload as the initial register allocation by IRA is
better so LRA does not hit the problem I am reporting.
The original register allocation from IRA is:
Disposition:
26:r245 l0 2 2:r246 l0 4 28:r249 l0 2 3:r250 l0 16
4:r251 l0 17 5:r252 l0 8 6:r253 l0 9 19:r260 l0 11
15:r266 l0 24 12:r275 l0 3 11:r276 l0 2 10:r278 l0 10
27:r281 l0 4 7:r282 l0 5 8:r283 l0 6 1:r284 l0 7
29:r285 l0 mem 24:r286 l0 24 25:r287 l0 2 23:r288 l0 24
22:r289 l0 24 21:r290 l0 24 20:r291 l0 24 18:r292 l0 24
17:r293 l0 11 16:r294 l0 11 14:r295 l0 11 13:r296 l0 24
9:r297 l0 24 0:r298 l0 24
The fixed register allocation from IRA is as follows, note the mems instead of
hard registers 8-11,24:
Disposition:
26:r245 l0 2 2:r246 l0 4 28:r249 l0 2 3:r250 l0 mem
4:r251 l0 mem 5:r252 l0 16 6:r253 l0 17 19:r260 l0 mem
15:r266 l0 mem 12:r275 l0 3 11:r276 l0 2 10:r278 l0 mem
27:r281 l0 4 7:r282 l0 5 8:r283 l0 6 1:r284 l0 7
29:r285 l0 mem 24:r286 l0 mem 25:r287 l0 2 23:r288 l0 mem
22:r289 l0 mem 21:r290 l0 mem 20:r291 l0 mem 18:r292 l0 mem
17:r293 l0 mem 16:r294 l0 mem 14:r295 l0 mem 13:r296 l0 mem
9:r297 l0 24 0:r298 l0 24
So the issue (I believe) is that reloads from LRA do not converge as well as
reloads introduced by classic reload. While this example can (and should) be
fixed in the original register allocation it feels as though there is a problem
to fix in LRA.
==============
[mfortune@mfortune-linux lra_bugreport]$ /althome/mips/tk/bin/mips-sde-elf-gcc
-v
Using built-in specs.
COLLECT_GCC=/althome/mips/tk/bin/mips-sde-elf-gcc
COLLECT_LTO_WRAPPER=/althome/mips/tk/libexec/gcc/mips-sde-elf/4.9.0/lto-wrapper
Target: mips-sde-elf
Configured with: /althome/mips/git_br/gcc/configure --prefix=/althome/mips/tk
--target=mips-sde-elf --with-gnu-as --with-gnu-ld --with-arch=mips32r2
--with-mips-plt --with-synci --with-llsc --with-newlib
target_alias=mips-sde-elf --enable-languages=c,c++,lto
Thread model: single
gcc version 4.9.0 20130918 (experimental) (GCC)
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug rtl-optimization/58461] [MIPS] Using LRA instead of reload increases code size for mips16
2013-09-18 11:56 [Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16 matthew.fortune at imgtec dot com
@ 2013-09-18 11:57 ` matthew.fortune at imgtec dot com
2013-09-19 18:40 ` rsandifo at gcc dot gnu.org
2013-09-25 20:21 ` matthew.fortune at imgtec dot com
2 siblings, 0 replies; 4+ messages in thread
From: matthew.fortune at imgtec dot com @ 2013-09-18 11:57 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461
--- Comment #1 from Matthew Fortune <matthew.fortune at imgtec dot com> ---
Created attachment 30853
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30853&action=edit
Patch to enable LRA for mips16
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug rtl-optimization/58461] [MIPS] Using LRA instead of reload increases code size for mips16
2013-09-18 11:56 [Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16 matthew.fortune at imgtec dot com
2013-09-18 11:57 ` [Bug rtl-optimization/58461] " matthew.fortune at imgtec dot com
@ 2013-09-19 18:40 ` rsandifo at gcc dot gnu.org
2013-09-25 20:21 ` matthew.fortune at imgtec dot com
2 siblings, 0 replies; 4+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2013-09-19 18:40 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461
rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rsandifo at gcc dot gnu.org
--- Comment #2 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Thanks for working on this. As you mentioned in the gcc@ message,
there are two ways of looking at the problem: the backend costs are
wrong or LRA isn't handling the reloads as well (or both, for a third).
I think it'd be wrong for the backend to say that moves between
MIPS16 registers and other general registers are more expensive
than memory though. Doing that could in theory pessimise post-
reload optimisation (postreload.c also compares memory and
register costs). Other passes might look at the costs too in
future. And a move in or out of a MIPS16 register really is as
cheap as a move between MIPS16 registers.
But of course, although the moves are cheap, registers other than
$2-$7, $16, $17 and $24 are only useful as spill space. So if
it's better for IRA to ignore them and LRA to use them via
TARGET_SPILL_CLASS, perhaps we should enforce that directly,
by hiding other registers from IRA. I suppose that's like
restoring the old cover classes hook, but only as an optional
feature.
I realise this isn't the point of the bug report or the attachment,
but just FWIW: the constraints shouldn't be matching the fake
FRAME_POINTER_REGNUM. They should wait for it to be eliminated
to either STACK_POINTER_REGNUM or HARD_FRAME_POINTER_REGNUM and
match that.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug rtl-optimization/58461] [MIPS] Using LRA instead of reload increases code size for mips16
2013-09-18 11:56 [Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16 matthew.fortune at imgtec dot com
2013-09-18 11:57 ` [Bug rtl-optimization/58461] " matthew.fortune at imgtec dot com
2013-09-19 18:40 ` rsandifo at gcc dot gnu.org
@ 2013-09-25 20:21 ` matthew.fortune at imgtec dot com
2 siblings, 0 replies; 4+ messages in thread
From: matthew.fortune at imgtec dot com @ 2013-09-25 20:21 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461
--- Comment #3 from Matthew Fortune <matthew.fortune at imgtec dot com> ---
(In reply to rsandifo@gcc.gnu.org from comment #2)
> I think it'd be wrong for the backend to say that moves between
> MIPS16 registers and other general registers are more expensive
> than memory though.
I'm not surprised by that comment. I agree. It was just an option I found that
avoided the need for changes to IRA.
> But of course, although the moves are cheap, registers other than
> $2-$7, $16, $17 and $24 are only useful as spill space. So if
> it's better for IRA to ignore them and LRA to use them via
> TARGET_SPILL_CLASS, perhaps we should enforce that directly,
> by hiding other registers from IRA. I suppose that's like
> restoring the old cover classes hook, but only as an optional
> feature.
One of the problems here is that IRA is coming up with M16_REGS as the
preferred class and GR_REGS as the alternate class for most pseudos. If there
is no cover class to translate GR_REGS into then that seems like a problem
unless it would work to use M16_REGS as a cover class for GR_REGS.
My initial attempt at this problem was to stop IRA from using the superunion of
preferred and alternate classes for an allocno and instead just using the
preferred class. This has the same effect as the change to register move costs
which was to get the alternate class to be NO_REGS (or ignored).
> I realise this isn't the point of the bug report or the attachment,
> but just FWIW: the constraints shouldn't be matching the fake
> FRAME_POINTER_REGNUM. They should wait for it to be eliminated
> to either STACK_POINTER_REGNUM or HARD_FRAME_POINTER_REGNUM and
> match that.
I'll look at that when putting an 'enable LRA' patch together. There's lots of
testing to do yet.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-09-25 20:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-18 11:56 [Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16 matthew.fortune at imgtec dot com
2013-09-18 11:57 ` [Bug rtl-optimization/58461] " matthew.fortune at imgtec dot com
2013-09-19 18:40 ` rsandifo at gcc dot gnu.org
2013-09-25 20:21 ` matthew.fortune at imgtec dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).