public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16
@ 2013-09-18 11:56 matthew.fortune at imgtec dot com
  2013-09-18 11:57 ` [Bug rtl-optimization/58461] " matthew.fortune at imgtec dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: matthew.fortune at imgtec dot com @ 2013-09-18 11:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461

            Bug ID: 58461
           Summary: [MIPS] Using LRA instead of reload increases code size
                    for mips16
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: matthew.fortune at imgtec dot com

Created attachment 30852
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30852&action=edit
Test case to trigger LRA reload issue

While working on enabling LRA for MIPS (mips16 in particular) I have observed
that LRA is not producing as optimal code as classic reload. The underlying
cause of this is that the register allocation decisions made in IRA were
sub-optimal but classic reload 'fixes' them whereas LRA does not. Regardless of
fixing the IRA issues there is probably something to fix in LRA.

I have attached a patch that applies to top of tree to enable LRA for
mips/mips16 and exposes two options to demonstrate how LRA differs from classic
reload. I have also attached a test case (reload_test_mips16.i) which is a
function from libjpeg.

The two options added by the patch are -mreload and -mfix-regalloc. LRA is
default on with the patch applied:

* mips-sde-elf-gcc -Os -mips16 -march=m4kec reload_test_mips16.i

LRA introduces a number of reloads that involve $24 which is inaccessible to
most mips16 instructions leading to an increase in code size. 

* mips-sde-elf-gcc -Os -mips16 -march=m4kec -mreload ...

Classic reload manages to avoid the reloads of $24 as its reloads converge to
use the same reload register and eliminate $24 altogether.

* mips-sde-elf-gcc -Os -mips16 -march=m4kec -mfix-regalloc ...

LRA now outperforms classic reload as the initial register allocation by IRA is
better so LRA does not hit the problem I am reporting.

The original register allocation from IRA is:
Disposition:
   26:r245 l0     2    2:r246 l0     4   28:r249 l0     2    3:r250 l0    16
    4:r251 l0    17    5:r252 l0     8    6:r253 l0     9   19:r260 l0    11
   15:r266 l0    24   12:r275 l0     3   11:r276 l0     2   10:r278 l0    10
   27:r281 l0     4    7:r282 l0     5    8:r283 l0     6    1:r284 l0     7
   29:r285 l0   mem   24:r286 l0    24   25:r287 l0     2   23:r288 l0    24
   22:r289 l0    24   21:r290 l0    24   20:r291 l0    24   18:r292 l0    24
   17:r293 l0    11   16:r294 l0    11   14:r295 l0    11   13:r296 l0    24
    9:r297 l0    24    0:r298 l0    24

The fixed register allocation from IRA is as follows, note the mems instead of
hard registers 8-11,24:
Disposition:
   26:r245 l0     2    2:r246 l0     4   28:r249 l0     2    3:r250 l0   mem
    4:r251 l0   mem    5:r252 l0    16    6:r253 l0    17   19:r260 l0   mem
   15:r266 l0   mem   12:r275 l0     3   11:r276 l0     2   10:r278 l0   mem
   27:r281 l0     4    7:r282 l0     5    8:r283 l0     6    1:r284 l0     7
   29:r285 l0   mem   24:r286 l0   mem   25:r287 l0     2   23:r288 l0   mem
   22:r289 l0   mem   21:r290 l0   mem   20:r291 l0   mem   18:r292 l0   mem
   17:r293 l0   mem   16:r294 l0   mem   14:r295 l0   mem   13:r296 l0   mem
    9:r297 l0    24    0:r298 l0    24

So the issue (I believe) is that reloads from LRA do not converge as well as
reloads introduced by classic reload. While this example can (and should) be
fixed in the original register allocation it feels as though there is a problem
to fix in LRA.

==============

[mfortune@mfortune-linux lra_bugreport]$ /althome/mips/tk/bin/mips-sde-elf-gcc
-v
Using built-in specs.
COLLECT_GCC=/althome/mips/tk/bin/mips-sde-elf-gcc
COLLECT_LTO_WRAPPER=/althome/mips/tk/libexec/gcc/mips-sde-elf/4.9.0/lto-wrapper
Target: mips-sde-elf
Configured with: /althome/mips/git_br/gcc/configure --prefix=/althome/mips/tk
--target=mips-sde-elf --with-gnu-as --with-gnu-ld --with-arch=mips32r2
--with-mips-plt --with-synci --with-llsc --with-newlib
target_alias=mips-sde-elf --enable-languages=c,c++,lto
Thread model: single
gcc version 4.9.0 20130918 (experimental) (GCC)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/58461] [MIPS] Using LRA instead of reload increases code size for mips16
  2013-09-18 11:56 [Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16 matthew.fortune at imgtec dot com
@ 2013-09-18 11:57 ` matthew.fortune at imgtec dot com
  2013-09-19 18:40 ` rsandifo at gcc dot gnu.org
  2013-09-25 20:21 ` matthew.fortune at imgtec dot com
  2 siblings, 0 replies; 4+ messages in thread
From: matthew.fortune at imgtec dot com @ 2013-09-18 11:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461

--- Comment #1 from Matthew Fortune <matthew.fortune at imgtec dot com> ---
Created attachment 30853
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30853&action=edit
Patch to enable LRA for mips16


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/58461] [MIPS] Using LRA instead of reload increases code size for mips16
  2013-09-18 11:56 [Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16 matthew.fortune at imgtec dot com
  2013-09-18 11:57 ` [Bug rtl-optimization/58461] " matthew.fortune at imgtec dot com
@ 2013-09-19 18:40 ` rsandifo at gcc dot gnu.org
  2013-09-25 20:21 ` matthew.fortune at imgtec dot com
  2 siblings, 0 replies; 4+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2013-09-19 18:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #2 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Thanks for working on this.  As you mentioned in the gcc@ message,
there are two ways of looking at the problem: the backend costs are
wrong or LRA isn't handling the reloads as well (or both, for a third).

I think it'd be wrong for the backend to say that moves between
MIPS16 registers and other general registers are more expensive
than memory though.  Doing that could in theory pessimise post-
reload optimisation (postreload.c also compares memory and
register costs).  Other passes might look at the costs too in
future.  And a move in or out of a MIPS16 register really is as
cheap as a move between MIPS16 registers.

But of course, although the moves are cheap, registers other than
$2-$7, $16, $17 and $24 are only useful as spill space.  So if
it's better for IRA to ignore them and LRA to use them via
TARGET_SPILL_CLASS, perhaps we should enforce that directly,
by hiding other registers from IRA.  I suppose that's like
restoring the old cover classes hook, but only as an optional
feature.

I realise this isn't the point of the bug report or the attachment,
but just FWIW: the constraints shouldn't be matching the fake
FRAME_POINTER_REGNUM.  They should wait for it to be eliminated
to either STACK_POINTER_REGNUM or HARD_FRAME_POINTER_REGNUM and
match that.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/58461] [MIPS] Using LRA instead of reload increases code size for mips16
  2013-09-18 11:56 [Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16 matthew.fortune at imgtec dot com
  2013-09-18 11:57 ` [Bug rtl-optimization/58461] " matthew.fortune at imgtec dot com
  2013-09-19 18:40 ` rsandifo at gcc dot gnu.org
@ 2013-09-25 20:21 ` matthew.fortune at imgtec dot com
  2 siblings, 0 replies; 4+ messages in thread
From: matthew.fortune at imgtec dot com @ 2013-09-25 20:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461

--- Comment #3 from Matthew Fortune <matthew.fortune at imgtec dot com> ---
(In reply to rsandifo@gcc.gnu.org from comment #2)
> I think it'd be wrong for the backend to say that moves between
> MIPS16 registers and other general registers are more expensive
> than memory though.  

I'm not surprised by that comment. I agree. It was just an option I found that
avoided the need for changes to IRA.

> But of course, although the moves are cheap, registers other than
> $2-$7, $16, $17 and $24 are only useful as spill space.  So if
> it's better for IRA to ignore them and LRA to use them via
> TARGET_SPILL_CLASS, perhaps we should enforce that directly,
> by hiding other registers from IRA.  I suppose that's like
> restoring the old cover classes hook, but only as an optional
> feature.

One of the problems here is that IRA is coming up with M16_REGS as the
preferred class and GR_REGS as the alternate class for most pseudos. If there
is no cover class to translate GR_REGS into then that seems like a problem
unless it would work to use M16_REGS as a cover class for GR_REGS.
My initial attempt at this problem was to stop IRA from using the superunion of
preferred and alternate classes for an allocno and instead just using the
preferred class. This has the same effect as the change to register move costs
which was to get the alternate class to be NO_REGS (or ignored).

> I realise this isn't the point of the bug report or the attachment,
> but just FWIW: the constraints shouldn't be matching the fake
> FRAME_POINTER_REGNUM.  They should wait for it to be eliminated
> to either STACK_POINTER_REGNUM or HARD_FRAME_POINTER_REGNUM and
> match that.

I'll look at that when putting an 'enable LRA' patch together. There's lots of
testing to do yet.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-09-25 20:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-18 11:56 [Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16 matthew.fortune at imgtec dot com
2013-09-18 11:57 ` [Bug rtl-optimization/58461] " matthew.fortune at imgtec dot com
2013-09-19 18:40 ` rsandifo at gcc dot gnu.org
2013-09-25 20:21 ` matthew.fortune at imgtec dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).