public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Indirect memory addresses vs. lra
@ 2019-08-04 19:18 John Darrington
  2019-08-08 16:25 ` Vladimir Makarov
  0 siblings, 1 reply; 38+ messages in thread
From: John Darrington @ 2019-08-04 19:18 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 2776 bytes --]


I'm trying to write a back-end for an architecture (s12z - the ISA you can 
download from [1]).  This arch accepts indirect memory addresses.   That is to 
say, those of the form (mem (mem (...)))  and although my TARGET_LEGITIMATE_ADDRESS
function returns true for such addresses, LRA insists on reloading them out of 
existence.

For example, when compiling a code fragment:

  volatile unsigned char *led = 0x2F2;
  *led = 1;

the ira dump file shows:

(insn 7 6 8 2 (set (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])
        (const_int 754 [0x2f2])) "/home/jmd/MemMem/memmem.c":15:27 96 {movpsi}
     (nil))
(insn 8 7 14 2 (set (mem/v:QI (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8]) [0 *led_7+0 S1 A8])
        (const_int 1 [0x1])) "/home/jmd/MemMem/memmem.c":16:8 98 {movqi}
     (nil))

which is a perfectly valid insn, and the most efficient assembler for it is:
mov.p #0x2f2, y
mov.b #1, [0,y]

However the reload dump shows this has been changed to:

(insn 7 6 22 2 (set (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])
        (const_int 754 [0x2f2])) "/home/jmd/MemMem/memmem.c":15:27 96 {movpsi}
     (nil))
(insn 22 7 8 2 (set (reg:PSI 8 x [22])
        (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])) "/home/jmd/MemMem/memmem.c":16:8 96 {movpsi}
     (nil))
(insn 8 22 14 2 (set (mem/v:QI (reg:PSI 8 x [22]) [0 *led_7+0 S1 A8])
        (const_int 1 [0x1])) "/home/jmd/MemMem/memmem.c":16:8 98 {movqi}
     (nil))

and ends up as:

mov.p #0x2f2, y
mov.p (0,y) x
mov.b #1, (0,x)

So this wastes a register (which leads to other issues which I don't want to go 
into in this email).

After a lot of debugging I tracked down the part of lra which is doing this 
reload to the function process_addr_reg at lra-constraints.c:1378

 if (! REG_P (reg))
    {
      if (check_only_p)
        return true;
      /* Always reload memory in an address even if the target supports such addresses.  */
      new_reg = lra_create_new_reg_with_unique_value (mode, reg, cl, "address");
      before_p = true;
    }

Changing this to

 if (! REG_P (reg))
    {
      if (check_only_p)
        return true;
      return false;
    }

solves my immediate problem.  However I imagine there was a reason for doing 
this reload, and presumably a better way of avoiding it.

Can someone explain the reason for this reload, and how I can best ensure that 
indirect memory operands are left in the compiled code?



[1] https://www.nxp.com/docs/en/reference-manual/S12ZCPU_RM_V1.pdf

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-04 19:18 Indirect memory addresses vs. lra John Darrington
@ 2019-08-08 16:25 ` Vladimir Makarov
  2019-08-08 16:44   ` Paul Koning
  0 siblings, 1 reply; 38+ messages in thread
From: Vladimir Makarov @ 2019-08-08 16:25 UTC (permalink / raw)
  To: John Darrington, gcc


On 2019-08-04 3:18 p.m., John Darrington wrote:
> I'm trying to write a back-end for an architecture (s12z - the ISA you can
> download from [1]).  This arch accepts indirect memory addresses.   That is to
> say, those of the form (mem (mem (...)))  and although my TARGET_LEGITIMATE_ADDRESS
> function returns true for such addresses, LRA insists on reloading them out of
> existence.
>
> For example, when compiling a code fragment:
>
>    volatile unsigned char *led = 0x2F2;
>    *led = 1;
>
> the ira dump file shows:
>
> (insn 7 6 8 2 (set (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])
>          (const_int 754 [0x2f2])) "/home/jmd/MemMem/memmem.c":15:27 96 {movpsi}
>       (nil))
> (insn 8 7 14 2 (set (mem/v:QI (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8]) [0 *led_7+0 S1 A8])
>          (const_int 1 [0x1])) "/home/jmd/MemMem/memmem.c":16:8 98 {movqi}
>       (nil))
>
> which is a perfectly valid insn, and the most efficient assembler for it is:
> mov.p #0x2f2, y
> mov.b #1, [0,y]
>
> However the reload dump shows this has been changed to:
>
> (insn 7 6 22 2 (set (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])
>          (const_int 754 [0x2f2])) "/home/jmd/MemMem/memmem.c":15:27 96 {movpsi}
>       (nil))
> (insn 22 7 8 2 (set (reg:PSI 8 x [22])
>          (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])) "/home/jmd/MemMem/memmem.c":16:8 96 {movpsi}
>       (nil))
> (insn 8 22 14 2 (set (mem/v:QI (reg:PSI 8 x [22]) [0 *led_7+0 S1 A8])
>          (const_int 1 [0x1])) "/home/jmd/MemMem/memmem.c":16:8 98 {movqi}
>       (nil))
>
> and ends up as:
>
> mov.p #0x2f2, y
> mov.p (0,y) x
> mov.b #1, (0,x)
>
> So this wastes a register (which leads to other issues which I don't want to go
> into in this email).
>
> After a lot of debugging I tracked down the part of lra which is doing this
> reload to the function process_addr_reg at lra-constraints.c:1378
>
>   if (! REG_P (reg))
>      {
>        if (check_only_p)
>          return true;
>        /* Always reload memory in an address even if the target supports such addresses.  */
>        new_reg = lra_create_new_reg_with_unique_value (mode, reg, cl, "address");
>        before_p = true;
>      }
>
> Changing this to
>
>   if (! REG_P (reg))
>      {
>        if (check_only_p)
>          return true;
>        return false;
>      }
>
> solves my immediate problem.  However I imagine there was a reason for doing
> this reload, and presumably a better way of avoiding it.
>
> Can someone explain the reason for this reload, and how I can best ensure that
> indirect memory operands are left in the compiled code?
>
The old reload (reload[1].c) supports such addressing.  As modern 
mainstream architectures have no this kind of addressing, it was not 
implemented in LRA.

I don't think the above simple change will work fully.  For example, you 
need to constrain memory nesting.  The constraints should be described, 
may be some hooks should be implemented (may be not and 
TARGET_LEGITIMATE_ADDRESS will be enough), may be additional address 
anslysis and transformations should be implemented in LRA, etc.  But may 
be implementing this is not hard either.

It is also difficult for me to say is it worth to do.  Removing such 
addressing helps to remove redundant memory reads.  On the other hand, 
its usage can decrease #insns and save registers for better RA and 
utilize hardware on design of which a lot of efforts were spent.

In any case, if somebody implements this, it can be included in LRA.

>
> [1] https://www.nxp.com/docs/en/reference-manual/S12ZCPU_RM_V1.pdf
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-08 16:25 ` Vladimir Makarov
@ 2019-08-08 16:44   ` Paul Koning
  2019-08-08 17:21     ` Segher Boessenkool
  2019-08-08 18:46     ` Indirect memory addresses vs. lra Vladimir Makarov
  0 siblings, 2 replies; 38+ messages in thread
From: Paul Koning @ 2019-08-08 16:44 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: John Darrington, gcc



> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
> 
> 
> On 2019-08-04 3:18 p.m., John Darrington wrote:
>> I'm trying to write a back-end for an architecture (s12z - the ISA you can
>> download from [1]).  This arch accepts indirect memory addresses.   That is to
>> say, those of the form (mem (mem (...)))  and although my TARGET_LEGITIMATE_ADDRESS
>> function returns true for such addresses, LRA insists on reloading them out of
>> existence.
>> ...
> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.

Is LRA only intended for "modern mainstream architectures"?

If yes, why is the old reload being deprecated?  You can't have it both ways.  Unless you want to obsolete all "not modern mainstream architectures" in GCC, it doesn't make sense to get rid of core functionality used by those architectures.

Indirect addressing is a key feature in size-optimized code.

	paul

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-08 16:44   ` Paul Koning
@ 2019-08-08 17:21     ` Segher Boessenkool
  2019-08-08 17:25       ` Paul Koning
  2019-08-08 17:30       ` Paul Koning
  2019-08-08 18:46     ` Indirect memory addresses vs. lra Vladimir Makarov
  1 sibling, 2 replies; 38+ messages in thread
From: Segher Boessenkool @ 2019-08-08 17:21 UTC (permalink / raw)
  To: Paul Koning; +Cc: Vladimir Makarov, John Darrington, gcc

On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
> > On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
> > The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
> 
> Is LRA only intended for "modern mainstream architectures"?

I sure hope not!  But it has only been *used* and *tested* much on such,
so far.  Things are designed to work well for modern archs.

> If yes, why is the old reload being deprecated?  You can't have it both ways.  Unless you want to obsolete all "not modern mainstream architectures" in GCC, it doesn't make sense to get rid of core functionality used by those architectures.
> 
> Indirect addressing is a key feature in size-optimized code.

That doesn't mean that LRA has to support it, btw, not necessarily; it
may well be possible to do a good job of this in the later passes?
Maybe postreload, maybe some peepholes, etc.?


Segher

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-08 17:21     ` Segher Boessenkool
@ 2019-08-08 17:25       ` Paul Koning
  2019-08-08 19:09         ` Segher Boessenkool
  2019-08-08 17:30       ` Paul Koning
  1 sibling, 1 reply; 38+ messages in thread
From: Paul Koning @ 2019-08-08 17:25 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Vladimir Makarov, John Darrington, gcc



> On Aug 8, 2019, at 1:21 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> 
> On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
>>> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>>> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
>> 
>> Is LRA only intended for "modern mainstream architectures"?
> 
> I sure hope not!  But it has only been *used* and *tested* much on such,
> so far.  Things are designed to work well for modern archs.
> 
>> If yes, why is the old reload being deprecated?  You can't have it both ways.  Unless you want to obsolete all "not modern mainstream architectures" in GCC, it doesn't make sense to get rid of core functionality used by those architectures.
>> 
>> Indirect addressing is a key feature in size-optimized code.
> 
> That doesn't mean that LRA has to support it, btw, not necessarily; it
> may well be possible to do a good job of this in the later passes?
> Maybe postreload, maybe some peepholes, etc.?

Possibly.  But as Vladimir points out, indirect addressing affects register allocation (reducing register pressure).  In older architectures that implement indirect addressing, that is one of the key ways in which the feature reduces code size.  While I can see how peephole optimization can convert a address load plus a register indirect into a memory indirect instruction, does that help the register become available for other uses or is post-LRA too late for that?  My impression is that it is too late, since at this point we're dealing with hard registers and making one free via peephole helps no one else.

	paul


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-08 17:21     ` Segher Boessenkool
  2019-08-08 17:25       ` Paul Koning
@ 2019-08-08 17:30       ` Paul Koning
  2019-08-08 19:19         ` Segher Boessenkool
  1 sibling, 1 reply; 38+ messages in thread
From: Paul Koning @ 2019-08-08 17:30 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Vladimir Makarov, John Darrington, gcc



> On Aug 8, 2019, at 1:21 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> 
> On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
>>> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>>> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
>> 
>> Is LRA only intended for "modern mainstream architectures"?
> 
> I sure hope not!  But it has only been *used* and *tested* much on such,
> so far. 

That's not entirely accurate.  At the prodding of people pushing for the removal of CC0 and reload, I've added LRA support to pdp11 in the V9 cycle.  And it works pretty well, in the sense of passing the compile tests.  But I haven't yet examined the code quality vs. the old one in any detail.

	paul

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-08 16:44   ` Paul Koning
  2019-08-08 17:21     ` Segher Boessenkool
@ 2019-08-08 18:46     ` Vladimir Makarov
  1 sibling, 0 replies; 38+ messages in thread
From: Vladimir Makarov @ 2019-08-08 18:46 UTC (permalink / raw)
  To: Paul Koning; +Cc: John Darrington, gcc


On 2019-08-08 12:43 p.m., Paul Koning wrote:
>
>> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>>
>>
>> On 2019-08-04 3:18 p.m., John Darrington wrote:
>>> I'm trying to write a back-end for an architecture (s12z - the ISA you can
>>> download from [1]).  This arch accepts indirect memory addresses.   That is to
>>> say, those of the form (mem (mem (...)))  and although my TARGET_LEGITIMATE_ADDRESS
>>> function returns true for such addresses, LRA insists on reloading them out of
>>> existence.
>>> ...
>> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
> Is LRA only intended for "modern mainstream architectures"?


No.  As I wrote patches implementing indirect addressing is welcomed.  
It is hard to implement everything at once and by one person.


> If yes, why is the old reload being deprecated?
>    You can't have it both ways.  Unless you want to obsolete all "not modern mainstream architectures" in GCC, it doesn't make sense to get rid of core functionality used by those architectures.
>
> Indirect addressing is a key feature in size-optimized code.
>
> 	

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-08 17:25       ` Paul Koning
@ 2019-08-08 19:09         ` Segher Boessenkool
  0 siblings, 0 replies; 38+ messages in thread
From: Segher Boessenkool @ 2019-08-08 19:09 UTC (permalink / raw)
  To: Paul Koning; +Cc: Vladimir Makarov, John Darrington, gcc

On Thu, Aug 08, 2019 at 01:25:27PM -0400, Paul Koning wrote:
> > On Aug 8, 2019, at 1:21 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> > On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
> >> Indirect addressing is a key feature in size-optimized code.
> > 
> > That doesn't mean that LRA has to support it, btw, not necessarily; it
> > may well be possible to do a good job of this in the later passes?
> > Maybe postreload, maybe some peepholes, etc.?
> 
> Possibly.  But as Vladimir points out, indirect addressing affects
> register allocation (reducing register pressure).

Yeah, good point, esp. if you have only one or two registers that you
can use for addressing at all.  So it will have to happen during (or
before?) RA, alright.


Segher

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-08 17:30       ` Paul Koning
@ 2019-08-08 19:19         ` Segher Boessenkool
  2019-08-08 19:57           ` Jeff Law
  0 siblings, 1 reply; 38+ messages in thread
From: Segher Boessenkool @ 2019-08-08 19:19 UTC (permalink / raw)
  To: Paul Koning; +Cc: Vladimir Makarov, John Darrington, gcc

On Thu, Aug 08, 2019 at 01:30:41PM -0400, Paul Koning wrote:
> 
> 
> > On Aug 8, 2019, at 1:21 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> > 
> > On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
> >>> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
> >>> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
> >> 
> >> Is LRA only intended for "modern mainstream architectures"?
> > 
> > I sure hope not!  But it has only been *used* and *tested* much on such,
> > so far. 
> 
> That's not entirely accurate.  At the prodding of people pushing for
> the removal of CC0 and reload, I've added LRA support to pdp11 in the
> V9 cycle.

I said "much" :-)

Pretty much all design input so far has been from "modern mainstream
architectures", as far as I can make out.  Now one of those has the
most "interesting" (for RA) features that many less mainstream archs
have (a not-so-very-flat register file), so it should still work pretty
well hopefully.

> And it works pretty well, in the sense of passing the
> compile tests.  But I haven't yet examined the code quality vs. the
> old one in any detail.

That would be quite interesting to see, also for the other ports that
still need conversion: how much (if any) degradation should you expect
from a straight-up conversion of a port to LRA, without any retuning?


Segher

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-08 19:19         ` Segher Boessenkool
@ 2019-08-08 19:57           ` Jeff Law
  2019-08-09  8:14             ` John Darrington
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff Law @ 2019-08-08 19:57 UTC (permalink / raw)
  To: Segher Boessenkool, Paul Koning; +Cc: Vladimir Makarov, John Darrington, gcc

On 8/8/19 1:19 PM, Segher Boessenkool wrote:
> On Thu, Aug 08, 2019 at 01:30:41PM -0400, Paul Koning wrote:
>>
>>
>>> On Aug 8, 2019, at 1:21 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
>>>
>>> On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
>>>>> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>>>>> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
>>>>
>>>> Is LRA only intended for "modern mainstream architectures"?
>>>
>>> I sure hope not!  But it has only been *used* and *tested* much on such,
>>> so far. 
>>
>> That's not entirely accurate.  At the prodding of people pushing for
>> the removal of CC0 and reload, I've added LRA support to pdp11 in the
>> V9 cycle.
> 
> I said "much" :-)
> 
> Pretty much all design input so far has been from "modern mainstream
> architectures", as far as I can make out.  Now one of those has the
> most "interesting" (for RA) features that many less mainstream archs
> have (a not-so-very-flat register file), so it should still work pretty
> well hopefully.
Yea, it's certainly designed with the more mainstream architectures in
mind.  THe double-indirect case that's being talked about here is well
out of the mainstream and not a feature of anything LRA has targetted to
date.  So I'm not surprised it's not working.

My suggestion would be to ignore the double-indirect aspect of the
architecture right now, get the port working, then come back and try to
make double-indirect addressing modes work.

> 
>> And it works pretty well, in the sense of passing the
>> compile tests.  But I haven't yet examined the code quality vs. the
>> old one in any detail.
> 
> That would be quite interesting to see, also for the other ports that
> still need conversion: how much (if any) degradation should you expect
> from a straight-up conversion of a port to LRA, without any retuning?
I did the v850 last year where it was a wash or perhaps a slight
improvement for codesize, which is a reasonable approximation for
performance on that target.

I was working a bit on converting the H8 away from cc0 with an eye
towards LRA as well.  Given how registers overlap on the H8, the most
straightforward port should end up with properties much like 32bit x86.
  I suspect the independent addressing of the high/low register parts
might be better handled by LRA, but I wasn't going to do anything beyond
the "just make it work".

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-08 19:57           ` Jeff Law
@ 2019-08-09  8:14             ` John Darrington
  2019-08-09 14:17               ` Segher Boessenkool
                                 ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: John Darrington @ 2019-08-09  8:14 UTC (permalink / raw)
  To: Jeff Law
  Cc: Segher Boessenkool, Paul Koning, Vladimir Makarov, John Darrington, gcc

On Thu, Aug 08, 2019 at 01:57:41PM -0600, Jeff Law wrote:

     Yea, it's certainly designed with the more mainstream architectures in
     mind.  THe double-indirect case that's being talked about here is well
     out of the mainstream and not a feature of anything LRA has targetted to
     date.  So I'm not surprised it's not working.
     
     My suggestion would be to ignore the double-indirect aspect of the
     architecture right now, get the port working, then come back and try to
     make double-indirect addressing modes work.
     
This sounds like sensible advice.  However I wonder if this issue is
related to the other major outstanding problem I have, viz: the large 
number of test failures which report "Unable to find a register to
spill" - So far, nobody has been able to explain how to solve that
issue and even the people who appear to be more knowlegeable have
expressed suprise that it is even happening at all.

Even if it should turn out not to be related, the message I've been
receiving in this thread is lra should not be expected to work for
non "mainstream" backends.  So perhaps there is another, yet to be
discovered, restriction which prevents my backend from ever working?

On the other hand, given my lack of experience with gcc,  it could be
that lra is working perfectly, and I have simply done something
incorrectly.    But the uncertainty voiced in this thread means that it
is hard to be sure that I'm not trying to do something which is
currently unsupported.

J'

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-09  8:14             ` John Darrington
@ 2019-08-09 14:17               ` Segher Boessenkool
  2019-08-09 14:23                 ` Paul Koning
  2019-08-10  6:10                 ` John Darrington
  2019-08-09 16:07               ` Jeff Law
  2019-08-09 17:34               ` Vladimir Makarov
  2 siblings, 2 replies; 38+ messages in thread
From: Segher Boessenkool @ 2019-08-09 14:17 UTC (permalink / raw)
  To: John Darrington; +Cc: Jeff Law, Paul Koning, Vladimir Makarov, gcc

Hi!

On Fri, Aug 09, 2019 at 10:14:39AM +0200, John Darrington wrote:
> On Thu, Aug 08, 2019 at 01:57:41PM -0600, Jeff Law wrote:
> 
>      Yea, it's certainly designed with the more mainstream architectures in
>      mind.  THe double-indirect case that's being talked about here is well
>      out of the mainstream and not a feature of anything LRA has targetted to
>      date.  So I'm not surprised it's not working.
>      
>      My suggestion would be to ignore the double-indirect aspect of the
>      architecture right now, get the port working, then come back and try to
>      make double-indirect addressing modes work.
>      
> This sounds like sensible advice.  However I wonder if this issue is
> related to the other major outstanding problem I have, viz: the large 
> number of test failures which report "Unable to find a register to
> spill" - So far, nobody has been able to explain how to solve that
> issue and even the people who appear to be more knowlegeable have
> expressed suprise that it is even happening at all.

No one is surprised.  It is just the funny way that LRA says "whoops I
am going in circles, there is no progress and there will never be, I'd
better stop that".  Everyone doing new ports / new conversions to LRA
sees that error all the time.

The error could be pretty much *anywhere* in your port.  You have to
look at what LRA did, and why, and why that is wrong, and fix that.

> Even if it should turn out not to be related, the message I've been
> receiving in this thread is lra should not be expected to work for
> non "mainstream" backends.

LRA is more likely to have problems in situations where it has not been
tested before.  You can replace LRA by anything else, and this isn't
limited to GCC (or software, or human endeavours, or humanity even).

> So perhaps there is another, yet to be
> discovered, restriction which prevents my backend from ever working?

From ever?  Nah, we can patch.  Also, Occam's razor says there likely
is an error in your backend you haven't found yet.

> On the other hand, given my lack of experience with gcc,  it could be
> that lra is working perfectly, and I have simply done something
> incorrectly.    But the uncertainty voiced in this thread means that it
> is hard to be sure that I'm not trying to do something which is
> currently unsupported.

Is your code in some branch in our git?  Or in some other public git?
Do you have a representative testcase?


Segher

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-09 14:17               ` Segher Boessenkool
@ 2019-08-09 14:23                 ` Paul Koning
  2019-08-10  6:10                 ` John Darrington
  1 sibling, 0 replies; 38+ messages in thread
From: Paul Koning @ 2019-08-09 14:23 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: John Darrington, Jeff Law, Vladimir Makarov, gcc



> On Aug 9, 2019, at 10:16 AM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> 
> Hi!
> 
> On Fri, Aug 09, 2019 at 10:14:39AM +0200, John Darrington wrote:
>> On Thu, Aug 08, 2019 at 01:57:41PM -0600, Jeff Law wrote:
>> 
>>  ...  However I wonder if this issue is
>> related to the other major outstanding problem I have, viz: the large 
>> number of test failures which report "Unable to find a register to
>> spill" - So far, nobody has been able to explain how to solve that
>> issue and even the people who appear to be more knowlegeable have
>> expressed suprise that it is even happening at all.
> 
> No one is surprised.  It is just the funny way that LRA says "whoops I
> am going in circles, there is no progress and there will never be, I'd
> better stop that".  Everyone doing new ports / new conversions to LRA
> sees that error all the time.
> 
> The error could be pretty much *anywhere* in your port.  You have to
> look at what LRA did, and why, and why that is wrong, and fix that.

I've run into this a number of times.  The difficulty is that, for someone who understands the back end and the documented rules but not the internals of LRA, it tends to be hard to figure out what the problem is.  And since the causes tend to be obscure and undocumented, I find myself having to relearn the analysis from time to time. 

It has been stated that LRA is more dependent on correct back end definitions than Reload is, but unfortunately the precise definition of "correct" can be less than obvious to a back end maintainer.

	paul


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-09  8:14             ` John Darrington
  2019-08-09 14:17               ` Segher Boessenkool
@ 2019-08-09 16:07               ` Jeff Law
  2019-08-09 17:34               ` Vladimir Makarov
  2 siblings, 0 replies; 38+ messages in thread
From: Jeff Law @ 2019-08-09 16:07 UTC (permalink / raw)
  To: John Darrington; +Cc: Segher Boessenkool, Paul Koning, Vladimir Makarov, gcc

On 8/9/19 2:14 AM, John Darrington wrote:
> On Thu, Aug 08, 2019 at 01:57:41PM -0600, Jeff Law wrote:
> 
>      Yea, it's certainly designed with the more mainstream architectures in
>      mind.  THe double-indirect case that's being talked about here is well
>      out of the mainstream and not a feature of anything LRA has targetted to
>      date.  So I'm not surprised it's not working.
>      
>      My suggestion would be to ignore the double-indirect aspect of the
>      architecture right now, get the port working, then come back and try to
>      make double-indirect addressing modes work.
>      
> This sounds like sensible advice.  However I wonder if this issue is
> related to the other major outstanding problem I have, viz: the large 
> number of test failures which report "Unable to find a register to
> spill" - So far, nobody has been able to explain how to solve that
> issue and even the people who appear to be more knowlegeable have
> expressed suprise that it is even happening at all.
You're going to have to debug what LRA is doing and why.  There's really
no short-cuts here.  We can't really do it for you.  Even if you weren't
using LRA you'd be doing the same process, just on even more difficult
to understand codebase.

> 
> Even if it should turn out not to be related, the message I've been
> receiving in this thread is lra should not be expected to work for
> non "mainstream" backends.  So perhaps there is another, yet to be
> discovered, restriction which prevents my backend from ever working?
It's possible.  But that's not really any different than reload.
There's certainly various aspects of architectures that reload can't
handle as well -- even on architectures that were mainstream processors
when reload was under active development and maintenance.  THere's even
a good chance reload won't handle double-indirect addressing modes well
-- they were far from mainstream and as a result the code which does
purport to handle double-indirect addressing modes hasn't been
used/tested all that much over the last 25+ years.

> 
> On the other hand, given my lack of experience with gcc,  it could be
> that lra is working perfectly, and I have simply done something
> incorrectly.    But the uncertainty voiced in this thread means that it
> is hard to be sure that I'm not trying to do something which is
> currently unsupported.
My recommendation is to continue with the LRA path.

jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-09  8:14             ` John Darrington
  2019-08-09 14:17               ` Segher Boessenkool
  2019-08-09 16:07               ` Jeff Law
@ 2019-08-09 17:34               ` Vladimir Makarov
  2019-08-10  6:06                 ` John Darrington
  2 siblings, 1 reply; 38+ messages in thread
From: Vladimir Makarov @ 2019-08-09 17:34 UTC (permalink / raw)
  To: John Darrington, Jeff Law; +Cc: Segher Boessenkool, Paul Koning, gcc


On 2019-08-09 4:14 a.m., John Darrington wrote:
> On Thu, Aug 08, 2019 at 01:57:41PM -0600, Jeff Law wrote:
>
>       Yea, it's certainly designed with the more mainstream architectures in
>       mind.  THe double-indirect case that's being talked about here is well
>       out of the mainstream and not a feature of anything LRA has targetted to
>       date.  So I'm not surprised it's not working.
>       
>       My suggestion would be to ignore the double-indirect aspect of the
>       architecture right now, get the port working, then come back and try to
>       make double-indirect addressing modes work.
>       
> This sounds like sensible advice.  However I wonder if this issue is
> related to the other major outstanding problem I have, viz: the large
> number of test failures which report "Unable to find a register to
> spill" - So far, nobody has been able to explain how to solve that
> issue and even the people who appear to be more knowlegeable have
> expressed suprise that it is even happening at all.

Basically, LRA behaves here as older reload.  If an RTL insn needs hard 
regs and there are no free regs, LRA/reload put pseudos assigned to hard 
regs and living through the insn into memory.  So it is very hard to run 
into problem "unable to find a register to spill", if the insn needs 
less regs provided by architecture. That is why people are surprised.  
Still it can happens as one RTL insn can be implemented by a few machine 
insns.  Most frequent case here are GCC asm insns requiring a lot of 
input/output/and clobbered regs/operands.

If you provide LRA dump for such test (it is better to use 
-fira-verbose=15 to output full RA info into stderr), I probably could 
say more.

The less regs the architecture has, the easier to run into such error 
message if something described wrong in the back-end.  I see your 
architecture is 16-bit micro-controller with only 8 regs, some of them 
is specialized.  So your architecture is really register constrained.

> Even if it should turn out not to be related, the message I've been
> receiving in this thread is lra should not be expected to work for
> non "mainstream" backends.  So perhaps there is another, yet to be
> discovered, restriction which prevents my backend from ever working?
>
> On the other hand, given my lack of experience with gcc,  it could be
> that lra is working perfectly, and I have simply done something
> incorrectly.    But the uncertainty voiced in this thread means that it
> is hard to be sure that I'm not trying to do something which is
> currently unsupported.

LRA/reload is the most machine-dependent machine-independent pass in 
GCC.  It is connected to machine-dependent code by numerous ways. Big 
part of making a new backend  is to make LRA/reload and 
machine-dependent code communication in the right way.

Sometimes it is hard to decide who is responsible for RA related bugs: 
RA or back-end.  Sometimes an innocent change in RA solving one problem 
for a particular target might results in numerous new bugs for other 
targets.  Therefore it is very difficult to say will your small change 
to permit indirect memory addressing work in general case.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-09 17:34               ` Vladimir Makarov
@ 2019-08-10  6:06                 ` John Darrington
  2019-08-10 16:12                   ` Segher Boessenkool
                                     ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: John Darrington @ 2019-08-10  6:06 UTC (permalink / raw)
  To: Vladimir Makarov
  Cc: John Darrington, Jeff Law, Segher Boessenkool, Paul Koning, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1126 bytes --]

On Fri, Aug 09, 2019 at 01:34:36PM -0400, Vladimir Makarov wrote:
     
     If you provide LRA dump for such test (it is better to use
     -fira-verbose=15 to output full RA info into stderr), I probably could
     say more.

I've attached such a dump (generated from gcc/testsuite/gcc.c-torture/compile/pr53410-2.c).
     
     The less regs the architecture has, thoke easier to run into such error
     message if something described wrong in the back-end.?? I see your
     architecture is 16-bit micro-controller with only 8 regs, some of them is
     specialized.?? So your architecture is really register constrained.

That's not quite correct.  It is a 24-bit micro-controller (the address
space is 24 bits wide).  There are 2 address registers (plus stack
pointer and program counter) and there are 8 general purpose data
registers (of differing sizes).
     

J'

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: ira-verbose=15.txt --]
[-- Type: text/plain; charset=unknown-8bit, Size: 81442 bytes --]

Building IRA IR

Pass 0 for finding pseudo/allocno costs

    r36: preferred X_REG, alternative NO_REGS, allocno X_REG
    a0 (r36,l0) best X_REG, allocno X_REG
    r35: preferred X_REG, alternative NO_REGS, allocno X_REG
    a10 (r35,l0) best X_REG, allocno X_REG
    r34: preferred X_REG, alternative NO_REGS, allocno X_REG
    a1 (r34,l0) best X_REG, allocno X_REG
    r33: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a11 (r33,l0) best DATA_REGS, allocno DATA_REGS
    r32: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a12 (r32,l0) best DATA_REGS, allocno DATA_REGS
    r31: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a14 (r31,l0) best DATA_REGS, allocno DATA_REGS
    r30: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
    a13 (r30,l0) best NO_REGS, allocno NO_REGS
    r29: preferred X_REG, alternative NO_REGS, allocno X_REG
    a15 (r29,l0) best X_REG, allocno X_REG
    r28: preferred X_REG, alternative NO_REGS, allocno X_REG
    a16 (r28,l0) best X_REG, allocno X_REG
    r27: preferred X_REG, alternative NO_REGS, allocno X_REG
    a17 (r27,l0) best X_REG, allocno X_REG
    r26: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a2 (r26,l0) best DATA_REGS, allocno DATA_REGS
    r25: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a4 (r25,l0) best DATA_REGS, allocno DATA_REGS
    r24: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a3 (r24,l0) best DATA_REGS, allocno DATA_REGS
    r23: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a5 (r23,l0) best DATA_REGS, allocno DATA_REGS
    r22: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a6 (r22,l0) best DATA_REGS, allocno DATA_REGS
    r21: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a8 (r21,l0) best DATA_REGS, allocno DATA_REGS
    r20: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a7 (r20,l0) best DATA_REGS, allocno DATA_REGS
    r19: preferred DATA_REGS, alternative NO_REGS, allocno DATA_REGS
    a9 (r19,l0) best DATA_REGS, allocno DATA_REGS

  a0(r36,l0) costs: X_REG:0 MEM:5000
  a1(r34,l0) costs: X_REG:0 MEM:84000
  a2(r26,l0) costs: DATA_REGS:0 MEM:5000
  a3(r24,l0) costs: DATA_REGS:0 MEM:5000
  a4(r25,l0) costs: DATA_REGS:0 MEM:5000
  a5(r23,l0) costs: DATA_REGS:0 MEM:5000
  a6(r22,l0) costs: DATA_REGS:0 MEM:5000
  a7(r20,l0) costs: DATA_REGS:0 MEM:5000
  a8(r21,l0) costs: DATA_REGS:0 MEM:5000
  a9(r19,l0) costs: DATA_REGS:0 MEM:5000
  a10(r35,l0) costs: X_REG:0 MEM:5000
  a11(r33,l0) costs: DATA_REGS:0 MEM:8000
  a12(r32,l0) costs: DATA_REGS:0 MEM:7000
  a13(r30,l0) costs: MEM:8000
  a14(r31,l0) costs: DATA_REGS:0 MEM:7000
  a15(r29,l0) costs: X_REG:0 MEM:8000
  a16(r28,l0) costs: X_REG:0 MEM:8000
  a17(r27,l0) costs: X_REG:2000 MEM:8000

   Insn 43(l0): point = 0
   Insn 39(l0): point = 3
   Insn 38(l0): point = 5
   Insn 37(l0): point = 7
   Insn 36(l0): point = 9
   Insn 35(l0): point = 11
   Insn 34(l0): point = 13
   Insn 33(l0): point = 15
   Insn 32(l0): point = 17
   Insn 31(l0): point = 19
   Insn 30(l0): point = 21
   Insn 29(l0): point = 23
   Insn 28(l0): point = 25
   Insn 27(l0): point = 27
   Insn 26(l0): point = 29
   Insn 25(l0): point = 31
   Insn 24(l0): point = 33
   Insn 23(l0): point = 35
   Insn 22(l0): point = 37
   Insn 21(l0): point = 39
   Insn 20(l0): point = 41
   Insn 19(l0): point = 43
   Insn 18(l0): point = 45
   Insn 17(l0): point = 47
   Insn 16(l0): point = 49
   Insn 15(l0): point = 51
   Insn 14(l0): point = 53
   Insn 9(l0): point = 55
   Insn 8(l0): point = 57
   Insn 7(l0): point = 59
   Insn 6(l0): point = 61
   Insn 5(l0): point = 63
   Insn 4(l0): point = 65
   Insn 3(l0): point = 67
   Insn 2(l0): point = 69
   Insn 10(l0): point = 71
 a0(r36): [4..5]
 a1(r34): [4..55]
 a2(r26): [18..21]
 a3(r24): [20..25]
 a4(r25): [22..23]
 a5(r23): [26..27]
 a6(r22): [38..41]
 a7(r20): [40..45]
 a8(r21): [42..43]
 a9(r19): [46..47]
 a10(r35): [50..51]
 a11(r33): [56..57]
 a12(r32): [58..59]
 a13(r30): [60..61]
 a14(r31): [62..63]
 a15(r29): [64..65]
 a16(r28): [66..67]
 a17(r27): [68..69]
Compressing live ranges: from 74 to 30 - 40%
Ranges after the compression:
 a0(r36): [0..1]
 a1(r34): [0..15]
 a2(r26): [2..3]
 a3(r24): [2..5]
 a4(r25): [4..5]
 a5(r23): [6..7]
 a6(r22): [8..9]
 a7(r20): [8..11]
 a8(r21): [10..11]
 a9(r19): [12..13]
 a10(r35): [14..15]
 a11(r33): [16..17]
 a12(r32): [18..19]
 a13(r30): [20..21]
 a14(r31): [22..23]
 a15(r29): [24..25]
 a16(r28): [26..27]
 a17(r27): [28..29]
  regions=1, blocks=4, points=30
    allocnos=18 (big 0), copies=0, conflicts=0, ranges=18
Disposition:
    9:r19  l0     6    7:r20  l0     7    8:r21  l0     6    6:r22  l0     6
    5:r23  l0     6    3:r24  l0     7    4:r25  l0     6    2:r26  l0     6
   17:r27  l0     8   16:r28  l0     8   15:r29  l0     8   13:r30  l0   mem
   14:r31  l0     6   12:r32  l0     6   11:r33  l0     6    1:r34  l0   mem
   10:r35  l0     8    0:r36  l0     8
+++Costs: overall 94000, reg 2000, mem 92000, ld 0, st 0, move 0
+++       move loops 0, new jumps 0

********** Local #1: **********

	   Spilling non-eliminable hard regs: 9
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            1 Non-pseudo reload: reject+=2
            1 Spill pseudo into memory: reject+=3
            Using memory insn operand 1: reject+=3
          alt=5,overall=14,losers=1,rld_nregs=0
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Spill pseudo into memory: reject+=3
            Using memory insn operand 1: reject+=3
            alt=6,overall=23,losers=2 -- refuse
          alt=7,overall=0,losers=0,rld_nregs=0
	 Choosing alt 7 in insn 10:  (0) m  (1) Q {movpsi}
          alt=0,overall=0,losers=0,rld_nregs=0
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=0,overall=9,losers=1 -- refuse
	 Choosing alt 0 in insn 3:  (0) =Q  (1) %Q  (2) n {addpsi3}
          alt=0,overall=0,losers=0,rld_nregs=0
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=0,overall=9,losers=1 -- refuse
	 Choosing alt 0 in insn 4:  (0) =Q  (1) %Q  (2) n {addpsi3}
          alt=0,overall=0,losers=0,rld_nregs=0
	 Choosing alt 0 in insn 5:  (0) =Q  (1) B {zero_extendpsisi2}
            0 Non input pseudo reload: reject++
          alt=0,overall=7,losers=1,rld_nregs=1
            0 Non input pseudo reload: reject++
          alt=1,overall=7,losers=1,rld_nregs=1
	 Choosing alt 0 in insn 6:  (0) =D  (1) mr  (2) i {lshrsi3}
      Creating newreg=37 from oldreg=30, assigning class DATA_REGS to r37
    6: r37:SI=r31:SI 0>>0x3
      REG_DEAD r31:SI
      REG_EQUAL udiv(r31:SI,0x8)
    Inserting insn reload after:
   45: r30:SI=r37:SI

            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=608,losers=1,rld_nregs=1
            0 Non pseudo reload: reject++
            Using memory insn operand 1: reject+=3
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=610,losers=1,rld_nregs=0
            0 Non input pseudo reload: reject++
            Using memory insn operand 1: reject+=3
            Cycle danger: overall += LRA_MAX_REJECT
          alt=6,overall=616,losers=2,rld_nregs=1
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=7,overall=2,losers=0,rld_nregs=0
	 Choosing alt 7 in insn 45:  (0) m  (1) r {*movsi}
            1 Small class reload: reject+=3
          alt=0,overall=9,losers=1,rld_nregs=1
	 Choosing alt 0 in insn 7:  (0) =Q  (1) B {zero_extendpsisi2}
      Creating newreg=38, assigning class BASE_REGS to r38
    7: r32:SI=zero_extend(r38:PSI)
      REG_DEAD r30:SI
    Inserting insn reload before:
   46: r38:PSI=r30:SI#0

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 46:  (0) Q  (1) m {movpsi}
          alt=0,overall=0,losers=0,rld_nregs=0
	 Choosing alt 0 in insn 8:  (0) =D  (1) mr  (2) i {ashlsi3}
            0 Non input pseudo reload: reject++
          alt=0,overall=7,losers=1,rld_nregs=1
	 Choosing alt 0 in insn 9:  (0) =Q  (1) Q {truncsipsi2}
      Creating newreg=39 from oldreg=34, assigning class ALL_REGS to r39
    9: r39:PSI=trunc(r33:SI)
      REG_DEAD r33:SI
    Inserting insn reload after:
   47: r34:PSI=r39:PSI

            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            1 Non pseudo costly reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            0 Non pseudo reload: reject++
            Using memory insn operand 1: reject+=3
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=610,losers=1,rld_nregs=0
            0 Non input pseudo reload: reject++
            Using memory insn operand 1: reject+=3
            Cycle danger: overall += LRA_MAX_REJECT
          alt=6,overall=616,losers=2,rld_nregs=1
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
            1 Non pseudo costly reload: reject++
          alt=7,overall=3,losers=0,rld_nregs=0
	 Choosing alt 7 in insn 47:  (0) m  (1) Q {movpsi}
      Creating newreg=40 from oldreg=34, assigning class GENERAL_REGS to address r40
      Creating newreg=41, assigning class GENERAL_REGS to address r41
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 14:  (0) m  (1) m {*movsi}
   14: [r40:PSI+0x20]=[r41:PSI]
    Inserting insn reload before:
   48: r40:PSI=r34:PSI
   49: r41:PSI=[y:PSI+0x2f]

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 48:  (0) Q  (1) m {movpsi}
            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=610,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
          alt=5,overall=10,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
          alt=6,overall=1,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 49:  (0) Q  (1) m {movpsi}
            alt=0: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            alt=3: Bad operand -- refuse
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
          alt=5,overall=13,losers=1,rld_nregs=0
          alt=6,overall=0,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 15:  (0) Q  (1) m {movpsi}
      Creating newreg=42 from oldreg=34, assigning class GENERAL_REGS to address r42
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 16:  (0) m  (1) m {*movsi}
   16: [r42:PSI+0x24]=[r35:PSI+0x4]
      REG_DEAD r35:PSI
    Inserting insn reload before:
   50: r42:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 50:  (0) Q  (1) m {movpsi}
      Creating newreg=43 from oldreg=34, assigning class GENERAL_REGS to address r43
            alt=0: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            alt=3: Bad operand -- refuse
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
          alt=5,overall=13,losers=1,rld_nregs=0
          alt=6,overall=0,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 17:  (0) r  (1) m {*movsi}
   17: r19:SI=[r43:PSI+0x20]
    Inserting insn reload before:
   51: r43:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 51:  (0) Q  (1) m {movpsi}
          alt=0,overall=6,losers=1,rld_nregs=1
            2 Non-pseudo reload: reject+=2
            2 Non input pseudo reload: reject++
            alt=1,overall=15,losers=2 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=0,overall=11,losers=1 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=1,overall=11,losers=1 -- refuse
	 Choosing alt 0 in insn 18:  (0) =D  (1) %0  (2) i {andsi3}
      Creating newreg=44 from oldreg=19, assigning class DATA_REGS to r44
   18: r44:SI=r44:SI&0x10001
      REG_DEAD r19:SI
    Inserting insn reload before:
   52: r44:SI=r19:SI
    Inserting insn reload after:
   53: r20:SI=r44:SI

      Creating newreg=45 from oldreg=34, assigning class GENERAL_REGS to address r45
            alt=0: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            alt=3: Bad operand -- refuse
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
          alt=5,overall=13,losers=1,rld_nregs=0
          alt=6,overall=0,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 19:  (0) r  (1) m {*movsi}
   19: r21:SI=[r45:PSI+0x24]
    Inserting insn reload before:
   54: r45:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 54:  (0) Q  (1) m {movpsi}
          alt=0,overall=0,losers=0,rld_nregs=0
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=0,overall=11,losers=1 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=1,overall=11,losers=1 -- refuse
	 Choosing alt 0 in insn 20:  (0) =D  (1) %0  (2) i {andsi3}
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            1 Spill pseudo into memory: reject+=3
            Using memory insn operand 1: reject+=3
          alt=5,overall=12,losers=1,rld_nregs=0
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Spill pseudo into memory: reject+=3
            Using memory insn operand 1: reject+=3
            alt=6,overall=21,losers=2 -- refuse
          alt=7,overall=0,losers=0,rld_nregs=0
	 Choosing alt 7 in insn 21:  (0) m  (1) r {*movsi}
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            1 Spill pseudo into memory: reject+=3
            Using memory insn operand 1: reject+=3
          alt=5,overall=12,losers=1,rld_nregs=0
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Spill pseudo into memory: reject+=3
            Using memory insn operand 1: reject+=3
            alt=6,overall=21,losers=2 -- refuse
          alt=7,overall=0,losers=0,rld_nregs=0
	 Choosing alt 7 in insn 22:  (0) m  (1) r {*movsi}
      Creating newreg=46 from oldreg=34, assigning class GENERAL_REGS to address r46
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 23:  (0) m  (1) m {*movsi}
   23: [r46:PSI+0x8]=[y:PSI+0x33]
    Inserting insn reload before:
   55: r46:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 55:  (0) Q  (1) m {movpsi}
      Creating newreg=47 from oldreg=34, assigning class GENERAL_REGS to address r47
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 24:  (0) m  (1) m {*movsi}
   24: [r47:PSI+0xc]=[y:PSI+0x37]
    Inserting insn reload before:
   56: r47:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 56:  (0) Q  (1) m {movpsi}
      Creating newreg=48 from oldreg=34, assigning class GENERAL_REGS to address r48
	 Reuse r48 for reload r34:PSI
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 25:  (0) m  (1) m {*movsi}
   25: [r48:PSI+0x18]=[r48:PSI+0x8]
    Inserting insn reload before:
   57: r48:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 57:  (0) Q  (1) m {movpsi}
      Creating newreg=49 from oldreg=34, assigning class GENERAL_REGS to address r49
	 Reuse r49 for reload r34:PSI
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 26:  (0) m  (1) m {*movsi}
   26: [r49:PSI+0x1c]=[r49:PSI+0xc]
    Inserting insn reload before:
   58: r49:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 58:  (0) Q  (1) m {movpsi}
      Creating newreg=50 from oldreg=34, assigning class GENERAL_REGS to address r50
            alt=0: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            alt=3: Bad operand -- refuse
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
          alt=5,overall=13,losers=1,rld_nregs=0
          alt=6,overall=0,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 27:  (0) r  (1) m {*movsi}
   27: r23:SI=[r50:PSI+0x18]
    Inserting insn reload before:
   59: r50:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 59:  (0) Q  (1) m {movpsi}
          alt=0,overall=6,losers=1,rld_nregs=1
            2 Non-pseudo reload: reject+=2
            2 Non input pseudo reload: reject++
            alt=1,overall=15,losers=2 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=0,overall=11,losers=1 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=1,overall=11,losers=1 -- refuse
	 Choosing alt 0 in insn 28:  (0) =D  (1) %0  (2) i {xorsi3}
      Creating newreg=51 from oldreg=23, assigning class DATA_REGS to r51
   28: r51:SI=r51:SI^0x10001
      REG_DEAD r23:SI
    Inserting insn reload before:
   60: r51:SI=r23:SI
    Inserting insn reload after:
   61: r24:SI=r51:SI

      Creating newreg=52 from oldreg=34, assigning class GENERAL_REGS to address r52
            alt=0: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            alt=3: Bad operand -- refuse
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
          alt=5,overall=13,losers=1,rld_nregs=0
          alt=6,overall=0,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 29:  (0) r  (1) m {*movsi}
   29: r25:SI=[r52:PSI+0x1c]
    Inserting insn reload before:
   62: r52:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 62:  (0) Q  (1) m {movpsi}
          alt=0,overall=0,losers=0,rld_nregs=0
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=0,overall=11,losers=1 -- refuse
            1 Matching alt: reject+=2
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            alt=1,overall=11,losers=1 -- refuse
	 Choosing alt 0 in insn 30:  (0) =D  (1) %0  (2) i {xorsi3}
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            1 Spill pseudo into memory: reject+=3
            Using memory insn operand 1: reject+=3
          alt=5,overall=12,losers=1,rld_nregs=0
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Spill pseudo into memory: reject+=3
            Using memory insn operand 1: reject+=3
            alt=6,overall=21,losers=2 -- refuse
          alt=7,overall=0,losers=0,rld_nregs=0
	 Choosing alt 7 in insn 31:  (0) m  (1) r {*movsi}
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            1 Spill pseudo into memory: reject+=3
            Using memory insn operand 1: reject+=3
          alt=5,overall=12,losers=1,rld_nregs=0
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Spill pseudo into memory: reject+=3
            Using memory insn operand 1: reject+=3
            alt=6,overall=21,losers=2 -- refuse
          alt=7,overall=0,losers=0,rld_nregs=0
	 Choosing alt 7 in insn 32:  (0) m  (1) r {*movsi}
      Creating newreg=53 from oldreg=34, assigning class GENERAL_REGS to address r53
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 33:  (0) m  (1) m {*movsi}
   33: [r53:PSI]=[y:PSI+0x33]
    Inserting insn reload before:
   63: r53:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 63:  (0) Q  (1) m {movpsi}
      Creating newreg=54 from oldreg=34, assigning class GENERAL_REGS to address r54
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 34:  (0) m  (1) m {*movsi}
   34: [r54:PSI+0x4]=[y:PSI+0x37]
    Inserting insn reload before:
   64: r54:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 64:  (0) Q  (1) m {movpsi}
      Creating newreg=55 from oldreg=34, assigning class GENERAL_REGS to address r55
	 Reuse r55 for reload r34:PSI
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 35:  (0) m  (1) m {*movsi}
   35: [r55:PSI+0x10]=[r55:PSI]
    Inserting insn reload before:
   65: r55:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 65:  (0) Q  (1) m {movpsi}
      Creating newreg=56 from oldreg=34, assigning class GENERAL_REGS to address r56
	 Reuse r56 for reload r34:PSI
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 36:  (0) m  (1) m {*movsi}
   36: [r56:PSI+0x14]=[r56:PSI+0x4]
    Inserting insn reload before:
   66: r56:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 66:  (0) Q  (1) m {movpsi}
      Creating newreg=57, assigning class GENERAL_REGS to address r57
      Creating newreg=58 from oldreg=34, assigning class GENERAL_REGS to address r58
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 37:  (0) m  (1) m {*movsi}
   37: [r57:PSI]=[r58:PSI+0x10]
    Inserting insn reload before:
   67: r57:PSI=[y:PSI+0x2f]
   68: r58:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=610,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
          alt=5,overall=10,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
          alt=6,overall=1,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 67:  (0) Q  (1) m {movpsi}
            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 68:  (0) Q  (1) m {movpsi}
            alt=0: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            alt=3: Bad operand -- refuse
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=609,losers=1,rld_nregs=1
            0 Spill pseudo into memory: reject+=3
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
          alt=5,overall=13,losers=1,rld_nregs=0
          alt=6,overall=0,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 38:  (0) Q  (1) m {movpsi}
      Creating newreg=59 from oldreg=34, assigning class GENERAL_REGS to address r59
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=0: Bad operand -- refuse
            alt=1: Bad operand -- refuse
            alt=2: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non-pseudo reload: reject+=2
            0 Non input pseudo reload: reject++
            1 Non-pseudo reload: reject+=2
            1 Non input pseudo reload: reject++
          alt=4,overall=18,losers=2,rld_nregs=2
          alt=5,overall=0,losers=0,rld_nregs=0
	 Choosing alt 5 in insn 39:  (0) m  (1) m {*movsi}
   39: [r36:PSI+0x4]=[r59:PSI+0x14]
      REG_DEAD r36:PSI
      REG_DEAD r34:PSI
    Inserting insn reload before:
   69: r59:PSI=r34:PSI

            0 Non pseudo reload: reject++
            alt=0: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            alt=2: Bad operand -- refuse
            0 Non pseudo reload: reject++
            alt=3: Bad operand -- refuse
            0 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=4,overall=607,losers=1,rld_nregs=1
            Using memory insn operand 0: reject+=3
            0 Non input pseudo reload: reject++
            1 Non pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=5,overall=611,losers=1,rld_nregs=0
            0 Non pseudo reload: reject++
            1 Non pseudo reload: reject++
          alt=6,overall=2,losers=0,rld_nregs=0
	 Choosing alt 6 in insn 69:  (0) Q  (1) m {movpsi}
	   Spilling non-eliminable hard regs: 9

********** Inheritance #1: **********

EBB 2 3
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=60 from oldreg=34, assigning class X_REG to inheritance r60
    Original reg change 34->60 (bb2):
   68: r58:PSI=r60:PSI
    Add inheritance<-original before:
   70: r60:PSI=r34:PSI

    Inheritance reuse change 34->60 (bb2):
   69: r59:PSI=r60:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=61 from oldreg=34, assigning class X_REG to inheritance r61
    Original reg change 34->61 (bb2):
   66: r56:PSI=r61:PSI
    Add inheritance<-original before:
   71: r61:PSI=r34:PSI

    Inheritance reuse change 34->61 (bb2):
   70: r60:PSI=r61:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=62 from oldreg=34, assigning class X_REG to inheritance r62
    Original reg change 34->62 (bb2):
   65: r55:PSI=r62:PSI
    Add inheritance<-original before:
   72: r62:PSI=r34:PSI

    Inheritance reuse change 34->62 (bb2):
   71: r61:PSI=r62:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=63 from oldreg=34, assigning class X_REG to inheritance r63
    Original reg change 34->63 (bb2):
   64: r54:PSI=r63:PSI
    Add inheritance<-original before:
   73: r63:PSI=r34:PSI

    Inheritance reuse change 34->63 (bb2):
   72: r62:PSI=r63:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=64 from oldreg=34, assigning class X_REG to inheritance r64
    Original reg change 34->64 (bb2):
   63: r53:PSI=r64:PSI
    Add inheritance<-original before:
   74: r64:PSI=r34:PSI

    Inheritance reuse change 34->64 (bb2):
   73: r63:PSI=r64:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=65 from oldreg=34, assigning class X_REG to inheritance r65
    Original reg change 34->65 (bb2):
   62: r52:PSI=r65:PSI
    Add inheritance<-original before:
   75: r65:PSI=r34:PSI

    Inheritance reuse change 34->65 (bb2):
   74: r64:PSI=r65:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=66 from oldreg=34, assigning class X_REG to inheritance r66
    Original reg change 34->66 (bb2):
   59: r50:PSI=r66:PSI
    Add inheritance<-original before:
   76: r66:PSI=r34:PSI

    Inheritance reuse change 34->66 (bb2):
   75: r65:PSI=r66:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=67 from oldreg=34, assigning class X_REG to inheritance r67
    Original reg change 34->67 (bb2):
   58: r49:PSI=r67:PSI
    Add inheritance<-original before:
   77: r67:PSI=r34:PSI

    Inheritance reuse change 34->67 (bb2):
   76: r66:PSI=r67:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=68 from oldreg=34, assigning class X_REG to inheritance r68
    Original reg change 34->68 (bb2):
   57: r48:PSI=r68:PSI
    Add inheritance<-original before:
   78: r68:PSI=r34:PSI

    Inheritance reuse change 34->68 (bb2):
   77: r67:PSI=r68:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=69 from oldreg=34, assigning class X_REG to inheritance r69
    Original reg change 34->69 (bb2):
   56: r47:PSI=r69:PSI
    Add inheritance<-original before:
   79: r69:PSI=r34:PSI

    Inheritance reuse change 34->69 (bb2):
   78: r68:PSI=r69:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=70 from oldreg=34, assigning class X_REG to inheritance r70
    Original reg change 34->70 (bb2):
   55: r46:PSI=r70:PSI
    Add inheritance<-original before:
   80: r70:PSI=r34:PSI

    Inheritance reuse change 34->70 (bb2):
   79: r69:PSI=r70:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=71 from oldreg=34, assigning class X_REG to inheritance r71
    Original reg change 34->71 (bb2):
   54: r45:PSI=r71:PSI
    Add inheritance<-original before:
   81: r71:PSI=r34:PSI

    Inheritance reuse change 34->71 (bb2):
   80: r70:PSI=r71:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=72 from oldreg=34, assigning class X_REG to inheritance r72
    Original reg change 34->72 (bb2):
   51: r43:PSI=r72:PSI
    Add inheritance<-original before:
   82: r72:PSI=r34:PSI

    Inheritance reuse change 34->72 (bb2):
   81: r71:PSI=r72:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=73 from oldreg=34, assigning class X_REG to inheritance r73
    Original reg change 34->73 (bb2):
   50: r42:PSI=r73:PSI
    Add inheritance<-original before:
   83: r73:PSI=r34:PSI

    Inheritance reuse change 34->73 (bb2):
   82: r72:PSI=r73:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=74 from oldreg=34, assigning class X_REG to inheritance r74
    Original reg change 34->74 (bb2):
   48: r40:PSI=r74:PSI
    Add inheritance<-original before:
   84: r74:PSI=r34:PSI

    Inheritance reuse change 34->74 (bb2):
   83: r73:PSI=r74:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Creating newreg=75 from oldreg=34, assigning class X_REG to inheritance r75
    Original reg change 34->75 (bb2):
   47: r75:PSI=r39:PSI
    Add original<-inheritance after:
   85: r34:PSI=r75:PSI

    Inheritance reuse change 34->75 (bb2):
   84: r74:PSI=r75:PSI
	  >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
    Rejecting inheritance for 30 because of disjoint classes DATA_REGS and NO_REGS
    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
	    Removing dead insn:
    85: r34:PSI=r75:PSI

********** Pseudo live ranges #1: **********

  BB 3
   Insn 43: point = 0, n_alt = -1
  BB 2
   Insn 39: point = 0, n_alt = 5
   Insn 69: point = 1, n_alt = 6
	   Creating copy r59<-r60@1000
   Insn 38: point = 3, n_alt = 6
   Insn 37: point = 4, n_alt = 5
   Insn 68: point = 5, n_alt = 6
	   Creating copy r58<-r60@1000
   Insn 70: point = 6, n_alt = -1
	   Creating copy r60<-r61@1000
   Insn 67: point = 8, n_alt = 6
   Insn 36: point = 9, n_alt = 5
   Insn 66: point = 10, n_alt = 6
	   Creating copy r56<-r61@1000
   Insn 71: point = 11, n_alt = -1
	   Creating copy r61<-r62@1000
   Insn 35: point = 13, n_alt = 5
   Insn 65: point = 14, n_alt = 6
	   Creating copy r55<-r62@1000
   Insn 72: point = 15, n_alt = -1
	   Creating copy r62<-r63@1000
   Insn 34: point = 17, n_alt = 5
   Insn 64: point = 18, n_alt = 6
	   Creating copy r54<-r63@1000
   Insn 73: point = 19, n_alt = -1
	   Creating copy r63<-r64@1000
   Insn 33: point = 21, n_alt = 5
   Insn 63: point = 22, n_alt = 6
	   Creating copy r53<-r64@1000
   Insn 74: point = 23, n_alt = -1
	   Creating copy r64<-r65@1000
   Insn 32: point = 25, n_alt = 7
   Insn 31: point = 26, n_alt = 7
   Insn 30: point = 27, n_alt = 0
   Insn 29: point = 29, n_alt = 6
   Insn 62: point = 31, n_alt = 6
	   Creating copy r52<-r65@1000
   Insn 75: point = 32, n_alt = -1
	   Creating copy r65<-r66@1000
   Insn 61: point = 34, n_alt = -2
	Hard reg 7 is preferable by r51 with profit 1000
   Insn 28: point = 36, n_alt = 0
   Insn 60: point = 37, n_alt = -2
	Hard reg 7 is preferable by r51 with profit 1000
	Hard reg 6 is preferable by r51 with profit 1000
   Insn 27: point = 39, n_alt = 6
   Insn 59: point = 41, n_alt = 6
	   Creating copy r50<-r66@1000
   Insn 76: point = 42, n_alt = -1
	   Creating copy r66<-r67@1000
   Insn 26: point = 44, n_alt = 5
   Insn 58: point = 45, n_alt = 6
	   Creating copy r49<-r67@1000
   Insn 77: point = 46, n_alt = -1
	   Creating copy r67<-r68@1000
   Insn 25: point = 48, n_alt = 5
   Insn 57: point = 49, n_alt = 6
	   Creating copy r48<-r68@1000
   Insn 78: point = 50, n_alt = -1
	   Creating copy r68<-r69@1000
   Insn 24: point = 52, n_alt = 5
   Insn 56: point = 53, n_alt = 6
	   Creating copy r47<-r69@1000
   Insn 79: point = 54, n_alt = -1
	   Creating copy r69<-r70@1000
   Insn 23: point = 56, n_alt = 5
   Insn 55: point = 57, n_alt = 6
	   Creating copy r46<-r70@1000
   Insn 80: point = 58, n_alt = -1
	   Creating copy r70<-r71@1000
   Insn 22: point = 60, n_alt = 7
   Insn 21: point = 61, n_alt = 7
   Insn 20: point = 62, n_alt = 0
   Insn 19: point = 64, n_alt = 6
   Insn 54: point = 66, n_alt = 6
	   Creating copy r45<-r71@1000
   Insn 81: point = 67, n_alt = -1
	   Creating copy r71<-r72@1000
   Insn 53: point = 69, n_alt = -2
	Hard reg 7 is preferable by r44 with profit 1000
   Insn 18: point = 71, n_alt = 0
   Insn 52: point = 72, n_alt = -2
	Hard reg 7 is preferable by r44 with profit 1000
	Hard reg 6 is preferable by r44 with profit 1000
   Insn 17: point = 74, n_alt = 6
   Insn 51: point = 76, n_alt = 6
	   Creating copy r43<-r72@1000
   Insn 82: point = 77, n_alt = -1
	   Creating copy r72<-r73@1000
   Insn 16: point = 79, n_alt = 5
   Insn 50: point = 80, n_alt = 6
	   Creating copy r42<-r73@1000
   Insn 83: point = 81, n_alt = -1
	   Creating copy r73<-r74@1000
   Insn 15: point = 83, n_alt = 6
   Insn 14: point = 84, n_alt = 5
   Insn 49: point = 85, n_alt = 6
   Insn 48: point = 86, n_alt = 6
	   Creating copy r40<-r74@1000
   Insn 84: point = 87, n_alt = -1
	   Creating copy r74<-r75@1000
   Insn 47: point = 89, n_alt = 7
	   Creating copy r39->r75@1000
   Insn 9: point = 91, n_alt = 0
   Insn 8: point = 93, n_alt = 0
   Insn 7: point = 95, n_alt = 0
   Insn 46: point = 97, n_alt = 6
   Insn 45: point = 99, n_alt = 7
   Insn 6: point = 101, n_alt = 0
   Insn 5: point = 103, n_alt = 0
   Insn 4: point = 105, n_alt = 0
   Insn 3: point = 107, n_alt = 0
   Insn 2: point = 109, n_alt = -2
   Insn 10: point = 110, n_alt = 7
 r19: [73..74]
 r20: [61..69]
 r21: [63..64]
 r22: [60..62]
 r23: [38..39]
 r24: [26..34]
 r25: [28..29]
 r26: [25..27]
 r27: [108..109]
 r28: [106..107]
 r29: [104..105]
 r30: [98..99]
 r31: [102..103]
 r32: [94..95]
 r33: [92..93]
 r35: [79..83]
 r36: [0..3]
 r37: [100..101]
 r38: [96..97]
 r39: [90..91]
 r40: [84..86]
 r41: [84..85]
 r42: [79..80]
 r43: [75..76]
 r44: [70..72]
 r45: [65..66]
 r46: [56..57]
 r47: [52..53]
 r48: [48..49]
 r49: [44..45]
 r50: [40..41]
 r51: [35..37]
 r52: [30..31]
 r53: [21..22]
 r54: [17..18]
 r55: [13..14]
 r56: [9..10]
 r57: [4..8]
 r58: [4..5]
 r59: [0..1]
 r60: [2..6]
 r61: [7..11]
 r62: [12..15]
 r63: [16..19]
 r64: [20..23]
 r65: [24..32]
 r66: [33..42]
 r67: [43..46]
 r68: [47..50]
 r69: [51..54]
 r70: [55..58]
 r71: [59..67]
 r72: [68..77]
 r73: [78..81]
 r74: [82..87]
 r75: [88..89]
Compressing live ranges: from 110 to 80 - 72%
Ranges after the compression:
 r19: [48..49]
 r20: [38..45]
 r21: [40..41]
 r22: [38..39]
 r23: [26..27]
 r24: [16..23]
 r25: [18..19]
 r26: [16..17]
 r27: [78..79]
 r28: [76..77]
 r29: [74..75]
 r30: [68..69]
 r31: [72..73]
 r32: [64..65]
 r33: [62..63]
 r35: [52..55]
 r36: [0..3]
 r37: [70..71]
 r38: [66..67]
 r39: [60..61]
 r40: [56..57]
 r41: [56..57]
 r42: [52..53]
 r43: [50..51]
 r44: [46..47]
 r45: [42..43]
 r46: [36..37]
 r47: [34..35]
 r48: [32..33]
 r49: [30..31]
 r50: [28..29]
 r51: [24..25]
 r52: [20..21]
 r53: [14..15]
 r54: [12..13]
 r55: [10..11]
 r56: [8..9]
 r57: [4..7]
 r58: [4..5]
 r59: [0..1]
 r60: [2..5]
 r61: [6..9]
 r62: [10..11]
 r63: [12..13]
 r64: [14..15]
 r65: [16..21]
 r66: [22..29]
 r67: [30..31]
 r68: [32..33]
 r69: [34..35]
 r70: [36..37]
 r71: [38..43]
 r72: [44..51]
 r73: [52..53]
 r74: [54..57]
 r75: [58..59]

********** Assignment #1: **********

	 Assigning to 66 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r66 (freq=3000)
	Hard reg 8 is preferable by r67 with profit 1000
	Hard reg 8 is preferable by r68 with profit 500
	Hard reg 8 is preferable by r69 with profit 250
	Hard reg 8 is preferable by r70 with profit 125
	Hard reg 8 is preferable by r71 with profit 62
	Hard reg 8 is preferable by r72 with profit 31
	Hard reg 8 is preferable by r45 with profit 31
	Hard reg 8 is preferable by r46 with profit 62
	Hard reg 8 is preferable by r47 with profit 125
	Hard reg 8 is preferable by r48 with profit 250
	Hard reg 8 is preferable by r49 with profit 500
	Hard reg 8 is preferable by r50 with profit 1000
	Hard reg 8 is preferable by r65 with profit 1000
	Hard reg 8 is preferable by r52 with profit 500
	Hard reg 8 is preferable by r64 with profit 500
	Hard reg 8 is preferable by r53 with profit 250
	Hard reg 8 is preferable by r63 with profit 250
	Hard reg 8 is preferable by r54 with profit 125
	Hard reg 8 is preferable by r62 with profit 125
	Hard reg 8 is preferable by r55 with profit 62
	Hard reg 8 is preferable by r61 with profit 62
	Hard reg 8 is preferable by r56 with profit 31
	Hard reg 8 is preferable by r60 with profit 31
	 Assigning to 72 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r72 (freq=3000)
	Hard reg 8 is preferable by r73 with profit 1000
	Hard reg 8 is preferable by r74 with profit 500
	Hard reg 8 is preferable by r75 with profit 250
	Hard reg 8 is preferable by r39 with profit 125
	Hard reg 8 is preferable by r40 with profit 250
	Hard reg 8 is preferable by r42 with profit 500
	Hard reg 8 is preferable by r43 with profit 1000
	Hard reg 8 is preferable by r71 with profit 1062
	Hard reg 8 is preferable by r45 with profit 531
	Hard reg 8 is preferable by r70 with profit 625
	Hard reg 8 is preferable by r46 with profit 312
	Hard reg 8 is preferable by r69 with profit 500
	Hard reg 8 is preferable by r47 with profit 250
	Hard reg 8 is preferable by r68 with profit 625
	Hard reg 8 is preferable by r48 with profit 312
	Hard reg 8 is preferable by r67 with profit 1062
	Hard reg 8 is preferable by r49 with profit 531
	 Assigning to 65 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r65 (freq=3000)
	Hard reg 8 is preferable by r52 with profit 1500
	Hard reg 8 is preferable by r64 with profit 1500
	Hard reg 8 is preferable by r53 with profit 750
	Hard reg 8 is preferable by r63 with profit 750
	Hard reg 8 is preferable by r54 with profit 375
	Hard reg 8 is preferable by r62 with profit 375
	Hard reg 8 is preferable by r55 with profit 187
	Hard reg 8 is preferable by r61 with profit 187
	Hard reg 8 is preferable by r56 with profit 93
	Hard reg 8 is preferable by r60 with profit 93
	Hard reg 8 is preferable by r58 with profit 31
	Hard reg 8 is preferable by r59 with profit 31
	 Assigning to 71 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r71 (freq=3000)
	Hard reg 8 is preferable by r45 with profit 1531
	Hard reg 8 is preferable by r70 with profit 1625
	Hard reg 8 is preferable by r46 with profit 812
	Hard reg 8 is preferable by r69 with profit 1000
	Hard reg 8 is preferable by r47 with profit 500
	Hard reg 8 is preferable by r68 with profit 875
	Hard reg 8 is preferable by r48 with profit 437
	Hard reg 8 is preferable by r67 with profit 1187
	Hard reg 8 is preferable by r49 with profit 593
	 Assigning to 60 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	 Assigning to 61 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r61 (freq=3000)
	Hard reg 8 is preferable by r62 with profit 1375
	Hard reg 8 is preferable by r63 with profit 1250
	Hard reg 8 is preferable by r64 with profit 1750
	Hard reg 8 is preferable by r53 with profit 875
	Hard reg 8 is preferable by r54 with profit 625
	Hard reg 8 is preferable by r55 with profit 687
	Hard reg 8 is preferable by r56 with profit 1093
	Hard reg 8 is preferable by r60 with profit 1093
	Hard reg 8 is preferable by r58 with profit 531
	Hard reg 8 is preferable by r59 with profit 531
	 Assigning to 74 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	 Assigning to 62 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r62 (freq=3000)
	Hard reg 8 is preferable by r63 with profit 2250
	Hard reg 8 is preferable by r64 with profit 2250
	Hard reg 8 is preferable by r53 with profit 1125
	Hard reg 8 is preferable by r54 with profit 1125
	Hard reg 8 is preferable by r55 with profit 1687
	 Assigning to 63 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r63 (freq=3000)
	Hard reg 8 is preferable by r64 with profit 3250
	Hard reg 8 is preferable by r53 with profit 1625
	Hard reg 8 is preferable by r54 with profit 2125
	 Assigning to 64 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r64 (freq=3000)
	Hard reg 8 is preferable by r53 with profit 2625
	 Assigning to 67 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r67 (freq=3000)
	Hard reg 8 is preferable by r68 with profit 1875
	Hard reg 8 is preferable by r69 with profit 1500
	Hard reg 8 is preferable by r70 with profit 1875
	Hard reg 8 is preferable by r46 with profit 937
	Hard reg 8 is preferable by r47 with profit 750
	Hard reg 8 is preferable by r48 with profit 937
	Hard reg 8 is preferable by r49 with profit 1593
	 Assigning to 68 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r68 (freq=3000)
	Hard reg 8 is preferable by r69 with profit 2500
	Hard reg 8 is preferable by r70 with profit 2375
	Hard reg 8 is preferable by r46 with profit 1187
	Hard reg 8 is preferable by r47 with profit 1250
	Hard reg 8 is preferable by r48 with profit 1937
	 Assigning to 69 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r69 (freq=3000)
	Hard reg 8 is preferable by r70 with profit 3375
	Hard reg 8 is preferable by r46 with profit 1687
	Hard reg 8 is preferable by r47 with profit 2250
	 Assigning to 70 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r70 (freq=3000)
	Hard reg 8 is preferable by r46 with profit 2687
	 Assigning to 73 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	 Assigning to 75 (cl=X_REG, orig=34, freq=2000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r75 (freq=2000)
	Hard reg 8 is preferable by r39 with profit 1125
	Hard reg 8 is preferable by r74 with profit 1500
	Hard reg 8 is preferable by r40 with profit 750
	Hard reg 8 is preferable by r73 with profit 1500
	Hard reg 8 is preferable by r42 with profit 750
	 Assigning to 38 (cl=BASE_REGS, orig=38, freq=2000, tfirst=38, tfreq=2000)...
	   Assign 8 to reload r38 (freq=2000)
	 Assigning to 44 (cl=DATA_REGS, orig=19, freq=3000, tfirst=44, tfreq=3000)...
	   Assign 6 to reload r44 (freq=3000)
	 Assigning to 51 (cl=DATA_REGS, orig=23, freq=3000, tfirst=51, tfreq=3000)...
	   Assign 6 to reload r51 (freq=3000)
	 Assigning to 37 (cl=DATA_REGS, orig=30, freq=2000, tfirst=37, tfreq=2000)...
	   Assign 6 to reload r37 (freq=2000)
	 Assigning to 39 (cl=ALL_REGS, orig=34, freq=2000, tfirst=39, tfreq=2000)...
	   Assign 8 to reload r39 (freq=2000)
	 Assigning to 40 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=40, tfreq=2000)...
	   Assign 8 to reload r40 (freq=2000)
	Hard reg 8 is preferable by r74 with profit 2500
	Hard reg 8 is preferable by r73 with profit 2000
	Hard reg 8 is preferable by r42 with profit 1000
	 Assigning to 41 (cl=GENERAL_REGS, orig=41, freq=2000, tfirst=41, tfreq=2000)...
	 Trying 0:
	 Trying 1:
	 Trying 2:
	 Trying 3:
	 Trying 4:
	 Trying 5:
	 Trying 6:
	 Trying 7:
	 Assigning to 42 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=42, tfreq=2000)...
	 Trying 0:
	 Trying 1:
	 Trying 2:
	 Trying 3:
	 Trying 4:
	 Trying 5:
	 Trying 6:
	 Trying 7:
	 Assigning to 43 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=43, tfreq=2000)...
	   Assign 8 to reload r43 (freq=2000)
	 Assigning to 45 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=45, tfreq=2000)...
	   Assign 8 to reload r45 (freq=2000)
	 Assigning to 46 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=46, tfreq=2000)...
	   Assign 8 to reload r46 (freq=2000)
	 Assigning to 47 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=47, tfreq=2000)...
	   Assign 8 to reload r47 (freq=2000)
	 Assigning to 48 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=48, tfreq=2000)...
	   Assign 8 to reload r48 (freq=2000)
	 Assigning to 49 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=49, tfreq=2000)...
	   Assign 8 to reload r49 (freq=2000)
	 Assigning to 50 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=50, tfreq=2000)...
	   Assign 8 to reload r50 (freq=2000)
	 Assigning to 52 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=52, tfreq=2000)...
	   Assign 8 to reload r52 (freq=2000)
	 Assigning to 53 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=53, tfreq=2000)...
	   Assign 8 to reload r53 (freq=2000)
	 Assigning to 54 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=54, tfreq=2000)...
	   Assign 8 to reload r54 (freq=2000)
	 Assigning to 55 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=55, tfreq=2000)...
	   Assign 8 to reload r55 (freq=2000)
	 Assigning to 56 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=56, tfreq=2000)...
	   Assign 8 to reload r56 (freq=2000)
	 Assigning to 57 (cl=GENERAL_REGS, orig=57, freq=2000, tfirst=57, tfreq=2000)...
	 Trying 0:
	 Trying 1:
	 Trying 2:
	 Trying 3:
	 Trying 4:
	 Trying 5:
	 Trying 6:
	 Trying 7:
	 Trying 8: spill 61(freq=3000)	 Now best 8(cost=2811, bad_spills=0, insn_pseudos=0)

      Spill inheritance r61(hr=8, freq=3000) for r57
	   Assign 8 to reload r57 (freq=2000)
	 Assigning to 58 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=58, tfreq=2000)...
	 Trying 0:
	 Trying 1:
	 Trying 2:
	 Trying 3:
	 Trying 4:
	 Trying 5:
	 Trying 6:
	 Trying 7:
	 Assigning to 59 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=59, tfreq=2000)...
	 Trying 0:
	 Trying 1:
	 Trying 2:
	 Trying 3:
	 Trying 4:
	 Trying 5:
	 Trying 6:
	 Trying 7:
  2nd iter for reload pseudo assignments:
	 Reload r41 assignment failure
	 Reload r42 assignment failure
	 Reload r58 assignment failure
	 Reload r59 assignment failure
	  Spill reload  r40(hr=8, freq=2000)
	  Spill  r35(hr=8, freq=2000)
	  Spill reload  r57(hr=8, freq=2000)
	  Spill  r36(hr=8, freq=2000)
	 Assigning to 74 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r74 (freq=3000)
	Hard reg 8 is preferable by r40 with profit 1750
	Hard reg 8 is preferable by r73 with profit 3000
	Hard reg 8 is preferable by r42 with profit 1500
	 Assigning to 73 (cl=X_REG, orig=34, freq=3000, tfirst=60, tfreq=17000)...
	   Assign 8 to inheritance r73 (freq=3000)
	Hard reg 8 is preferable by r42 with profit 2500
	 Assigning to 40 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=40, tfreq=2000)...
	   Assign 8 to reload r40 (freq=2000)
	 Assigning to 41 (cl=GENERAL_REGS, orig=41, freq=2000, tfirst=41, tfreq=2000)...
	 Trying 0:
	 Trying 1:
	 Trying 2:
	 Trying 3:
	 Trying 4:
	 Trying 5:
	 Trying 6:
	 Trying 7:
	 Assigning to 42 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=42, tfreq=2000)...
	   Assign 8 to reload r42 (freq=2000)
	 Assigning to 57 (cl=GENERAL_REGS, orig=57, freq=2000, tfirst=57, tfreq=2000)...
	   Assign 8 to reload r57 (freq=2000)
	 Assigning to 58 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=58, tfreq=2000)...
	 Trying 0:
	 Trying 1:
	 Trying 2:
	 Trying 3:
	 Trying 4:
	 Trying 5:
	 Trying 6:
	 Trying 7:
	 Assigning to 59 (cl=GENERAL_REGS, orig=34, freq=2000, tfirst=59, tfreq=2000)...
	   Assign 8 to reload r59 (freq=2000)
	Hard reg 8 is preferable by r60 with profit 2093
	Hard reg 8 is preferable by r61 with profit 687
	Hard reg 8 is preferable by r58 with profit 1031
  Reassigning non-reload pseudos

********** Undoing inheritance #1: **********

Inherit 14 out of 16 (87.50%)
   Insn after restoring regs:
   69: r59:PSI=r34:PSI
      REG_DEAD r34:PSI
   Insn after restoring regs:
   68: r58:PSI=r34:PSI
	   Removing inheritance:
   70: r60:PSI=r61:PSI
      REG_DEAD r61:PSI
    Change reload insn:
   66: r56:PSI=r62:PSI
   Insn after restoring regs:
   71: r34:PSI=r62:PSI
      REG_DEAD r62:PSI

****** Splitting a hard reg after assignment #1: ******

	Hard reg 8 is preferable by r60 with profit 2093
	Hard reg 0 is preferable by r60 with profit 1000
	Hard reg 8 is preferable by r61 with profit 687
	Hard reg 0 is preferable by r61 with profit 500
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c: In function ‘f1’:
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: error: unable to find a register to spill
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: error: this is the insn:
(insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
                (const_int 32 [0x20])) [2  S4 A64])
        (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8])) "/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9 95 {*movsi}
     (expr_list:REG_DEAD (reg:PSI 41)
        (expr_list:REG_DEAD (reg/f:PSI 40 [34])
            (nil))))
during RTL pass: reload
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: internal compiler error: in lra_split_hard_reg_for, at lra-assigns.c:1841
0x10f2889b _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
	/home/jmd/Source/GCC2/gcc/rtl-error.c:108
0x10c738ab lra_split_hard_reg_for()
	/home/jmd/Source/GCC2/gcc/lra-assigns.c:1841
0x10c68633 lra(_IO_FILE*)
	/home/jmd/Source/GCC2/gcc/lra.c:2555
0x10bcc9bb do_reload
	/home/jmd/Source/GCC2/gcc/ira.c:5522
0x10bcd187 execute
	/home/jmd/Source/GCC2/gcc/ira.c:5706
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-09 14:17               ` Segher Boessenkool
  2019-08-09 14:23                 ` Paul Koning
@ 2019-08-10  6:10                 ` John Darrington
  2019-08-10 16:15                   ` Segher Boessenkool
  1 sibling, 1 reply; 38+ messages in thread
From: John Darrington @ 2019-08-10  6:10 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: John Darrington, Jeff Law, Paul Koning, Vladimir Makarov, gcc

[-- Attachment #1: Type: text/plain, Size: 740 bytes --]

On Fri, Aug 09, 2019 at 09:16:44AM -0500, Segher Boessenkool wrote:

     Is your code in some branch in our git?  

No.  But it could be pushed there if people think it would be
appropriate to do so, and if I'm given the permissions to do so.
     
     Or in some other public git?

It's in my repo on gcc135 ~jmd/gcc-s12z (branch s12z)


     Do you have a representative testcase?

I think gcc/testsuite/gcc.c-torture/compile/pr53410-2.c is as
representative as any.
     

J'

     

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-10  6:06                 ` John Darrington
@ 2019-08-10 16:12                   ` Segher Boessenkool
  2019-08-12  6:47                     ` John Darrington
  2019-08-12 13:35                   ` Vladimir Makarov
  2019-08-15 16:29                   ` Vladimir Makarov
  2 siblings, 1 reply; 38+ messages in thread
From: Segher Boessenkool @ 2019-08-10 16:12 UTC (permalink / raw)
  To: John Darrington; +Cc: Vladimir Makarov, Jeff Law, Paul Koning, gcc

Hi!

On Sat, Aug 10, 2019 at 08:05:53AM +0200, John Darrington wrote:
> 	 Choosing alt 5 in insn 14:  (0) m  (1) m {*movsi}
>    14: [r40:PSI+0x20]=[r41:PSI]
>     Inserting insn reload before:
>    48: r40:PSI=r34:PSI
>    49: r41:PSI=[y:PSI+0x2f]

insn 14 is a mem-to-mem move (another feature not many more modern /
more RISCy CPUs have).  That requires both of your address registers.
So far, so good.  The reloads (insn 48 and 49) require address
registers themselves; that isn't necessarily a problem either.  But
this requires careful juggling.  Maybe you will need some backend code
for this, or to optimise this (although right now you just want it to
*work* :-) )

For some reason LRA didn't manage.  Register inheritance seems to be
implicated (but that might be a red herring).  Vladimir will probably
find out more, and/or correct me :-)


Segher

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-10  6:10                 ` John Darrington
@ 2019-08-10 16:15                   ` Segher Boessenkool
  0 siblings, 0 replies; 38+ messages in thread
From: Segher Boessenkool @ 2019-08-10 16:15 UTC (permalink / raw)
  To: John Darrington; +Cc: Jeff Law, Paul Koning, Vladimir Makarov, gcc

On Sat, Aug 10, 2019 at 08:10:27AM +0200, John Darrington wrote:
> On Fri, Aug 09, 2019 at 09:16:44AM -0500, Segher Boessenkool wrote:
> 
>      Is your code in some branch in our git?  
> 
> No.  But it could be pushed there if people think it would be
> appropriate to do so, and if I'm given the permissions to do so.
>      
>      Or in some other public git?
> 
> It's in my repo on gcc135 ~jmd/gcc-s12z (branch s12z)

That will work fine, for me at least.

>      Do you have a representative testcase?
> 
> I think gcc/testsuite/gcc.c-torture/compile/pr53410-2.c is as
> representative as any.

Okido, thanks!


Segher

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-10 16:12                   ` Segher Boessenkool
@ 2019-08-12  6:47                     ` John Darrington
  2019-08-12  8:40                       ` Segher Boessenkool
  0 siblings, 1 reply; 38+ messages in thread
From: John Darrington @ 2019-08-12  6:47 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: John Darrington, Vladimir Makarov, Jeff Law, Paul Koning, gcc

On Sat, Aug 10, 2019 at 11:12:18AM -0500, Segher Boessenkool wrote:
     Hi!
     
     On Sat, Aug 10, 2019 at 08:05:53AM +0200, John Darrington wrote:
     > 	 Choosing alt 5 in insn 14:  (0) m  (1) m {*movsi}
     >    14: [r40:PSI+0x20]=[r41:PSI]
     >     Inserting insn reload before:
     >    48: r40:PSI=r34:PSI
     >    49: r41:PSI=[y:PSI+0x2f]
     
     insn 14 is a mem-to-mem move (another feature not many more modern /
     more RISCy CPUs have).  That requires both of your address registers.
     So far, so good.  The reloads (insn 48 and 49) require address
     registers themselves; that isn't necessarily a problem either.

So far as I can see, insn 48 is completely redundant.  It's copying a
pseudo reg (74) into another pseudo reg (40).
This is pointless and a waste, since insn 14 does not modify 74.
I don't understand why lra feels the need to do it.

If lra knew about (mem (mem ...)) style addressing, then insn 49 would
also be redundant (which is why I raised the topic).

In summary, what we have is:

(insn 48 84 49 2 (set (reg/f:PSI 40 [34])
        (reg/f:PSI 74 [34]))
     (nil))
(insn 49 48 14 2 (set (reg:PSI 41)
        (mem/f/c:PSI (plus:PSI (reg/f:PSI 9 y)
                (const_int 47 [0x2f])) [3 p+0 S4 A8]))
     (nil))
(insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
                (const_int 32 [0x20])) [2  S4 A64])
        (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8])) 

where, like you say, insns 48 and 49 are reloads.  But these two reloads 
are unnecessary and cause the machine to run out of PSImode registers.
The above could be easier and more efficiently done simply as:

(insn 14 11 15 2 (set 
	(mem:SI (plus:PSI (reg/f:PSI 74 [34]) (const_int 32 [0x20])) [2  S4 A64])
        (mem/f/c:PSI (mem:PSI (plus:PSI (reg/f:PSI 9 y)
                (const_int 47 [0x2f])) [3 p+0 S4 A8])))


This is exactly what we had before lra messed with things.  It can be
represented in the ISA with one assembler instruction: 
  mov.p (32, x), [47, y]
and if I'm not mistaken, alternative 5 of my "movpsi" pattern should do
this just fine.


     But
     this requires careful juggling.  Maybe you will need some backend code

Could you give a hint into which set of hooks/constraints/predicates
this backend code should go?
     

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-12  6:47                     ` John Darrington
@ 2019-08-12  8:40                       ` Segher Boessenkool
  0 siblings, 0 replies; 38+ messages in thread
From: Segher Boessenkool @ 2019-08-12  8:40 UTC (permalink / raw)
  To: John Darrington; +Cc: Vladimir Makarov, Jeff Law, Paul Koning, gcc

Hi John,

On Mon, Aug 12, 2019 at 08:47:43AM +0200, John Darrington wrote:
> On Sat, Aug 10, 2019 at 11:12:18AM -0500, Segher Boessenkool wrote:
>      On Sat, Aug 10, 2019 at 08:05:53AM +0200, John Darrington wrote:
>      > 	 Choosing alt 5 in insn 14:  (0) m  (1) m {*movsi}
>      >    14: [r40:PSI+0x20]=[r41:PSI]
>      >     Inserting insn reload before:
>      >    48: r40:PSI=r34:PSI
>      >    49: r41:PSI=[y:PSI+0x2f]
>      
>      insn 14 is a mem-to-mem move (another feature not many more modern /
>      more RISCy CPUs have).  That requires both of your address registers.
>      So far, so good.  The reloads (insn 48 and 49) require address
>      registers themselves; that isn't necessarily a problem either.
> 
> So far as I can see, insn 48 is completely redundant.  It's copying a
> pseudo reg (74) into another pseudo reg (40).
> This is pointless and a waste, since insn 14 does not modify 74.
> I don't understand why lra feels the need to do it.

LRA always does this, I think...  it reloads all inputs to all insns
that may need reloading.  It later optimises most of that away again,
but this gives it a lot of freedom to move things around.

Or that is what it always looked like to me.  I haven't looked at the
code to see if that is the real reason, blush.

> If lra knew about (mem (mem ...)) style addressing, then insn 49 would
> also be redundant (which is why I raised the topic).

Yes.  But it probably should be able to deal with things like this, too,
or some other testcases will die a horrible death.

> In summary, what we have is:
> 
> (insn 48 84 49 2 (set (reg/f:PSI 40 [34])
>         (reg/f:PSI 74 [34]))
>      (nil))
> (insn 49 48 14 2 (set (reg:PSI 41)
>         (mem/f/c:PSI (plus:PSI (reg/f:PSI 9 y)
>                 (const_int 47 [0x2f])) [3 p+0 S4 A8]))
>      (nil))
> (insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
>                 (const_int 32 [0x20])) [2  S4 A64])
>         (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8])) 
> 
> where, like you say, insns 48 and 49 are reloads.  But these two reloads 
> are unnecessary and cause the machine to run out of PSImode registers.

Anyway, please have patience, and see what Vladimir comes up with.  These
things take time.


Segher

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-10  6:06                 ` John Darrington
  2019-08-10 16:12                   ` Segher Boessenkool
@ 2019-08-12 13:35                   ` Vladimir Makarov
  2019-08-15 16:29                   ` Vladimir Makarov
  2 siblings, 0 replies; 38+ messages in thread
From: Vladimir Makarov @ 2019-08-12 13:35 UTC (permalink / raw)
  To: John Darrington; +Cc: Jeff Law, Segher Boessenkool, Paul Koning, gcc


On 2019-08-10 2:05 a.m., John Darrington wrote:
> On Fri, Aug 09, 2019 at 01:34:36PM -0400, Vladimir Makarov wrote:
>       
>       If you provide LRA dump for such test (it is better to use
>       -fira-verbose=15 to output full RA info into stderr), I probably could
>       say more.
>
> I've attached such a dump (generated from gcc/testsuite/gcc.c-torture/compile/pr53410-2.c).

Unfortunately, this info is not enough for me to say what is the 
problem.  I only found suspicious that LRA is trying to assign a few 
registers to a pseudo register and fails even though these registers are 
not assigned to anything.  Probably HARD_REGNO_MODE_OK prevents this.  
So it would be interesting to know how many registers of Pmode are 
actually available.

In any case I'll try to look at this problem more on this week using 
your built gcc on gcc135.

>       
>       The less regs the architecture has, thoke easier to run into such error
>       message if something described wrong in the back-end.?? I see your
>       architecture is 16-bit micro-controller with only 8 regs, some of them is
>       specialized.?? So your architecture is really register constrained.
>
> That's not quite correct.  It is a 24-bit micro-controller (the address
> space is 24 bits wide).  There are 2 address registers (plus stack
> pointer and program counter) and there are 8 general purpose data
> registers (of differing sizes).
>       
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-10  6:06                 ` John Darrington
  2019-08-10 16:12                   ` Segher Boessenkool
  2019-08-12 13:35                   ` Vladimir Makarov
@ 2019-08-15 16:29                   ` Vladimir Makarov
  2019-08-15 16:38                     ` Richard Biener
  2019-08-15 17:36                     ` John Darrington
  2 siblings, 2 replies; 38+ messages in thread
From: Vladimir Makarov @ 2019-08-15 16:29 UTC (permalink / raw)
  To: John Darrington; +Cc: Jeff Law, Segher Boessenkool, Paul Koning, gcc

On 8/10/19 2:05 AM, John Darrington wrote:
> On Fri, Aug 09, 2019 at 01:34:36PM -0400, Vladimir Makarov wrote:
>       
>       If you provide LRA dump for such test (it is better to use
>       -fira-verbose=15 to output full RA info into stderr), I probably could
>       say more.
>
> I've attached such a dump (generated from gcc/testsuite/gcc.c-torture/compile/pr53410-2.c).
>       
>       The less regs the architecture has, thoke easier to run into such error
>       message if something described wrong in the back-end.?? I see your
>       architecture is 16-bit micro-controller with only 8 regs, some of them is
>       specialized.?? So your architecture is really register constrained.
>
> That's not quite correct.  It is a 24-bit micro-controller (the address
> space is 24 bits wide).  There are 2 address registers (plus stack
> pointer and program counter) and there are 8 general purpose data
> registers (of differing sizes).
>       
>
> J'
>
Thank you for providing the sources.  It helped me to understand what is 
going on.  So the test crashes on

/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c: In function ‘f1’:
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: error: unable to find a register to spill
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: error: this is the insn:
(insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
                 (const_int 32 [0x20])) [2  S4 A64])
         (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8])) "/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9 95 {*movsi}
      (expr_list:REG_DEAD (reg:PSI 41)
         (expr_list:REG_DEAD (reg/f:PSI 40 [34])
             (nil))))

Your target has only 2 non-fixed addr registers (r8, r9).  One (r9) is defined as a hard reg pointer pointer. Honestly, I never saw a target with such register constraints.

-O0 assumes -fno-omit-frame-pointer.  So in -O0 mode we have only *one* free addr reg for insn which requires *2* of them.  That is why the GCC port crashes on this test.  If you add -fomit-frame-pointer, the test succeeds.

But even if use -fomit-frame-pointer,  it is not guaranteed that hard reg pointer will be substituted by stack pointer.  There are many cases where it is not possible (e.g. in case of alloca usage).

So what can be done, imho.  The simplest solution would be preventing insns with more one memory operand.  The more difficult solution would be permitting two memory one with address pseudo and another one with stack pointer.

I think only after solving this problem, you could think about implementing indirect memory addressing.

  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-15 16:29                   ` Vladimir Makarov
@ 2019-08-15 16:38                     ` Richard Biener
  2019-08-15 17:41                       ` John Darrington
  2019-08-15 18:30                       ` Vladimir Makarov
  2019-08-15 17:36                     ` John Darrington
  1 sibling, 2 replies; 38+ messages in thread
From: Richard Biener @ 2019-08-15 16:38 UTC (permalink / raw)
  To: gcc, Vladimir Makarov, John Darrington
  Cc: Jeff Law, Segher Boessenkool, Paul Koning

On August 15, 2019 6:29:13 PM GMT+02:00, Vladimir Makarov <vmakarov@redhat.com> wrote:
>On 8/10/19 2:05 AM, John Darrington wrote:
>> On Fri, Aug 09, 2019 at 01:34:36PM -0400, Vladimir Makarov wrote:
>>       
>>       If you provide LRA dump for such test (it is better to use
>>       -fira-verbose=15 to output full RA info into stderr), I
>probably could
>>       say more.
>>
>> I've attached such a dump (generated from
>gcc/testsuite/gcc.c-torture/compile/pr53410-2.c).
>>       
>>       The less regs the architecture has, thoke easier to run into
>such error
>>       message if something described wrong in the back-end.?? I see
>your
>>       architecture is 16-bit micro-controller with only 8 regs, some
>of them is
>>       specialized.?? So your architecture is really register
>constrained.
>>
>> That's not quite correct.  It is a 24-bit micro-controller (the
>address
>> space is 24 bits wide).  There are 2 address registers (plus stack
>> pointer and program counter) and there are 8 general purpose data
>> registers (of differing sizes).
>>       
>>
>> J'
>>
>Thank you for providing the sources.  It helped me to understand what
>is 
>going on.  So the test crashes on
>
>/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:
>In function ‘f1’:
>/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1:
>error: unable to find a register to spill
>/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1:
>error: this is the insn:
>(insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
>                 (const_int 32 [0x20])) [2  S4 A64])
>(mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8]))
>"/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9
>95 {*movsi}
>      (expr_list:REG_DEAD (reg:PSI 41)
>         (expr_list:REG_DEAD (reg/f:PSI 40 [34])
>             (nil))))
>
>Your target has only 2 non-fixed addr registers (r8, r9).  One (r9) is
>defined as a hard reg pointer pointer. Honestly, I never saw a target
>with such register constraints.
>
>-O0 assumes -fno-omit-frame-pointer.  So in -O0 mode we have only *one*
>free addr reg for insn which requires *2* of them.  That is why the GCC
>port crashes on this test.  If you add -fomit-frame-pointer, the test
>succeeds.
>
>But even if use -fomit-frame-pointer,  it is not guaranteed that hard
>reg pointer will be substituted by stack pointer.  There are many cases
>where it is not possible (e.g. in case of alloca usage).
>
>So what can be done, imho.  The simplest solution would be preventing
>insns with more one memory operand.  The more difficult solution would
>be permitting two memory one with address pseudo and another one with
>stack pointer.

Couldn't we spill the frame pointer? Basically we should be able to compute the first address into a reg, spill that, do the second (both could require the frame pointer), spill the frame pointer, reload the first computed address from the stack, execute the insn and then reload the frame pointer.

Maybe the frame pointer can also be implemented 'virually' in an index register that you keep updated so that sp + reg
Is the FP. Or frame accesses can use a
Stack slot as FP and the indirect memory 
Addressing... (is there an indirect lea?) 

>I think only after solving this problem, you could think about
>implementing indirect memory addressing.
>
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-15 16:29                   ` Vladimir Makarov
  2019-08-15 16:38                     ` Richard Biener
@ 2019-08-15 17:36                     ` John Darrington
  2019-08-15 18:23                       ` Vladimir Makarov
  1 sibling, 1 reply; 38+ messages in thread
From: John Darrington @ 2019-08-15 17:36 UTC (permalink / raw)
  To: Vladimir Makarov
  Cc: John Darrington, Jeff Law, Segher Boessenkool, Paul Koning, gcc

On Thu, Aug 15, 2019 at 12:29:13PM -0400, Vladimir Makarov wrote:


     Thank you for providing the sources.?? It helped me to understand what is
     going on.?? So the test crashes on
     
     /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c: In function ???f1???:
     /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: error: unable to find a register to spill
     /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: error: this is the insn:
     (insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
                     (const_int 32 [0x20])) [2  S4 A64])
             (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8])) "/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9 95 {*movsi}
          (expr_list:REG_DEAD (reg:PSI 41)
             (expr_list:REG_DEAD (reg/f:PSI 40 [34])
                 (nil))))

Thanks for taking a look.
     
     Your target has only 2 non-fixed addr registers (r8, r9).  One (r9) is defined as a hard reg pointer pointer.

That is correct.

     Honestly, I never saw a target with such register constraints.

My recollection is that MC68HC11 was the same.
     
     So what can be done, imho.  The simplest solution would be preventing insns with more one memory operand.

I tried this solution earlier.  But unfortunately it makes things worse.  What happens is it libgcc cannot
even be built -- ICEs occur on a memory from  address reg insn such as:
     
(insn 117 2981 3697 5 (set (mem/f:PSI (plus:PSI (reg:PSI 1309)
                (const_int 102 [0x66])) [3 fs_129(D)->pc+0 S4 A8])
		        (reg:PSI 1310)) "/home/jmd/Source/GCC2/libgcc/unwind-dw2.c":977:9 96 {movpsi}


J'
     

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-15 16:38                     ` Richard Biener
@ 2019-08-15 17:41                       ` John Darrington
  2019-08-15 18:30                       ` Vladimir Makarov
  1 sibling, 0 replies; 38+ messages in thread
From: John Darrington @ 2019-08-15 17:41 UTC (permalink / raw)
  To: Richard Biener
  Cc: gcc, Vladimir Makarov, John Darrington, Jeff Law,
	Segher Boessenkool, Paul Koning

On Thu, Aug 15, 2019 at 06:38:30PM +0200, Richard Biener wrote:

   Couldn't we spill the frame pointer? Basically we should be able to
   compute the first address into a reg, spill that, do the second
   (both could require the frame pointer), spill the frame pointer,
   reload the first computed address from the stack, execute the insn
   and then reload the frame pointer. 
     
     Maybe the frame pointer can also be implemented 'virually' in an index register that you keep updated so that sp + reg
     Is the FP. Or frame accesses can use a  Stack slot as FP and the indirect memory 
     Addressing... (is there an indirect lea?)

Yes.  lea x, [4,x] is a valid instruction.

J'

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-15 17:36                     ` John Darrington
@ 2019-08-15 18:23                       ` Vladimir Makarov
  2019-08-16 11:24                         ` Special Memory Constraint [was Re: Indirect memory addresses vs. lra] John Darrington
  0 siblings, 1 reply; 38+ messages in thread
From: Vladimir Makarov @ 2019-08-15 18:23 UTC (permalink / raw)
  To: gcc

On 8/15/19 1:35 PM, John Darrington wrote:
> On Thu, Aug 15, 2019 at 12:29:13PM -0400, Vladimir Makarov wrote:
>
>
>       Thank you for providing the sources.?? It helped me to understand what is
>       going on.?? So the test crashes on
>       
>       /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c: In function ???f1???:
>       /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: error: unable to find a register to spill
>       /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: error: this is the insn:
>       (insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
>                       (const_int 32 [0x20])) [2  S4 A64])
>               (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8])) "/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9 95 {*movsi}
>            (expr_list:REG_DEAD (reg:PSI 41)
>               (expr_list:REG_DEAD (reg/f:PSI 40 [34])
>                   (nil))))
>
> Thanks for taking a look.
>       
>       Your target has only 2 non-fixed addr registers (r8, r9).  One (r9) is defined as a hard reg pointer pointer.
>
> That is correct.
>
>       Honestly, I never saw a target with such register constraints.
>
> My recollection is that MC68HC11 was the same.
>       
>       So what can be done, imho.  The simplest solution would be preventing insns with more one memory operand.
>
> I tried this solution earlier.  But unfortunately it makes things worse.  What happens is it libgcc cannot
> even be built -- ICEs occur on a memory from  address reg insn such as:
>       
> (insn 117 2981 3697 5 (set (mem/f:PSI (plus:PSI (reg:PSI 1309)
>                  (const_int 102 [0x66])) [3 fs_129(D)->pc+0 S4 A8])
> 		        (reg:PSI 1310)) "/home/jmd/Source/GCC2/libgcc/unwind-dw2.c":977:9 96 {movpsi}
>
I see.  Then for the insn, you could try to create a pattern 
"memory,special memory constraint".  The special memory constraint 
should satisfy only spilled pseudo (pseudo with reg_renumber == -1).  I 
believe lra-constraints.c can spill the pseudo and the end you will have 
mem[disp1 + r8|r9|sp] = mem[disp1+sp].

It might work.  If it is not, we could modify LRA to do this.

Another solution would be adding unexisting register Z and for mem:psi 
[psi:r] = Z you could emit an assembler insn : mem[psi:r] = a stack slot 
corresponding Z.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-15 16:38                     ` Richard Biener
  2019-08-15 17:41                       ` John Darrington
@ 2019-08-15 18:30                       ` Vladimir Makarov
  2019-08-15 21:22                         ` Segher Boessenkool
  1 sibling, 1 reply; 38+ messages in thread
From: Vladimir Makarov @ 2019-08-15 18:30 UTC (permalink / raw)
  To: Richard Biener, gcc, John Darrington
  Cc: Jeff Law, Segher Boessenkool, Paul Koning

On 8/15/19 12:38 PM, Richard Biener wrote:
> On August 15, 2019 6:29:13 PM GMT+02:00, Vladimir Makarov <vmakarov@redhat.com> wrote:
>> On 8/10/19 2:05 AM, John Darrington wrote:
>>> On Fri, Aug 09, 2019 at 01:34:36PM -0400, Vladimir Makarov wrote:
>>>        
>>>        If you provide LRA dump for such test (it is better to use
>>>        -fira-verbose=15 to output full RA info into stderr), I
>> probably could
>>>        say more.
>>>
>>> I've attached such a dump (generated from
>> gcc/testsuite/gcc.c-torture/compile/pr53410-2.c).
>>>        
>>>        The less regs the architecture has, thoke easier to run into
>> such error
>>>        message if something described wrong in the back-end.?? I see
>> your
>>>        architecture is 16-bit micro-controller with only 8 regs, some
>> of them is
>>>        specialized.?? So your architecture is really register
>> constrained.
>>> That's not quite correct.  It is a 24-bit micro-controller (the
>> address
>>> space is 24 bits wide).  There are 2 address registers (plus stack
>>> pointer and program counter) and there are 8 general purpose data
>>> registers (of differing sizes).
>>>        
>>>
>>> J'
>>>
>> Thank you for providing the sources.  It helped me to understand what
>> is
>> going on.  So the test crashes on
>>
>> /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:
>> In function ‘f1’:
>> /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1:
>> error: unable to find a register to spill
>> /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1:
>> error: this is the insn:
>> (insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
>>                  (const_int 32 [0x20])) [2  S4 A64])
>> (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8]))
>> "/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9
>> 95 {*movsi}
>>       (expr_list:REG_DEAD (reg:PSI 41)
>>          (expr_list:REG_DEAD (reg/f:PSI 40 [34])
>>              (nil))))
>>
>> Your target has only 2 non-fixed addr registers (r8, r9).  One (r9) is
>> defined as a hard reg pointer pointer. Honestly, I never saw a target
>> with such register constraints.
>>
>> -O0 assumes -fno-omit-frame-pointer.  So in -O0 mode we have only *one*
>> free addr reg for insn which requires *2* of them.  That is why the GCC
>> port crashes on this test.  If you add -fomit-frame-pointer, the test
>> succeeds.
>>
>> But even if use -fomit-frame-pointer,  it is not guaranteed that hard
>> reg pointer will be substituted by stack pointer.  There are many cases
>> where it is not possible (e.g. in case of alloca usage).
>>
>> So what can be done, imho.  The simplest solution would be preventing
>> insns with more one memory operand.  The more difficult solution would
>> be permitting two memory one with address pseudo and another one with
>> stack pointer.
> Couldn't we spill the frame pointer? Basically we should be able to compute the first address into a reg, spill that, do the second (both could require the frame pointer), spill the frame pointer, reload the first computed address from the stack, execute the insn and then reload the frame pointer.
>
> Maybe the frame pointer can also be implemented 'virually' in an index register that you keep updated so that sp + reg
> Is the FP. Or frame accesses can use a
> Stack slot as FP and the indirect memory
> Addressing... (is there an indirect lea?)
>
Yes, it could be a solution.  It just needs some target maintainer 
creativity.  There are a lot of things (tricks) can be done in 
machine-dependent code which would not require RA changes.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Indirect memory addresses vs. lra
  2019-08-15 18:30                       ` Vladimir Makarov
@ 2019-08-15 21:22                         ` Segher Boessenkool
  0 siblings, 0 replies; 38+ messages in thread
From: Segher Boessenkool @ 2019-08-15 21:22 UTC (permalink / raw)
  To: Vladimir Makarov
  Cc: Richard Biener, gcc, John Darrington, Jeff Law, Paul Koning

On Thu, Aug 15, 2019 at 02:30:19PM -0400, Vladimir Makarov wrote:
> >Couldn't we spill the frame pointer? Basically we should be able to 
> >compute the first address into a reg, spill that, do the second (both 
> >could require the frame pointer), spill the frame pointer, reload the 
> >first computed address from the stack, execute the insn and then reload 
> >the frame pointer.
> >
> >Maybe the frame pointer can also be implemented 'virually' in an index 
> >register that you keep updated so that sp + reg
> >Is the FP. Or frame accesses can use a
> >Stack slot as FP and the indirect memory
> >Addressing... (is there an indirect lea?)
> >
> Yes, it could be a solution.  It just needs some target maintainer 
> creativity.  There are a lot of things (tricks) can be done in 
> machine-dependent code which would not require RA changes.

You can even go as far as not having the hard frame pointer be a machine
register at all.  In RTL it will still be a reg, but that doesn't mean
the machine code you emit should be like that; you can use a special
fixed memory location for it, for example.


Segher

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
  2019-08-15 18:23                       ` Vladimir Makarov
@ 2019-08-16 11:24                         ` John Darrington
  2019-08-16 14:50                           ` Vladimir Makarov
  0 siblings, 1 reply; 38+ messages in thread
From: John Darrington @ 2019-08-16 11:24 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc

On Thu, Aug 15, 2019 at 02:23:45PM -0400, Vladimir Makarov wrote:

     > I tried this solution earlier.  But unfortunately it makes things worse.  What happens is it libgcc cannot
     > even be built -- ICEs occur on a memory from  address reg insn such as:
     > (insn 117 2981 3697 5 (set (mem/f:PSI (plus:PSI (reg:PSI 1309)
     >                  (const_int 102 [0x66])) [3 fs_129(D)->pc+0 S4 A8])
     > 		        (reg:PSI 1310)) "/home/jmd/Source/GCC2/libgcc/unwind-dw2.c":977:9 96 {movpsi}
     > 
     I see.?? Then for the insn, you could try to create a pattern
     "memory,special memory constraint".?? The special memory constraint
     should satisfy only spilled pseudo (pseudo with reg_renumber == -1).?? I
     believe lra-constraints.c can spill the pseudo and the end you will have
     mem[disp1 + r8|r9|sp] = mem[disp1+sp].

You mean something like this:

(define_special_memory_constraint "a"
 "My special memory constraint"
 (match_operand 0 "my_special_predicate")
)

(define_predicate "my_special_predicate"
		    (match_operand 0 "memory_operand")
 {
  debug_rtx (op);
  if (MEM_P (op))
  {
    op = XEXP (op, 0);
    if (GET_CODE (op) == PLUS)
      {
	op = XEXP (op, 0);
	if (REG_P (op))
	  {
	    fprintf (stderr, "Reg number is %d\n", REGNO (op));
	    if (REGNO (op) >= 0)
	      return false;
	  }
      }
  }
  return true;
})

When I use this I get lots of the following ICEs

     "internal compiler error: maximum number of generated reload insns per insn achieved (90)"

It seems logical to me that this would happen since the constraint is not going to match any
operand with resolved registers.  Thus it will continually reload.

... which makes me think I've probably misunderstood what you are saying.

J'


-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
  2019-08-16 11:24                         ` Special Memory Constraint [was Re: Indirect memory addresses vs. lra] John Darrington
@ 2019-08-16 14:50                           ` Vladimir Makarov
  2019-08-19  7:36                             ` John Darrington
  0 siblings, 1 reply; 38+ messages in thread
From: Vladimir Makarov @ 2019-08-16 14:50 UTC (permalink / raw)
  To: John Darrington; +Cc: gcc


On 2019-08-16 7:23 a.m., John Darrington wrote:
> On Thu, Aug 15, 2019 at 02:23:45PM -0400, Vladimir Makarov wrote:
>
>       > I tried this solution earlier.  But unfortunately it makes things worse.  What happens is it libgcc cannot
>       > even be built -- ICEs occur on a memory from  address reg insn such as:
>       > (insn 117 2981 3697 5 (set (mem/f:PSI (plus:PSI (reg:PSI 1309)
>       >                  (const_int 102 [0x66])) [3 fs_129(D)->pc+0 S4 A8])
>       > 		        (reg:PSI 1310)) "/home/jmd/Source/GCC2/libgcc/unwind-dw2.c":977:9 96 {movpsi}
>       >
>       I see.?? Then for the insn, you could try to create a pattern
>       "memory,special memory constraint".?? The special memory constraint
>       should satisfy only spilled pseudo (pseudo with reg_renumber == -1).?? I
>       believe lra-constraints.c can spill the pseudo and the end you will have
>       mem[disp1 + r8|r9|sp] = mem[disp1+sp].
>
> You mean something like this:
>
> (define_special_memory_constraint "a"
>   "My special memory constraint"
>   (match_operand 0 "my_special_predicate")
> )
>
> (define_predicate "my_special_predicate"
> 		    (match_operand 0 "memory_operand")
>   {
>    debug_rtx (op);
>    if (MEM_P (op))
>    {
>      op = XEXP (op, 0);
>      if (GET_CODE (op) == PLUS)
>        {
> 	op = XEXP (op, 0);
> 	if (REG_P (op))
> 	  {
> 	    fprintf (stderr, "Reg number is %d\n", REGNO (op));
> 	    if (REGNO (op) >= 0)
> 	      return false;
> 	  }
>        }
>    }
>    return true;
> })

No I meant something like that

(define_special_memory_constraint "a" ...)
(define_predicate "my_special_predicate" ...
		
  {
    if (lra_in_progress_p)
      return REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER && reg_renumber[REGNO(op)] < 0;
    return true if memory with sp addressing;
})

I think LRA spills pseudo-register and it will be memory addressed by sp 
at the end of LRA.

> When I use this I get lots of the following ICEs
>
>       "internal compiler error: maximum number of generated reload insns per insn achieved (90)"
>
> It seems logical to me that this would happen since the constraint is not going to match any
> operand with resolved registers.  Thus it will continually reload.
>
> ... which makes me think I've probably misunderstood what you are saying.
>
> J'
>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
  2019-08-16 14:50                           ` Vladimir Makarov
@ 2019-08-19  7:36                             ` John Darrington
  2019-08-19 13:14                               ` Vladimir Makarov
  0 siblings, 1 reply; 38+ messages in thread
From: John Darrington @ 2019-08-19  7:36 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: John Darrington, gcc

On Fri, Aug 16, 2019 at 10:50:13AM -0400, Vladimir Makarov wrote:
     
     
     No I meant something like that
     
     (define_special_memory_constraint "a" ...)
     (define_predicate "my_special_predicate" ...
     		
      {
        if (lra_in_progress_p)
          return REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER && reg_renumber[REGNO(op)] < 0;
        return true if memory with sp addressing;
     })
     
     I think LRA spills pseudo-register and it will be memory addressed by sp
     at the end of LRA.

What I've done is this:

(define_predicate "my_special_predicate"
		    (match_operand 0 "memory_operand")
 {
   debug_rtx (op);
   gcc_assert (MEM_P (op));
   op = XEXP (op, 0);
   if (GET_CODE (op) == PLUS)
     op = XEXP (op, 0);

   if (lra_in_progress)
     {
       fprintf (stderr, "%s:%d\n", __FILE__, __LINE__);
       return REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER && reg_renumber[REGNO(op)] < 0;
     }


   if (REG_P (op))
     {
       int regno = REGNO (op);
       return (regno == 10); // register is the stack pointer
     }

   return true;
 })

 (and many variations)  Unfortunately, any moderately complicated input
 still results in a (mem (reg) ) insn repeatedly entering the
 lra_in_progress case and returning false, and eventually terminating with
     
 "internal compiler error: maximum number of generated reload insns per insn achieved (90)"


Any other ideas?

J'

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
  2019-08-19  7:36                             ` John Darrington
@ 2019-08-19 13:14                               ` Vladimir Makarov
  2019-08-19 15:07                                 ` Segher Boessenkool
  0 siblings, 1 reply; 38+ messages in thread
From: Vladimir Makarov @ 2019-08-19 13:14 UTC (permalink / raw)
  To: gcc


On 2019-08-19 3:35 a.m., John Darrington wrote:
> On Fri, Aug 16, 2019 at 10:50:13AM -0400, Vladimir Makarov wrote:
>       
>       
>       No I meant something like that
>       
>       (define_special_memory_constraint "a" ...)
>       (define_predicate "my_special_predicate" ...
>       		
>        {
>          if (lra_in_progress_p)
>            return REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER && reg_renumber[REGNO(op)] < 0;
>          return true if memory with sp addressing;
>       })
>       
>       I think LRA spills pseudo-register and it will be memory addressed by sp
>       at the end of LRA.
>
> What I've done is this:
>
> (define_predicate "my_special_predicate"
> 		    (match_operand 0 "memory_operand")
>   {
>     debug_rtx (op);
>     gcc_assert (MEM_P (op));
>     op = XEXP (op, 0);
>     if (GET_CODE (op) == PLUS)
>       op = XEXP (op, 0);
>
>     if (lra_in_progress)
>       {
>         fprintf (stderr, "%s:%d\n", __FILE__, __LINE__);
>         return REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER && reg_renumber[REGNO(op)] < 0;
>       }
>
>
>     if (REG_P (op))
>       {
>         int regno = REGNO (op);
>         return (regno == 10); // register is the stack pointer
>       }
>
>     return true;
>   })
>
>   (and many variations)  Unfortunately, any moderately complicated input
>   still results in a (mem (reg) ) insn repeatedly entering the
>   lra_in_progress case and returning false, and eventually terminating with
>       
>   "internal compiler error: maximum number of generated reload insns per insn achieved (90)"
>
>
> Any other ideas?
   As I remember there were a few other ideas from Richard Biener and 
Segher Boessenkool.  I also proposed to add a new address register which 
will be always a fixed stack memory slot at the end. Unfortunately I am 
not familiar with the target and the port to say in details how to do 
it.  But I think it is worth to try.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
  2019-08-19 13:14                               ` Vladimir Makarov
@ 2019-08-19 15:07                                 ` Segher Boessenkool
  2019-08-19 18:06                                   ` John Darrington
  0 siblings, 1 reply; 38+ messages in thread
From: Segher Boessenkool @ 2019-08-19 15:07 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: gcc

On Mon, Aug 19, 2019 at 09:14:22AM -0400, Vladimir Makarov wrote:
> On 2019-08-19 3:35 a.m., John Darrington wrote:
> >On Fri, Aug 16, 2019 at 10:50:13AM -0400, Vladimir Makarov wrote:
> >      No I meant something like that
> >      
> >      (define_special_memory_constraint "a" ...)
> >      (define_predicate "my_special_predicate" ...
> >      		
> >       {
> >         if (lra_in_progress_p)
> >           return REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER && 
> >           reg_renumber[REGNO(op)] < 0;
> >         return true if memory with sp addressing;
> >      })
> >      
> >      I think LRA spills pseudo-register and it will be memory addressed 
> >      by sp
> >      at the end of LRA.
> >
> >What I've done is this:
> >
> >(define_predicate "my_special_predicate"
> >		    (match_operand 0 "memory_operand")
> >  {
> >    debug_rtx (op);
> >    gcc_assert (MEM_P (op));
> >    op = XEXP (op, 0);
> >    if (GET_CODE (op) == PLUS)
> >      op = XEXP (op, 0);
> >
> >    if (lra_in_progress)
> >      {
> >        fprintf (stderr, "%s:%d\n", __FILE__, __LINE__);
> >        return REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER && 
> >        reg_renumber[REGNO(op)] < 0;
> >      }
> >
> >
> >    if (REG_P (op))
> >      {
> >        int regno = REGNO (op);
> >        return (regno == 10); // register is the stack pointer
> >      }
> >
> >    return true;
> >  })
> >
> >  (and many variations)  Unfortunately, any moderately complicated input
> >  still results in a (mem (reg) ) insn repeatedly entering the
> >  lra_in_progress case and returning false, and eventually terminating with
> >      
> >  "internal compiler error: maximum number of generated reload insns per 
> >  insn achieved (90)"
> >
> >
> >Any other ideas?
>   As I remember there were a few other ideas from Richard Biener and 
> Segher Boessenkool.  I also proposed to add a new address register which 
> will be always a fixed stack memory slot at the end. Unfortunately I am 
> not familiar with the target and the port to say in details how to do 
> it.  But I think it is worth to try.

The m68hc11 port used the fake Z register approach, and I believe it had
some special machine pass to get rid of it right before assembler output.

(r171302 is when it was removed -- last version was
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/m68hc11/m68hc11.c;h=1e414102c3f1fed985e4fb8db7954342e965190b;hb=bae8bb65d842d7ffefe990c1f0ac004491f3c105#l4061
for the machine reorg stuff).

No idea how well it works...  But it's only needed if you are forced to
have a frame pointer IIUC?


Segher

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
  2019-08-19 15:07                                 ` Segher Boessenkool
@ 2019-08-19 18:06                                   ` John Darrington
  2019-08-20  6:56                                     ` Richard Biener
  0 siblings, 1 reply; 38+ messages in thread
From: John Darrington @ 2019-08-19 18:06 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Vladimir Makarov, gcc

On Mon, Aug 19, 2019 at 10:07:11AM -0500, Segher Boessenkool wrote:

     > ? As I remember there were a few other ideas from Richard Biener and 
     > Segher Boessenkool.? I also proposed to add a new address register which 
     > will be always a fixed stack memory slot at the end. Unfortunately I am 
     > not familiar with the target and the port to say in details how to do 
     > it.? But I think it is worth to try.
     
     The m68hc11 port used the fake Z register approach, and I believe it had
     some special machine pass to get rid of it right before assembler output.
     
     (r171302 is when it was removed -- last version was
     https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/m68hc11/m68hc11.c;h=1e414102c3f1fed985e4fb8db7954342e965190b;hb=bae8bb65d842d7ffefe990c1f0ac004491f3c105#l4061
     for the machine reorg stuff).
     
     No idea how well it works...  But it's only needed if you are forced to
     have a frame pointer IIUC?
     
     
     Segher


Most of these suggestions involve adding some sort of virtual registers
So I hacked the machine description to add two new registers Z1 and Z2 
with the same mode as X and Y.

Obviously the assembler balks at this.  However the compiler still
ICEs at the same place as before.

So this suggests that our original diagnosis, viz: there are not enough
address registers was not accurate, and in fact there is some other
problem?

J'

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
  2019-08-19 18:06                                   ` John Darrington
@ 2019-08-20  6:56                                     ` Richard Biener
  2019-08-20  7:07                                       ` John Darrington
  0 siblings, 1 reply; 38+ messages in thread
From: Richard Biener @ 2019-08-20  6:56 UTC (permalink / raw)
  To: John Darrington; +Cc: Segher Boessenkool, Vladimir Makarov, GCC Development

On Mon, Aug 19, 2019 at 8:06 PM John Darrington
<john@darrington.wattle.id.au> wrote:
>
> On Mon, Aug 19, 2019 at 10:07:11AM -0500, Segher Boessenkool wrote:
>
>      > ? As I remember there were a few other ideas from Richard Biener and
>      > Segher Boessenkool.? I also proposed to add a new address register which
>      > will be always a fixed stack memory slot at the end. Unfortunately I am
>      > not familiar with the target and the port to say in details how to do
>      > it.? But I think it is worth to try.
>
>      The m68hc11 port used the fake Z register approach, and I believe it had
>      some special machine pass to get rid of it right before assembler output.
>
>      (r171302 is when it was removed -- last version was
>      https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/m68hc11/m68hc11.c;h=1e414102c3f1fed985e4fb8db7954342e965190b;hb=bae8bb65d842d7ffefe990c1f0ac004491f3c105#l4061
>      for the machine reorg stuff).
>
>      No idea how well it works...  But it's only needed if you are forced to
>      have a frame pointer IIUC?
>
>
>      Segher
>
>
> Most of these suggestions involve adding some sort of virtual registers
> So I hacked the machine description to add two new registers Z1 and Z2
> with the same mode as X and Y.
>
> Obviously the assembler balks at this.  However the compiler still
> ICEs at the same place as before.
>
> So this suggests that our original diagnosis, viz: there are not enough
> address registers was not accurate, and in fact there is some other
> problem?

That sounds likely.  Given you have indirect addressing you could
simulate N virtual regs by placing them in a virtual reg table in memory
and accessed via a fixed address register (assuming all instructions
that would need an address reg also can take that indirect from memory).

Richard.

> J'
>
> --
> Avoid eavesdropping.  Send strong encrypted email.
> PGP Public key ID: 1024D/2DE827B3
> fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
> See http://sks-keyservers.net or any PGP keyserver for public key.
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
  2019-08-20  6:56                                     ` Richard Biener
@ 2019-08-20  7:07                                       ` John Darrington
  2019-08-20  7:30                                         ` Richard Biener
  0 siblings, 1 reply; 38+ messages in thread
From: John Darrington @ 2019-08-20  7:07 UTC (permalink / raw)
  To: Richard Biener
  Cc: John Darrington, Segher Boessenkool, Vladimir Makarov, GCC Development

On Tue, Aug 20, 2019 at 08:56:39AM +0200, Richard Biener wrote:

     > Most of these suggestions involve adding some sort of virtual registers
     > So I hacked the machine description to add two new registers Z1 and Z2
     > with the same mode as X and Y.
     >
     > Obviously the assembler balks at this.  However the compiler still
     > ICEs at the same place as before.
     >
     > So this suggests that our original diagnosis, viz: there are not enough
     > address registers was not accurate, and in fact there is some other
     > problem?
     
     That sounds likely.  Given you have indirect addressing you could
     simulate N virtual regs by placing them in a virtual reg table in memory
     and accessed via a fixed address register (assuming all instructions
     that would need an address reg also can take that indirect from memory).
     
That was my plan.  Accordingly, extending the md to provide N additional
regs (N currently = 2) was the first step.  Having doubled the number
of available address registers, I had expected this would fix most of the 
ICEs (but cause a lot of assembler errors).

However it hasn't eliminated any ICEs.  lra is still complaining 
"unable to find a register to spill" So the plan seems to have fallen
over at the first hurdle.  Why can it still not spill registers despite
having a lot more of them?

J'

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
  2019-08-20  7:07                                       ` John Darrington
@ 2019-08-20  7:30                                         ` Richard Biener
  0 siblings, 0 replies; 38+ messages in thread
From: Richard Biener @ 2019-08-20  7:30 UTC (permalink / raw)
  To: John Darrington; +Cc: Segher Boessenkool, Vladimir Makarov, GCC Development

On Tue, Aug 20, 2019 at 9:07 AM John Darrington
<john@darrington.wattle.id.au> wrote:
>
> On Tue, Aug 20, 2019 at 08:56:39AM +0200, Richard Biener wrote:
>
>      > Most of these suggestions involve adding some sort of virtual registers
>      > So I hacked the machine description to add two new registers Z1 and Z2
>      > with the same mode as X and Y.
>      >
>      > Obviously the assembler balks at this.  However the compiler still
>      > ICEs at the same place as before.
>      >
>      > So this suggests that our original diagnosis, viz: there are not enough
>      > address registers was not accurate, and in fact there is some other
>      > problem?
>
>      That sounds likely.  Given you have indirect addressing you could
>      simulate N virtual regs by placing them in a virtual reg table in memory
>      and accessed via a fixed address register (assuming all instructions
>      that would need an address reg also can take that indirect from memory).
>
> That was my plan.  Accordingly, extending the md to provide N additional
> regs (N currently = 2) was the first step.  Having doubled the number
> of available address registers, I had expected this would fix most of the
> ICEs (but cause a lot of assembler errors).
>
> However it hasn't eliminated any ICEs.  lra is still complaining
> "unable to find a register to spill" So the plan seems to have fallen
> over at the first hurdle.  Why can it still not spill registers despite
> having a lot more of them?

You really have to sit down and trace the LRA code with a debugger
to tell...  unfortunately the dumps aren't verbose enough to tell.
Usually after spilling the insn constraints can still not be satisfied,
the main question is usually why.

Richard.

> J'

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2019-08-20  7:30 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-04 19:18 Indirect memory addresses vs. lra John Darrington
2019-08-08 16:25 ` Vladimir Makarov
2019-08-08 16:44   ` Paul Koning
2019-08-08 17:21     ` Segher Boessenkool
2019-08-08 17:25       ` Paul Koning
2019-08-08 19:09         ` Segher Boessenkool
2019-08-08 17:30       ` Paul Koning
2019-08-08 19:19         ` Segher Boessenkool
2019-08-08 19:57           ` Jeff Law
2019-08-09  8:14             ` John Darrington
2019-08-09 14:17               ` Segher Boessenkool
2019-08-09 14:23                 ` Paul Koning
2019-08-10  6:10                 ` John Darrington
2019-08-10 16:15                   ` Segher Boessenkool
2019-08-09 16:07               ` Jeff Law
2019-08-09 17:34               ` Vladimir Makarov
2019-08-10  6:06                 ` John Darrington
2019-08-10 16:12                   ` Segher Boessenkool
2019-08-12  6:47                     ` John Darrington
2019-08-12  8:40                       ` Segher Boessenkool
2019-08-12 13:35                   ` Vladimir Makarov
2019-08-15 16:29                   ` Vladimir Makarov
2019-08-15 16:38                     ` Richard Biener
2019-08-15 17:41                       ` John Darrington
2019-08-15 18:30                       ` Vladimir Makarov
2019-08-15 21:22                         ` Segher Boessenkool
2019-08-15 17:36                     ` John Darrington
2019-08-15 18:23                       ` Vladimir Makarov
2019-08-16 11:24                         ` Special Memory Constraint [was Re: Indirect memory addresses vs. lra] John Darrington
2019-08-16 14:50                           ` Vladimir Makarov
2019-08-19  7:36                             ` John Darrington
2019-08-19 13:14                               ` Vladimir Makarov
2019-08-19 15:07                                 ` Segher Boessenkool
2019-08-19 18:06                                   ` John Darrington
2019-08-20  6:56                                     ` Richard Biener
2019-08-20  7:07                                       ` John Darrington
2019-08-20  7:30                                         ` Richard Biener
2019-08-08 18:46     ` Indirect memory addresses vs. lra Vladimir Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).