splitting load immediates using high and lo

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* splitting load immediates using high and lo_sum
@ 2005-07-21 23:36 Tabony, Charles
  2005-07-21 23:49 ` Dale Johannesen
  0 siblings, 1 reply; 6+ messages in thread
From: Tabony, Charles @ 2005-07-21 23:36 UTC (permalink / raw)
  To: gcc

Hi,

I am working on a port for a processor that has 32 bit registers but can
only load 16 bit immediates.  I have tried several ways to split moves
with larger immediates into two RTL insns.  One is using a
define_expand:

-------------------------code-----------------------
(define_expand "movsi"
  [(set (match_operand:SI 0 "nonimmediate_operand" "")
        (match_operand:SI 1 "general_operand" ""))]
  ""
  {
    if (GET_CODE(operands[0]) != REG) {
      operands[1] = force_reg(SImode, operands[1]);
    }
    else if(s17to32_const_int_operand(operands[1],
GET_MODE(operands[1]))){
      emit_move_insn(operands[0],
                     gen_rtx_HIGH(GET_MODE(operands[1]), operands[1]));
      emit_move_insn(operands[0],
                     gen_rtx_LO_SUM(GET_MODE(operands[1]),
                                    operands[0], operands[1]));
      DONE;
    }
  })
------------------------/code-----------------------

With the corresponding define_insns:

-------------------------code-----------------------
(define_insn "movsi_high"
  [(set (match_operand:SI 0 "register_operand" "=r")
        (high:SI (match_operand:SI 1 "immediate_operand" "i")))]
  ""
  "%0.h = #HI(%1)")

(define_insn "movsi_lo_sum"
  [(set (match_operand:SI 0 "register_operand" "+r")
        (lo_sum:SI (match_dup 0)
                   (match_operand:SI 1 "immediate_operand" "i")))]
  ""
  "%0.l = #LO(%1)")
------------------------/code-----------------------

but using this method, I get the following error:

-------------------------error-----------------------
./libgcc2.c:470: error: unrecognizable insn:
(insn 100 99 86 0 ./libgcc2.c:464 (set (reg:SI 10 r10)
        (lo_sum (reg:SI 10 r10)
            (const_int 65536 [0x10000]))) -1 (nil)
    (nil))
./libgcc2.c:470: internal compiler error: in extract_insn, at
recog.c:2083
------------------------/error-----------------------

Why would that RTL not match my movsi_lo_sum define_insn?

I also tried using a define_split:

-------------------------code-----------------------
(define_split
  [(set (match_operand:SI 0 "register_operand" "")
	(match_operand:SI 1 "s17to32_const_int_operand" ""))]
  "reload_completed"
  [(set (match_dup 0)
	(high:SI (match_dup 1)))
   (set (match_dup 0)
	(lo_sum:SI (match_dup 0)
		(match_dup 1)))]
  "")
------------------------/code-----------------------

along with the same define_insns, but then I get the following error:

-------------------------error-----------------------
./crtstuff.c:288: error: insn does not satisfy its constraints:
(insn 103 12 11 0 (set (reg:SI 0 r0)
        (symbol_ref/u:SI ("*.LC0") [flags 0x2])) 18 {movsi_real} (nil)
    (nil))
./crtstuff.c:288: internal compiler error: in
reload_cse_simplify_operands, at postreload.c:378
------------------------/error-----------------------

In other words, that RTL never matches my define_split, even though I
placed it before the more general movsi define_insn and
s17to32_const_int_operand should return 1 for a symbol_ref.

Do you have any idea why either of these attempts do not work?  Which
method do you think is better?  In case you were wondering, here is the
code for s17to32_const_int_operand.  I modified the function
int_2word_operand from the frv port.

-------------------------code-----------------------
int s17to32_const_int_operand(rtx op, enum machine_mode mode
ATTRIBUTE_UNUSED)
{
  HOST_WIDE_INT value;
  REAL_VALUE_TYPE rv;
  long l;

  switch (GET_CODE (op))
    {
    default:
      break;

    case LABEL_REF:
    case SYMBOL_REF:
    case CONST:
      return 1;

    case CONST_INT:
      return ! IN_RANGE_P (INTVAL (op), -0x8000, 0x7FFF);

    case CONST_DOUBLE:
      if (GET_MODE (op) == SFmode)
	{
	  REAL_VALUE_FROM_CONST_DOUBLE (rv, op);
	  REAL_VALUE_TO_TARGET_SINGLE (rv, l);
	  value = l;
	  return ! IN_RANGE_P (value, -0x8000, 0x7FFF);
	}
      else if (GET_MODE (op) == VOIDmode)
	{
	  value = CONST_DOUBLE_LOW (op);
	  return ! IN_RANGE_P (value, -0x8000, 0x7FFF);
	}
      break;
    }

  return 0;
}
------------------------/code-----------------------

Thank you,
Charles J. Tabony

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: splitting load immediates using high and lo_sum
  2005-07-21 23:36 splitting load immediates using high and lo_sum Tabony, Charles
@ 2005-07-21 23:49 ` Dale Johannesen
  0 siblings, 0 replies; 6+ messages in thread
From: Dale Johannesen @ 2005-07-21 23:49 UTC (permalink / raw)
  To: Tabony, Charles; +Cc: gcc, Dale Johannesen


On Jul 21, 2005, at 4:36 PM, Tabony, Charles wrote:

> Hi,
>
> I am working on a port for a processor that has 32 bit registers but 
> can
> only load 16 bit immediates.
>   ""
>   "%0.h = #HI(%1)")

What are the semantics of this?  Low bits zeroed, or untouched?
If the former, your semantics are identical to Sparc; look at that.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: splitting load immediates using high and lo_sum
@ 2005-08-02 17:53 Tabony, Charles
  0 siblings, 0 replies; 6+ messages in thread
From: Tabony, Charles @ 2005-08-02 17:53 UTC (permalink / raw)
  To: Dale Johannesen; +Cc: gcc

> From: Dale Johannesen [mailto:dalej@apple.com] 
> 
> On Jul 21, 2005, at 5:04 PM, Tabony, Charles wrote:
> 
> >> From: Dale Johannesen [mailto:dalej@apple.com]
> >>
> >> On Jul 21, 2005, at 4:36 PM, Tabony, Charles wrote:
> >>
> >>> Hi,
> >>>
> >>> I am working on a port for a processor that has 32 bit 
> registers but
> >>> can
> >>> only load 16 bit immediates.
> >>>   ""
> >>>   "%0.h = #HI(%1)")
> >>
> >> What are the semantics of this?  Low bits zeroed, or untouched?
> >> If the former, your semantics are identical to Sparc; look at that.
> >
> > The low bits are untouched.  However, I would expect the compiler to
> > always follow setting the high bits with setting the low bits.
> 
> OK, if you're willing to accept that limitation (your 
> architecture could
> handle putting the LO first, which Sparc can't) then Sparc is still a
> good model to look at.  What it does should work for you.

Earlier I was able to successfully split load immediates into high and
lo_sum insns, and that has worked great as far as scheduling.  However,
I noticed that now instead of loading the address of a constant such as
a string, compiled programs will load the address of a constant that is
the address of that string and then dereference it.  My guess is that
this is caused by the constant in the high/lo_sum pair being hidden from
CSE.

I looked at the way SPARC and MIPS handle the problem, but I don't think
that will work for me.  If I understand correctly, they split the move
into a load immediate that has the lower bits cleared, corresponding to
a sethi or lui instruction, and an ior immediate.  The semantics of the
instructions I am working with, "R0.H = #HI(CONSTANT)" and "R0.L =
#LO(CONSTANT)" are that the half of the register not being set is
unmodified.

Since I can not use an ior immediate like SPARC and MIPS, how can I
split move immediate insns so that they can be effeciently scheduled but
still eliminate the unnecessary indirection?  Also, does the method used
by SPARC and MIPS work for symbols?

Thank you,
Charles

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: splitting load immediates using high and lo_sum
@ 2005-07-22  0:45 Tabony, Charles
  0 siblings, 0 replies; 6+ messages in thread
From: Tabony, Charles @ 2005-07-22  0:45 UTC (permalink / raw)
  To: Dale Johannesen; +Cc: gcc

> From: Dale Johannesen [mailto:dalej@apple.com]
> 
> On Jul 21, 2005, at 5:04 PM, Tabony, Charles wrote:
> 
> >> From: Dale Johannesen [mailto:dalej@apple.com]
> >>
> >> On Jul 21, 2005, at 4:36 PM, Tabony, Charles wrote:
> >>
> >>> Hi,
> >>>
> >>> I am working on a port for a processor that has 32 bit registers
but
> >>> can
> >>> only load 16 bit immediates.
> >>>   ""
> >>>   "%0.h = #HI(%1)")
> >>
> >> What are the semantics of this?  Low bits zeroed, or untouched?
> >> If the former, your semantics are identical to Sparc; look at that.
> >
> > The low bits are untouched.  However, I would expect the compiler to
> > always follow setting the high bits with setting the low bits.
> 
> OK, if you're willing to accept that limitation (your architecture
could
> handle putting the LO first, which Sparc can't) then Sparc is still a
> good model to look at.  What it does should work for you.

Aha!  I looked at the SPARC code and distilled it down to what I needed
and the difference is that it sets the mode of the high and lo_sum
expressions to the mode of operand 0, while I was setting it to the mode
of operand 1.  Now mine works great.

Thank you,
Charles J. Tabony

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: splitting load immediates using high and lo_sum
  2005-07-22  0:04 Tabony, Charles
@ 2005-07-22  0:16 ` Dale Johannesen
  0 siblings, 0 replies; 6+ messages in thread
From: Dale Johannesen @ 2005-07-22  0:16 UTC (permalink / raw)
  To: Tabony, Charles; +Cc: gcc, Dale Johannesen


On Jul 21, 2005, at 5:04 PM, Tabony, Charles wrote:

>> From: Dale Johannesen [mailto:dalej@apple.com]
>>
>> On Jul 21, 2005, at 4:36 PM, Tabony, Charles wrote:
>>
>>> Hi,
>>>
>>> I am working on a port for a processor that has 32 bit registers but
>>> can
>>> only load 16 bit immediates.
>>>   ""
>>>   "%0.h = #HI(%1)")
>>
>> What are the semantics of this?  Low bits zeroed, or untouched?
>> If the former, your semantics are identical to Sparc; look at that.
>
> The low bits are untouched.  However, I would expect the compiler to
> always follow setting the high bits with setting the low bits.

OK, if you're willing to accept that limitation (your architecture could
handle putting the LO first, which Sparc can't) then Sparc is still a
good model to look at.  What it does should work for you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: splitting load immediates using high and lo_sum
@ 2005-07-22  0:04 Tabony, Charles
  2005-07-22  0:16 ` Dale Johannesen
  0 siblings, 1 reply; 6+ messages in thread
From: Tabony, Charles @ 2005-07-22  0:04 UTC (permalink / raw)
  To: Dale Johannesen; +Cc: gcc

> From: Dale Johannesen [mailto:dalej@apple.com]
> 
> On Jul 21, 2005, at 4:36 PM, Tabony, Charles wrote:
> 
> > Hi,
> >
> > I am working on a port for a processor that has 32 bit registers but
> > can
> > only load 16 bit immediates.
> >   ""
> >   "%0.h = #HI(%1)")
> 
> What are the semantics of this?  Low bits zeroed, or untouched?
> If the former, your semantics are identical to Sparc; look at that.

The low bits are untouched.  However, I would expect the compiler to
always follow setting the high bits with setting the low bits.  The
point of splitting them is that I want the insn setting the high bits to
be scheduled in a vliw packet with the preceding insns and the insn
setting the low bits to be scheduled with the following insns whenever
possible.  The processor cannot execute both instructions in the same
cycle.

-Charles

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-08-02 17:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-07-21 23:36 splitting load immediates using high and lo_sum Tabony, Charles
2005-07-21 23:49 ` Dale Johannesen
2005-07-22  0:04 Tabony, Charles
2005-07-22  0:16 ` Dale Johannesen
2005-07-22  0:45 Tabony, Charles
2005-08-02 17:53 Tabony, Charles

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).