public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* How to split 40bit data types load/store?
@ 2009-09-14 14:24 Mohamed Shafi
  2009-09-14 14:59 ` Richard Henderson
  0 siblings, 1 reply; 4+ messages in thread
From: Mohamed Shafi @ 2009-09-14 14:24 UTC (permalink / raw)
  To: GCC

Hello all,

I am doing a port for a 32bit target in GCC 4.4.0. I have to support a
40bit data (_Accum) in the port. The target has 40bit registers which
is a GPR and works as 32bit reg in other modes. The load and store for
_Accum happens in two step. The lower 32bit in one instruction and the
upper 8bit in the next instruction. I want to split the instruction
after reload. I tired to have a pattern (for load) like this:

(define_insn "fn_load_ext_sa"
 [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")]
	            UNSPEC_FN_EXT)
       (match_operand:SA 1 "memory_operand" ""))]

(define_insn "fn_load_sa"
 [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")]
                    UNSPEC_FN)
       (match_operand:SA 1 "memory_operand" ""))]


The above patterns works for O0. But with optimizations i am getting
ICE. It seems that GCC won't  accept unspec object in destination
operand. So how can split the pattens for the load and store for these
data types?

Regards,
Shafi

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to split 40bit data types load/store?
  2009-09-14 14:24 How to split 40bit data types load/store? Mohamed Shafi
@ 2009-09-14 14:59 ` Richard Henderson
  2009-10-05 14:03   ` Mohamed Shafi
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Henderson @ 2009-09-14 14:59 UTC (permalink / raw)
  To: Mohamed Shafi; +Cc: GCC

On 09/14/2009 07:24 AM, Mohamed Shafi wrote:
> Hello all,
>
> I am doing a port for a 32bit target in GCC 4.4.0. I have to support a
> 40bit data (_Accum) in the port. The target has 40bit registers which
> is a GPR and works as 32bit reg in other modes. The load and store for
> _Accum happens in two step. The lower 32bit in one instruction and the
> upper 8bit in the next instruction. I want to split the instruction
> after reload. I tired to have a pattern (for load) like this:
>
> (define_insn "fn_load_ext_sa"
>   [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")]
> 	            UNSPEC_FN_EXT)
>         (match_operand:SA 1 "memory_operand" ""))]
>
> (define_insn "fn_load_sa"
>   [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")]
>                      UNSPEC_FN)
>         (match_operand:SA 1 "memory_operand" ""))]

Unspec on the left-hand-side isn't something that's supposed to happen, 
and is more than likely the cause of your problems.  Try moving the 
unspec to the right-hand-side like:

   (set (reg:SI reg) (mem:SI addr))

   (set (reg:SA reg)
        (unspec:SA [(reg:SI reg) (mem:QI addr)]
                   UNSPEC_ACCUM_INSERT))

and

   (set (mem:SI addr) (reg:SI reg))

   (set (mem:QI addr)
        (unspec:QI [(reg:SA reg)]
                   UNSPEC_ACCUM_EXTRACT))

Note that after reload it's perfectly acceptable for a hard register to 
appear with the different SI and SAmodes.

It's probably not too hard to define this with zero_extract sequences 
instead of unspecs, but given that these only appear after reload, it 
may not be worth the effort.


r~

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to split 40bit data types load/store?
  2009-09-14 14:59 ` Richard Henderson
@ 2009-10-05 14:03   ` Mohamed Shafi
  2009-10-06  3:10     ` Richard Henderson
  0 siblings, 1 reply; 4+ messages in thread
From: Mohamed Shafi @ 2009-10-05 14:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: GCC

2009/9/14 Richard Henderson <rth@redhat.com>:
> On 09/14/2009 07:24 AM, Mohamed Shafi wrote:
>>
>> Hello all,
>>
>> I am doing a port for a 32bit target in GCC 4.4.0. I have to support a
>> 40bit data (_Accum) in the port. The target has 40bit registers which
>> is a GPR and works as 32bit reg in other modes. The load and store for
>> _Accum happens in two step. The lower 32bit in one instruction and the
>> upper 8bit in the next instruction. I want to split the instruction
>> after reload. I tired to have a pattern (for load) like this:
>>
>> (define_insn "fn_load_ext_sa"
>>  [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")]
>>                    UNSPEC_FN_EXT)
>>        (match_operand:SA 1 "memory_operand" ""))]
>>
>> (define_insn "fn_load_sa"
>>  [(set (unspec:SA [(match_operand:DA 0 "register_operand" "")]
>>                     UNSPEC_FN)
>>        (match_operand:SA 1 "memory_operand" ""))]
>
> Unspec on the left-hand-side isn't something that's supposed to happen, and
> is more than likely the cause of your problems.  Try moving the unspec to
> the right-hand-side like:
>
>  (set (reg:SI reg) (mem:SI addr))
>
>  (set (reg:SA reg)
>       (unspec:SA [(reg:SI reg) (mem:QI addr)]
>                  UNSPEC_ACCUM_INSERT))
>
> and
>
>  (set (mem:SI addr) (reg:SI reg))
>
>  (set (mem:QI addr)
>       (unspec:QI [(reg:SA reg)]
>                  UNSPEC_ACCUM_EXTRACT))
>
> Note that after reload it's perfectly acceptable for a hard register to
> appear with the different SI and SAmodes.
>
> It's probably not too hard to define this with zero_extract sequences
> instead of unspecs, but given that these only appear after reload, it may
> not be worth the effort.
>

   I was able to implement this with unspecs. But now it seems that i
need to split the pattern before reload also. So i am thinking of
removing this and doing a split before reload. The issue is that there
is no support to for register indirect addressing mode for accessing
the upper eight bits of the 40bit register. The only addressing mode
supported for accessing this section is (SP+offset). So what i thought
was to allow this addressing mode and at the time of reloading, at the
time of secondary reload with the help of a scratch register and a
scratch memory. But it seems that in GCC it is not possible to have
both scratch memory and a scratch register for the same operation. Am
i right?
So what i did was to implement this at the define_expand stage itself.
The idea is to generate the following sequence:

for load (R0), D0 generate

load (R0), D0                        // 32bit mode , SAmode move
load (R0+4), scratch_reg      // 32bit mode, SAmode
store scratch_reg, (SP+off)   //32bit mode, SAmode
load.ext (SP+off), D0.u8

and similarly for store.
 Here are the patterns that i used for this purpose:

(define_expand "movda"
 [(set (match_operand:DA 0 "nonimmediate_operand" "")
       (match_operand:DA 1 "nonimmediate_operand" ""))]
 ""
 "{
  if (MEM_P (operands[1]) && REG_P (XEXP (operands[1], 0))
      && XEXP (operands[1], 0) != virtual_stack_vars_rtx))
    {
      rtx lo_half, hi_half;
      rtx scratch_mem, scratch_reg, subreg;

      gcc_assert (can_create_pseudo_p ());
      scratch_reg = gen_reg_rtx (SAmode);
      scratch_mem = assign_stack_temp (SAmode, GET_MODE_SIZE (SAmode), 0);\
      subreg = gen_rtx_SUBREG (SAmode, operands[0], 0);

      lo_half = adjust_address (operands[1], SAmode, 0);
      hi_half = adjust_address (operands[1], SAmode, 4);
      emit_insn (gen_rtx_SET (SAmode, subreg, lo_half));
      emit_insn (gen_rtx_SET (SAmode, scratch_reg, hi_half));
      emit_insn (gen_rtx_SET (SAmode, scratch_mem, scratch_reg));
      emit_insn (gen_load_reg_ext (operands[0], scratch_mem));
      DONE;
    }
   /* and similarly for store operation */
 }"
)

(define_insn "load_reg_ext"
 [(set (subreg:SA (zero_extract:DA (match_operand:DA 0 "register_operand" "=d")
                        (const_int 8)
                        (const_int 24)) 4)
       (match_operand:SA 1 "memory_operand" "Sd3"))]

(define_insn "store_reg_ext"
 [(set (match_operand:SA 0 "memory_operand" "=Sd3")
       (zero_extract:SA (match_operand:DA 1 "register_operand" "d")
                        (const_int 8)
                        (const_int 24)))]

(define_insn "*movsa_internal"
 [(set (match_operand:SA 0 "nonimmediate_operand" "=m,d,d")
         (match_operand:SA 1 "nonimmediate_operand" "d,m,d"))]


By default -fomit-frame-pointer will passed to the complier. Without
optimization compiler generates the expected output. But with
optimization that is not the case. It seems that the pattern that i
have written above are not proper. For the simple function like the
following

_Accum foo (_Accum *a)
{
   _Accum b = *a;
   return b;
}

with optimization enabled the complier generates only

load (R0), D0                        // 32bit mode , SAmode move

the 1st instruction in the expected 4 instruction sequence.
How can i write the patterns properly?

Regards
Shafi

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to split 40bit data types load/store?
  2009-10-05 14:03   ` Mohamed Shafi
@ 2009-10-06  3:10     ` Richard Henderson
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2009-10-06  3:10 UTC (permalink / raw)
  To: Mohamed Shafi; +Cc: GCC

On 10/05/2009 07:02 AM, Mohamed Shafi wrote:
> . But now it seems that i
> need to split the pattern before reload also.

Oh?  Why?

> The only addressing mode
> supported for accessing this section is (SP+offset).

Ouch.  Is there no general register to high-8bit move either?
So you can't do

   load (R0), D0
   load (R0+4), R1
   move.ext R1, D0.u8

> (define_expand "movda"
>   [(set (match_operand:DA 0 "nonimmediate_operand" "")
>         (match_operand:DA 1 "nonimmediate_operand" ""))]

So, DAmode is your 40-bit value.  What's SAmode?

> (define_insn "load_reg_ext"
>   [(set (subreg:SA (zero_extract:DA (match_operand:DA 0 "register_operand" "=d")
>                          (const_int 8)
>                          (const_int 24)) 4)
>         (match_operand:SA 1 "memory_operand" "Sd3"))]

This pattern doesn't look kosher with the subreg outside the 
zero_extract.  I know I mentioned using zero_extract in an earlier 
message, but that may not actually work with an object larger than word 
size.  You may be better off with an unspec:

   (set (match_operand:DA 0 "register_operand" "=d")
        (unspec:DA [(match_operand:DA 1 "register_operand" "0")
		   (match_operand:SA 2 "memory_operand" "Sd3")]
		  UNSPEC_LOAD_REG_EXT))

or whatever you actually need.


r~

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-10-06  3:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-14 14:24 How to split 40bit data types load/store? Mohamed Shafi
2009-09-14 14:59 ` Richard Henderson
2009-10-05 14:03   ` Mohamed Shafi
2009-10-06  3:10     ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).