public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [commit, spu] Improve address generation for large stack frames
@ 2011-08-16 20:08 Ulrich Weigand
  2011-08-19  4:18 ` Richard Henderson
  0 siblings, 1 reply; 3+ messages in thread
From: Ulrich Weigand @ 2011-08-16 20:08 UTC (permalink / raw)
  To: gcc-patches

Hello,

since switching the frame growth direction on SPU to downward, the
displacements on stack addresses using an eliminable base registers
(e.g. frame_pointer_rtx) grow into the opposite direction from the
displacements that will be used in the final addresses which use
the real stack pointer register.

This means that checking those displacements in spu_legitimate_address_p
is not just pointless (and unnecessary), but can be rather counter-
productive.  This check should be defered until reload, when the
registers are eliminated and the final displacement is known.

This exposed a weakness in reload: when confronted with an address
of the form register + const_in where the displacement is out of
range, reload will reload in integer only on platforms where the
"double_reg_address_ok" flag is set.  However, this is only the
case if the machine actually support base + index + displacement
addressing, which SPU does not.

(I have experimented with changing double_reg_address_ok to only
require base + index addressing, but that unfortunately runs into
fundamental problems elsewhere, e.g. on rs6000, due to the fact
that reg + const_int will continue to be considered "offsettable"
even after the reload for the const_int has been pushed ...)

Thus the patch below simply fixes the issue locally on the SPU
by providing an appropriate LEGITIMIZE_RELOAD_ADDRESS macro.

This gets code generated for stack addressing if the stack is
larger than the biggest displacement back to the same quality
we had before the FRAME_GROWS_DOWNWARDS change.

Tested on spu-elf.
Committed to mainline.

Bye,
Ulrich


ChangeLog:

	* config/spu/spu.h (LEGITIMIZE_RELOAD_ADDRESS): New macro.
	* config/spu/spu-protos.h (spu_legitimize_reload_address): Add
	prototype.
	* config/spu/spu.c (spu_legitimize_reload_address): New function.
	(spu_legitimate_address_p): Do not check displacement is the base
	is an eliminable stack register.

Index: gcc/config/spu/spu-protos.h
===================================================================
*** gcc/config/spu/spu-protos.h	(revision 177595)
--- gcc/config/spu/spu-protos.h	(working copy)
*************** extern void spu_builtin_insert (rtx ops[
*** 76,81 ****
--- 76,82 ----
  extern void spu_builtin_promote (rtx ops[]);
  extern void spu_expand_sign_extend (rtx ops[]);
  extern void spu_expand_vector_init (rtx target, rtx vals);
+ extern rtx spu_legitimize_reload_address (rtx, enum machine_mode, int, int);
  #endif /* RTX_CODE  */
  
  extern void spu_init_expanders (void);
Index: gcc/config/spu/spu.c
===================================================================
*** gcc/config/spu/spu.c	(revision 177595)
--- gcc/config/spu/spu.c	(working copy)
*************** spu_legitimate_address_p (enum machine_m
*** 3803,3810 ****
  	if (GET_CODE (op0) == REG
  	    && INT_REG_OK_FOR_BASE_P (op0, reg_ok_strict)
  	    && GET_CODE (op1) == CONST_INT
! 	    && INTVAL (op1) >= -0x2000
! 	    && INTVAL (op1) <= 0x1fff
  	    && (!aligned || (INTVAL (op1) & 15) == 0))
  	  return TRUE;
  	if (GET_CODE (op0) == REG
--- 3803,3816 ----
  	if (GET_CODE (op0) == REG
  	    && INT_REG_OK_FOR_BASE_P (op0, reg_ok_strict)
  	    && GET_CODE (op1) == CONST_INT
! 	    && ((INTVAL (op1) >= -0x2000 && INTVAL (op1) <= 0x1fff)
! 		/* If virtual registers are involved, the displacement will
! 		   change later on anyway, so checking would be premature.
! 		   Reload will make sure the final displacement after
! 		   register elimination is OK.  */
! 		|| op0 == arg_pointer_rtx
! 		|| op0 == frame_pointer_rtx
! 		|| op0 == virtual_stack_vars_rtx)
  	    && (!aligned || (INTVAL (op1) & 15) == 0))
  	  return TRUE;
  	if (GET_CODE (op0) == REG
*************** spu_addr_space_legitimize_address (rtx x
*** 3877,3882 ****
--- 3883,3927 ----
    return spu_legitimize_address (x, oldx, mode);
  }
  
+ /* Reload reg + const_int for out-of-range displacements.  */
+ rtx
+ spu_legitimize_reload_address (rtx ad, enum machine_mode mode ATTRIBUTE_UNUSED,
+ 			       int opnum, int type)
+ {
+   bool removed_and = false;
+ 
+   if (GET_CODE (ad) == AND
+       && CONST_INT_P (XEXP (ad, 1))
+       && INTVAL (XEXP (ad, 1)) == (HOST_WIDE_INT) - 16)
+     {
+       ad = XEXP (ad, 0);
+       removed_and = true;
+     }
+ 
+   if (GET_CODE (ad) == PLUS
+       && REG_P (XEXP (ad, 0))
+       && CONST_INT_P (XEXP (ad, 1))
+       && !(INTVAL (XEXP (ad, 1)) >= -0x2000
+ 	   && INTVAL (XEXP (ad, 1)) <= 0x1fff))
+     {
+       /* Unshare the sum.  */
+       ad = copy_rtx (ad);
+ 
+       /* Reload the displacement.  */
+       push_reload (XEXP (ad, 1), NULL_RTX, &XEXP (ad, 1), NULL,
+ 		   BASE_REG_CLASS, GET_MODE (ad), VOIDmode, 0, 0,
+ 		   opnum, (enum reload_type) type);
+ 
+       /* Add back AND for alignment if we stripped it.  */
+       if (removed_and)
+ 	ad = gen_rtx_AND (GET_MODE (ad), ad, GEN_INT (-16));
+ 
+       return ad;
+     }
+ 
+   return NULL_RTX;
+ }
+ 
  /* Handle an attribute requiring a FUNCTION_DECL; arguments as in
     struct attribute_spec.handler.  */
  static tree
Index: gcc/config/spu/spu.h
===================================================================
*** gcc/config/spu/spu.h	(revision 177595)
--- gcc/config/spu/spu.h	(working copy)
*************** targetm.resolve_overloaded_builtin = spu
*** 390,395 ****
--- 390,406 ----
  
  #define MAX_REGS_PER_ADDRESS 2
  
+ #define LEGITIMIZE_RELOAD_ADDRESS(AD, MODE, OPNUM, TYPE, IND, WIN)	\
+ do {									\
+   rtx new_rtx = spu_legitimize_reload_address (AD, MODE, OPNUM,		\
+ 					       (int)(TYPE));		\
+   if (new_rtx)								\
+     {									\
+       (AD) = new_rtx;							\
+       goto WIN;								\
+     }									\
+ } while (0)
+ 
  \f
  /* Costs */
  
-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [commit, spu] Improve address generation for large stack frames
  2011-08-16 20:08 [commit, spu] Improve address generation for large stack frames Ulrich Weigand
@ 2011-08-19  4:18 ` Richard Henderson
  2011-08-19 12:51   ` Ulrich Weigand
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Henderson @ 2011-08-19  4:18 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc-patches

On 08/16/2011 11:35 AM, Ulrich Weigand wrote:
> +       /* Reload the displacement.  */
> +       push_reload (XEXP (ad, 1), NULL_RTX, &XEXP (ad, 1), NULL,
> + 		   BASE_REG_CLASS, GET_MODE (ad), VOIDmode, 0, 0,
> + 		   opnum, (enum reload_type) type);

Are you sure you want to reload it this way, and not

  (plus (plus base const-large) const-small)

?  If you push that inner reload, it seems like it would be 
sharable/cse-able with other variable references within the block.


r~

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [commit, spu] Improve address generation for large stack frames
  2011-08-19  4:18 ` Richard Henderson
@ 2011-08-19 12:51   ` Ulrich Weigand
  0 siblings, 0 replies; 3+ messages in thread
From: Ulrich Weigand @ 2011-08-19 12:51 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches

Richard Henderson wrote:
> On 08/16/2011 11:35 AM, Ulrich Weigand wrote:
> > +       /* Reload the displacement.  */
> > +       push_reload (XEXP (ad, 1), NULL_RTX, &XEXP (ad, 1), NULL,
> > + 		   BASE_REG_CLASS, GET_MODE (ad), VOIDmode, 0, 0,
> > + 		   opnum, (enum reload_type) type);
> 
> Are you sure you want to reload it this way, and not
> 
>   (plus (plus base const-large) const-small)
> 
> ?  If you push that inner reload, it seems like it would be 
> sharable/cse-able with other variable references within the block.

Well, yes, but I don't have an instruction to implement the
inner plus (SPU add immediate is restricted to +/- 512), so
I'd still have to reload the large constant as well and would
get another instruction.

This results in actually worse code if no sharing ends up
possible, so I'm not sure if it's worth it overall ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-08-19 12:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-16 20:08 [commit, spu] Improve address generation for large stack frames Ulrich Weigand
2011-08-19  4:18 ` Richard Henderson
2011-08-19 12:51   ` Ulrich Weigand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).