public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RS6000] Stack tie fix.
@ 2012-04-03  8:40 Alan Modra
  2012-04-03 14:35 ` Olivier Hainque
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Modra @ 2012-04-03  8:40 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Edelsohn

This patch merges the rs6000 stack_tie and frame_tie rtl, so that we
now should use a tie insn that mentions all base regs involved in
register saves and restores.  That should avoid any alias analysis
problems eg. http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01156.html.
I chose to put the regs of interest on the destination of the fake
set, rather than in the source as is currently done, because I figure
that relying on the compiler to not reorder reads is fragile.  Here's
an example of the new stack tie:

(set (mem:BLK (unspec:SI [(reg:SI 1) (reg:SI 12)] UNSPEC_TIE))
     (const_int 0))

Some notes:
- I tried putting the mem on both source and destination of the set,
  but that runs afoul of rtl dead code elimination.
- The rs6000_emit_epilogue places that called gen_frame_tie and
  gen_stack_tie both used alias set 0 mems.  I believe this was
  unnecessarily restrictive.
- CODE_FOR_stack_tie is mentioned in rs6000.c but not
  CODE_FOR_frame_tie.  Removing frame_tie fixes this sched bug too.

Bootstrapped and regression tested powerpc-linux with usual -O2
BOOT_CFLAGS and also -Os, and testcases from pr44199, pr30282, pr52828,
the url above and http://gcc.gnu.org/ml/gcc/2011-03/msg00123.html
examined for sanity.  OK to apply?

	PR target/52828
	* config/rs6000/rs6000.c (rs6000_emit_stack_tie): Rewrite with
	tie regs on destination of set.  Delete forward declaration.
	(rs6000_emit_stack_reset): Update rs6000_emit_stack_tie calls.
	(rs6000_emit_prologue): Likewise.
	(rs6000_emit_epilogue): Likewise.  Use in place of gen_frame_tie
	and gen_stack_tie.
	* config/rs6000/predicates.md (tie_operand): New.
	* config/rs6000/rs6000.md (restore_stack_block): Generate new
	stack tie rtl.
	(restore_stack_nonlocal): Likewise.
	(stack_tie): Update.
	(frame_tie): Delete.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 185830)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -925,7 +925,6 @@ static const char *rs6000_invalid_within_doloop (c
 static bool rs6000_legitimate_address_p (enum machine_mode, rtx, bool);
 static bool rs6000_debug_legitimate_address_p (enum machine_mode, rtx, bool);
 static rtx rs6000_generate_compare (rtx, enum machine_mode);
-static void rs6000_emit_stack_tie (void);
 static bool spe_func_has_64bit_regs_p (void);
 static rtx gen_frame_mem_offset (enum machine_mode, rtx, int);
 static unsigned rs6000_hash_constant (rtx);
@@ -18517,12 +18516,28 @@ rs6000_aix_asm_output_dwarf_table_ref (char * fram
    and the change to the stack pointer.  */
 
 static void
-rs6000_emit_stack_tie (void)
+rs6000_emit_stack_tie (rtx fp, bool hard_frame_needed)
 {
-  rtx mem = gen_frame_mem (BLKmode,
-			   gen_rtx_REG (Pmode, STACK_POINTER_REGNUM));
+  rtx u;
+  rtvec p;
+  int i;
 
-  emit_insn (gen_stack_tie (mem));
+  if (REGNO (fp) == STACK_POINTER_REGNUM
+      || (hard_frame_needed
+	  && REGNO (fp) == HARD_FRAME_POINTER_REGNUM))
+    fp = NULL_RTX;
+
+  p = rtvec_alloc (1 + hard_frame_needed + (fp != NULL_RTX));
+
+  i = 0;
+  RTVEC_ELT (p, i++) = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
+  if (hard_frame_needed)
+    RTVEC_ELT (p, i++) = gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM);
+  if (fp != NULL_RTX)
+    RTVEC_ELT (p, i++) = fp;
+  u = gen_rtx_UNSPEC (Pmode, p, UNSPEC_TIE);
+
+  emit_insn (gen_stack_tie (gen_frame_mem (BLKmode, u)));
 }
 
 /* Emit the correct code for allocating stack space, as insns.
@@ -19142,7 +19157,7 @@ rs6000_emit_stack_reset (rs6000_stack_t *info,
       || (TARGET_SPE_ABI
 	  && info->spe_64bit_regs_used != 0
 	  && info->first_gp_reg_save != 32))
-    rs6000_emit_stack_tie ();
+    rs6000_emit_stack_tie (frame_reg_rtx, frame_pointer_needed);
   
   if (frame_reg_rtx != sp_reg_rtx)
     {
@@ -19362,7 +19377,7 @@ rs6000_emit_prologue (void)
 	}
       rs6000_emit_allocate_stack (info->total_size, copy_reg);
       if (frame_reg_rtx != sp_reg_rtx)
-	rs6000_emit_stack_tie ();
+	rs6000_emit_stack_tie (frame_reg_rtx, false);
     }
 
   /* Handle world saves specially here.  */
@@ -19866,7 +19881,7 @@ rs6000_emit_prologue (void)
 	sp_offset = info->total_size;
       rs6000_emit_allocate_stack (info->total_size, copy_reg);
       if (frame_reg_rtx != sp_reg_rtx)
-	rs6000_emit_stack_tie ();
+	rs6000_emit_stack_tie (frame_reg_rtx, false);
     }
 
   /* Set frame pointer, if needed.  */
@@ -20437,13 +20452,7 @@ rs6000_emit_epilogue (int sibcall)
       /* Prevent reordering memory accesses against stack pointer restore.  */
       else if (cfun->calls_alloca
 	       || offset_below_red_zone_p (-info->total_size))
-	{
-	  rtx mem1 = gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx);
-	  rtx mem2 = gen_rtx_MEM (BLKmode, sp_reg_rtx);
-	  MEM_NOTRAP_P (mem1) = 1;
-	  MEM_NOTRAP_P (mem2) = 1;
-	  emit_insn (gen_frame_tie (mem1, mem2));
-	}
+	rs6000_emit_stack_tie (frame_reg_rtx, true);
 
       insn = emit_insn (gen_add3_insn (frame_reg_rtx, hard_frame_pointer_rtx,
 				       GEN_INT (info->total_size)));
@@ -20456,11 +20465,7 @@ rs6000_emit_epilogue (int sibcall)
       /* Prevent reordering memory accesses against stack pointer restore.  */
       if (cfun->calls_alloca
 	  || offset_below_red_zone_p (-info->total_size))
-	{
-	  rtx mem = gen_rtx_MEM (BLKmode, sp_reg_rtx);
-	  MEM_NOTRAP_P (mem) = 1;
-	  emit_insn (gen_stack_tie (mem));
-	}
+	rs6000_emit_stack_tie (frame_reg_rtx, false);
       insn = emit_insn (gen_add3_insn (sp_reg_rtx, sp_reg_rtx,
 				       GEN_INT (info->total_size)));
       sp_offset = 0;
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 185830)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1451,3 +1451,11 @@
 
   return 1;
 })
+
+;; Return 1 if OP is a stack tie operand.
+(define_predicate "tie_operand"
+  (match_code "mem")
+{
+  return (GET_CODE (XEXP (op, 0)) == UNSPEC
+	  && XINT (XEXP (op, 0), 1) == UNSPEC_TIE);
+})
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 185830)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -11989,17 +11989,20 @@
 (define_expand "restore_stack_block"
   [(set (match_dup 2) (match_dup 3))
    (set (match_dup 4) (match_dup 2))
-   (set (match_dup 5) (unspec:BLK [(match_dup 5)] UNSPEC_TIE))
+   (set (match_dup 5) (const_int 0))
    (set (match_operand 0 "register_operand" "")
 	(match_operand 1 "register_operand" ""))]
   ""
   "
 {
+  rtx u;
+
   operands[1] = force_reg (Pmode, operands[1]);
   operands[2] = gen_reg_rtx (Pmode);
   operands[3] = gen_frame_mem (Pmode, operands[0]);
   operands[4] = gen_frame_mem (Pmode, operands[1]);
-  operands[5] = gen_frame_mem (BLKmode, operands[0]);
+  u = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, operands[0]), UNSPEC_TIE);
+  operands[5] = gen_frame_mem (BLKmode, u);
 }")
 
 (define_expand "save_stack_nonlocal"
@@ -12022,12 +12025,13 @@
   [(set (match_dup 2) (match_operand 1 "memory_operand" ""))
    (set (match_dup 3) (match_dup 4))
    (set (match_dup 5) (match_dup 2))
-   (set (match_dup 6) (unspec:BLK [(match_dup 6)] UNSPEC_TIE))
+   (set (match_dup 6) (const_int 0))
    (set (match_operand 0 "register_operand" "") (match_dup 3))]
   ""
   "
 {
   int units_per_word = (TARGET_32BIT) ? 4 : 8;
+  rtx u;
 
   /* Restore the backchain from the first word, sp from the second.  */
   operands[2] = gen_reg_rtx (Pmode);
@@ -12035,7 +12039,8 @@
   operands[1] = adjust_address_nv (operands[1], Pmode, 0);
   operands[4] = adjust_address_nv (operands[1], Pmode, units_per_word);
   operands[5] = gen_frame_mem (Pmode, operands[3]);
-  operands[6] = gen_frame_mem (BLKmode, operands[0]);
+  u = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, operands[0]), UNSPEC_TIE);
+  operands[6] = gen_frame_mem (BLKmode, u);
 }")
 \f
 ;; TOC register handling.
@@ -15899,25 +15904,15 @@
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
 
-; These are to explain that changes to the stack pointer should
-; not be moved over stores to stack memory.
+; This is to explain that changes to the stack pointer should
+; not be moved over loads from or stores to stack memory.
 (define_insn "stack_tie"
-  [(set (match_operand:BLK 0 "memory_operand" "+m")
-        (unspec:BLK [(match_dup 0)] UNSPEC_TIE))]
+  [(set (match_operand:BLK 0 "tie_operand" "")
+	(const_int 0))]
   ""
   ""
   [(set_attr "length" "0")])
 
-; Like stack_tie, but depend on both fp and sp based memory.
-(define_insn "frame_tie"
-  [(set (match_operand:BLK 0 "memory_operand" "+m")
-	(unspec:BLK [(match_dup 0)
-		     (match_operand:BLK 1 "memory_operand" "m")] UNSPEC_TIE))]
-  ""
-  ""
-  [(set_attr "length" "0")])
-
-
 (define_expand "epilogue"
   [(use (const_int 0))]
   ""

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-03  8:40 [RS6000] Stack tie fix Alan Modra
@ 2012-04-03 14:35 ` Olivier Hainque
  2012-04-03 14:56   ` Olivier Hainque
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Hainque @ 2012-04-03 14:35 UTC (permalink / raw)
  To: Alan Modra; +Cc: Olivier Hainque, gcc-patches, David Edelsohn

Hello Alan,

Thanks a lot for following up on this one. Coincidentally, I was just
about to submit the alternate approach we had discussed about, after
David's comment at http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01842.html.

We have experienced with it for a while on our gcc-4.5 series of
compilers and we are working on a transition to gcc 4.7.

This is of course a much heavier hammer so it would be nice if we can
indeed have a subtler way out :-)

Olivier

On Apr 3, 2012, at 10:40 , Alan Modra wrote:

> This patch merges the rs6000 stack_tie and frame_tie rtl, so that we
> now should use a tie insn that mentions all base regs involved in
> register saves and restores.  That should avoid any alias analysis
> problems eg. http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01156.html.
> I chose to put the regs of interest on the destination of the fake
> set, rather than in the source as is currently done, because I figure
> that relying on the compiler to not reorder reads is fragile.  Here's
> an example of the new stack tie:
> 
> (set (mem:BLK (unspec:SI [(reg:SI 1) (reg:SI 12)] UNSPEC_TIE))
>     (const_int 0))
> 
> Some notes:
> - I tried putting the mem on both source and destination of the set,
>  but that runs afoul of rtl dead code elimination.
> - The rs6000_emit_epilogue places that called gen_frame_tie and
>  gen_stack_tie both used alias set 0 mems.  I believe this was
>  unnecessarily restrictive.
> - CODE_FOR_stack_tie is mentioned in rs6000.c but not
>  CODE_FOR_frame_tie.  Removing frame_tie fixes this sched bug too.
> 
> Bootstrapped and regression tested powerpc-linux with usual -O2
> BOOT_CFLAGS and also -Os, and testcases from pr44199, pr30282, pr52828,
> the url above and http://gcc.gnu.org/ml/gcc/2011-03/msg00123.html
> examined for sanity.  OK to apply?
> 
> 	PR target/52828
> 	* config/rs6000/rs6000.c (rs6000_emit_stack_tie): Rewrite with
> 	tie regs on destination of set.  Delete forward declaration.
> 	(rs6000_emit_stack_reset): Update rs6000_emit_stack_tie calls.
> 	(rs6000_emit_prologue): Likewise.
> 	(rs6000_emit_epilogue): Likewise.  Use in place of gen_frame_tie
> 	and gen_stack_tie.
> 	* config/rs6000/predicates.md (tie_operand): New.
> 	* config/rs6000/rs6000.md (restore_stack_block): Generate new
> 	stack tie rtl.
> 	(restore_stack_nonlocal): Likewise.
> 	(stack_tie): Update.
> 	(frame_tie): Delete.
> 
> Index: gcc/config/rs6000/rs6000.c
> ===================================================================
> --- gcc/config/rs6000/rs6000.c	(revision 185830)
> +++ gcc/config/rs6000/rs6000.c	(working copy)
> @@ -925,7 +925,6 @@ static const char *rs6000_invalid_within_doloop (c
> static bool rs6000_legitimate_address_p (enum machine_mode, rtx, bool);
> static bool rs6000_debug_legitimate_address_p (enum machine_mode, rtx, bool);
> static rtx rs6000_generate_compare (rtx, enum machine_mode);
> -static void rs6000_emit_stack_tie (void);
> static bool spe_func_has_64bit_regs_p (void);
> static rtx gen_frame_mem_offset (enum machine_mode, rtx, int);
> static unsigned rs6000_hash_constant (rtx);
> @@ -18517,12 +18516,28 @@ rs6000_aix_asm_output_dwarf_table_ref (char * fram
>    and the change to the stack pointer.  */
> 
> static void
> -rs6000_emit_stack_tie (void)
> +rs6000_emit_stack_tie (rtx fp, bool hard_frame_needed)
> {
> -  rtx mem = gen_frame_mem (BLKmode,
> -			   gen_rtx_REG (Pmode, STACK_POINTER_REGNUM));
> +  rtx u;
> +  rtvec p;
> +  int i;
> 
> -  emit_insn (gen_stack_tie (mem));
> +  if (REGNO (fp) == STACK_POINTER_REGNUM
> +      || (hard_frame_needed
> +	  && REGNO (fp) == HARD_FRAME_POINTER_REGNUM))
> +    fp = NULL_RTX;
> +
> +  p = rtvec_alloc (1 + hard_frame_needed + (fp != NULL_RTX));
> +
> +  i = 0;
> +  RTVEC_ELT (p, i++) = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
> +  if (hard_frame_needed)
> +    RTVEC_ELT (p, i++) = gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM);
> +  if (fp != NULL_RTX)
> +    RTVEC_ELT (p, i++) = fp;
> +  u = gen_rtx_UNSPEC (Pmode, p, UNSPEC_TIE);
> +
> +  emit_insn (gen_stack_tie (gen_frame_mem (BLKmode, u)));
> }
> 
> /* Emit the correct code for allocating stack space, as insns.
> @@ -19142,7 +19157,7 @@ rs6000_emit_stack_reset (rs6000_stack_t *info,
>       || (TARGET_SPE_ABI
> 	  && info->spe_64bit_regs_used != 0
> 	  && info->first_gp_reg_save != 32))
> -    rs6000_emit_stack_tie ();
> +    rs6000_emit_stack_tie (frame_reg_rtx, frame_pointer_needed);
> 
>   if (frame_reg_rtx != sp_reg_rtx)
>     {
> @@ -19362,7 +19377,7 @@ rs6000_emit_prologue (void)
> 	}
>       rs6000_emit_allocate_stack (info->total_size, copy_reg);
>       if (frame_reg_rtx != sp_reg_rtx)
> -	rs6000_emit_stack_tie ();
> +	rs6000_emit_stack_tie (frame_reg_rtx, false);
>     }
> 
>   /* Handle world saves specially here.  */
> @@ -19866,7 +19881,7 @@ rs6000_emit_prologue (void)
> 	sp_offset = info->total_size;
>       rs6000_emit_allocate_stack (info->total_size, copy_reg);
>       if (frame_reg_rtx != sp_reg_rtx)
> -	rs6000_emit_stack_tie ();
> +	rs6000_emit_stack_tie (frame_reg_rtx, false);
>     }
> 
>   /* Set frame pointer, if needed.  */
> @@ -20437,13 +20452,7 @@ rs6000_emit_epilogue (int sibcall)
>       /* Prevent reordering memory accesses against stack pointer restore.  */
>       else if (cfun->calls_alloca
> 	       || offset_below_red_zone_p (-info->total_size))
> -	{
> -	  rtx mem1 = gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx);
> -	  rtx mem2 = gen_rtx_MEM (BLKmode, sp_reg_rtx);
> -	  MEM_NOTRAP_P (mem1) = 1;
> -	  MEM_NOTRAP_P (mem2) = 1;
> -	  emit_insn (gen_frame_tie (mem1, mem2));
> -	}
> +	rs6000_emit_stack_tie (frame_reg_rtx, true);
> 
>       insn = emit_insn (gen_add3_insn (frame_reg_rtx, hard_frame_pointer_rtx,
> 				       GEN_INT (info->total_size)));
> @@ -20456,11 +20465,7 @@ rs6000_emit_epilogue (int sibcall)
>       /* Prevent reordering memory accesses against stack pointer restore.  */
>       if (cfun->calls_alloca
> 	  || offset_below_red_zone_p (-info->total_size))
> -	{
> -	  rtx mem = gen_rtx_MEM (BLKmode, sp_reg_rtx);
> -	  MEM_NOTRAP_P (mem) = 1;
> -	  emit_insn (gen_stack_tie (mem));
> -	}
> +	rs6000_emit_stack_tie (frame_reg_rtx, false);
>       insn = emit_insn (gen_add3_insn (sp_reg_rtx, sp_reg_rtx,
> 				       GEN_INT (info->total_size)));
>       sp_offset = 0;
> Index: gcc/config/rs6000/predicates.md
> ===================================================================
> --- gcc/config/rs6000/predicates.md	(revision 185830)
> +++ gcc/config/rs6000/predicates.md	(working copy)
> @@ -1451,3 +1451,11 @@
> 
>   return 1;
> })
> +
> +;; Return 1 if OP is a stack tie operand.
> +(define_predicate "tie_operand"
> +  (match_code "mem")
> +{
> +  return (GET_CODE (XEXP (op, 0)) == UNSPEC
> +	  && XINT (XEXP (op, 0), 1) == UNSPEC_TIE);
> +})
> Index: gcc/config/rs6000/rs6000.md
> ===================================================================
> --- gcc/config/rs6000/rs6000.md	(revision 185830)
> +++ gcc/config/rs6000/rs6000.md	(working copy)
> @@ -11989,17 +11989,20 @@
> (define_expand "restore_stack_block"
>   [(set (match_dup 2) (match_dup 3))
>    (set (match_dup 4) (match_dup 2))
> -   (set (match_dup 5) (unspec:BLK [(match_dup 5)] UNSPEC_TIE))
> +   (set (match_dup 5) (const_int 0))
>    (set (match_operand 0 "register_operand" "")
> 	(match_operand 1 "register_operand" ""))]
>   ""
>   "
> {
> +  rtx u;
> +
>   operands[1] = force_reg (Pmode, operands[1]);
>   operands[2] = gen_reg_rtx (Pmode);
>   operands[3] = gen_frame_mem (Pmode, operands[0]);
>   operands[4] = gen_frame_mem (Pmode, operands[1]);
> -  operands[5] = gen_frame_mem (BLKmode, operands[0]);
> +  u = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, operands[0]), UNSPEC_TIE);
> +  operands[5] = gen_frame_mem (BLKmode, u);
> }")
> 
> (define_expand "save_stack_nonlocal"
> @@ -12022,12 +12025,13 @@
>   [(set (match_dup 2) (match_operand 1 "memory_operand" ""))
>    (set (match_dup 3) (match_dup 4))
>    (set (match_dup 5) (match_dup 2))
> -   (set (match_dup 6) (unspec:BLK [(match_dup 6)] UNSPEC_TIE))
> +   (set (match_dup 6) (const_int 0))
>    (set (match_operand 0 "register_operand" "") (match_dup 3))]
>   ""
>   "
> {
>   int units_per_word = (TARGET_32BIT) ? 4 : 8;
> +  rtx u;
> 
>   /* Restore the backchain from the first word, sp from the second.  */
>   operands[2] = gen_reg_rtx (Pmode);
> @@ -12035,7 +12039,8 @@
>   operands[1] = adjust_address_nv (operands[1], Pmode, 0);
>   operands[4] = adjust_address_nv (operands[1], Pmode, units_per_word);
>   operands[5] = gen_frame_mem (Pmode, operands[3]);
> -  operands[6] = gen_frame_mem (BLKmode, operands[0]);
> +  u = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, operands[0]), UNSPEC_TIE);
> +  operands[6] = gen_frame_mem (BLKmode, u);
> }")
> 
> ;; TOC register handling.
> @@ -15899,25 +15904,15 @@
>   [(set_attr "type" "branch")
>    (set_attr "length" "4")])
> 
> -; These are to explain that changes to the stack pointer should
> -; not be moved over stores to stack memory.
> +; This is to explain that changes to the stack pointer should
> +; not be moved over loads from or stores to stack memory.
> (define_insn "stack_tie"
> -  [(set (match_operand:BLK 0 "memory_operand" "+m")
> -        (unspec:BLK [(match_dup 0)] UNSPEC_TIE))]
> +  [(set (match_operand:BLK 0 "tie_operand" "")
> +	(const_int 0))]
>   ""
>   ""
>   [(set_attr "length" "0")])
> 
> -; Like stack_tie, but depend on both fp and sp based memory.
> -(define_insn "frame_tie"
> -  [(set (match_operand:BLK 0 "memory_operand" "+m")
> -	(unspec:BLK [(match_dup 0)
> -		     (match_operand:BLK 1 "memory_operand" "m")] UNSPEC_TIE))]
> -  ""
> -  ""
> -  [(set_attr "length" "0")])
> -
> -
> (define_expand "epilogue"
>   [(use (const_int 0))]
>   ""
> 
> -- 
> Alan Modra
> Australia Development Lab, IBM
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-03 14:35 ` Olivier Hainque
@ 2012-04-03 14:56   ` Olivier Hainque
  2012-04-03 17:05     ` David Edelsohn
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Hainque @ 2012-04-03 14:56 UTC (permalink / raw)
  To: Alan Modra; +Cc: Olivier Hainque, gcc-patches, David Edelsohn


On Apr 3, 2012, at 16:34 , Olivier Hainque wrote:
> Thanks a lot for following up on this one. Coincidentally, I was just
> about to submit the alternate approach we had discussed about, after
> David's comment at http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01842.html.

> This is of course a much heavier hammer so it would be nice if we can
> indeed have a subtler way out :-)

 I realized that this last sentence is ambiguous ...

 To clarify: the heavier approach is the one I was about to submit
 (minor variation of Joseph's proposal in the thread just referenced),
 and the subtler way out is the one you are proposing here.

 Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-03 14:56   ` Olivier Hainque
@ 2012-04-03 17:05     ` David Edelsohn
  2012-04-04  1:22       ` Alan Modra
  0 siblings, 1 reply; 17+ messages in thread
From: David Edelsohn @ 2012-04-03 17:05 UTC (permalink / raw)
  To: Olivier Hainque, Alan Modra; +Cc: GCC Patches

On Tue, Apr 3, 2012 at 10:55 AM, Olivier Hainque <hainque@adacore.com> wrote:
>
> On Apr 3, 2012, at 16:34 , Olivier Hainque wrote:
>> Thanks a lot for following up on this one. Coincidentally, I was just
>> about to submit the alternate approach we had discussed about, after
>> David's comment at http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01842.html.
>
>> This is of course a much heavier hammer so it would be nice if we can
>> indeed have a subtler way out :-)
>
>  To clarify: the heavier approach is the one I was about to submit
>  (minor variation of Joseph's proposal in the thread just referenced),
>  and the subtler way out is the one you are proposing here.

We can give Alan's patch a try.  I'm not sure if it is sufficient
given the experience of IBM's XL compiler.  I also would rather not
use the heavier hammer, but I don't want to leave a latent bug.

Thanks, David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-03 17:05     ` David Edelsohn
@ 2012-04-04  1:22       ` Alan Modra
  2012-04-05 10:36         ` Olivier Hainque
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Modra @ 2012-04-04  1:22 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Olivier Hainque, GCC Patches

On Tue, Apr 03, 2012 at 01:05:08PM -0400, David Edelsohn wrote:
> On Tue, Apr 3, 2012 at 10:55 AM, Olivier Hainque <hainque@adacore.com> wrote:
> >
> > On Apr 3, 2012, at 16:34 , Olivier Hainque wrote:
> >> Thanks a lot for following up on this one. Coincidentally, I was just
> >> about to submit the alternate approach we had discussed about, after
> >> David's comment at http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01842.html.
> >
> >> This is of course a much heavier hammer so it would be nice if we can
> >> indeed have a subtler way out :-)
> >
> >  To clarify: the heavier approach is the one I was about to submit
> >  (minor variation of Joseph's proposal in the thread just referenced),
> >  and the subtler way out is the one you are proposing here.
> 
> We can give Alan's patch a try.  I'm not sure if it is sufficient
> given the experience of IBM's XL compiler.  I also would rather not
> use the heavier hammer, but I don't want to leave a latent bug.

I'll see whether my approach fixes pr30282 for gcc-4.4 which has known
deficiencies in alias analysis.  Olivier, would you be kind enough to
backport and test against other versions of gcc that needed your
bigger hammer?

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-04  1:22       ` Alan Modra
@ 2012-04-05 10:36         ` Olivier Hainque
  2012-04-05 14:03           ` Alan Modra
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Hainque @ 2012-04-05 10:36 UTC (permalink / raw)
  To: Alan Modra; +Cc: Olivier Hainque, David Edelsohn, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1635 bytes --]

Hello Alan,

On Apr 4, 2012, at 03:22 , Alan Modra wrote:
> I'll see whether my approach fixes pr30282 for gcc-4.4 which has known
> deficiencies in alias analysis.  Olivier, would you be kind enough to
> backport and test against other versions of gcc that needed your
> bigger hammer?

 Sure. I can debug the paths taken for the testcases where we had
 observed problems and see. I'll see if I can do a few tests comparing
 generated code on some source base. The problems are hard to observe
 at runtime because they are very timing/event/environment dependant.

 I still feel a bit unclear/concerned on a couple of aspects. One is:
 
 There are lots of places in the emit_epilogue code that assign
 frame_reg_rtx with different possible values, (hard_fp, sp, r11)
 before we first get to points where we emit ties. There are also
 multiple places where we do emit ties.

 It is not at all obvious to me that the all places where we emit ties
 have an accurate perception of what frame_reg's were possibly used before.

 IOW, I am not 100% convinced that we cannot have a bad case where
 we emit a tie that misses a reg reference. The various paths are all
 controlled by intricate conditions, so getting that 100% conviction
 (at least on paper) isn't easy to me.

 Is it clearer to you ?

 FWIW, I had made an experiment at trying to extract subfunctions,
 which might help such reasoning.  Set of patches attached. This doesn't
 apply to the current mainline, would need to refined, and we probably
 couldn't/shouldn't put something like this on the path to the resolution
 of our current issue, so this is JIC you are curious.


[-- Attachment #2: ppcepi-split.tar.gz --]
[-- Type: application/x-gzip, Size: 14681 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-05 10:36         ` Olivier Hainque
@ 2012-04-05 14:03           ` Alan Modra
  2012-04-05 17:23             ` Olivier Hainque
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Modra @ 2012-04-05 14:03 UTC (permalink / raw)
  To: Olivier Hainque; +Cc: David Edelsohn, GCC Patches

Hi Olivier,

On Thu, Apr 05, 2012 at 12:36:01PM +0200, Olivier Hainque wrote:
> On Apr 4, 2012, at 03:22 , Alan Modra wrote:
> > I'll see whether my approach fixes pr30282 for gcc-4.4 which has known
> > deficiencies in alias analysis.  Olivier, would you be kind enough to
> > backport and test against other versions of gcc that needed your
> > bigger hammer?

Well it turns out that gcc-4.4 still gets this testcase from pr30282
wrong.

int find_num(int i)
{
    int arr[5] = {0, 1, 2, 3, 4};
    return arr[i];
}

The read from arr[i] is scheduled past the frame teardown.

[snip]
>  There are lots of places in the emit_epilogue code that assign
>  frame_reg_rtx with different possible values, (hard_fp, sp, r11)

r12 too

>  before we first get to points where we emit ties. There are also
>  multiple places where we do emit ties.
> 
>  It is not at all obvious to me that the all places where we emit ties
>  have an accurate perception of what frame_reg's were possibly used before.
> 
>  IOW, I am not 100% convinced that we cannot have a bad case where
>  we emit a tie that misses a reg reference. The various paths are all
>  controlled by intricate conditions, so getting that 100% conviction
>  (at least on paper) isn't easy to me.
> 
>  Is it clearer to you ?

I spent quite some time looking at it. ;)  Have you spotted an error
somewhere, apart from spe_save_area_ptr not being mentioned in the
stack tie?  I left that one for a later fix since the SPE reg saves
use alias set 0 mems (which is another bug).

Hmm, I see I accidentally left out an assert from the stack tie
patch.  This one may make you feel a little better.  :)

  /* Update stack and set back pointer unless this is V.4,
     for which it was done previously.  */
  if (!WORLD_SAVE_P (info) && info->push_p
      && !(DEFAULT_ABI == ABI_V4 || crtl->calls_eh_return))
    {
      rtx copy_reg = NULL;

      /* If using some other frame reg previously, then we ought to
	 ensure it is mentioned in the stack tie emitted below.  */
      gcc_checking_assert (REGNO (frame_reg_rtx) == 1
			   || REGNO (frame_reg_rtx) == 12);


>  FWIW, I had made an experiment at trying to extract subfunctions,
>  which might help such reasoning.  Set of patches attached. This doesn't
>  apply to the current mainline, would need to refined, and we probably
>  couldn't/shouldn't put something like this on the path to the resolution
>  of our current issue, so this is JIC you are curious.

I think this is worthwhile doing, but more important to try to make
the logic simpler.  For example, this

  /* If we need to save CR, put it into r12 or r11.  */
  if (!WORLD_SAVE_P (info) && info->cr_save_p && frame_reg_rtx != frame_ptr_rtx)
    {
      rtx set;

      cr_save_rtx
	= gen_rtx_REG (SImode, DEFAULT_ABI == ABI_AIX && !saving_GPRs_inline
		       ? 11 : 12);

is better written as

  /* If we need to save CR, put it into r12 or r11.  */
  cr_save_regno = DEFAULT_ABI == ABI_AIX && !saving_GPRs_inline ? 11 : 12;
  if (!WORLD_SAVE_P (info)
      && info->cr_save_p
      && REGNO (frame_reg_rtx) != cr_save_regno)
    {
      rtx set;

      cr_save_rtx = gen_rtx_REG (SImode, cr_save_regno);

This way it's obvious why there is a test of frame_reg_rtx, and that
the test is correct.  ie. you aren't clobbering frame_reg_rtx with cr.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-05 14:03           ` Alan Modra
@ 2012-04-05 17:23             ` Olivier Hainque
  2012-04-06  0:33               ` Alan Modra
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Hainque @ 2012-04-05 17:23 UTC (permalink / raw)
  To: Alan Modra; +Cc: Olivier Hainque, David Edelsohn, GCC Patches


On Apr 5, 2012, at 16:03 , Alan Modra wrote:

> Well it turns out that gcc-4.4 still gets this testcase from pr30282
> wrong.

 Hmph :-(

>> There are lots of places in the emit_epilogue code that assign
>> frame_reg_rtx with different possible values, (hard_fp, sp, r11)
> 
> r12 too

 Right ... 

>> It is not at all obvious to me that the all places where we emit ties
>> have an accurate perception of what frame_reg's were possibly used before.
>> Is it clearer to you ?
> 
> I spent quite some time looking at it. ;)

 Oh, sure, that's why I'm asking :)

> Have you spotted an error
> somewhere, apart from spe_save_area_ptr not being mentioned in the
> stack tie?

 No, not really ... (more below)

> Hmm, I see I accidentally left out an assert from the stack tie
> patch.  This one may make you feel a little better.  :)
> 
>  /* Update stack and set back pointer unless this is V.4,
>     for which it was done previously.  */

>      /* If using some other frame reg previously, then we ought to
> 	 ensure it is mentioned in the stack tie emitted below.  */
>      gcc_checking_assert (REGNO (frame_reg_rtx) == 1
> 			   || REGNO (frame_reg_rtx) == 12);

 My concern (apart from possible remaining aliasing issues) was exactly
 what the comment above expresses, but right away I don't see how the assert
 does the check expressed by the comment, and there are other places where
 we emit ties.

 I need to review the function with your changes applied in greater detail.
 
 IIUC (please correct me if I'm wrong), every time we need to emit a tie, we
 must ensure that it references all the base regs that were used before to
 restore regs from the frame (to prevent the sp adjustment past the tie from
 moving prior to any of these accesses)
 
 Since we have several possible uses of different registers ahead, controlled
 by intricate conditions, I would have thought we'd need to maintain a list
 of these registers, to which we add every time we use a new one to access, and
 which we'd use to feed the tie references.

 If we don't do that, I still find it difficult to see that the ties alway do
 reference the proper set of regs (all those used to access the frame earlier).

 It might just be that there's part of the logic I still haven't grasped, so ...

>> FWIW, I had made an experiment at trying to extract subfunctions,
>> which might help such reasoning.

> I think this is worthwhile doing, but more important to try to make
> the logic simpler.
[...]

Entirely agreed! The exercise was just aimed at helping me understand
some of the logic :-)

Thanks a lot for your feedback, much appreciated,

Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-05 17:23             ` Olivier Hainque
@ 2012-04-06  0:33               ` Alan Modra
  2012-04-06 15:25                 ` Olivier Hainque
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Modra @ 2012-04-06  0:33 UTC (permalink / raw)
  To: Olivier Hainque; +Cc: David Edelsohn, GCC Patches

On Thu, Apr 05, 2012 at 07:23:20PM +0200, Olivier Hainque wrote:
>  IIUC (please correct me if I'm wrong), every time we need to emit a tie, we
>  must ensure that it references all the base regs that were used before to
>  restore regs from the frame (to prevent the sp adjustment past the tie from
>  moving prior to any of these accesses)

The minimal requirement is:  In the prologue, the next frame write
must be prevented from moving before the stack adjust.  In the
epilogue, the previous frame read must be prevented from moving after
the stack adjust.  If the adjacent write/read uses r1 as a base reg,
then we don't need a stack tie at all.

Writes/reads further away won't be reordered over the adjacent
write/read.  At least, I've never seen gcc do so, even with older
versions known to have bugs in aliasing analysis.  Every single case
I've seen of improper reordering had a stack tie that didn't mention
the base reg used by the adjacent write/read (or no tie).

(Hah!  In writing this, I remember why I took out that assert.  What
happens prior to the tie in the prologue is of no concern.)

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-06  0:33               ` Alan Modra
@ 2012-04-06 15:25                 ` Olivier Hainque
  2012-04-11  4:16                   ` Alan Modra
  0 siblings, 1 reply; 17+ messages in thread
From: Olivier Hainque @ 2012-04-06 15:25 UTC (permalink / raw)
  To: Alan Modra; +Cc: Olivier Hainque, David Edelsohn, GCC Patches

Hello Alan,

On Apr 6, 2012, at 02:32 , Alan Modra wrote:
[...]
> In the epilogue, the previous frame read must be prevented from moving after
> the stack adjust.  If the adjacent write/read uses r1 as a base reg,
> then we don't need a stack tie at all.

Right

> Writes/reads further away won't be reordered over the adjacent
> write/read.  At least, I've never seen gcc do so, even with older
> versions known to have bugs in aliasing analysis.

 Ah, I hadn't understood that there was this implicit assumption.
 I understand better that part of the logic then. (would be worth
 a comment IMHO) Thanks!

> (Hah!  In writing this, I remember why I took out that assert.  What
> happens prior to the tie in the prologue is of no concern.)

 :-)

 I have been able to verify that your patch helps our previously
 failing case, starting from the svn rev and failure mode analyzed per 
 http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01156.html.

 We get through the same compiler circuits, down to base_alias_check
 now with

  x = (unspec:SI [
        (reg/f:SI 1 1)
        (reg/f:SI 31 31)
        (reg:SI 11 11)
    ] UNSPEC_TIE)

 instead of (reg:SI 1).

 x_base is now null and such that we rapidly conclude that "we don't
 know anything about aliasing", so the outer mem accesses are declared
 conflicting.

 Have you discovered where the problem you are still observing
 is coming from (with another case on 4.4) ? I was planning to look
 into it and realized that maybe you were already doing so.

 Thanks much for your feedback,

 Olivier


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-06 15:25                 ` Olivier Hainque
@ 2012-04-11  4:16                   ` Alan Modra
  2012-04-11  9:00                     ` Richard Guenther
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Modra @ 2012-04-11  4:16 UTC (permalink / raw)
  To: Olivier Hainque; +Cc: David Edelsohn, GCC Patches

On Fri, Apr 06, 2012 at 05:25:15PM +0200, Olivier Hainque wrote:
>  Have you discovered where the problem you are still observing
>  is coming from (with another case on 4.4) ? I was planning to look
>  into it and realized that maybe you were already doing so.

In gcc-4.4, alias.c:write_dependence_p has

  if (DIFFERENT_ALIAS_SETS_P (x, mem))
    return 0;

immediately after the tests for (mem:BLK (scratch)) and
ALIAS_SET_MEMORY_BARRIER.  On find_num.c the read from the stack arr[]
is alias set 3, the stack tie is alias set 2, so they are seen as not
conflicting.

For mainline we have the same alias sets but with the addresses
involved, (unspec [reg 1] UNSPEC_TIE) and (plus (reg 9) (reg 3)), we
run all the way down to rtx_refs_may_alias_p where ao_ref_from_mem for
the stack tie returns false.  So anti_dependence returns true.

Hmm, the fact that alias analysis on mainline treats my fancy barriers
like this says I may as well have not bothered.  (mem:BLK (scratch))
performs identically at the moment.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-11  4:16                   ` Alan Modra
@ 2012-04-11  9:00                     ` Richard Guenther
  2012-04-12 13:23                       ` Alan Modra
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Guenther @ 2012-04-11  9:00 UTC (permalink / raw)
  To: Olivier Hainque, David Edelsohn, GCC Patches

On Wed, Apr 11, 2012 at 6:15 AM, Alan Modra <amodra@gmail.com> wrote:
> On Fri, Apr 06, 2012 at 05:25:15PM +0200, Olivier Hainque wrote:
>>  Have you discovered where the problem you are still observing
>>  is coming from (with another case on 4.4) ? I was planning to look
>>  into it and realized that maybe you were already doing so.
>
> In gcc-4.4, alias.c:write_dependence_p has
>
>  if (DIFFERENT_ALIAS_SETS_P (x, mem))
>    return 0;
>
> immediately after the tests for (mem:BLK (scratch)) and
> ALIAS_SET_MEMORY_BARRIER.  On find_num.c the read from the stack arr[]
> is alias set 3, the stack tie is alias set 2, so they are seen as not
> conflicting.
>
> For mainline we have the same alias sets but with the addresses
> involved, (unspec [reg 1] UNSPEC_TIE) and (plus (reg 9) (reg 3)), we
> run all the way down to rtx_refs_may_alias_p where ao_ref_from_mem for
> the stack tie returns false.  So anti_dependence returns true.
>
> Hmm, the fact that alias analysis on mainline treats my fancy barriers
> like this says I may as well have not bothered.  (mem:BLK (scratch))
> performs identically at the moment.

Well - you are expecting generic code to understand your arch-specific
UNSPEC.  It of course cannot.  From reading back a bit I understand
that you want alias analysis to consider this a barrier for all memory
accesses that all mentioned registers possibly point to?  I'd have used

 (USE (mem:BLK (reg 1)))
 (CLOBBER (mem:BLK (reg 1)))

repeated, for each register (maybe you can avoid the USE if the CLOBBER
is implicitely considered a use, or maybe we don't need to care because
we do not specify what piece of memory is possibly clobbered, we just
specify its base address).

Oh, and the above would need to be handled explicitely anyway in alias.c

Richard.

> --
> Alan Modra
> Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-11  9:00                     ` Richard Guenther
@ 2012-04-12 13:23                       ` Alan Modra
  2012-04-12 15:34                         ` David Edelsohn
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Modra @ 2012-04-12 13:23 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Olivier Hainque, David Edelsohn, GCC Patches

On Wed, Apr 11, 2012 at 11:00:04AM +0200, Richard Guenther wrote:
> On Wed, Apr 11, 2012 at 6:15 AM, Alan Modra <amodra@gmail.com> wrote:
> Well - you are expecting generic code to understand your arch-specific
> UNSPEC.  It of course cannot.

Right.

>  (USE (mem:BLK (reg 1)))
>  (CLOBBER (mem:BLK (reg 1)))
> 
> repeated, for each register (maybe you can avoid the USE if the CLOBBER

I tried that.  It doesn't work without something else in the insn to
stop rtl-dce deleting it, so you may as well use SETs.  But thanks for
the prod in the right direction.  We do get slightly better results
when the regs are not hidden away in an UNSPEC, for instance
non-stack writes/reads are seen by the alias oracle to not conflict
with the epilogue frame deallocation.

Bootstrapped etc. powerpc-linux.  OK to apply, David?

	PR target/52828
	* config/rs6000/rs6000.c (rs6000_emit_stack_tie): Rewrite with
	tie regs on destination of sets.  Delete forward declaration.
	(rs6000_emit_stack_reset): Update rs6000_emit_stack_tie calls.
	(rs6000_emit_prologue): Likewise.
	(rs6000_emit_epilogue): Likewise.  Use in place of gen_frame_tie
	and gen_stack_tie.
	(is_mem_ref): Use tie_operand to recognise stack ties.
	* config/rs6000/predicates.md (tie_operand): New.
	* config/rs6000/rs6000.md (UNSPEC_TIE): Delete.
	(restore_stack_block): Generate new stack tie rtl.
	(restore_stack_nonlocal): Likewise.
	(stack_tie): Update.
	(frame_tie): Delete.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 186373)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -925,7 +925,6 @@ static const char *rs6000_invalid_within_doloop (c
 static bool rs6000_legitimate_address_p (enum machine_mode, rtx, bool);
 static bool rs6000_debug_legitimate_address_p (enum machine_mode, rtx, bool);
 static rtx rs6000_generate_compare (rtx, enum machine_mode);
-static void rs6000_emit_stack_tie (void);
 static bool spe_func_has_64bit_regs_p (void);
 static rtx gen_frame_mem_offset (enum machine_mode, rtx, int);
 static unsigned rs6000_hash_constant (rtx);
@@ -18505,12 +18504,29 @@ rs6000_aix_asm_output_dwarf_table_ref (char * fram
    and the change to the stack pointer.  */
 
 static void
-rs6000_emit_stack_tie (void)
+rs6000_emit_stack_tie (rtx fp, bool hard_frame_needed)
 {
-  rtx mem = gen_frame_mem (BLKmode,
-			   gen_rtx_REG (Pmode, STACK_POINTER_REGNUM));
+  rtvec p;
+  int i;
+  rtx regs[3];
 
-  emit_insn (gen_stack_tie (mem));
+  i = 0;
+  regs[i++] = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
+  if (hard_frame_needed)
+    regs[i++] = gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM);
+  if (!(REGNO (fp) == STACK_POINTER_REGNUM
+	|| (hard_frame_needed
+	    && REGNO (fp) == HARD_FRAME_POINTER_REGNUM)))
+    regs[i++] = fp;
+
+  p = rtvec_alloc (i);
+  while (--i >= 0)
+    {
+      rtx mem = gen_frame_mem (BLKmode, regs[i]);
+      RTVEC_ELT (p, i) = gen_rtx_SET (VOIDmode, mem, const0_rtx);
+    }
+
+  emit_insn (gen_stack_tie (gen_rtx_PARALLEL (VOIDmode, p)));
 }
 
 /* Emit the correct code for allocating stack space, as insns.
@@ -19130,7 +19146,7 @@ rs6000_emit_stack_reset (rs6000_stack_t *info,
       || (TARGET_SPE_ABI
 	  && info->spe_64bit_regs_used != 0
 	  && info->first_gp_reg_save != 32))
-    rs6000_emit_stack_tie ();
+    rs6000_emit_stack_tie (frame_reg_rtx, frame_pointer_needed);
   
   if (frame_reg_rtx != sp_reg_rtx)
     {
@@ -19350,7 +19366,7 @@ rs6000_emit_prologue (void)
 	}
       rs6000_emit_allocate_stack (info->total_size, copy_reg);
       if (frame_reg_rtx != sp_reg_rtx)
-	rs6000_emit_stack_tie ();
+	rs6000_emit_stack_tie (frame_reg_rtx, false);
     }
 
   /* Handle world saves specially here.  */
@@ -19854,7 +19870,7 @@ rs6000_emit_prologue (void)
 	sp_offset = info->total_size;
       rs6000_emit_allocate_stack (info->total_size, copy_reg);
       if (frame_reg_rtx != sp_reg_rtx)
-	rs6000_emit_stack_tie ();
+	rs6000_emit_stack_tie (frame_reg_rtx, false);
     }
 
   /* Set frame pointer, if needed.  */
@@ -20425,13 +20441,7 @@ rs6000_emit_epilogue (int sibcall)
       /* Prevent reordering memory accesses against stack pointer restore.  */
       else if (cfun->calls_alloca
 	       || offset_below_red_zone_p (-info->total_size))
-	{
-	  rtx mem1 = gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx);
-	  rtx mem2 = gen_rtx_MEM (BLKmode, sp_reg_rtx);
-	  MEM_NOTRAP_P (mem1) = 1;
-	  MEM_NOTRAP_P (mem2) = 1;
-	  emit_insn (gen_frame_tie (mem1, mem2));
-	}
+	rs6000_emit_stack_tie (frame_reg_rtx, true);
 
       insn = emit_insn (gen_add3_insn (frame_reg_rtx, hard_frame_pointer_rtx,
 				       GEN_INT (info->total_size)));
@@ -20444,11 +20454,7 @@ rs6000_emit_epilogue (int sibcall)
       /* Prevent reordering memory accesses against stack pointer restore.  */
       if (cfun->calls_alloca
 	  || offset_below_red_zone_p (-info->total_size))
-	{
-	  rtx mem = gen_rtx_MEM (BLKmode, sp_reg_rtx);
-	  MEM_NOTRAP_P (mem) = 1;
-	  emit_insn (gen_stack_tie (mem));
-	}
+	rs6000_emit_stack_tie (frame_reg_rtx, false);
       insn = emit_insn (gen_add3_insn (sp_reg_rtx, sp_reg_rtx,
 				       GEN_INT (info->total_size)));
       sp_offset = 0;
@@ -22835,8 +22841,7 @@ is_mem_ref (rtx pat)
   bool ret = false;
 
   /* stack_tie does not produce any real memory traffic.  */
-  if (GET_CODE (pat) == UNSPEC
-      && XINT (pat, 1) == UNSPEC_TIE)
+  if (tie_operand (pat, VOIDmode))
     return false;
 
   if (GET_CODE (pat) == MEM)
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 186373)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -1451,3 +1451,13 @@
 
   return 1;
 })
+
+;; Return 1 if OP is a stack tie operand.
+(define_predicate "tie_operand"
+  (match_code "parallel")
+{
+  return (GET_CODE (XVECEXP (op, 0, 0)) == SET
+	  && GET_CODE (XEXP (XVECEXP (op, 0, 0), 0)) == MEM
+	  && GET_MODE (XEXP (XVECEXP (op, 0, 0), 0)) == BLKmode
+	  && XEXP (XVECEXP (op, 0, 0), 1) == const0_rtx);
+})
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 186373)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -73,7 +73,6 @@
 (define_c_enum "unspec"
   [UNSPEC_FRSP			; frsp for POWER machines
    UNSPEC_PROBE_STACK		; probe stack memory reference
-   UNSPEC_TIE			; tie stack contents and stack pointer
    UNSPEC_TOCPTR		; address of a word pointing to the TOC
    UNSPEC_TOC			; address of the TOC (more-or-less)
    UNSPEC_MOVSI_GOT
@@ -11989,17 +11988,23 @@
 (define_expand "restore_stack_block"
   [(set (match_dup 2) (match_dup 3))
    (set (match_dup 4) (match_dup 2))
-   (set (match_dup 5) (unspec:BLK [(match_dup 5)] UNSPEC_TIE))
+   (match_dup 5)
    (set (match_operand 0 "register_operand" "")
 	(match_operand 1 "register_operand" ""))]
   ""
   "
 {
+  rtvec p;
+
   operands[1] = force_reg (Pmode, operands[1]);
   operands[2] = gen_reg_rtx (Pmode);
   operands[3] = gen_frame_mem (Pmode, operands[0]);
   operands[4] = gen_frame_mem (Pmode, operands[1]);
-  operands[5] = gen_frame_mem (BLKmode, operands[0]);
+  p = rtvec_alloc (1);
+  RTVEC_ELT (p, 0) = gen_rtx_SET (VOIDmode,
+				  gen_frame_mem (BLKmode, operands[0]),
+				  const0_rtx);
+  operands[5] = gen_rtx_PARALLEL (VOIDmode, p);
 }")
 
 (define_expand "save_stack_nonlocal"
@@ -12022,12 +12027,13 @@
   [(set (match_dup 2) (match_operand 1 "memory_operand" ""))
    (set (match_dup 3) (match_dup 4))
    (set (match_dup 5) (match_dup 2))
-   (set (match_dup 6) (unspec:BLK [(match_dup 6)] UNSPEC_TIE))
+   (match_dup 6)
    (set (match_operand 0 "register_operand" "") (match_dup 3))]
   ""
   "
 {
   int units_per_word = (TARGET_32BIT) ? 4 : 8;
+  rtvec p;
 
   /* Restore the backchain from the first word, sp from the second.  */
   operands[2] = gen_reg_rtx (Pmode);
@@ -12035,7 +12041,11 @@
   operands[1] = adjust_address_nv (operands[1], Pmode, 0);
   operands[4] = adjust_address_nv (operands[1], Pmode, units_per_word);
   operands[5] = gen_frame_mem (Pmode, operands[3]);
-  operands[6] = gen_frame_mem (BLKmode, operands[0]);
+  p = rtvec_alloc (1);
+  RTVEC_ELT (p, 0) = gen_rtx_SET (VOIDmode,
+				  gen_frame_mem (BLKmode, operands[0]),
+				  const0_rtx);
+  operands[6] = gen_rtx_PARALLEL (VOIDmode, p);
 }")
 \f
 ;; TOC register handling.
@@ -15899,25 +15909,15 @@
   [(set_attr "type" "branch")
    (set_attr "length" "4")])
 
-; These are to explain that changes to the stack pointer should
-; not be moved over stores to stack memory.
+; This is to explain that changes to the stack pointer should
+; not be moved over loads from or stores to stack memory.
 (define_insn "stack_tie"
-  [(set (match_operand:BLK 0 "memory_operand" "+m")
-        (unspec:BLK [(match_dup 0)] UNSPEC_TIE))]
+  [(match_parallel 0 "tie_operand"
+		   [(set (mem:BLK (reg 1)) (const_int 0))])]
   ""
   ""
   [(set_attr "length" "0")])
 
-; Like stack_tie, but depend on both fp and sp based memory.
-(define_insn "frame_tie"
-  [(set (match_operand:BLK 0 "memory_operand" "+m")
-	(unspec:BLK [(match_dup 0)
-		     (match_operand:BLK 1 "memory_operand" "m")] UNSPEC_TIE))]
-  ""
-  ""
-  [(set_attr "length" "0")])
-
-
 (define_expand "epilogue"
   [(use (const_int 0))]
   ""

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-12 13:23                       ` Alan Modra
@ 2012-04-12 15:34                         ` David Edelsohn
  2012-04-12 16:03                           ` Richard Guenther
  2012-04-12 16:08                           ` Olivier Hainque
  0 siblings, 2 replies; 17+ messages in thread
From: David Edelsohn @ 2012-04-12 15:34 UTC (permalink / raw)
  To: amodra, Richard Guenther, Olivier Hainque, GCC Patches

On Thu, Apr 12, 2012 at 9:22 AM, Alan Modra <amodra@gmail.com> wrote:

> I tried that.  It doesn't work without something else in the insn to
> stop rtl-dce deleting it, so you may as well use SETs.  But thanks for
> the prod in the right direction.  We do get slightly better results
> when the regs are not hidden away in an UNSPEC, for instance
> non-stack writes/reads are seen by the alias oracle to not conflict
> with the epilogue frame deallocation.
>
> Bootstrapped etc. powerpc-linux.  OK to apply, David?
>
>        PR target/52828
>        * config/rs6000/rs6000.c (rs6000_emit_stack_tie): Rewrite with
>        tie regs on destination of sets.  Delete forward declaration.
>        (rs6000_emit_stack_reset): Update rs6000_emit_stack_tie calls.
>        (rs6000_emit_prologue): Likewise.
>        (rs6000_emit_epilogue): Likewise.  Use in place of gen_frame_tie
>        and gen_stack_tie.
>        (is_mem_ref): Use tie_operand to recognise stack ties.
>        * config/rs6000/predicates.md (tie_operand): New.
>        * config/rs6000/rs6000.md (UNSPEC_TIE): Delete.
>        (restore_stack_block): Generate new stack tie rtl.
>        (restore_stack_nonlocal): Likewise.
>        (stack_tie): Update.
>        (frame_tie): Delete.

This probably is getting close to the best we can do with GCC's RTL
alias analysis infrastructure.

This version is okay, but I also want to give Richi and Olivier an
opportunity to comment if they still have any concerns.

Thanks, David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-12 15:34                         ` David Edelsohn
@ 2012-04-12 16:03                           ` Richard Guenther
  2012-04-12 16:08                           ` Olivier Hainque
  1 sibling, 0 replies; 17+ messages in thread
From: Richard Guenther @ 2012-04-12 16:03 UTC (permalink / raw)
  To: David Edelsohn; +Cc: amodra, Olivier Hainque, GCC Patches

On Thu, Apr 12, 2012 at 5:34 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
> On Thu, Apr 12, 2012 at 9:22 AM, Alan Modra <amodra@gmail.com> wrote:
>
>> I tried that.  It doesn't work without something else in the insn to
>> stop rtl-dce deleting it, so you may as well use SETs.  But thanks for
>> the prod in the right direction.  We do get slightly better results
>> when the regs are not hidden away in an UNSPEC, for instance
>> non-stack writes/reads are seen by the alias oracle to not conflict
>> with the epilogue frame deallocation.
>>
>> Bootstrapped etc. powerpc-linux.  OK to apply, David?
>>
>>        PR target/52828
>>        * config/rs6000/rs6000.c (rs6000_emit_stack_tie): Rewrite with
>>        tie regs on destination of sets.  Delete forward declaration.
>>        (rs6000_emit_stack_reset): Update rs6000_emit_stack_tie calls.
>>        (rs6000_emit_prologue): Likewise.
>>        (rs6000_emit_epilogue): Likewise.  Use in place of gen_frame_tie
>>        and gen_stack_tie.
>>        (is_mem_ref): Use tie_operand to recognise stack ties.
>>        * config/rs6000/predicates.md (tie_operand): New.
>>        * config/rs6000/rs6000.md (UNSPEC_TIE): Delete.
>>        (restore_stack_block): Generate new stack tie rtl.
>>        (restore_stack_nonlocal): Likewise.
>>        (stack_tie): Update.
>>        (frame_tie): Delete.
>
> This probably is getting close to the best we can do with GCC's RTL
> alias analysis infrastructure.
>
> This version is okay, but I also want to give Richi and Olivier an
> opportunity to comment if they still have any concerns.

It looks fine to me.

Richard.

> Thanks, David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-12 15:34                         ` David Edelsohn
  2012-04-12 16:03                           ` Richard Guenther
@ 2012-04-12 16:08                           ` Olivier Hainque
  2012-04-12 17:18                             ` Olivier Hainque
  1 sibling, 1 reply; 17+ messages in thread
From: Olivier Hainque @ 2012-04-12 16:08 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Olivier Hainque, amodra, Richard Guenther, GCC Patches


On Apr 12, 2012, at 17:34 , David Edelsohn wrote:
> This version is okay, but I also want to give Richi and Olivier an
> opportunity to comment if they still have any concerns.

 Thanks :) I'm rebuilding our old compiler with the patch
 and will double check how things go on our original failing case.

 Olivier

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RS6000] Stack tie fix.
  2012-04-12 16:08                           ` Olivier Hainque
@ 2012-04-12 17:18                             ` Olivier Hainque
  0 siblings, 0 replies; 17+ messages in thread
From: Olivier Hainque @ 2012-04-12 17:18 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Olivier Hainque, amodra, Richard Guenther, GCC Patches


On Apr 12, 2012, at 18:07 , Olivier Hainque wrote:

> I'm rebuilding our old compiler with the patch
> and will double check how things go on our original
> failing case.

 All is well :) Thanks Alan!


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2012-04-12 17:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-03  8:40 [RS6000] Stack tie fix Alan Modra
2012-04-03 14:35 ` Olivier Hainque
2012-04-03 14:56   ` Olivier Hainque
2012-04-03 17:05     ` David Edelsohn
2012-04-04  1:22       ` Alan Modra
2012-04-05 10:36         ` Olivier Hainque
2012-04-05 14:03           ` Alan Modra
2012-04-05 17:23             ` Olivier Hainque
2012-04-06  0:33               ` Alan Modra
2012-04-06 15:25                 ` Olivier Hainque
2012-04-11  4:16                   ` Alan Modra
2012-04-11  9:00                     ` Richard Guenther
2012-04-12 13:23                       ` Alan Modra
2012-04-12 15:34                         ` David Edelsohn
2012-04-12 16:03                           ` Richard Guenther
2012-04-12 16:08                           ` Olivier Hainque
2012-04-12 17:18                             ` Olivier Hainque

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).