Out-of-order update of new_spill_reg

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Out-of-order update of new_spill_reg_store[]
@ 2011-10-09  9:28 Richard Sandiford
  2011-10-11 11:00 ` Bernd Schmidt
  2011-10-17 14:32 ` Bernd Schmidt
  0 siblings, 2 replies; 10+ messages in thread
From: Richard Sandiford @ 2011-10-09  9:28 UTC (permalink / raw)
  To: gcc-patches; +Cc: bernds, uweigand

This patch fixes an ordering problem in reload: the output reloads are
emitted in reverse operand order, but new_spill_reg_store[] is updated
in forward reload order.  This causes problems if the same register is
used for two reloads.

I saw this hit on mips64-linux-gnu/-mabi=64 as a failure in
execute/scal-to-vec1.c at -O3.  The reloads were:

Reloads for insn # 580
Reload 0: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, secondary_reload_p
        reload_reg_rtx: (reg:SI 5 $5)
Reload 1: reload_out (SI) = (reg:SI 32 $f0 [1655])
        MD1_REG, RELOAD_FOR_OUTPUT (opnum = 0)
        reload_out_reg: (reg:SI 32 $f0 [1655])
        reload_reg_rtx: (reg:SI 65 lo)
        secondary_out_reload = 0

Reload 2: reload_out (SI) = (reg:SI 1656)
        GR_REGS, RELOAD_FOR_OUTPUT (opnum = 3)
        reload_out_reg: (reg:SI 1656)
        reload_reg_rtx: (reg:SI 5 $5)

So $5 is first stored in 1656 (operand 3), then $5 is used a secondary
reload in copying LO to $f0 (operand 0, reg 1655).  The next and final
use of 1655 ends up inheriting this second reload of $5, so we try to
delete the original output copy.  The problem is that we delete the
wrong one: we delete the store of $5 to 1656 rather than the copy of
$5 to 1655/$f0.

The fix I went for is to clear new_spill_reg_store[] for all reloads
as a separate pass (rather than in the main do_{input,output}_reload
loop), then only allow new_spill_store_reg[] to be set if the associated
reload register reaches the end of the reload sequence.

emit_input_reloads has:

      /* Output a special code sequence for this case, and forget about
	 spill reg information.  */
      new_spill_reg_store[REGNO (reloadreg)] = NULL;
      inc_for_reload (reloadreg, oldequiv, rl->out, rl->inc);

I think this store is redundant: emit_reload_insns should already have
cleared it, beth before and after the patch.  (The code was originally:

      /* Output a special code sequence for this case.  */
      new_spill_reg_store[REGNO (reloadreg)]
	= inc_for_reload (reloadreg, oldequiv, rl->out,
			  rl->inc);

but was changed because we can't inherit auto-inc reloads as easily
as that.  So the nullification came from an existing new_spill_reg_store[]
assignment, rather than being added explicitly.)

Also, emit_reload_insns has two blocks to record inheritance information:
one for spill registers and one for non-spill registers.  The spill version
checks that the reload register reaches the end of the sequence, and I think
the non-spill version should too.

Tested on mips64-linux-gnu and x86_64-linux-gnu.  It fixes the testcase
(by deleting the correct instruction -- the inheritance still happens).
Bernd, Uli, does this look OK?

Richard


gcc/
	* reload1.c (reload_regs_reach_end_p): Replace with...
	(reload_reg_rtx_reaches_end_p): ...this function.
	(new_spill_reg_store): Update commentary.
	(emit_input_reload_insns): Don't clear new_spill_reg_store here.
	(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
	before setting new_spill_reg_store.
	(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
	Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
	Also use reload_reg_rtx_reaches_end_p when recording inheritance
	information for non-spill reload registers.

Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c	2011-10-08 16:32:26.000000000 +0100
+++ gcc/reload1.c	2011-10-08 16:32:26.000000000 +0100
@@ -5499,15 +5499,15 @@ reload_reg_reaches_end_p (unsigned int r
 }
 
 /* Like reload_reg_reaches_end_p, but check that the condition holds for
-   every register in the range [REGNO, REGNO + NREGS).  */
+   every register in REG.  */
 
 static bool
-reload_regs_reach_end_p (unsigned int regno, int nregs, int reloadnum)
+reload_reg_rtx_reaches_end_p (rtx reg, int reloadnum)
 {
-  int i;
+  unsigned int i;
 
-  for (i = 0; i < nregs; i++)
-    if (!reload_reg_reaches_end_p (regno + i, reloadnum))
+  for (i = REGNO (reg); i < END_REGNO (reg); i++)
+    if (!reload_reg_reaches_end_p (i, reloadnum))
       return false;
   return true;
 }
@@ -7052,7 +7052,9 @@ static rtx operand_reload_insns = 0;
 static rtx other_operand_reload_insns = 0;
 static rtx other_output_reload_insns[MAX_RECOG_OPERANDS];
 
-/* Values to be put in spill_reg_store are put here first.  */
+/* Values to be put in spill_reg_store are put here first.  Instructions
+   must only be placed here if the associated reload register reaches
+   the end of the instruction's reload sequence.  */
 static rtx new_spill_reg_store[FIRST_PSEUDO_REGISTER];
 static HARD_REG_SET reg_reloaded_died;
 
@@ -7213,9 +7215,7 @@ emit_input_reload_insns (struct insn_cha
 
       /* Prevent normal processing of this reload.  */
       special = 1;
-      /* Output a special code sequence for this case, and forget about
-	 spill reg information.  */
-      new_spill_reg_store[REGNO (reloadreg)] = NULL;
+      /* Output a special code sequence for this case.  */
       inc_for_reload (reloadreg, oldequiv, rl->out, rl->inc);
     }
 
@@ -7736,14 +7736,14 @@ emit_output_reload_insns (struct insn_ch
 		    rld[s].out_reg = rl->out_reg;
 		    set = single_set (next);
 		    if (set && SET_SRC (set) == s_reg
-			&& ! new_spill_reg_store[REGNO (s_reg)])
+			&& reload_reg_rtx_reaches_end_p (s_reg, s))
 		      {
 			SET_HARD_REG_BIT (reg_is_output_reload,
 					  REGNO (s_reg));
 			new_spill_reg_store[REGNO (s_reg)] = next;
 		      }
 		  }
-		else
+		else if (reload_reg_rtx_reaches_end_p (rl_reg_rtx, j))
 		  new_spill_reg_store[REGNO (rl_reg_rtx)] = p;
 	      }
 	  }
@@ -8003,6 +8003,15 @@ emit_reload_insns (struct insn_chain *ch
       debug_reload_to_stream (dump_file);
     }
 
+  for (j = 0; j < n_reloads; j++)
+    if (rld[j].reg_rtx && HARD_REGISTER_P (rld[j].reg_rtx))
+      {
+	unsigned int i;
+
+	for (i = REGNO (rld[j].reg_rtx); i < END_REGNO (rld[j].reg_rtx); i++)
+	  new_spill_reg_store[i] = 0;
+      }
+
   /* Now output the instructions to copy the data into and out of the
      reload registers.  Do these in the order that the reloads were reported,
      since reloads of base and index registers precede reloads of operands
@@ -8010,14 +8019,6 @@ emit_reload_insns (struct insn_chain *ch
 
   for (j = 0; j < n_reloads; j++)
     {
-      if (rld[j].reg_rtx && HARD_REGISTER_P (rld[j].reg_rtx))
-	{
-	  unsigned int i;
-
-	  for (i = REGNO (rld[j].reg_rtx); i < END_REGNO (rld[j].reg_rtx); i++)
-	    new_spill_reg_store[i] = 0;
-	}
-
       do_input_reload (chain, rld + j, j);
       do_output_reload (chain, rld + j, j);
     }
@@ -8143,15 +8144,13 @@ emit_reload_insns (struct insn_chain *ch
 			 && GET_CODE (rld[r].out) != PRE_MODIFY))))
 	    {
 	      rtx reg;
-	      enum machine_mode mode;
-	      int regno, nregs;
 
 	      reg = reload_reg_rtx_for_output[r];
-	      mode = GET_MODE (reg);
-	      regno = REGNO (reg);
-	      nregs = hard_regno_nregs[regno][mode];
-	      if (reload_regs_reach_end_p (regno, nregs, r))
+	      if (reload_reg_rtx_reaches_end_p (reg, r))
 		{
+		  enum machine_mode mode = GET_MODE (reg);
+		  int regno = REGNO (reg);
+		  int nregs = hard_regno_nregs[regno][mode];
 		  rtx out = (REG_P (rld[r].out)
 			     ? rld[r].out
 			     : rld[r].out_reg
@@ -8215,20 +8214,21 @@ emit_reload_insns (struct insn_chain *ch
 		   && !reg_set_p (reload_reg_rtx_for_input[r], PATTERN (insn)))
 	    {
 	      rtx reg;
-	      enum machine_mode mode;
-	      int regno, nregs;
 
 	      reg = reload_reg_rtx_for_input[r];
-	      mode = GET_MODE (reg);
-	      regno = REGNO (reg);
-	      nregs = hard_regno_nregs[regno][mode];
-	      if (reload_regs_reach_end_p (regno, nregs, r))
+	      if (reload_reg_rtx_reaches_end_p (reg, r))
 		{
+		  enum machine_mode mode;
+		  int regno;
+		  int nregs;
 		  int in_regno;
 		  int in_nregs;
 		  rtx in;
 		  bool piecemeal;
 
+		  mode = GET_MODE (reg);
+		  regno = REGNO (reg);
+		  nregs = hard_regno_nregs[regno][mode];
 		  if (REG_P (rld[r].in)
 		      && REGNO (rld[r].in) >= FIRST_PSEUDO_REGISTER)
 		    in = rld[r].in;
@@ -8329,30 +8329,33 @@ emit_reload_insns (struct insn_chain *ch
 		 the storing insn so that we may delete this insn with
 		 delete_output_reload.  */
 	      src_reg = reload_reg_rtx_for_output[r];
-
-	      /* If this is an optional reload, try to find the source reg
-		 from an input reload.  */
-	      if (! src_reg)
+	      if (src_reg
+		  && reload_reg_rtx_reaches_end_p (src_reg, r))
+		store_insn = new_spill_reg_store[REGNO (src_reg)];
+	      else
 		{
+		  /* If this is an optional reload, try to find the
+		     source reg from an input reload.  */
 		  rtx set = single_set (insn);
 		  if (set && SET_DEST (set) == rld[r].out)
 		    {
 		      int k;
+		      rtx cand;
 
 		      src_reg = SET_SRC (set);
 		      store_insn = insn;
 		      for (k = 0; k < n_reloads; k++)
-			{
-			  if (rld[k].in == src_reg)
-			    {
-			      src_reg = reload_reg_rtx_for_input[k];
-			      break;
-			    }
-			}
+			if (rld[k].in == src_reg)
+			  {
+			    cand = reload_reg_rtx_for_input[k];
+			    if (reload_reg_rtx_reaches_end_p (cand, k))
+			      {
+				src_reg = cand;
+				break;
+			      }
+			  }
 		    }
 		}
-	      else
-		store_insn = new_spill_reg_store[REGNO (src_reg)];
 	      if (src_reg && REG_P (src_reg)
 		  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
 		{

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Out-of-order update of new_spill_reg_store[]
  2011-10-09  9:28 Out-of-order update of new_spill_reg_store[] Richard Sandiford
@ 2011-10-11 11:00 ` Bernd Schmidt
  2011-10-11 13:04   ` Richard Sandiford
  2011-10-17 14:32 ` Bernd Schmidt
  1 sibling, 1 reply; 10+ messages in thread
From: Bernd Schmidt @ 2011-10-11 11:00 UTC (permalink / raw)
  To: gcc-patches, uweigand, rdsandiford

I'm not completely following this yet, so please bear with me...

On 10/09/11 10:01, Richard Sandiford wrote:
> Reload 0: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, secondary_reload_p
>         reload_reg_rtx: (reg:SI 5 $5)
> Reload 1: reload_out (SI) = (reg:SI 32 $f0 [1655])
>         MD1_REG, RELOAD_FOR_OUTPUT (opnum = 0)
>         reload_out_reg: (reg:SI 32 $f0 [1655])
>         reload_reg_rtx: (reg:SI 65 lo)
>         secondary_out_reload = 0
> 
> Reload 2: reload_out (SI) = (reg:SI 1656)
>         GR_REGS, RELOAD_FOR_OUTPUT (opnum = 3)
>         reload_out_reg: (reg:SI 1656)
>         reload_reg_rtx: (reg:SI 5 $5)
> 
> So $5 is first stored in 1656 (operand 3), then $5 is used a secondary
> reload in copying LO to $f0 (operand 0, reg 1655).  The next and final
> use of 1655 ends up inheriting this second reload of $5, so we try to
> delete the original output copy.  The problem is that we delete the
> wrong one: we delete the store of $5 to 1656 rather than the copy of
> $5 to 1655/$f0.

So, reload 1 inherited from somewhere else rather than using reg $5 from
its secondary reload? Where do we try to delete the insn, and what's the
state of the spill_reg_store data at that point?

> The fix I went for is to clear new_spill_reg_store[] for all reloads
> as a separate pass (rather than in the main do_{input,output}_reload
> loop), then only allow new_spill_store_reg[] to be set if the associated
> reload register reaches the end of the reload sequence.

In this case, reload 0 is emitted after reload 2, so it reaches the end.
Correct? What would happen if the 0/1 pair and 2 were swapped?


Bernd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Out-of-order update of new_spill_reg_store[]
  2011-10-11 11:00 ` Bernd Schmidt
@ 2011-10-11 13:04   ` Richard Sandiford
  2011-10-12 13:20     ` Bernd Schmidt
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Sandiford @ 2011-10-11 13:04 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches, uweigand

Bernd Schmidt <bernds@codesourcery.com> writes:
> On 10/09/11 10:01, Richard Sandiford wrote:
>> Reload 0: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, secondary_reload_p
>>         reload_reg_rtx: (reg:SI 5 $5)
>> Reload 1: reload_out (SI) = (reg:SI 32 $f0 [1655])
>>         MD1_REG, RELOAD_FOR_OUTPUT (opnum = 0)
>>         reload_out_reg: (reg:SI 32 $f0 [1655])
>>         reload_reg_rtx: (reg:SI 65 lo)
>>         secondary_out_reload = 0
>> 
>> Reload 2: reload_out (SI) = (reg:SI 1656)
>>         GR_REGS, RELOAD_FOR_OUTPUT (opnum = 3)
>>         reload_out_reg: (reg:SI 1656)
>>         reload_reg_rtx: (reg:SI 5 $5)
>> 
>> So $5 is first stored in 1656 (operand 3), then $5 is used a secondary
>> reload in copying LO to $f0 (operand 0, reg 1655).  The next and final
>> use of 1655 ends up inheriting this second reload of $5, so we try to
>> delete the original output copy.  The problem is that we delete the
>> wrong one: we delete the store of $5 to 1656 rather than the copy of
>> $5 to 1655/$f0.
>
> So, reload 1 inherited from somewhere else rather than using reg $5 from
> its secondary reload? Where do we try to delete the insn, and what's the
> state of the spill_reg_store data at that point?

No, reload 1 is inherited by a later instruction.  And it's inherited
correctly, in terms of the register contents being what we expect.
(Reload 1 is the one that survives to the end of the instruction's
reload sequence.  Reload 2, in contrast, is clobbered by reload 1,
so could not be inherited.  So when we record inheritance information
in emit_reload_insns, reload_reg_reaches_end_p correctly stops us
from recording reload 2 but allows us to record reload 1.)

The problem is that we record the wrong instruction for reload 1.
We say that reload 1 is performed by the instruction that performs
reload 2.  So spill_reg_store[] contains the instruction for reload 2
rather than the instruction for reload 1.  We delete it in
delete_output_reload at the point of inheritance.

>> The fix I went for is to clear new_spill_reg_store[] for all reloads
>> as a separate pass (rather than in the main do_{input,output}_reload
>> loop), then only allow new_spill_store_reg[] to be set if the associated
>> reload register reaches the end of the reload sequence.
>
> In this case, reload 0 is emitted after reload 2, so it reaches the end.
> Correct?

Right.  And this is a defined order: output reloads are emitted in reverse
operand order (operand N-1, ..., operand 0). reload_reg_reaches_end_p
and reload_reg_free_p both rely on this property.  In this case, reload 2
is associated with operand 3, while reload 1 is associated with operand 0,
so reload 2 has to come first.

> What would happen if the 0/1 pair and 2 were swapped?

If only the reload indices changed (so that reload 2 became 0) then the
reloads would still be emitted in the current order (because the order
is tied to the operand number rather than the reload number).  And the
patch would also record the same instruction in spill_reg_store[].
I think current trunk would also behave correctly in this case,
because the reload numbers happen to match the insn order.

If the operand numbers were swapped, then the order of the output
reloads would be too.  This register selection would then be invalid.
We would have a secondary reload that uses $5 being emitted while the $5
output from the main instruction is still live.  But reload_reg_free_p
knows to avoid this situation.

Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Out-of-order update of new_spill_reg_store[]
  2011-10-11 13:04   ` Richard Sandiford
@ 2011-10-12 13:20     ` Bernd Schmidt
  2011-10-12 21:12       ` Richard Sandiford
  0 siblings, 1 reply; 10+ messages in thread
From: Bernd Schmidt @ 2011-10-12 13:20 UTC (permalink / raw)
  To: gcc-patches, uweigand, rdsandiford

On 10/11/11 14:35, Richard Sandiford wrote:
> No, reload 1 is inherited by a later instruction.  And it's inherited
> correctly, in terms of the register contents being what we expect.
> (Reload 1 is the one that survives to the end of the instruction's
> reload sequence.  Reload 2, in contrast, is clobbered by reload 1,
> so could not be inherited.  So when we record inheritance information
> in emit_reload_insns, reload_reg_reaches_end_p correctly stops us
> from recording reload 2 but allows us to record reload 1.)
> 
> The problem is that we record the wrong instruction for reload 1.
> We say that reload 1 is performed by the instruction that performs
> reload 2.  So spill_reg_store[] contains the instruction for reload 2
> rather than the instruction for reload 1.  We delete it in
> delete_output_reload at the point of inheritance.

Ok. So, would the minimal fix of testing !new_spill_reg_store[..] before
writing to it also work? Seems to me this would cope with the
out-of-order writes by only allowing the first.

If so, then I think I'd prefer that, but we could gcc_assert
(reload_reg_reaches_end (..)) as a bit of a verification of that function.


Bernd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Out-of-order update of new_spill_reg_store[]
  2011-10-12 13:20     ` Bernd Schmidt
@ 2011-10-12 21:12       ` Richard Sandiford
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Sandiford @ 2011-10-12 21:12 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches, uweigand

Bernd Schmidt <bernds@codesourcery.com> writes:
> On 10/11/11 14:35, Richard Sandiford wrote:
>> No, reload 1 is inherited by a later instruction.  And it's inherited
>> correctly, in terms of the register contents being what we expect.
>> (Reload 1 is the one that survives to the end of the instruction's
>> reload sequence.  Reload 2, in contrast, is clobbered by reload 1,
>> so could not be inherited.  So when we record inheritance information
>> in emit_reload_insns, reload_reg_reaches_end_p correctly stops us
>> from recording reload 2 but allows us to record reload 1.)
>> 
>> The problem is that we record the wrong instruction for reload 1.
>> We say that reload 1 is performed by the instruction that performs
>> reload 2.  So spill_reg_store[] contains the instruction for reload 2
>> rather than the instruction for reload 1.  We delete it in
>> delete_output_reload at the point of inheritance.
>
> Ok. So, would the minimal fix of testing !new_spill_reg_store[..] before
> writing to it also work? Seems to me this would cope with the
> out-of-order writes by only allowing the first.
>
> If so, then I think I'd prefer that, but we could gcc_assert
> (reload_reg_reaches_end (..)) as a bit of a verification of that function.

I don't think the assert would be safe.  We could have similar reuse in
cases where the first reload (in rld order) is a double-register value
starting in $4 and the second reload uses just $5.  In that case, the first
reload will have set new_spill_reg_store[4], so new_spill_reg_store[5] will
still be null.  But $5 in the second reload won't survive until the end of
the sequence.  So we'd try to set new_spill_reg_store[5] and trip the assert.

IMO it's a choice between just checking for null and not asserting
(if that really is safe, since we'll be storing instructions that
don't actually reach the end of the reload sequence), or checking
reload_reg_reaches_end.  I prefer the second, since it seems more
direct, and matches the corresponding code in emit_reload_insns.

Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Out-of-order update of new_spill_reg_store[]
  2011-10-09  9:28 Out-of-order update of new_spill_reg_store[] Richard Sandiford
  2011-10-11 11:00 ` Bernd Schmidt
@ 2011-10-17 14:32 ` Bernd Schmidt
  2011-12-04 10:42   ` Richard Sandiford
  1 sibling, 1 reply; 10+ messages in thread
From: Bernd Schmidt @ 2011-10-17 14:32 UTC (permalink / raw)
  To: gcc-patches, uweigand, rdsandiford

> gcc/
> 	* reload1.c (reload_regs_reach_end_p): Replace with...
> 	(reload_reg_rtx_reaches_end_p): ...this function.
> 	(new_spill_reg_store): Update commentary.
> 	(emit_input_reload_insns): Don't clear new_spill_reg_store here.
> 	(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
> 	before setting new_spill_reg_store.
> 	(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
> 	Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
> 	Also use reload_reg_rtx_reaches_end_p when recording inheritance
> 	information for non-spill reload registers.

Just an update to say that based on our discussion I think the general
approach is OK, but I'm still trying to figure out what exactly this
piece of code is doing, and whether the changes to it make sense:

> @@ -8329,30 +8329,33 @@ emit_reload_insns (struct insn_chain *ch
>  		 the storing insn so that we may delete this insn with
>  		 delete_output_reload.  */
>  	      src_reg = reload_reg_rtx_for_output[r];
> -
> -	      /* If this is an optional reload, try to find the source reg
> -		 from an input reload.  */
> -	      if (! src_reg)
> +	      if (src_reg
> +		  && reload_reg_rtx_reaches_end_p (src_reg, r))
> +		store_insn = new_spill_reg_store[REGNO (src_reg)];
> +	      else
>  		{
> +		  /* If this is an optional reload, try to find the
> +		     source reg from an input reload.  */
>  		  rtx set = single_set (insn);
>  		  if (set && SET_DEST (set) == rld[r].out)
>  		    {
>  		      int k;
> +		      rtx cand;
>  
>  		      src_reg = SET_SRC (set);
>  		      store_insn = insn;
>  		      for (k = 0; k < n_reloads; k++)
> -			{
> -			  if (rld[k].in == src_reg)
> -			    {
> -			      src_reg = reload_reg_rtx_for_input[k];
> -			      break;
> -			    }
> -			}
> +			if (rld[k].in == src_reg)
> +			  {
> +			    cand = reload_reg_rtx_for_input[k];
> +			    if (reload_reg_rtx_reaches_end_p (cand, k))
> +			      {
> +				src_reg = cand;
> +				break;
> +			      }
> +			  }
>  		    }
>  		}
> -	      else
> -		store_insn = new_spill_reg_store[REGNO (src_reg)];
>  	      if (src_reg && REG_P (src_reg)
>  		  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
>  		{


Bernd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Out-of-order update of new_spill_reg_store[]
  2011-10-17 14:32 ` Bernd Schmidt
@ 2011-12-04 10:42   ` Richard Sandiford
  2012-02-01 16:09     ` Ulrich Weigand
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Sandiford @ 2011-12-04 10:42 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches, uweigand

Back to this...

Bernd Schmidt <bernds@codesourcery.com> writes:
>> gcc/
>> 	* reload1.c (reload_regs_reach_end_p): Replace with...
>> 	(reload_reg_rtx_reaches_end_p): ...this function.
>> 	(new_spill_reg_store): Update commentary.
>> 	(emit_input_reload_insns): Don't clear new_spill_reg_store here.
>> 	(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
>> 	before setting new_spill_reg_store.
>> 	(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
>> 	Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
>> 	Also use reload_reg_rtx_reaches_end_p when recording inheritance
>> 	information for non-spill reload registers.
>
> Just an update to say that based on our discussion I think the general
> approach is OK, but I'm still trying to figure out what exactly this
> piece of code is doing, and whether the changes to it make sense:
>
>> @@ -8329,30 +8329,33 @@ emit_reload_insns (struct insn_chain *ch
>>  		 the storing insn so that we may delete this insn with
>>  		 delete_output_reload.  */
>>  	      src_reg = reload_reg_rtx_for_output[r];
>> -
>> -	      /* If this is an optional reload, try to find the source reg
>> -		 from an input reload.  */
>> -	      if (! src_reg)
>> +	      if (src_reg
>> +		  && reload_reg_rtx_reaches_end_p (src_reg, r))
>> +		store_insn = new_spill_reg_store[REGNO (src_reg)];
>> +	      else
>>  		{
>> +		  /* If this is an optional reload, try to find the
>> +		     source reg from an input reload.  */
>>  		  rtx set = single_set (insn);
>>  		  if (set && SET_DEST (set) == rld[r].out)
>>  		    {
>>  		      int k;
>> +		      rtx cand;
>>  
>>  		      src_reg = SET_SRC (set);
>>  		      store_insn = insn;
>>  		      for (k = 0; k < n_reloads; k++)
>> -			{
>> -			  if (rld[k].in == src_reg)
>> -			    {
>> -			      src_reg = reload_reg_rtx_for_input[k];
>> -			      break;
>> -			    }
>> -			}
>> +			if (rld[k].in == src_reg)
>> +			  {
>> +			    cand = reload_reg_rtx_for_input[k];
>> +			    if (reload_reg_rtx_reaches_end_p (cand, k))
>> +			      {
>> +				src_reg = cand;
>> +				break;
>> +			      }
>> +			  }
>>  		    }
>>  		}
>> -	      else
>> -		store_insn = new_spill_reg_store[REGNO (src_reg)];
>>  	      if (src_reg && REG_P (src_reg)
>>  		  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
>>  		{

Yeah, I was in two minds what to do here.  AIUI, the code:

	  if (!HARD_REGISTER_NUM_P (out_regno))
	    {
	      rtx src_reg, store_insn = NULL_RTX;

	      reg_last_reload_reg[out_regno] = 0;

	      /* If we can find a hard register that is stored, record
		 the storing insn so that we may delete this insn with
		 delete_output_reload.  */
	      src_reg = reload_reg_rtx_for_output[r];

	      /* If this is an optional reload, try to find the source reg
		 from an input reload.  */
	      if (! src_reg)
		{
		  rtx set = single_set (insn);
		  if (set && SET_DEST (set) == rld[r].out)
		    {
		      int k;

[A]		      src_reg = SET_SRC (set);
		      store_insn = insn;
		      for (k = 0; k < n_reloads; k++)
			{
			  if (rld[k].in == src_reg)
			    {
[B]			      src_reg = reload_reg_rtx_for_input[k];
			      break;
			    }
			}
		    }
		}
	      else
[C]		store_insn = new_spill_reg_store[REGNO (src_reg)];
	      if (src_reg && REG_P (src_reg)
		  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
		{
                  ...record inheritance for out <- src_reg...

is coping with three cases:

 [C] we have an output reload for a non-spill register.  E.g. the
     pre-reload instruction might be:

        (set (reg P1) (.... (reg H1) ...))
             REG_DEAD: H1

     (H = hard register, P = pseudo register), where the P1 and H1
     operands have matching constraints.  In this case we might use
     H1 as the reload register for (reg P1).

     This is the case that really does need reload_reg_rtx_reaches_end_p.
     E.g. the SET above could be in parallel with another SET that needs
     a secondary output reload.  It's possible in principle for H1 to be
     used as the secondary reload register there too, so that the final
     sequence looks like:

       (parallel [(set (reg H1 [orig P1]) (... (reg H1) ...))
                  (set (reg H2 [orig P2]) (...))]
       (set (reg P1) (reg H1))
       (set (reg H1) (reg H2))
       (set (reg P2) (reg H1))

     (I say "in principle" because I don't know whether there is code
     to take advantage of this for non-spill registers like H1.
     But it would be a valid choice.)

     Either way, the idea is that new_spill_reg_store[R] is only valid
     for reload registers R that reach the end of the reload sequence.
     We really should check that H1 reaches the end before using
     new_spill_reg_store[H1].

 [A] (but not [B]).  I think this is for optional reloads that we
     decided not to make, in cases where the source of the set needs
     no reloads.  E.g. the pre-reload instruction might be:

       (set (reg P1) (reg H1))

     and P1 has an optional output reload that we decided not to make.
     Here we record that H1 holds P1 as a possible inheritance.
     If inheritance causes the P1 <- H1 move to become unnecessary,
     we can delete it as dead.

     There doesn't seem to be any kind of "reaches end" check, but I
     suppose the hope is that instructions like this are going to be so
     simple that there's no point.  Although I'm not sure whether for:

          (parallel [(set (reg P1) (reg H1))
                     (clobber (scratch))])

     we'd ever consider using H1 for the scratch in cases where the
     clobber isn't an earlyclobber.

     Either way, this is unrelated to the original problem of
     new_spill_reg_store being out of whack.  reload_reg_rtx_reaches_end_p
     wouldn't be the right thing to check, since we don't have a reload
     register to test in this case.  Because there's no "off the shelf"
     fix, it's probably best to leave it be until we have a testcase
     that exhibits a problem.

     The patch doesn't change this case.  

 [B] I think this is similar to [A], but for cases where the source
     is reloaded.  E.g. the pre-reload instruction might be:

       (set (reg P1) (reg P2))

     where P1 has an optional reload that we decided not to make and
     P2 is reloaded into H2.  The final sequence would look like:

       (set (reg H2) (reg P2))
       (set (reg P1) (reg H2))

     and the code would record P1 <- H2 for inheritance.

     There does seem to be a real danger here that H2 could be reused
     for clobbers (except again for earlyclobbers), so it seemed best
     to test reload_reg_rtx_reaches_end_p.  However, this change was
     by inspection, and isn't directly related to the new handling
     of new_spill_reg_store[], since the inherited "reload" is
     actually the reloaded instruction itself.

     If H2 doesn't reach the end, the patch falls back to [A].
     This means that if the original source is a hard register
     of the wrong class (instead of a pseudo as in the example above),
     we can continue to record an inheritance for that.

     I suppose it could be argued that this in itself has a risk
     (although I think a smaller one) because of the aforementioned
     lack of a "reaches end" test in [A].  So while the effect of the
     reload_reg_rtx_reaches_end_p check is to stop [B] from trumping [A]
     in cases where [B] is definitely invalid, it doesn't do anything to
     fix any problems that there might be in [A].

     Which is why this is a quandry.  After the patch for [C],
     I think this is the only remaining case in which we record
     inheritance for a reload register without checking whether
     the reload register reaches the end.  (Note that the "i >= 0"
     block above this one already applies the check consistently.)
     So on the one hand, it seems wrong to leave an obvious hole
     (especially one that will lead to silent wrong-code bugs),
     but on the other, it's always dangerous changing reload by
     inspection.

If you prefer changing as little as possible, then the version below
just does [C].  Also tested on mips64-linux-gnu.

Richard


gcc/
	* reload1.c (reload_regs_reach_end_p): Replace with...
	(reload_reg_rtx_reaches_end_p): ...this function.
	(new_spill_reg_store): Update commentary.
	(emit_input_reload_insns): Don't clear new_spill_reg_store here.
	(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
	before setting new_spill_reg_store.
	(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
	Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
	Also use reload_reg_rtx_reaches_end_p when reading new_spill_reg_store
	for non-spill reload registers.

Index: gcc/reload1.c
===================================================================
--- gcc/reload1.c	2011-11-27 10:06:56.000000000 +0000
+++ gcc/reload1.c	2011-12-03 14:29:26.000000000 +0000
@@ -5505,15 +5505,15 @@ reload_reg_reaches_end_p (unsigned int r
 }
 
 /* Like reload_reg_reaches_end_p, but check that the condition holds for
-   every register in the range [REGNO, REGNO + NREGS).  */
+   every register in REG.  */
 
 static bool
-reload_regs_reach_end_p (unsigned int regno, int nregs, int reloadnum)
+reload_reg_rtx_reaches_end_p (rtx reg, int reloadnum)
 {
-  int i;
+  unsigned int i;
 
-  for (i = 0; i < nregs; i++)
-    if (!reload_reg_reaches_end_p (regno + i, reloadnum))
+  for (i = REGNO (reg); i < END_REGNO (reg); i++)
+    if (!reload_reg_reaches_end_p (i, reloadnum))
       return false;
   return true;
 }
@@ -7058,7 +7058,9 @@ static rtx operand_reload_insns = 0;
 static rtx other_operand_reload_insns = 0;
 static rtx other_output_reload_insns[MAX_RECOG_OPERANDS];
 
-/* Values to be put in spill_reg_store are put here first.  */
+/* Values to be put in spill_reg_store are put here first.  Instructions
+   must only be placed here if the associated reload register reaches
+   the end of the instruction's reload sequence.  */
 static rtx new_spill_reg_store[FIRST_PSEUDO_REGISTER];
 static HARD_REG_SET reg_reloaded_died;
 
@@ -7219,9 +7221,7 @@ emit_input_reload_insns (struct insn_cha
 
       /* Prevent normal processing of this reload.  */
       special = 1;
-      /* Output a special code sequence for this case, and forget about
-	 spill reg information.  */
-      new_spill_reg_store[REGNO (reloadreg)] = NULL;
+      /* Output a special code sequence for this case.  */
       inc_for_reload (reloadreg, oldequiv, rl->out, rl->inc);
     }
 
@@ -7742,14 +7742,14 @@ emit_output_reload_insns (struct insn_ch
 		    rld[s].out_reg = rl->out_reg;
 		    set = single_set (next);
 		    if (set && SET_SRC (set) == s_reg
-			&& ! new_spill_reg_store[REGNO (s_reg)])
+			&& reload_reg_rtx_reaches_end_p (s_reg, s))
 		      {
 			SET_HARD_REG_BIT (reg_is_output_reload,
 					  REGNO (s_reg));
 			new_spill_reg_store[REGNO (s_reg)] = next;
 		      }
 		  }
-		else
+		else if (reload_reg_rtx_reaches_end_p (rl_reg_rtx, j))
 		  new_spill_reg_store[REGNO (rl_reg_rtx)] = p;
 	      }
 	  }
@@ -8009,6 +8009,15 @@ emit_reload_insns (struct insn_chain *ch
       debug_reload_to_stream (dump_file);
     }
 
+  for (j = 0; j < n_reloads; j++)
+    if (rld[j].reg_rtx && HARD_REGISTER_P (rld[j].reg_rtx))
+      {
+	unsigned int i;
+
+	for (i = REGNO (rld[j].reg_rtx); i < END_REGNO (rld[j].reg_rtx); i++)
+	  new_spill_reg_store[i] = 0;
+      }
+
   /* Now output the instructions to copy the data into and out of the
      reload registers.  Do these in the order that the reloads were reported,
      since reloads of base and index registers precede reloads of operands
@@ -8016,14 +8025,6 @@ emit_reload_insns (struct insn_chain *ch
 
   for (j = 0; j < n_reloads; j++)
     {
-      if (rld[j].reg_rtx && HARD_REGISTER_P (rld[j].reg_rtx))
-	{
-	  unsigned int i;
-
-	  for (i = REGNO (rld[j].reg_rtx); i < END_REGNO (rld[j].reg_rtx); i++)
-	    new_spill_reg_store[i] = 0;
-	}
-
       do_input_reload (chain, rld + j, j);
       do_output_reload (chain, rld + j, j);
     }
@@ -8149,15 +8150,13 @@ emit_reload_insns (struct insn_chain *ch
 			 && GET_CODE (rld[r].out) != PRE_MODIFY))))
 	    {
 	      rtx reg;
-	      enum machine_mode mode;
-	      int regno, nregs;
 
 	      reg = reload_reg_rtx_for_output[r];
-	      mode = GET_MODE (reg);
-	      regno = REGNO (reg);
-	      nregs = hard_regno_nregs[regno][mode];
-	      if (reload_regs_reach_end_p (regno, nregs, r))
+	      if (reload_reg_rtx_reaches_end_p (reg, r))
 		{
+		  enum machine_mode mode = GET_MODE (reg);
+		  int regno = REGNO (reg);
+		  int nregs = hard_regno_nregs[regno][mode];
 		  rtx out = (REG_P (rld[r].out)
 			     ? rld[r].out
 			     : rld[r].out_reg
@@ -8221,20 +8220,21 @@ emit_reload_insns (struct insn_chain *ch
 		   && !reg_set_p (reload_reg_rtx_for_input[r], PATTERN (insn)))
 	    {
 	      rtx reg;
-	      enum machine_mode mode;
-	      int regno, nregs;
 
 	      reg = reload_reg_rtx_for_input[r];
-	      mode = GET_MODE (reg);
-	      regno = REGNO (reg);
-	      nregs = hard_regno_nregs[regno][mode];
-	      if (reload_regs_reach_end_p (regno, nregs, r))
+	      if (reload_reg_rtx_reaches_end_p (reg, r))
 		{
+		  enum machine_mode mode;
+		  int regno;
+		  int nregs;
 		  int in_regno;
 		  int in_nregs;
 		  rtx in;
 		  bool piecemeal;
 
+		  mode = GET_MODE (reg);
+		  regno = REGNO (reg);
+		  nregs = hard_regno_nregs[regno][mode];
 		  if (REG_P (rld[r].in)
 		      && REGNO (rld[r].in) >= FIRST_PSEUDO_REGISTER)
 		    in = rld[r].in;
@@ -8336,10 +8336,17 @@ emit_reload_insns (struct insn_chain *ch
 		 delete_output_reload.  */
 	      src_reg = reload_reg_rtx_for_output[r];
 
-	      /* If this is an optional reload, try to find the source reg
-		 from an input reload.  */
-	      if (! src_reg)
+	      if (src_reg)
+		{
+		  if (reload_reg_rtx_reaches_end_p (src_reg, r))
+		    store_insn = new_spill_reg_store[REGNO (src_reg)];
+		  else
+		    src_reg = NULL_RTX;
+		}
+	      else
 		{
+		  /* If this is an optional reload, try to find the
+		     source reg from an input reload.  */
 		  rtx set = single_set (insn);
 		  if (set && SET_DEST (set) == rld[r].out)
 		    {
@@ -8357,8 +8364,6 @@ emit_reload_insns (struct insn_chain *ch
 			}
 		    }
 		}
-	      else
-		store_insn = new_spill_reg_store[REGNO (src_reg)];
 	      if (src_reg && REG_P (src_reg)
 		  && REGNO (src_reg) < FIRST_PSEUDO_REGISTER)
 		{

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Out-of-order update of new_spill_reg_store[]
  2011-12-04 10:42   ` Richard Sandiford
@ 2012-02-01 16:09     ` Ulrich Weigand
  2012-02-01 21:00       ` Richard Sandiford
  0 siblings, 1 reply; 10+ messages in thread
From: Ulrich Weigand @ 2012-02-01 16:09 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Bernd Schmidt, gcc-patches

Richard Sandiford wrote:
> Bernd Schmidt <bernds@codesourcery.com> writes:
> >> gcc/
> >> 	* reload1.c (reload_regs_reach_end_p): Replace with...
> >> 	(reload_reg_rtx_reaches_end_p): ...this function.
> >> 	(new_spill_reg_store): Update commentary.
> >> 	(emit_input_reload_insns): Don't clear new_spill_reg_store here.
> >> 	(emit_output_reload_insns): Check reload_reg_rtx_reaches_end_p
> >> 	before setting new_spill_reg_store.
> >> 	(emit_reload_insns): Use a separate loop to clear new_spill_reg_store.
> >> 	Use reload_reg_rtx_reaches_end_p instead of reload_regs_reach_end_p.
> >> 	Also use reload_reg_rtx_reaches_end_p when recording inheritance
> >> 	information for non-spill reload registers.
> >
> > Just an update to say that based on our discussion I think the general
> > approach is OK, but I'm still trying to figure out what exactly this
> > piece of code is doing, and whether the changes to it make sense:

Not sure whether Bernd was planning to get back to this, but just a couple
of comments from my side ...

>      Either way, the idea is that new_spill_reg_store[R] is only valid
>      for reload registers R that reach the end of the reload sequence.
>      We really should check that H1 reaches the end before using
>      new_spill_reg_store[H1].

I'm a bit confused why it is necessary to check that the register reaches
the end of the reload sequence *both* when *setting* new_spill_reg_store,
and again when *using* the contents of new_spill_reg_store ...  Wouldn't
the latter be redundant if we already guarantee the former?

>  [A] (but not [B]).  I think this is for optional reloads that we
>      decided not to make, in cases where the source of the set needs
>      no reloads.  E.g. the pre-reload instruction might be:
> 
>        (set (reg P1) (reg H1))
> 
>      and P1 has an optional output reload that we decided not to make.
>      Here we record that H1 holds P1 as a possible inheritance.
>      If inheritance causes the P1 <- H1 move to become unnecessary,
>      we can delete it as dead.
> 
>      There doesn't seem to be any kind of "reaches end" check, but I
>      suppose the hope is that instructions like this are going to be so
>      simple that there's no point.  Although I'm not sure whether for:
> 
>           (parallel [(set (reg P1) (reg H1))
>                      (clobber (scratch))])
> 
>      we'd ever consider using H1 for the scratch in cases where the
>      clobber isn't an earlyclobber.

Well, there could still be an output reload on the instruction, I guess
(pre/post modify of an address register in the SET_DEST ?), which might
even involve secondary reloads ... and if H1 is dead in the insn, it
could possibly be chosen as the reload reg for one of these.

In any case, it seems strange to use some random SET_SRC register for
spill_reg_store purposes.  I'd have expected this array to only track
spill registers.  In fact, the comment says:
              /* If this is an optional reload, try to find the source reg
                 from an input reload.  */
So at least the comment seems to indicate that only case [B] was ever
intended to be handled here, not case [A].

>  [B] I think this is similar to [A], but for cases where the source
>      is reloaded.  E.g. the pre-reload instruction might be:
> 
>        (set (reg P1) (reg P2))
> 
>      where P1 has an optional reload that we decided not to make and
>      P2 is reloaded into H2.  The final sequence would look like:
> 
>        (set (reg H2) (reg P2))
>        (set (reg P1) (reg H2))
> 
>      and the code would record P1 <- H2 for inheritance.
> 
>      There does seem to be a real danger here that H2 could be reused
>      for clobbers (except again for earlyclobbers), so it seemed best
>      to test reload_reg_rtx_reaches_end_p.  However, this change was
>      by inspection, and isn't directly related to the new handling
>      of new_spill_reg_store[], since the inherited "reload" is
>      actually the reloaded instruction itself.
> 
>      If H2 doesn't reach the end, the patch falls back to [A].
>      This means that if the original source is a hard register
>      of the wrong class (instead of a pseudo as in the example above),
>      we can continue to record an inheritance for that.

This has again the problem whether we even want/can handle case [A]
correctly.  And even if so, there's the additional problem that if
there is an input reload, the SET_SRC is most likely going to be
replaced by some reload register (note that emit_reload_insns is
called *before* subst_reloads), which makes it really unclear how
valid it is to use the original value of SET_SRC before substitution ...

Maybe the conservative fix would be to not handle case [A], and neither
case [B] when the register does not reach the end.  Not sure if this
would show up as performance regression somewhere, but it seems unlikely
to me.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Out-of-order update of new_spill_reg_store[]
  2012-02-01 16:09     ` Ulrich Weigand
@ 2012-02-01 21:00       ` Richard Sandiford
  2012-02-02 18:04         ` Ulrich Weigand
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Sandiford @ 2012-02-01 21:00 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: Bernd Schmidt, gcc-patches

Thanks for looking at this.

"Ulrich Weigand" <uweigand@de.ibm.com> writes:
> Richard Sandiford wrote:
>>      Either way, the idea is that new_spill_reg_store[R] is only valid
>>      for reload registers R that reach the end of the reload sequence.
>>      We really should check that H1 reaches the end before using
>>      new_spill_reg_store[H1].
>
> I'm a bit confused why it is necessary to check that the register reaches
> the end of the reload sequence *both* when *setting* new_spill_reg_store,
> and again when *using* the contents of new_spill_reg_store ...  Wouldn't
> the latter be redundant if we already guarantee the former?

The idea was to handle the case where we use the same register for
two reloads.  One of them reaches the end but the other one doesn't.
(In the testcase, the one that reaches the end is from the second
copy in a secondary output reload, while the first is a normal
copy-out reload.)

We need to check when setting because we set new_spill_reg_store[]
in rld[] order, which isn't necessarily the same as the order in
which the reloads are actually emitted into the insn stream.
So the entry for the wrong reload can end up overwriting the
right one.

We also need to check when using new_spill_reg_store[] because we
consider recording inheritance information for both reloads.  Only one
of them actually reaches the end, and new_spill_reg_store[] is only valid
for that one.  We shouldn't record any inheritance information for the other.

>>  [A] (but not [B]).  I think this is for optional reloads that we
>>      decided not to make, in cases where the source of the set needs
>>      no reloads.  E.g. the pre-reload instruction might be:
>> 
>>        (set (reg P1) (reg H1))
>> 
>>      and P1 has an optional output reload that we decided not to make.
>>      Here we record that H1 holds P1 as a possible inheritance.
>>      If inheritance causes the P1 <- H1 move to become unnecessary,
>>      we can delete it as dead.
>> 
>>      There doesn't seem to be any kind of "reaches end" check, but I
>>      suppose the hope is that instructions like this are going to be so
>>      simple that there's no point.  Although I'm not sure whether for:
>> 
>>           (parallel [(set (reg P1) (reg H1))
>>                      (clobber (scratch))])
>> 
>>      we'd ever consider using H1 for the scratch in cases where the
>>      clobber isn't an earlyclobber.
>
> Well, there could still be an output reload on the instruction, I guess
> (pre/post modify of an address register in the SET_DEST ?), which might
> even involve secondary reloads ... and if H1 is dead in the insn, it
> could possibly be chosen as the reload reg for one of these.

Hopefully that particular case wouldn't be a problem because autoinc
reloads need a register that is free both before and after the instruction.
But I agree this code doesn't look particularly robust.

> In any case, it seems strange to use some random SET_SRC register for
> spill_reg_store purposes.  I'd have expected this array to only track
> spill registers.  In fact, the comment says:
>               /* If this is an optional reload, try to find the source reg
>                  from an input reload.  */
> So at least the comment seems to indicate that only case [B] was ever
> intended to be handled here, not case [A].

In that case though, I don't really see what the src_reg assignment:

		  if (set && SET_DEST (set) == rld[r].out)
		    {
		      int k;

		      src_reg = SET_SRC (set);

is doing, since it is only used if we can't find an input reload.

>>  [B] I think this is similar to [A], but for cases where the source
>>      is reloaded.  E.g. the pre-reload instruction might be:
>> 
>>        (set (reg P1) (reg P2))
>> 
>>      where P1 has an optional reload that we decided not to make and
>>      P2 is reloaded into H2.  The final sequence would look like:
>> 
>>        (set (reg H2) (reg P2))
>>        (set (reg P1) (reg H2))
>> 
>>      and the code would record P1 <- H2 for inheritance.
>> 
>>      There does seem to be a real danger here that H2 could be reused
>>      for clobbers (except again for earlyclobbers), so it seemed best
>>      to test reload_reg_rtx_reaches_end_p.  However, this change was
>>      by inspection, and isn't directly related to the new handling
>>      of new_spill_reg_store[], since the inherited "reload" is
>>      actually the reloaded instruction itself.
>> 
>>      If H2 doesn't reach the end, the patch falls back to [A].
>>      This means that if the original source is a hard register
>>      of the wrong class (instead of a pseudo as in the example above),
>>      we can continue to record an inheritance for that.
>
> This has again the problem whether we even want/can handle case [A]
> correctly.  And even if so, there's the additional problem that if
> there is an input reload, the SET_SRC is most likely going to be
> replaced by some reload register (note that emit_reload_insns is
> called *before* subst_reloads), which makes it really unclear how
> valid it is to use the original value of SET_SRC before substitution ...

But I think the idea is that any input reload like that will be copying
the current SET_SRC to another register.  And the hope is presumably
that a R1 <- R2 reload won't reuse R2, even if it needs secondary reloads.
Again, I agree it doesn't feel robust though.

> Maybe the conservative fix would be to not handle case [A], and neither
> case [B] when the register does not reach the end.  Not sure if this
> would show up as performance regression somewhere, but it seems unlikely
> to me.

I'd hope so too, but since that first src_reg was presumably added for a
reason, I'm a bit reluctant to do something so daring at this stage. :-)

It looks like the main sticking point for both you and Bernd is the:

		{
		  rtx set = single_set (insn);
		  if (set && SET_DEST (set) == rld[r].out)
		    {
		      int k;

		      src_reg = SET_SRC (set);
		      store_insn = insn;
		      for (k = 0; k < n_reloads; k++)
			{
			  if (rld[k].in == src_reg)
			    {
			      src_reg = reload_reg_rtx_for_input[k];
			      break;
			    }
			}
		    }
		}

block.  Since that part wasn't really directly related to the bug
I was trying to fix, more of a misguided attempt to be complete,
my preference would be to go for the second patch, which leaves
the block untouched.

Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Out-of-order update of new_spill_reg_store[]
  2012-02-01 21:00       ` Richard Sandiford
@ 2012-02-02 18:04         ` Ulrich Weigand
  0 siblings, 0 replies; 10+ messages in thread
From: Ulrich Weigand @ 2012-02-02 18:04 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Bernd Schmidt, gcc-patches

Richard Sandiford wrote:
> "Ulrich Weigand" <uweigand@de.ibm.com> writes:
> > Richard Sandiford wrote:
> >>      Either way, the idea is that new_spill_reg_store[R] is only valid
> >>      for reload registers R that reach the end of the reload sequence.
> >>      We really should check that H1 reaches the end before using
> >>      new_spill_reg_store[H1].
> >
> > I'm a bit confused why it is necessary to check that the register reaches
> > the end of the reload sequence *both* when *setting* new_spill_reg_store,
> > and again when *using* the contents of new_spill_reg_store ...  Wouldn't
> > the latter be redundant if we already guarantee the former?
> 
> The idea was to handle the case where we use the same register for
> two reloads.  One of them reaches the end but the other one doesn't.

Ah, now I get it.  Thanks for the explanation!

> > In any case, it seems strange to use some random SET_SRC register for
> > spill_reg_store purposes.  I'd have expected this array to only track
> > spill registers.  In fact, the comment says:
> >               /* If this is an optional reload, try to find the source reg
> >                  from an input reload.  */
> > So at least the comment seems to indicate that only case [B] was ever
> > intended to be handled here, not case [A].
> 
> In that case though, I don't really see what the src_reg assignment:
> 
> 		  if (set && SET_DEST (set) == rld[r].out)
> 		    {
> 		      int k;
> 
> 		      src_reg = SET_SRC (set);
> 
> is doing, since it is only used if we can't find an input reload.

Oh, I agree -- I was just pointing out that the code doesn't match the
comment here.  I'd assume the intent was to catch even more options for
reload inheritance; I'm just not really sure doing it this way is safe.
(I'm also not sure how much this actually buys us; it would be interesting
to experiment with disabling this code path and see what if any differences
in code generation on various platforms result ...)

> > Maybe the conservative fix would be to not handle case [A], and neither
> > case [B] when the register does not reach the end.  Not sure if this
> > would show up as performance regression somewhere, but it seems unlikely
> > to me.
> 
> I'd hope so too, but since that first src_reg was presumably added for a
> reason, I'm a bit reluctant to do something so daring at this stage. :-)

Agreed, this looks like something for later.

> block.  Since that part wasn't really directly related to the bug
> I was trying to fix, more of a misguided attempt to be complete,
> my preference would be to go for the second patch, which leaves
> the block untouched.

Makes sense to me.

I'd still like to give Bernd a chance to weigh in (since he already looked
at the patch before), but if he doesn't within the next couple of days,
your (second) patch is OK for mainline.

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-02-02 18:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-09  9:28 Out-of-order update of new_spill_reg_store[] Richard Sandiford
2011-10-11 11:00 ` Bernd Schmidt
2011-10-11 13:04   ` Richard Sandiford
2011-10-12 13:20     ` Bernd Schmidt
2011-10-12 21:12       ` Richard Sandiford
2011-10-17 14:32 ` Bernd Schmidt
2011-12-04 10:42   ` Richard Sandiford
2012-02-01 16:09     ` Ulrich Weigand
2012-02-01 21:00       ` Richard Sandiford
2012-02-02 18:04         ` Ulrich Weigand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).