public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][RFC, Reload]. Reload bug?
@ 2012-06-28 21:46 Tejas Belagod
  2012-06-29 13:16 ` Ulrich Weigand
  0 siblings, 1 reply; 9+ messages in thread
From: Tejas Belagod @ 2012-06-28 21:46 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3976 bytes --]


Hi,

Attached is a fix for what seems to be a reload bug while handling 
subreg(mem...). I ran into this problem while implementing support for struct 
load/store in AArch64 using the standard patterns 
vec_<loadstore>_lanes<large_int_mode><vec_mode> on the same lines of the ARM 
backend. The test case that caused the issue was:

void SexiALI_Convert(void *vdest, void *vsrc, unsigned int frames, int n)
{
  unsigned int x;
  short *src = vsrc;
  unsigned char *dest = vdest;
  for(x=0;x<256;x++)
  {
   int tmp;
   tmp = *src;
   src++;
   tmp += *src;
   src++;
   *dest++ = tmp;
  }
}

Before reload, this is the RTL dump I see:

.....
(insn 110 114 111 4 (set (reg:V8HI 158 [ vect_var_.21 ])
         (subreg:V8HI (reg:OI 530 [ vect_array.20 ]) 0)) ice.i:9 512 
{*aarch64_simd_movv8hi}
      (nil))

(insn 111 110 115 4 (set (reg:V8HI 159 [ vect_var_.22 ])
         (subreg:V8HI (reg:OI 530 [ vect_array.20 ]) 16)) ice.i:9 512 
{*aarch64_simd_movv8hi}
      (expr_list:REG_DEAD (reg:OI 530 [ vect_array.20 ])
         (nil)))

(insn 115 111 116 4 (set (reg:V8HI 161 [ vect_var_.24 ])
         (subreg:V8HI (reg:OI 529 [ vect_array.23 ]) 0)) ice.i:9 512 
{*aarch64_simd_movv8hi}
      (nil))

(insn 116 115 117 4 (set (reg:V8HI 162 [ vect_var_.25 ])
         (subreg:V8HI (reg:OI 529 [ vect_array.23 ]) 16)) ice.i:9 512 
{*aarch64_simd_movv8hi}
      (expr_list:REG_DEAD (reg:OI 529 [ vect_array.23 ])
         (nil)))

(insn 117 116 118 4 (set (reg:V4SI 544 [ vect_var_.27 ])
         (sign_extend:V4SI (vec_select:V4HI (reg:V8HI 159 [ vect_var_.22 ])
                 (parallel:V8HI [
                         (const_int 0 [0])
                         (const_int 1 [0x1])
                         (const_int 2 [0x2])
                         (const_int 3 [0x3])
                     ])))) ice.i:11 700 {aarch64_simd_vec_unpacks_lo_v8hi}
      (nil))

(insn 118 117 124 4 (set (reg:V4SI 545 [ vect_var_.26 ])
         (sign_extend:V4SI (vec_select:V4HI (reg:V8HI 158 [ vect_var_.21 ])
                 (parallel:V8HI [
                         (const_int 0 [0])
                         (const_int 1 [0x1])
                         (const_int 2 [0x2])
                         (const_int 3 [0x3])
                     ])))) ice.i:9 700 {aarch64_simd_vec_unpacks_lo_v8hi}
      (nil))

.....

In insn 116, reg_equiv_mem () of the psuedoreg 529 is (mem:OI (reg sp)), and the 
subreg is equivalent to:
	subreg:V8HI (mem:OI (reg sp) 16)
which does not get folded into
	mem:V8HI (plus:DI (reg sp) (const_int 16))
because, in reload.c:find_reloads_toplev () where such subregs are narrowed into
narower memrefs, the memref supplied to strict_memory_address_addr_space_P () is
just (mem:OI (reg sp)) and the SUBREG_BYTE is forgotten. Therefore
strict_memory_address_addr_space_P () thinks that (mem:OI (reg sp)) is a
valid target address and lets it pass as a subreg and does not narrow the subreg
into a narrower memref. find_reloads_toplev () should have infact given
strict_memory_address_addr_space_P () (mem:OI (plus:DI (reg sp) (const_int 16)) 
) which will be returned as false as base+offset is invalid for NEON addressing
modes and this will be reloaded into a narrower memref.

Also, I tried writing a secondary reload for this, but at no time is the RTL
	(subreg:V8HI (mem:OI (reg sp)) 16)
available to the target secondary reload for it to fix it up.

Therefore, I've fixed find_reloads_toplev () to pass the full address to 
strict_memory_address_addr_space_P () in the case of subregs.

Does this look like a sane fix?

I've tested this patch on arm-none-eabi and bootstrapped on x86_64-pc-linux and
all is well.

Thanks,
Tejas Belagod.
ARM.

Changelog:

2012-06-28  Tejas Belagod  <tejas.belagod@arm.com>

gcc/
	* reload.c (find_reloads_toplev): Include the subreg byte in the address
	of memrefs when	converting subregs of mems into narrower memrefs.

[-- Attachment #2: reload-1.txt --]
[-- Type: text/plain, Size: 1480 bytes --]

diff --git a/gcc/reload.c b/gcc/reload.c
index e42cc5c..b6d4ce9 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -4771,15 +4771,27 @@ find_reloads_toplev (rtx x, int opnum, enum reload_type type,
 #ifdef LOAD_EXTEND_OP
 	  && !paradoxical_subreg_p (x)
 #endif
-	  && (reg_equiv_address (regno) != 0
-	      || (reg_equiv_mem (regno) != 0
-		  && (! strict_memory_address_addr_space_p
-		      (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
-		       MEM_ADDR_SPACE (reg_equiv_mem (regno)))
-		      || ! offsettable_memref_p (reg_equiv_mem (regno))
-		      || num_not_at_initial_offset))))
-	x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
-					   insn, address_reloaded);
+	 )
+	{
+	  if (reg_equiv_address (regno) != 0)
+	    x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
+					     insn, address_reloaded);
+	  else if (reg_equiv_mem (regno) != 0)
+	    {
+	      tem =
+		simplify_gen_subreg (GET_MODE (x), reg_equiv_mem (regno),
+				     GET_MODE (SUBREG_REG (x)),
+				     SUBREG_BYTE (x));
+	      gcc_assert (tem);
+	      if (MEM_P (tem)
+		  && (!strict_memory_address_addr_space_p
+			(GET_MODE (x), tem, MEM_ADDR_SPACE (tem))
+		      || !offsettable_memref_p (tem)
+		      || num_not_at_initial_offset))
+		x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
+						 insn, address_reloaded);
+	    }
+	}
     }
 
   for (copied = 0, i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC, Reload]. Reload bug?
  2012-06-28 21:46 [PATCH][RFC, Reload]. Reload bug? Tejas Belagod
@ 2012-06-29 13:16 ` Ulrich Weigand
  2012-06-29 14:17   ` Tejas Belagod
  0 siblings, 1 reply; 9+ messages in thread
From: Ulrich Weigand @ 2012-06-29 13:16 UTC (permalink / raw)
  To: Tejas Belagod; +Cc: gcc-patches

Tejas Belagod wrote:

> Therefore strict_memory_address_addr_space_P () thinks that
> (mem:OI (reg sp)) is a valid target address and lets it pass as
> a subreg and does not narrow the subreg into a narrower memref.
> find_reloads_toplev () should have infact given 
> strict_memory_address_addr_space_P ()
> (mem:OI (plus:DI (reg sp) (const_int 16)))
> which will be returned as false as base+offset is invalid for NEON
> addressing modes and this will be reloaded into a narrower memref.

Huh.  I would have expected the offsettable_memref_p check

> -	  && (reg_equiv_address (regno) != 0
> -	      || (reg_equiv_mem (regno) != 0
> -		  && (! strict_memory_address_addr_space_p
> -		      (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
> -		       MEM_ADDR_SPACE (reg_equiv_mem (regno)))
> -		      || ! offsettable_memref_p (reg_equiv_mem (regno))

^^^ here
> -		      || num_not_at_initial_offset))))

to fail, which should cause find_reloads_subreg_address to get called.

Why is that not happening for you?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC, Reload]. Reload bug?
  2012-06-29 13:16 ` Ulrich Weigand
@ 2012-06-29 14:17   ` Tejas Belagod
  2012-07-04 10:41     ` Tejas Belagod
  0 siblings, 1 reply; 9+ messages in thread
From: Tejas Belagod @ 2012-06-29 14:17 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc-patches

Ulrich Weigand wrote:
> Tejas Belagod wrote:
> 
>> Therefore strict_memory_address_addr_space_P () thinks that
>> (mem:OI (reg sp)) is a valid target address and lets it pass as
>> a subreg and does not narrow the subreg into a narrower memref.
>> find_reloads_toplev () should have infact given 
>> strict_memory_address_addr_space_P ()
>> (mem:OI (plus:DI (reg sp) (const_int 16)))
>> which will be returned as false as base+offset is invalid for NEON
>> addressing modes and this will be reloaded into a narrower memref.
> 
> Huh.  I would have expected the offsettable_memref_p check
> 
>> -	  && (reg_equiv_address (regno) != 0
>> -	      || (reg_equiv_mem (regno) != 0
>> -		  && (! strict_memory_address_addr_space_p
>> -		      (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
>> -		       MEM_ADDR_SPACE (reg_equiv_mem (regno)))
>> -		      || ! offsettable_memref_p (reg_equiv_mem (regno))
> 
> ^^^ here
>> -		      || num_not_at_initial_offset))))
> 
> to fail, which should cause find_reloads_subreg_address to get called.
> 
> Why is that not happening for you?

This is because offsettable_address_addr_space_p () gets as far as calling
strict_memory_address_addr_space_p () with a QImode and (mode_sz - 1) which
returns true. The only way I see offsettable_address_addr_space_p () returning
false would be mode_dependent_address_p () to return true for addr expression
(PLUS (reg) (16)) which partly makes sense to me because PLUS is a
mode-dependent address in that it cannot be allowed for NEON addressing modes,
but it seems very generic for mode_dependent_address_p () to return true for
PLUS in general instead of making a special case for vector modes. This
distinction cannot be made in the target's mode_dependent_address_p() as 'mode'
is not supplied to it.

Thanks,
Tejas Belagod.
ARM.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC, Reload]. Reload bug?
  2012-06-29 14:17   ` Tejas Belagod
@ 2012-07-04 10:41     ` Tejas Belagod
  2012-07-27 18:19       ` [PATCH, RFC] Re-work find_reloads_subreg_address (Re: [PATCH][RFC, Reload]. Reload bug?) Ulrich Weigand
  0 siblings, 1 reply; 9+ messages in thread
From: Tejas Belagod @ 2012-07-04 10:41 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc-patches

Tejas Belagod wrote:
> Ulrich Weigand wrote:
>> Tejas Belagod wrote:
>>
>>> Therefore strict_memory_address_addr_space_P () thinks that
>>> (mem:OI (reg sp)) is a valid target address and lets it pass as
>>> a subreg and does not narrow the subreg into a narrower memref.
>>> find_reloads_toplev () should have infact given 
>>> strict_memory_address_addr_space_P ()
>>> (mem:OI (plus:DI (reg sp) (const_int 16)))
>>> which will be returned as false as base+offset is invalid for NEON
>>> addressing modes and this will be reloaded into a narrower memref.
>> Huh.  I would have expected the offsettable_memref_p check
>>
>>> -	  && (reg_equiv_address (regno) != 0
>>> -	      || (reg_equiv_mem (regno) != 0
>>> -		  && (! strict_memory_address_addr_space_p
>>> -		      (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
>>> -		       MEM_ADDR_SPACE (reg_equiv_mem (regno)))
>>> -		      || ! offsettable_memref_p (reg_equiv_mem (regno))
>> ^^^ here
>>> -		      || num_not_at_initial_offset))))
>> to fail, which should cause find_reloads_subreg_address to get called.
>>
>> Why is that not happening for you?
> 
> This is because offsettable_address_addr_space_p () gets as far as calling
> strict_memory_address_addr_space_p () with a QImode and (mode_sz - 1) which
> returns true. The only way I see offsettable_address_addr_space_p () returning
> false would be mode_dependent_address_p () to return true for addr expression
> (PLUS (reg) (16)) which partly makes sense to me because PLUS is a
> mode-dependent address in that it cannot be allowed for NEON addressing modes,
> but it seems very generic for mode_dependent_address_p () to return true for
> PLUS in general instead of making a special case for vector modes. This
> distinction cannot be made in the target's mode_dependent_address_p() as 'mode'
> is not supplied to it.
> 

I dug a little deeper into offsettable_address_addr_space_p () and realized that
it only gets reg_equiv_mem () which is MEM:OI (reg sp) to work with which does
not include the SUBREG_BYTE, therefore mode_dependent_address_p () does not have
PLUS to check for address tree-mode dependency.

Thanks,
Tejas Belagod.
ARM.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH, RFC] Re-work find_reloads_subreg_address (Re: [PATCH][RFC, Reload]. Reload bug?)
  2012-07-04 10:41     ` Tejas Belagod
@ 2012-07-27 18:19       ` Ulrich Weigand
  2012-08-02 11:11         ` Tejas Belagod
  0 siblings, 1 reply; 9+ messages in thread
From: Ulrich Weigand @ 2012-07-27 18:19 UTC (permalink / raw)
  To: Tejas Belagod; +Cc: gcc-patches, bernds, vmakarov

Tejas Belagod wrote:
> Tejas Belagod wrote:
> > This is because offsettable_address_addr_space_p () gets as far as calling
> > strict_memory_address_addr_space_p () with a QImode and (mode_sz - 1) which
> > returns true. The only way I see offsettable_address_addr_space_p () returning
> > false would be mode_dependent_address_p () to return true for addr expression
> > (PLUS (reg) (16)) which partly makes sense to me because PLUS is a
> > mode-dependent address in that it cannot be allowed for NEON addressing modes,
> > but it seems very generic for mode_dependent_address_p () to return true for
> > PLUS in general instead of making a special case for vector modes. This
> > distinction cannot be made in the target's mode_dependent_address_p() as 'mode'
> > is not supplied to it.
> 
> I dug a little deeper into offsettable_address_addr_space_p () and realized that
> it only gets reg_equiv_mem () which is MEM:OI (reg sp) to work with which does
> not include the SUBREG_BYTE, therefore mode_dependent_address_p () does not have
> PLUS to check for address tree-mode dependency.

Sorry for the late reply, it took me a while to understand what's
really going on here.

I now agree that this is definitely a bug in reload; it's clear that
offsettable_memref_p does not and cannot catch this case.  In fact,
it does not even have enough information to answer the question in
any sensible way (except for just about always returning "no", which
wouldn't really be useful).

I also agree with the general approach in your patch.  The basic idea
is that it makes no sense to ask a generic question like "would it
be valid to add some (unknown) offset and change to some (unknown)
mode?", when instead we can ask quite specifically for the *known*
offset and mode we want to change to.  However, I'd prefer this to
go even further than what your patch does: I think we should not
be querying "offsettable_memref_p" *at all* here.

With your patch, you still call offsettable_memref_p on the address
that has already been offset -- asking for more offsets seems pointless.
Also, there is another call to offsettable_memref_p *within*
find_reloads_subreg_address --which gets used when FORCE_REPLACE is
false-- which suffers from the same problem as the call in
find_reloads_toplev, and likewise ought to be eliminated.


Taking a step back, let's look at the ways a (subreg (reg)) where
reg is a pseudo equivalent to a memory location can be handled:

- The "default" way would be to reload the inner reg in its proper
  mode from its equivalent memory location into a reload register,
  and use a subreg of that register as the operand.  This always
  works correcly, but sometime causes unnecessary reloads, if the
  insn could actually handle a memory operand directly.

- To avoid that unnecessary reload, we can instead attempt to replace
  the whole subreg with a modified (usually narrowed or widended)
  memory reference.  This can then be either used directly in the insn,
  or itself be reloaded.

In the second case (outer reload), there can be a number of issues:

- We may not be allowed at all to change the memory access if:

  * we have a paradoxical subreg that is implictly handled as a
    LOAD_EXTEND or ZERO_EXTEND due to LOAD_EXTEND_OP

  * we have a normal subreg that is implicitly acting on the full
    register due to WORD_REGISTER_OPERATIONS  (the check for this
    seems to be incomplete in current code!)

  * the equivalent memory address is mode-dependent

  * we have a paradoxical subreg, and we cannot prove the widened
    memory is properly aligned (so we may cause either a misaligned
    access, or access to unmapped memory)

- Even if we are in principle allowed to change the memory access,
  the modified address might not be valid (either because the
  original equivalent address is already invalid, or because it
  becomes invalid when adding an offset and/or changing its mode).
  We can still do the outer access in that case, but we will have
  to push address reloads (based on the modified address).

  Current code tries to be clever as to when to perform the
  substitution of the modified memory address: if it thinks no
  address reloads will be required in either case, it leaves the
  address as (subreg (reg)), allowing find_reloads to choose
  between (inner or outer) reloads or doing an (outer) access
  as memory operand directly.  In either case, the actual change
  to a mem happens in cleanup_subreg_operands at the end.

  On the other hand, if address reloads *are* required, it is
  find_reloads_toplev/find_reloads_subreg_address that will
  replace either the subreg or the reg with an explicit (outer
  or inner) memory access, and push the corresponding address
  reloads.  [ Note that find_reloads now no longer has the
  choice of switching between inner and outer access.  In the
  case of an outer access, there still is the choice between
  a direct memory access in the insn and a reload.  ]

- Even if we are allowed to change the memory access and generate
  correct address reloads, there sometimes is a question of whether
  outer access is actually preferable from a performance perspective
  to just doing the inner access.  In particular, if we have a
  paradoxical subreg, the outer access might cause more memory
  traffic than the inner access ...

  If the outer access can be done implicitly in the insn itself,
  that may still be preferable.  If we do a reload in either case,
  however, it may become questionable whether this is really better.

  Current code seems to make somewhat weird decisions on this
  particular point: if no address reloads are required, find_reloads
  will force a reload (i.e. not do a direct memory access) for nearly
  all paradoxical subregs:

   - where required due to LOAD_EXTEND_OP
   - always on big-endian targets
   - always on targets defining WORD_REGISTER_OPERATIONS
   - always unless the inner mode is aligned to BIGGEST_ALIGNMENT

  (Except for the first, I do not understand why any of these are
  actually necessary ...)

  push_reload will then nearly always reload the inner access.  So in
  effect, if no address reloads are required, for paradoxical subregs
  we will just about always do an inner access anyway.

  However, if address reloads *are* required, find_reloads_subreg_address
  goes to a significant amount of trouble to implement them correctly
  whenever possible by replacing the subreg with an outer MEM access.
  [ But if the target *defined* LOAD_EXTEND_OP, we will not do this for
  *any* paradoxical subreg, even those not actually affected.]

  This strikes me as odd: if we don't want to do outer accesses for
  paradoxical subreg when no address reloads are required (which is
  hopefully the vast majority of scenarios), why would we suddenly
  want them when address reloads are required?



Now, getting back to the original problem.  As discussed above, we don't
really want to rely on offsettable_memref_p.  Instead, we want to actually
generate the final memory reference, and simply check it for validity.

However, now the question becomes: as we have already generated the final
address, should we now just throw it away and have it get recomputed by
cleanup_subreg_operands?  The benefit would be to allow find_reloads the
choice between inner and outer access.  But is this actually needed?
For paradoxical subregs, we know that find_reloads would nearly always
choose to reload the inner access.  For normal subregs  --except where
prevented due to WORD_REGISTER_OPERATIONS-- find_reloads would always
choose to perform an outer access (directly or reloaded).  We can make
this distinction just as well already in find_reloads_subreg_address ...

This has the additional benefit that we can replace a bunch of (somewhat
dubious) hand-crafted code in find_reloads_subreg_address by a simple
call to simplify_subreg.

The following patch implements this idea; it passes a basic regression
test on arm-linux-gnueabi.  (Obviously this would need a lot more
testing on various platforms before getting into mainline ...)

Can you have a look whether this fixes the problem you're seeing?

Bernd, Vlad, I'd appreciate your comments on this approach as well.

Thanks,
Ulrich


ChangeLog:

	* reload.c (find_reloads_subreg_address): Remove FORCE_REPLACE
	parameter.  Always replace normal subreg with memory reference
	whenever possible.  Return NULL otherwise.
	(find_reloads_toplev): Always call find_reloads_subreg_address
	for subregs of registers equivalent to a memory location.
	Only recurse further if find_reloads_subreg_address fails.
	(find_reloads_address_1): Only call find_reloads_subreg_address
	for subregs of registers equivalent to a memory location.
	Properly handle failure of find_reloads_subreg_address.


Index: gcc/reload.c
===================================================================
*** gcc/reload.c	(revision 189809)
--- gcc/reload.c	(working copy)
*************** static int find_reloads_address_1 (enum 
*** 282,288 ****
  static void find_reloads_address_part (rtx, rtx *, enum reg_class,
  				       enum machine_mode, int,
  				       enum reload_type, int);
! static rtx find_reloads_subreg_address (rtx, int, int, enum reload_type,
  					int, rtx, int *);
  static void copy_replacements_1 (rtx *, rtx *, int);
  static int find_inc_amount (rtx, rtx);
--- 282,288 ----
  static void find_reloads_address_part (rtx, rtx *, enum reg_class,
  				       enum machine_mode, int,
  				       enum reload_type, int);
! static rtx find_reloads_subreg_address (rtx, int, enum reload_type,
  					int, rtx, int *);
  static void copy_replacements_1 (rtx *, rtx *, int);
  static int find_inc_amount (rtx, rtx);
*************** find_reloads_toplev (rtx x, int opnum, e
*** 4755,4785 ****
  	}
  
        /* If the subreg contains a reg that will be converted to a mem,
! 	 convert the subreg to a narrower memref now.
! 	 Otherwise, we would get (subreg (mem ...) ...),
! 	 which would force reload of the mem.
! 
! 	 We also need to do this if there is an equivalent MEM that is
! 	 not offsettable.  In that case, alter_subreg would produce an
! 	 invalid address on big-endian machines.
! 
! 	 For machines that extend byte loads, we must not reload using
! 	 a wider mode if we have a paradoxical SUBREG.  find_reloads will
! 	 force a reload in that case.  So we should not do anything here.  */
  
        if (regno >= FIRST_PSEUDO_REGISTER
! #ifdef LOAD_EXTEND_OP
! 	  && !paradoxical_subreg_p (x)
! #endif
! 	  && (reg_equiv_address (regno) != 0
! 	      || (reg_equiv_mem (regno) != 0
! 		  && (! strict_memory_address_addr_space_p
! 		      (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
! 		       MEM_ADDR_SPACE (reg_equiv_mem (regno)))
! 		      || ! offsettable_memref_p (reg_equiv_mem (regno))
! 		      || num_not_at_initial_offset))))
! 	x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
! 					   insn, address_reloaded);
      }
  
    for (copied = 0, i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
--- 4755,4773 ----
  	}
  
        /* If the subreg contains a reg that will be converted to a mem,
! 	 attempt to convert the whole subreg to a (narrower or wider)
! 	 memory reference instead.  If this succeeds, we're done --
! 	 otherwise fall through to check whether the inner reg still
! 	 needs address reloads anyway.  */
  
        if (regno >= FIRST_PSEUDO_REGISTER
! 	  && reg_equiv_memory_loc (regno) != 0)
! 	{
! 	  tem = find_reloads_subreg_address (x, opnum, type, ind_levels,
! 					     insn, address_reloaded);
! 	  if (tem)
! 	    return tem;
! 	}
      }
  
    for (copied = 0, i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
*************** find_reloads_address_1 (enum machine_mod
*** 6018,6029 ****
  	      if (ira_reg_class_max_nregs [rclass][GET_MODE (SUBREG_REG (x))]
  		  > reg_class_size[(int) rclass])
  		{
! 		  x = find_reloads_subreg_address (x, 0, opnum,
! 						   ADDR_TYPE (type),
! 						   ind_levels, insn, NULL);
! 		  push_reload (x, NULL_RTX, loc, (rtx*) 0, rclass,
! 			       GET_MODE (x), VOIDmode, 0, 0, opnum, type);
! 		  return 1;
  		}
  	    }
  	}
--- 6006,6036 ----
  	      if (ira_reg_class_max_nregs [rclass][GET_MODE (SUBREG_REG (x))]
  		  > reg_class_size[(int) rclass])
  		{
! 		  /* If the inner register will be replaced by a memory
! 		     reference, we can do this only if we can replace the
! 		     whole subreg by a (narrower) memory reference.  If
! 		     this is not possible, fall through and reload just
! 		     the inner register (including address reloads).  */
! 		  if (reg_equiv_memory_loc (REGNO (SUBREG_REG (x))) != 0)
! 		    {
! 		      rtx tem = find_reloads_subreg_address (x, opnum,
! 							     ADDR_TYPE (type),
! 							     ind_levels, insn,
! 							     NULL);
! 		      if (tem)
! 			{
! 			  push_reload (tem, NULL_RTX, loc, (rtx*) 0, rclass,
! 				       GET_MODE (tem), VOIDmode, 0, 0,
! 				       opnum, type);
! 			  return 1;
! 			}
! 		    }
! 		  else
! 		    {
! 		      push_reload (x, NULL_RTX, loc, (rtx*) 0, rclass,
! 				   GET_MODE (x), VOIDmode, 0, 0, opnum, type);
! 		      return 1;
! 		    }
  		}
  	    }
  	}
*************** find_reloads_address_part (rtx x, rtx *l
*** 6100,6116 ****
  }
  \f
  /* X, a subreg of a pseudo, is a part of an address that needs to be
!    reloaded.
! 
!    If the pseudo is equivalent to a memory location that cannot be directly
!    addressed, make the necessary address reloads.
  
!    If address reloads have been necessary, or if the address is changed
!    by register elimination, return the rtx of the memory location;
!    otherwise, return X.
! 
!    If FORCE_REPLACE is nonzero, unconditionally replace the subreg with the
!    memory location.
  
     OPNUM and TYPE identify the purpose of the reload.
  
--- 6107,6118 ----
  }
  \f
  /* X, a subreg of a pseudo, is a part of an address that needs to be
!    reloaded, and the pseusdo is equivalent to a memory location.
  
!    Attempt to replace the whole subreg by a (possibly narrower or wider)
!    memory reference.  If this is possible, return this new memory
!    reference, and push all required address reloads.  Otherwise,
!    return NULL.
  
     OPNUM and TYPE identify the purpose of the reload.
  
*************** find_reloads_address_part (rtx x, rtx *l
*** 6122,6252 ****
     stack slots.  */
  
  static rtx
! find_reloads_subreg_address (rtx x, int force_replace, int opnum,
! 			     enum reload_type type, int ind_levels, rtx insn,
! 			     int *address_reloaded)
  {
    int regno = REGNO (SUBREG_REG (x));
    int reloaded = 0;
  
!   if (reg_equiv_memory_loc (regno))
!     {
!       /* If the address is not directly addressable, or if the address is not
! 	 offsettable, then it must be replaced.  */
!       if (! force_replace
! 	  && (reg_equiv_address (regno)
! 	      || ! offsettable_memref_p (reg_equiv_mem (regno))))
! 	force_replace = 1;
! 
!       if (force_replace || num_not_at_initial_offset)
! 	{
! 	  rtx tem = make_memloc (SUBREG_REG (x), regno);
! 
! 	  /* If the address changes because of register elimination, then
! 	     it must be replaced.  */
! 	  if (force_replace
! 	      || ! rtx_equal_p (tem, reg_equiv_mem (regno)))
! 	    {
! 	      unsigned outer_size = GET_MODE_SIZE (GET_MODE (x));
! 	      unsigned inner_size = GET_MODE_SIZE (GET_MODE (SUBREG_REG (x)));
! 	      int offset;
! 	      rtx orig = tem;
! 
! 	      /* For big-endian paradoxical subregs, SUBREG_BYTE does not
! 		 hold the correct (negative) byte offset.  */
! 	      if (BYTES_BIG_ENDIAN && outer_size > inner_size)
! 		offset = inner_size - outer_size;
! 	      else
! 		offset = SUBREG_BYTE (x);
  
! 	      XEXP (tem, 0) = plus_constant (GET_MODE (XEXP (tem, 0)),
! 					     XEXP (tem, 0), offset);
! 	      PUT_MODE (tem, GET_MODE (x));
! 	      if (MEM_OFFSET_KNOWN_P (tem))
! 		set_mem_offset (tem, MEM_OFFSET (tem) + offset);
! 	      if (MEM_SIZE_KNOWN_P (tem)
! 		  && MEM_SIZE (tem) != (HOST_WIDE_INT) outer_size)
! 		set_mem_size (tem, outer_size);
! 
! 	      /* If this was a paradoxical subreg that we replaced, the
! 		 resulting memory must be sufficiently aligned to allow
! 		 us to widen the mode of the memory.  */
! 	      if (outer_size > inner_size)
! 		{
! 		  rtx base;
  
! 		  base = XEXP (tem, 0);
! 		  if (GET_CODE (base) == PLUS)
! 		    {
! 		      if (CONST_INT_P (XEXP (base, 1))
! 			  && INTVAL (XEXP (base, 1)) % outer_size != 0)
! 			return x;
! 		      base = XEXP (base, 0);
! 		    }
! 		  if (!REG_P (base)
! 		      || (REGNO_POINTER_ALIGN (REGNO (base))
! 			  < outer_size * BITS_PER_UNIT))
! 		    return x;
! 		}
  
- 	      reloaded = find_reloads_address (GET_MODE (tem), &tem,
- 					       XEXP (tem, 0), &XEXP (tem, 0),
- 					       opnum, type, ind_levels, insn);
- 	      /* ??? Do we need to handle nonzero offsets somehow?  */
- 	      if (!offset && !rtx_equal_p (tem, orig))
- 		push_reg_equiv_alt_mem (regno, tem);
- 
- 	      /* For some processors an address may be valid in the
- 		 original mode but not in a smaller mode.  For
- 		 example, ARM accepts a scaled index register in
- 		 SImode but not in HImode.  Note that this is only
- 		 a problem if the address in reg_equiv_mem is already
- 		 invalid in the new mode; other cases would be fixed
- 		 by find_reloads_address as usual.
- 
- 		 ??? We attempt to handle such cases here by doing an
- 		 additional reload of the full address after the
- 		 usual processing by find_reloads_address.  Note that
- 		 this may not work in the general case, but it seems
- 		 to cover the cases where this situation currently
- 		 occurs.  A more general fix might be to reload the
- 		 *value* instead of the address, but this would not
- 		 be expected by the callers of this routine as-is.
- 
- 		 If find_reloads_address already completed replaced
- 		 the address, there is nothing further to do.  */
- 	      if (reloaded == 0
- 		  && reg_equiv_mem (regno) != 0
- 		  && !strict_memory_address_addr_space_p
- 			(GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
- 			 MEM_ADDR_SPACE (reg_equiv_mem (regno))))
- 		{
- 		  push_reload (XEXP (tem, 0), NULL_RTX, &XEXP (tem, 0), (rtx*) 0,
- 			       base_reg_class (GET_MODE (tem),
- 					       MEM_ADDR_SPACE (tem),
- 					       MEM, SCRATCH),
- 			       GET_MODE (XEXP (tem, 0)), VOIDmode, 0, 0,
- 			       opnum, type);
- 		  reloaded = 1;
- 		}
- 	      /* If this is not a toplevel operand, find_reloads doesn't see
- 		 this substitution.  We have to emit a USE of the pseudo so
- 		 that delete_output_reload can see it.  */
- 	      if (replace_reloads && recog_data.operand[opnum] != x)
- 		/* We mark the USE with QImode so that we recognize it
- 		   as one that can be safely deleted at the end of
- 		   reload.  */
- 		PUT_MODE (emit_insn_before (gen_rtx_USE (VOIDmode,
- 							 SUBREG_REG (x)),
- 					    insn), QImode);
- 	      x = tem;
- 	    }
- 	}
-     }
    if (address_reloaded)
      *address_reloaded = reloaded;
  
!   return x;
  }
  \f
  /* Substitute into the current INSN the registers into which we have reloaded
--- 6124,6231 ----
     stack slots.  */
  
  static rtx
! find_reloads_subreg_address (rtx x, int opnum, enum reload_type type,
! 			     int ind_levels, rtx insn, int *address_reloaded)
  {
+   enum machine_mode outer_mode = GET_MODE (x);
+   enum machine_mode inner_mode = GET_MODE (SUBREG_REG (x));
+   unsigned outer_size = GET_MODE_SIZE (outer_mode);
+   unsigned inner_size = GET_MODE_SIZE (inner_mode);
    int regno = REGNO (SUBREG_REG (x));
    int reloaded = 0;
+   rtx tem, orig;
+   int offset;
  
!   gcc_assert (reg_equiv_memory_loc (regno) != 0);
  
!   /* We cannot replace the subreg with a modified memory reference if:
  
!      - we have a paradoxical subreg that implicitly acts as a zero or
!        sign extension operation due to LOAD_EXTEND_OP;
! 
!      - we have a subreg that is implicitly supposed to act on the full
!        register due to WORD_REGISTER_OPERATIONS (see also eliminate_regs);
! 
!      - the address of the equivalent memory location is mode-dependent;  or
! 
!      - we have a paradoxical subreg and the resulting memory is not
!        sufficiently aligned to allow access in the wider mode.
! 
!     In addition, we choose not to perform the replacement for *any*
!     paradoxical subreg, even if it were possible in principle.  This
!     is to avoid generating wider memory references than necessary.
! 
!     This corresponds to how previous versions of reload used to handle
!     paradoxical subregs where no address reload was required.  */
! 
!   if (paradoxical_subreg_p (x))
!     return NULL;
! 
! #ifdef WORD_REGISTER_OPERATIONS
!   if (outer_size < inner_size
!       && ((outer_size - 1) / UNITS_PER_WORD
!           == (inner_size - 1) / UNITS_PER_WORD))
!     return NULL;
! #endif
! 
!   /* Since we don't attempt to handle paradoxical subregs, we can just
!      call into simplify_subreg, which will handle all remaining checks
!      for us.  */
!   orig = make_memloc (SUBREG_REG (x), regno);
!   offset = SUBREG_BYTE (x);
!   tem = simplify_subreg (outer_mode, orig, inner_mode, offset);
!   if (!tem || !MEM_P (tem))
!     return NULL;
! 
!   /* Now push all required address reloads, if any.  */
!   reloaded = find_reloads_address (GET_MODE (tem), &tem,
! 				   XEXP (tem, 0), &XEXP (tem, 0),
! 				   opnum, type, ind_levels, insn);
!   /* ??? Do we need to handle nonzero offsets somehow?  */
!   if (!offset && !rtx_equal_p (tem, orig))
!     push_reg_equiv_alt_mem (regno, tem);
! 
!   /* For some processors an address may be valid in the original mode but
!      not in a smaller mode.  For example, ARM accepts a scaled index register
!      in SImode but not in HImode.  Note that this is only a problem if the
!      address in reg_equiv_mem is already invalid in the new mode; other
!      cases would be fixed by find_reloads_address as usual.
! 
!      ??? We attempt to handle such cases here by doing an additional reload
!      of the full address after the usual processing by find_reloads_address.
!      Note that this may not work in the general case, but it seems to cover
!      the cases where this situation currently occurs.  A more general fix
!      might be to reload the *value* instead of the address, but this would
!      not be expected by the callers of this routine as-is.
! 
!      If find_reloads_address already completed replaced the address, there
!      is nothing further to do.  */
!   if (reloaded == 0
!       && reg_equiv_mem (regno) != 0
!       && !strict_memory_address_addr_space_p
! 		(GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
! 		 MEM_ADDR_SPACE (reg_equiv_mem (regno))))
!     {
!       push_reload (XEXP (tem, 0), NULL_RTX, &XEXP (tem, 0), (rtx*) 0,
! 		   base_reg_class (GET_MODE (tem), MEM_ADDR_SPACE (tem),
! 				   MEM, SCRATCH),
! 		   GET_MODE (XEXP (tem, 0)), VOIDmode, 0, 0, opnum, type);
!       reloaded = 1;
!     }
! 
!   /* If this is not a toplevel operand, find_reloads doesn't see this
!      substitution.  We have to emit a USE of the pseudo so that
!      delete_output_reload can see it.  */
!   if (replace_reloads && recog_data.operand[opnum] != x)
!     /* We mark the USE with QImode so that we recognize it as one that
!        can be safely deleted at the end of reload.  */
!     PUT_MODE (emit_insn_before (gen_rtx_USE (VOIDmode, SUBREG_REG (x)), insn),
! 	      QImode);
  
    if (address_reloaded)
      *address_reloaded = reloaded;
  
!   return tem;
  }
  \f
  /* Substitute into the current INSN the registers into which we have reloaded

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH, RFC] Re-work find_reloads_subreg_address (Re: [PATCH][RFC, Reload]. Reload bug?)
  2012-07-27 18:19       ` [PATCH, RFC] Re-work find_reloads_subreg_address (Re: [PATCH][RFC, Reload]. Reload bug?) Ulrich Weigand
@ 2012-08-02 11:11         ` Tejas Belagod
  2012-08-20 10:09           ` Tejas Belagod
  0 siblings, 1 reply; 9+ messages in thread
From: Tejas Belagod @ 2012-08-02 11:11 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc-patches, bernds, vmakarov

Ulrich Weigand wrote:
> 
> The following patch implements this idea; it passes a basic regression
> test on arm-linux-gnueabi.  (Obviously this would need a lot more
> testing on various platforms before getting into mainline ...)
> 
> Can you have a look whether this fixes the problem you're seeing?
> 

Sorry for the delay in replying. Thanks for the patch. I tried this patch -  it 
doesn't seem to reach as far as cleanup_subreg_operands (), but fails an 
assertion in push_reload () in reload.c:1307. I'm currently investigating this 
and will let you know the reason soon.

Thanks,
Tejas Belagod.
ARM.


> Bernd, Vlad, I'd appreciate your comments on this approach as well.
> 
> Thanks,
> Ulrich
> 
> 
> ChangeLog:
> 
>         * reload.c (find_reloads_subreg_address): Remove FORCE_REPLACE
>         parameter.  Always replace normal subreg with memory reference
>         whenever possible.  Return NULL otherwise.
>         (find_reloads_toplev): Always call find_reloads_subreg_address
>         for subregs of registers equivalent to a memory location.
>         Only recurse further if find_reloads_subreg_address fails.
>         (find_reloads_address_1): Only call find_reloads_subreg_address
>         for subregs of registers equivalent to a memory location.
>         Properly handle failure of find_reloads_subreg_address.
> 
> 
> Index: gcc/reload.c
> ===================================================================
> *** gcc/reload.c        (revision 189809)
> --- gcc/reload.c        (working copy)
> *************** static int find_reloads_address_1 (enum
> *** 282,288 ****
>   static void find_reloads_address_part (rtx, rtx *, enum reg_class,
>                                        enum machine_mode, int,
>                                        enum reload_type, int);
> ! static rtx find_reloads_subreg_address (rtx, int, int, enum reload_type,
>                                         int, rtx, int *);
>   static void copy_replacements_1 (rtx *, rtx *, int);
>   static int find_inc_amount (rtx, rtx);
> --- 282,288 ----
>   static void find_reloads_address_part (rtx, rtx *, enum reg_class,
>                                        enum machine_mode, int,
>                                        enum reload_type, int);
> ! static rtx find_reloads_subreg_address (rtx, int, enum reload_type,
>                                         int, rtx, int *);
>   static void copy_replacements_1 (rtx *, rtx *, int);
>   static int find_inc_amount (rtx, rtx);
> *************** find_reloads_toplev (rtx x, int opnum, e
> *** 4755,4785 ****
>         }
> 
>         /* If the subreg contains a reg that will be converted to a mem,
> !        convert the subreg to a narrower memref now.
> !        Otherwise, we would get (subreg (mem ...) ...),
> !        which would force reload of the mem.
> !
> !        We also need to do this if there is an equivalent MEM that is
> !        not offsettable.  In that case, alter_subreg would produce an
> !        invalid address on big-endian machines.
> !
> !        For machines that extend byte loads, we must not reload using
> !        a wider mode if we have a paradoxical SUBREG.  find_reloads will
> !        force a reload in that case.  So we should not do anything here.  */
> 
>         if (regno >= FIRST_PSEUDO_REGISTER
> ! #ifdef LOAD_EXTEND_OP
> !         && !paradoxical_subreg_p (x)
> ! #endif
> !         && (reg_equiv_address (regno) != 0
> !             || (reg_equiv_mem (regno) != 0
> !                 && (! strict_memory_address_addr_space_p
> !                     (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
> !                      MEM_ADDR_SPACE (reg_equiv_mem (regno)))
> !                     || ! offsettable_memref_p (reg_equiv_mem (regno))
> !                     || num_not_at_initial_offset))))
> !       x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
> !                                          insn, address_reloaded);
>       }
> 
>     for (copied = 0, i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
> --- 4755,4773 ----
>         }
> 
>         /* If the subreg contains a reg that will be converted to a mem,
> !        attempt to convert the whole subreg to a (narrower or wider)
> !        memory reference instead.  If this succeeds, we're done --
> !        otherwise fall through to check whether the inner reg still
> !        needs address reloads anyway.  */
> 
>         if (regno >= FIRST_PSEUDO_REGISTER
> !         && reg_equiv_memory_loc (regno) != 0)
> !       {
> !         tem = find_reloads_subreg_address (x, opnum, type, ind_levels,
> !                                            insn, address_reloaded);
> !         if (tem)
> !           return tem;
> !       }
>       }
> 
>     for (copied = 0, i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
> *************** find_reloads_address_1 (enum machine_mod
> *** 6018,6029 ****
>               if (ira_reg_class_max_nregs [rclass][GET_MODE (SUBREG_REG (x))]
>                   > reg_class_size[(int) rclass])
>                 {
> !                 x = find_reloads_subreg_address (x, 0, opnum,
> !                                                  ADDR_TYPE (type),
> !                                                  ind_levels, insn, NULL);
> !                 push_reload (x, NULL_RTX, loc, (rtx*) 0, rclass,
> !                              GET_MODE (x), VOIDmode, 0, 0, opnum, type);
> !                 return 1;
>                 }
>             }
>         }
> --- 6006,6036 ----
>               if (ira_reg_class_max_nregs [rclass][GET_MODE (SUBREG_REG (x))]
>                   > reg_class_size[(int) rclass])
>                 {
> !                 /* If the inner register will be replaced by a memory
> !                    reference, we can do this only if we can replace the
> !                    whole subreg by a (narrower) memory reference.  If
> !                    this is not possible, fall through and reload just
> !                    the inner register (including address reloads).  */
> !                 if (reg_equiv_memory_loc (REGNO (SUBREG_REG (x))) != 0)
> !                   {
> !                     rtx tem = find_reloads_subreg_address (x, opnum,
> !                                                            ADDR_TYPE (type),
> !                                                            ind_levels, insn,
> !                                                            NULL);
> !                     if (tem)
> !                       {
> !                         push_reload (tem, NULL_RTX, loc, (rtx*) 0, rclass,
> !                                      GET_MODE (tem), VOIDmode, 0, 0,
> !                                      opnum, type);
> !                         return 1;
> !                       }
> !                   }
> !                 else
> !                   {
> !                     push_reload (x, NULL_RTX, loc, (rtx*) 0, rclass,
> !                                  GET_MODE (x), VOIDmode, 0, 0, opnum, type);
> !                     return 1;
> !                   }
>                 }
>             }
>         }
> *************** find_reloads_address_part (rtx x, rtx *l
> *** 6100,6116 ****
>   }
> 
>   /* X, a subreg of a pseudo, is a part of an address that needs to be
> !    reloaded.
> !
> !    If the pseudo is equivalent to a memory location that cannot be directly
> !    addressed, make the necessary address reloads.
> 
> !    If address reloads have been necessary, or if the address is changed
> !    by register elimination, return the rtx of the memory location;
> !    otherwise, return X.
> !
> !    If FORCE_REPLACE is nonzero, unconditionally replace the subreg with the
> !    memory location.
> 
>      OPNUM and TYPE identify the purpose of the reload.
> 
> --- 6107,6118 ----
>   }
> 
>   /* X, a subreg of a pseudo, is a part of an address that needs to be
> !    reloaded, and the pseusdo is equivalent to a memory location.
> 
> !    Attempt to replace the whole subreg by a (possibly narrower or wider)
> !    memory reference.  If this is possible, return this new memory
> !    reference, and push all required address reloads.  Otherwise,
> !    return NULL.
> 
>      OPNUM and TYPE identify the purpose of the reload.
> 
> *************** find_reloads_address_part (rtx x, rtx *l
> *** 6122,6252 ****
>      stack slots.  */
> 
>   static rtx
> ! find_reloads_subreg_address (rtx x, int force_replace, int opnum,
> !                            enum reload_type type, int ind_levels, rtx insn,
> !                            int *address_reloaded)
>   {
>     int regno = REGNO (SUBREG_REG (x));
>     int reloaded = 0;
> 
> !   if (reg_equiv_memory_loc (regno))
> !     {
> !       /* If the address is not directly addressable, or if the address is not
> !        offsettable, then it must be replaced.  */
> !       if (! force_replace
> !         && (reg_equiv_address (regno)
> !             || ! offsettable_memref_p (reg_equiv_mem (regno))))
> !       force_replace = 1;
> !
> !       if (force_replace || num_not_at_initial_offset)
> !       {
> !         rtx tem = make_memloc (SUBREG_REG (x), regno);
> !
> !         /* If the address changes because of register elimination, then
> !            it must be replaced.  */
> !         if (force_replace
> !             || ! rtx_equal_p (tem, reg_equiv_mem (regno)))
> !           {
> !             unsigned outer_size = GET_MODE_SIZE (GET_MODE (x));
> !             unsigned inner_size = GET_MODE_SIZE (GET_MODE (SUBREG_REG (x)));
> !             int offset;
> !             rtx orig = tem;
> !
> !             /* For big-endian paradoxical subregs, SUBREG_BYTE does not
> !                hold the correct (negative) byte offset.  */
> !             if (BYTES_BIG_ENDIAN && outer_size > inner_size)
> !               offset = inner_size - outer_size;
> !             else
> !               offset = SUBREG_BYTE (x);
> 
> !             XEXP (tem, 0) = plus_constant (GET_MODE (XEXP (tem, 0)),
> !                                            XEXP (tem, 0), offset);
> !             PUT_MODE (tem, GET_MODE (x));
> !             if (MEM_OFFSET_KNOWN_P (tem))
> !               set_mem_offset (tem, MEM_OFFSET (tem) + offset);
> !             if (MEM_SIZE_KNOWN_P (tem)
> !                 && MEM_SIZE (tem) != (HOST_WIDE_INT) outer_size)
> !               set_mem_size (tem, outer_size);
> !
> !             /* If this was a paradoxical subreg that we replaced, the
> !                resulting memory must be sufficiently aligned to allow
> !                us to widen the mode of the memory.  */
> !             if (outer_size > inner_size)
> !               {
> !                 rtx base;
> 
> !                 base = XEXP (tem, 0);
> !                 if (GET_CODE (base) == PLUS)
> !                   {
> !                     if (CONST_INT_P (XEXP (base, 1))
> !                         && INTVAL (XEXP (base, 1)) % outer_size != 0)
> !                       return x;
> !                     base = XEXP (base, 0);
> !                   }
> !                 if (!REG_P (base)
> !                     || (REGNO_POINTER_ALIGN (REGNO (base))
> !                         < outer_size * BITS_PER_UNIT))
> !                   return x;
> !               }
> 
> -             reloaded = find_reloads_address (GET_MODE (tem), &tem,
> -                                              XEXP (tem, 0), &XEXP (tem, 0),
> -                                              opnum, type, ind_levels, insn);
> -             /* ??? Do we need to handle nonzero offsets somehow?  */
> -             if (!offset && !rtx_equal_p (tem, orig))
> -               push_reg_equiv_alt_mem (regno, tem);
> -
> -             /* For some processors an address may be valid in the
> -                original mode but not in a smaller mode.  For
> -                example, ARM accepts a scaled index register in
> -                SImode but not in HImode.  Note that this is only
> -                a problem if the address in reg_equiv_mem is already
> -                invalid in the new mode; other cases would be fixed
> -                by find_reloads_address as usual.
> -
> -                ??? We attempt to handle such cases here by doing an
> -                additional reload of the full address after the
> -                usual processing by find_reloads_address.  Note that
> -                this may not work in the general case, but it seems
> -                to cover the cases where this situation currently
> -                occurs.  A more general fix might be to reload the
> -                *value* instead of the address, but this would not
> -                be expected by the callers of this routine as-is.
> -
> -                If find_reloads_address already completed replaced
> -                the address, there is nothing further to do.  */
> -             if (reloaded == 0
> -                 && reg_equiv_mem (regno) != 0
> -                 && !strict_memory_address_addr_space_p
> -                       (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
> -                        MEM_ADDR_SPACE (reg_equiv_mem (regno))))
> -               {
> -                 push_reload (XEXP (tem, 0), NULL_RTX, &XEXP (tem, 0), (rtx*) 0,
> -                              base_reg_class (GET_MODE (tem),
> -                                              MEM_ADDR_SPACE (tem),
> -                                              MEM, SCRATCH),
> -                              GET_MODE (XEXP (tem, 0)), VOIDmode, 0, 0,
> -                              opnum, type);
> -                 reloaded = 1;
> -               }
> -             /* If this is not a toplevel operand, find_reloads doesn't see
> -                this substitution.  We have to emit a USE of the pseudo so
> -                that delete_output_reload can see it.  */
> -             if (replace_reloads && recog_data.operand[opnum] != x)
> -               /* We mark the USE with QImode so that we recognize it
> -                  as one that can be safely deleted at the end of
> -                  reload.  */
> -               PUT_MODE (emit_insn_before (gen_rtx_USE (VOIDmode,
> -                                                        SUBREG_REG (x)),
> -                                           insn), QImode);
> -             x = tem;
> -           }
> -       }
> -     }
>     if (address_reloaded)
>       *address_reloaded = reloaded;
> 
> !   return x;
>   }
> 
>   /* Substitute into the current INSN the registers into which we have reloaded
> --- 6124,6231 ----
>      stack slots.  */
> 
>   static rtx
> ! find_reloads_subreg_address (rtx x, int opnum, enum reload_type type,
> !                            int ind_levels, rtx insn, int *address_reloaded)
>   {
> +   enum machine_mode outer_mode = GET_MODE (x);
> +   enum machine_mode inner_mode = GET_MODE (SUBREG_REG (x));
> +   unsigned outer_size = GET_MODE_SIZE (outer_mode);
> +   unsigned inner_size = GET_MODE_SIZE (inner_mode);
>     int regno = REGNO (SUBREG_REG (x));
>     int reloaded = 0;
> +   rtx tem, orig;
> +   int offset;
> 
> !   gcc_assert (reg_equiv_memory_loc (regno) != 0);
> 
> !   /* We cannot replace the subreg with a modified memory reference if:
> 
> !      - we have a paradoxical subreg that implicitly acts as a zero or
> !        sign extension operation due to LOAD_EXTEND_OP;
> !
> !      - we have a subreg that is implicitly supposed to act on the full
> !        register due to WORD_REGISTER_OPERATIONS (see also eliminate_regs);
> !
> !      - the address of the equivalent memory location is mode-dependent;  or
> !
> !      - we have a paradoxical subreg and the resulting memory is not
> !        sufficiently aligned to allow access in the wider mode.
> !
> !     In addition, we choose not to perform the replacement for *any*
> !     paradoxical subreg, even if it were possible in principle.  This
> !     is to avoid generating wider memory references than necessary.
> !
> !     This corresponds to how previous versions of reload used to handle
> !     paradoxical subregs where no address reload was required.  */
> !
> !   if (paradoxical_subreg_p (x))
> !     return NULL;
> !
> ! #ifdef WORD_REGISTER_OPERATIONS
> !   if (outer_size < inner_size
> !       && ((outer_size - 1) / UNITS_PER_WORD
> !           == (inner_size - 1) / UNITS_PER_WORD))
> !     return NULL;
> ! #endif
> !
> !   /* Since we don't attempt to handle paradoxical subregs, we can just
> !      call into simplify_subreg, which will handle all remaining checks
> !      for us.  */
> !   orig = make_memloc (SUBREG_REG (x), regno);
> !   offset = SUBREG_BYTE (x);
> !   tem = simplify_subreg (outer_mode, orig, inner_mode, offset);
> !   if (!tem || !MEM_P (tem))
> !     return NULL;
> !
> !   /* Now push all required address reloads, if any.  */
> !   reloaded = find_reloads_address (GET_MODE (tem), &tem,
> !                                  XEXP (tem, 0), &XEXP (tem, 0),
> !                                  opnum, type, ind_levels, insn);
> !   /* ??? Do we need to handle nonzero offsets somehow?  */
> !   if (!offset && !rtx_equal_p (tem, orig))
> !     push_reg_equiv_alt_mem (regno, tem);
> !
> !   /* For some processors an address may be valid in the original mode but
> !      not in a smaller mode.  For example, ARM accepts a scaled index register
> !      in SImode but not in HImode.  Note that this is only a problem if the
> !      address in reg_equiv_mem is already invalid in the new mode; other
> !      cases would be fixed by find_reloads_address as usual.
> !
> !      ??? We attempt to handle such cases here by doing an additional reload
> !      of the full address after the usual processing by find_reloads_address.
> !      Note that this may not work in the general case, but it seems to cover
> !      the cases where this situation currently occurs.  A more general fix
> !      might be to reload the *value* instead of the address, but this would
> !      not be expected by the callers of this routine as-is.
> !
> !      If find_reloads_address already completed replaced the address, there
> !      is nothing further to do.  */
> !   if (reloaded == 0
> !       && reg_equiv_mem (regno) != 0
> !       && !strict_memory_address_addr_space_p
> !               (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
> !                MEM_ADDR_SPACE (reg_equiv_mem (regno))))
> !     {
> !       push_reload (XEXP (tem, 0), NULL_RTX, &XEXP (tem, 0), (rtx*) 0,
> !                  base_reg_class (GET_MODE (tem), MEM_ADDR_SPACE (tem),
> !                                  MEM, SCRATCH),
> !                  GET_MODE (XEXP (tem, 0)), VOIDmode, 0, 0, opnum, type);
> !       reloaded = 1;
> !     }
> !
> !   /* If this is not a toplevel operand, find_reloads doesn't see this
> !      substitution.  We have to emit a USE of the pseudo so that
> !      delete_output_reload can see it.  */
> !   if (replace_reloads && recog_data.operand[opnum] != x)
> !     /* We mark the USE with QImode so that we recognize it as one that
> !        can be safely deleted at the end of reload.  */
> !     PUT_MODE (emit_insn_before (gen_rtx_USE (VOIDmode, SUBREG_REG (x)), insn),
> !             QImode);
> 
>     if (address_reloaded)
>       *address_reloaded = reloaded;
> 
> !   return tem;
>   }
> 
>   /* Substitute into the current INSN the registers into which we have reloaded
> 
> --
>   Dr. Ulrich Weigand
>   GNU Toolchain for Linux on System z and Cell BE
>   Ulrich.Weigand@de.ibm.com
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH, RFC] Re-work find_reloads_subreg_address (Re: [PATCH][RFC, Reload]. Reload bug?)
  2012-08-02 11:11         ` Tejas Belagod
@ 2012-08-20 10:09           ` Tejas Belagod
  2012-10-16 14:02             ` [commit] Re-work find_reloads_subreg_address Ulrich Weigand
  0 siblings, 1 reply; 9+ messages in thread
From: Tejas Belagod @ 2012-08-20 10:09 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc-patches, bernds, vmakarov

Tejas Belagod wrote:
> Ulrich Weigand wrote:
>> The following patch implements this idea; it passes a basic regression
>> test on arm-linux-gnueabi.  (Obviously this would need a lot more
>> testing on various platforms before getting into mainline ...)
>>
>> Can you have a look whether this fixes the problem you're seeing?
>>
> 
> Sorry for the delay in replying. Thanks for the patch. I tried this patch -  it
> doesn't seem to reach as far as cleanup_subreg_operands (), but fails an
> assertion in push_reload () in reload.c:1307. I'm currently investigating this
> and will let you know the reason soon.
> 

Hi,

Sorry for the delay in replying. These new issues that I was seeing were bugs 
specific to my code that I've fixed. Your patch seems to work fine. Thanks a lot 
for the patch.

Thanks,
Tejas Belagod.
ARM.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [commit] Re-work find_reloads_subreg_address
  2012-08-20 10:09           ` Tejas Belagod
@ 2012-10-16 14:02             ` Ulrich Weigand
  2012-10-16 14:33               ` Tejas Belagod
  0 siblings, 1 reply; 9+ messages in thread
From: Ulrich Weigand @ 2012-10-16 14:02 UTC (permalink / raw)
  To: Tejas Belagod; +Cc: gcc-patches, bernds, vmakarov, ramrad01, ubizjak

Tejas Belagod wrote:
> > Ulrich Weigand wrote:
> >> The following patch implements this idea; it passes a basic regression
> >> test on arm-linux-gnueabi.  (Obviously this would need a lot more
> >> testing on various platforms before getting into mainline ...)
> >>
> >> Can you have a look whether this fixes the problem you're seeing?
[snip]

> Sorry for the delay in replying. These new issues that I was seeing were
> bugs specific to my code that I've fixed. Your patch seems to work fine.
> Thanks a lot for the patch.

I've now checked the patch in to mainline.   In addition to your test on
aarch64, I've completed tests without regression on ppc(64)-linux,
s390(x)-linux, and i386-linux (with Uros' patch).

Note that Uros' patch here:
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01463.html
is a pre-requisite for the reload patch to avoid regressions on i386.

Current version (nearly unchanged) of the patch appended below.

Bye,
Ulrich

 
ChangeLog:

	* reload.c (find_reloads_subreg_address): Remove FORCE_REPLACE
	parameter.  Always replace normal subreg with memory reference
	whenever possible.  Return NULL otherwise.
	(find_reloads_toplev): Always call find_reloads_subreg_address
	for subregs of registers equivalent to a memory location.
	Only recurse further if find_reloads_subreg_address fails.
	(find_reloads_address_1): Only call find_reloads_subreg_address
	for subregs of registers equivalent to a memory location.
	Properly handle failure of find_reloads_subreg_address.

Index: gcc/reload.c
===================================================================
*** gcc/reload.c	(revision 191665)
--- gcc/reload.c	(working copy)
*************** static int find_reloads_address_1 (enum 
*** 282,288 ****
  static void find_reloads_address_part (rtx, rtx *, enum reg_class,
  				       enum machine_mode, int,
  				       enum reload_type, int);
! static rtx find_reloads_subreg_address (rtx, int, int, enum reload_type,
  					int, rtx, int *);
  static void copy_replacements_1 (rtx *, rtx *, int);
  static int find_inc_amount (rtx, rtx);
--- 282,288 ----
  static void find_reloads_address_part (rtx, rtx *, enum reg_class,
  				       enum machine_mode, int,
  				       enum reload_type, int);
! static rtx find_reloads_subreg_address (rtx, int, enum reload_type,
  					int, rtx, int *);
  static void copy_replacements_1 (rtx *, rtx *, int);
  static int find_inc_amount (rtx, rtx);
*************** find_reloads_toplev (rtx x, int opnum, e
*** 4810,4840 ****
  	}
  
        /* If the subreg contains a reg that will be converted to a mem,
! 	 convert the subreg to a narrower memref now.
! 	 Otherwise, we would get (subreg (mem ...) ...),
! 	 which would force reload of the mem.
! 
! 	 We also need to do this if there is an equivalent MEM that is
! 	 not offsettable.  In that case, alter_subreg would produce an
! 	 invalid address on big-endian machines.
! 
! 	 For machines that extend byte loads, we must not reload using
! 	 a wider mode if we have a paradoxical SUBREG.  find_reloads will
! 	 force a reload in that case.  So we should not do anything here.  */
  
        if (regno >= FIRST_PSEUDO_REGISTER
! #ifdef LOAD_EXTEND_OP
! 	  && !paradoxical_subreg_p (x)
! #endif
! 	  && (reg_equiv_address (regno) != 0
! 	      || (reg_equiv_mem (regno) != 0
! 		  && (! strict_memory_address_addr_space_p
! 		      (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
! 		       MEM_ADDR_SPACE (reg_equiv_mem (regno)))
! 		      || ! offsettable_memref_p (reg_equiv_mem (regno))
! 		      || num_not_at_initial_offset))))
! 	x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
! 					   insn, address_reloaded);
      }
  
    for (copied = 0, i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
--- 4810,4828 ----
  	}
  
        /* If the subreg contains a reg that will be converted to a mem,
! 	 attempt to convert the whole subreg to a (narrower or wider)
! 	 memory reference instead.  If this succeeds, we're done --
! 	 otherwise fall through to check whether the inner reg still
! 	 needs address reloads anyway.  */
  
        if (regno >= FIRST_PSEUDO_REGISTER
! 	  && reg_equiv_memory_loc (regno) != 0)
! 	{
! 	  tem = find_reloads_subreg_address (x, opnum, type, ind_levels,
! 					     insn, address_reloaded);
! 	  if (tem)
! 	    return tem;
! 	}
      }
  
    for (copied = 0, i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
*************** find_reloads_address_1 (enum machine_mod
*** 6070,6081 ****
  	      if (ira_reg_class_max_nregs [rclass][GET_MODE (SUBREG_REG (x))]
  		  > reg_class_size[(int) rclass])
  		{
! 		  x = find_reloads_subreg_address (x, 0, opnum,
! 						   ADDR_TYPE (type),
! 						   ind_levels, insn, NULL);
! 		  push_reload (x, NULL_RTX, loc, (rtx*) 0, rclass,
! 			       GET_MODE (x), VOIDmode, 0, 0, opnum, type);
! 		  return 1;
  		}
  	    }
  	}
--- 6058,6088 ----
  	      if (ira_reg_class_max_nregs [rclass][GET_MODE (SUBREG_REG (x))]
  		  > reg_class_size[(int) rclass])
  		{
! 		  /* If the inner register will be replaced by a memory
! 		     reference, we can do this only if we can replace the
! 		     whole subreg by a (narrower) memory reference.  If
! 		     this is not possible, fall through and reload just
! 		     the inner register (including address reloads).  */
! 		  if (reg_equiv_memory_loc (REGNO (SUBREG_REG (x))) != 0)
! 		    {
! 		      rtx tem = find_reloads_subreg_address (x, opnum,
! 							     ADDR_TYPE (type),
! 							     ind_levels, insn,
! 							     NULL);
! 		      if (tem)
! 			{
! 			  push_reload (tem, NULL_RTX, loc, (rtx*) 0, rclass,
! 				       GET_MODE (tem), VOIDmode, 0, 0,
! 				       opnum, type);
! 			  return 1;
! 			}
! 		    }
! 		  else
! 		    {
! 		      push_reload (x, NULL_RTX, loc, (rtx*) 0, rclass,
! 				   GET_MODE (x), VOIDmode, 0, 0, opnum, type);
! 		      return 1;
! 		    }
  		}
  	    }
  	}
*************** find_reloads_address_part (rtx x, rtx *l
*** 6152,6168 ****
  }
  \f
  /* X, a subreg of a pseudo, is a part of an address that needs to be
!    reloaded.
! 
!    If the pseudo is equivalent to a memory location that cannot be directly
!    addressed, make the necessary address reloads.
  
!    If address reloads have been necessary, or if the address is changed
!    by register elimination, return the rtx of the memory location;
!    otherwise, return X.
! 
!    If FORCE_REPLACE is nonzero, unconditionally replace the subreg with the
!    memory location.
  
     OPNUM and TYPE identify the purpose of the reload.
  
--- 6159,6170 ----
  }
  \f
  /* X, a subreg of a pseudo, is a part of an address that needs to be
!    reloaded, and the pseusdo is equivalent to a memory location.
  
!    Attempt to replace the whole subreg by a (possibly narrower or wider)
!    memory reference.  If this is possible, return this new memory
!    reference, and push all required address reloads.  Otherwise,
!    return NULL.
  
     OPNUM and TYPE identify the purpose of the reload.
  
*************** find_reloads_address_part (rtx x, rtx *l
*** 6174,6304 ****
     stack slots.  */
  
  static rtx
! find_reloads_subreg_address (rtx x, int force_replace, int opnum,
! 			     enum reload_type type, int ind_levels, rtx insn,
! 			     int *address_reloaded)
  {
    int regno = REGNO (SUBREG_REG (x));
    int reloaded = 0;
  
!   if (reg_equiv_memory_loc (regno))
!     {
!       /* If the address is not directly addressable, or if the address is not
! 	 offsettable, then it must be replaced.  */
!       if (! force_replace
! 	  && (reg_equiv_address (regno)
! 	      || ! offsettable_memref_p (reg_equiv_mem (regno))))
! 	force_replace = 1;
! 
!       if (force_replace || num_not_at_initial_offset)
! 	{
! 	  rtx tem = make_memloc (SUBREG_REG (x), regno);
! 
! 	  /* If the address changes because of register elimination, then
! 	     it must be replaced.  */
! 	  if (force_replace
! 	      || ! rtx_equal_p (tem, reg_equiv_mem (regno)))
! 	    {
! 	      unsigned outer_size = GET_MODE_SIZE (GET_MODE (x));
! 	      unsigned inner_size = GET_MODE_SIZE (GET_MODE (SUBREG_REG (x)));
! 	      int offset;
! 	      rtx orig = tem;
! 
! 	      /* For big-endian paradoxical subregs, SUBREG_BYTE does not
! 		 hold the correct (negative) byte offset.  */
! 	      if (BYTES_BIG_ENDIAN && outer_size > inner_size)
! 		offset = inner_size - outer_size;
! 	      else
! 		offset = SUBREG_BYTE (x);
  
! 	      XEXP (tem, 0) = plus_constant (GET_MODE (XEXP (tem, 0)),
! 					     XEXP (tem, 0), offset);
! 	      PUT_MODE (tem, GET_MODE (x));
! 	      if (MEM_OFFSET_KNOWN_P (tem))
! 		set_mem_offset (tem, MEM_OFFSET (tem) + offset);
! 	      if (MEM_SIZE_KNOWN_P (tem)
! 		  && MEM_SIZE (tem) != (HOST_WIDE_INT) outer_size)
! 		set_mem_size (tem, outer_size);
! 
! 	      /* If this was a paradoxical subreg that we replaced, the
! 		 resulting memory must be sufficiently aligned to allow
! 		 us to widen the mode of the memory.  */
! 	      if (outer_size > inner_size)
! 		{
! 		  rtx base;
  
! 		  base = XEXP (tem, 0);
! 		  if (GET_CODE (base) == PLUS)
! 		    {
! 		      if (CONST_INT_P (XEXP (base, 1))
! 			  && INTVAL (XEXP (base, 1)) % outer_size != 0)
! 			return x;
! 		      base = XEXP (base, 0);
! 		    }
! 		  if (!REG_P (base)
! 		      || (REGNO_POINTER_ALIGN (REGNO (base))
! 			  < outer_size * BITS_PER_UNIT))
! 		    return x;
! 		}
  
- 	      reloaded = find_reloads_address (GET_MODE (tem), &tem,
- 					       XEXP (tem, 0), &XEXP (tem, 0),
- 					       opnum, type, ind_levels, insn);
- 	      /* ??? Do we need to handle nonzero offsets somehow?  */
- 	      if (!offset && !rtx_equal_p (tem, orig))
- 		push_reg_equiv_alt_mem (regno, tem);
- 
- 	      /* For some processors an address may be valid in the
- 		 original mode but not in a smaller mode.  For
- 		 example, ARM accepts a scaled index register in
- 		 SImode but not in HImode.  Note that this is only
- 		 a problem if the address in reg_equiv_mem is already
- 		 invalid in the new mode; other cases would be fixed
- 		 by find_reloads_address as usual.
- 
- 		 ??? We attempt to handle such cases here by doing an
- 		 additional reload of the full address after the
- 		 usual processing by find_reloads_address.  Note that
- 		 this may not work in the general case, but it seems
- 		 to cover the cases where this situation currently
- 		 occurs.  A more general fix might be to reload the
- 		 *value* instead of the address, but this would not
- 		 be expected by the callers of this routine as-is.
- 
- 		 If find_reloads_address already completed replaced
- 		 the address, there is nothing further to do.  */
- 	      if (reloaded == 0
- 		  && reg_equiv_mem (regno) != 0
- 		  && !strict_memory_address_addr_space_p
- 			(GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
- 			 MEM_ADDR_SPACE (reg_equiv_mem (regno))))
- 		{
- 		  push_reload (XEXP (tem, 0), NULL_RTX, &XEXP (tem, 0), (rtx*) 0,
- 			       base_reg_class (GET_MODE (tem),
- 					       MEM_ADDR_SPACE (tem),
- 					       MEM, SCRATCH),
- 			       GET_MODE (XEXP (tem, 0)), VOIDmode, 0, 0,
- 			       opnum, type);
- 		  reloaded = 1;
- 		}
- 	      /* If this is not a toplevel operand, find_reloads doesn't see
- 		 this substitution.  We have to emit a USE of the pseudo so
- 		 that delete_output_reload can see it.  */
- 	      if (replace_reloads && recog_data.operand[opnum] != x)
- 		/* We mark the USE with QImode so that we recognize it
- 		   as one that can be safely deleted at the end of
- 		   reload.  */
- 		PUT_MODE (emit_insn_before (gen_rtx_USE (VOIDmode,
- 							 SUBREG_REG (x)),
- 					    insn), QImode);
- 	      x = tem;
- 	    }
- 	}
-     }
    if (address_reloaded)
      *address_reloaded = reloaded;
  
!   return x;
  }
  \f
  /* Substitute into the current INSN the registers into which we have reloaded
--- 6176,6281 ----
     stack slots.  */
  
  static rtx
! find_reloads_subreg_address (rtx x, int opnum, enum reload_type type,
! 			     int ind_levels, rtx insn, int *address_reloaded)
  {
+   enum machine_mode outer_mode = GET_MODE (x);
+   enum machine_mode inner_mode = GET_MODE (SUBREG_REG (x));
    int regno = REGNO (SUBREG_REG (x));
    int reloaded = 0;
+   rtx tem, orig;
+   int offset;
  
!   gcc_assert (reg_equiv_memory_loc (regno) != 0);
  
!   /* We cannot replace the subreg with a modified memory reference if:
  
!      - we have a paradoxical subreg that implicitly acts as a zero or
!        sign extension operation due to LOAD_EXTEND_OP;
! 
!      - we have a subreg that is implicitly supposed to act on the full
!        register due to WORD_REGISTER_OPERATIONS (see also eliminate_regs);
! 
!      - the address of the equivalent memory location is mode-dependent;  or
! 
!      - we have a paradoxical subreg and the resulting memory is not
!        sufficiently aligned to allow access in the wider mode.
! 
!     In addition, we choose not to perform the replacement for *any*
!     paradoxical subreg, even if it were possible in principle.  This
!     is to avoid generating wider memory references than necessary.
! 
!     This corresponds to how previous versions of reload used to handle
!     paradoxical subregs where no address reload was required.  */
! 
!   if (paradoxical_subreg_p (x))
!     return NULL;
! 
! #ifdef WORD_REGISTER_OPERATIONS
!   if (GET_MODE_SIZE (outer_mode) < GET_MODE_SIZE (inner_mode)
!       && ((GET_MODE_SIZE (outer_mode) - 1) / UNITS_PER_WORD
!           == (GET_MODE_SIZE (inner_mode) - 1) / UNITS_PER_WORD))
!     return NULL;
! #endif
! 
!   /* Since we don't attempt to handle paradoxical subregs, we can just
!      call into simplify_subreg, which will handle all remaining checks
!      for us.  */
!   orig = make_memloc (SUBREG_REG (x), regno);
!   offset = SUBREG_BYTE (x);
!   tem = simplify_subreg (outer_mode, orig, inner_mode, offset);
!   if (!tem || !MEM_P (tem))
!     return NULL;
! 
!   /* Now push all required address reloads, if any.  */
!   reloaded = find_reloads_address (GET_MODE (tem), &tem,
! 				   XEXP (tem, 0), &XEXP (tem, 0),
! 				   opnum, type, ind_levels, insn);
!   /* ??? Do we need to handle nonzero offsets somehow?  */
!   if (!offset && !rtx_equal_p (tem, orig))
!     push_reg_equiv_alt_mem (regno, tem);
! 
!   /* For some processors an address may be valid in the original mode but
!      not in a smaller mode.  For example, ARM accepts a scaled index register
!      in SImode but not in HImode.  Note that this is only a problem if the
!      address in reg_equiv_mem is already invalid in the new mode; other
!      cases would be fixed by find_reloads_address as usual.
! 
!      ??? We attempt to handle such cases here by doing an additional reload
!      of the full address after the usual processing by find_reloads_address.
!      Note that this may not work in the general case, but it seems to cover
!      the cases where this situation currently occurs.  A more general fix
!      might be to reload the *value* instead of the address, but this would
!      not be expected by the callers of this routine as-is.
! 
!      If find_reloads_address already completed replaced the address, there
!      is nothing further to do.  */
!   if (reloaded == 0
!       && reg_equiv_mem (regno) != 0
!       && !strict_memory_address_addr_space_p
! 		(GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
! 		 MEM_ADDR_SPACE (reg_equiv_mem (regno))))
!     {
!       push_reload (XEXP (tem, 0), NULL_RTX, &XEXP (tem, 0), (rtx*) 0,
! 		   base_reg_class (GET_MODE (tem), MEM_ADDR_SPACE (tem),
! 				   MEM, SCRATCH),
! 		   GET_MODE (XEXP (tem, 0)), VOIDmode, 0, 0, opnum, type);
!       reloaded = 1;
!     }
! 
!   /* If this is not a toplevel operand, find_reloads doesn't see this
!      substitution.  We have to emit a USE of the pseudo so that
!      delete_output_reload can see it.  */
!   if (replace_reloads && recog_data.operand[opnum] != x)
!     /* We mark the USE with QImode so that we recognize it as one that
!        can be safely deleted at the end of reload.  */
!     PUT_MODE (emit_insn_before (gen_rtx_USE (VOIDmode, SUBREG_REG (x)), insn),
! 	      QImode);
  
    if (address_reloaded)
      *address_reloaded = reloaded;
  
!   return tem;
  }
  \f
  /* Substitute into the current INSN the registers into which we have reloaded


-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [commit] Re-work find_reloads_subreg_address
  2012-10-16 14:02             ` [commit] Re-work find_reloads_subreg_address Ulrich Weigand
@ 2012-10-16 14:33               ` Tejas Belagod
  0 siblings, 0 replies; 9+ messages in thread
From: Tejas Belagod @ 2012-10-16 14:33 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: gcc-patches, bernds, vmakarov, Ramana Radhakrishnan, ubizjak

Ulrich Weigand wrote:
> Tejas Belagod wrote:
>>> Ulrich Weigand wrote:
>>>> The following patch implements this idea; it passes a basic regression
>>>> test on arm-linux-gnueabi.  (Obviously this would need a lot more
>>>> testing on various platforms before getting into mainline ...)
>>>>
>>>> Can you have a look whether this fixes the problem you're seeing?
> [snip]
> 
>> Sorry for the delay in replying. These new issues that I was seeing were
>> bugs specific to my code that I've fixed. Your patch seems to work fine.
>> Thanks a lot for the patch.
> 
> I've now checked the patch in to mainline.   In addition to your test on
> aarch64, I've completed tests without regression on ppc(64)-linux,
> s390(x)-linux, and i386-linux (with Uros' patch).
> 
> Note that Uros' patch here:
> http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01463.html
> is a pre-requisite for the reload patch to avoid regressions on i386.
> 
> Current version (nearly unchanged) of the patch appended below.
> 

Thank you.

Tejas Belagod,
ARM.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-10-16 14:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-28 21:46 [PATCH][RFC, Reload]. Reload bug? Tejas Belagod
2012-06-29 13:16 ` Ulrich Weigand
2012-06-29 14:17   ` Tejas Belagod
2012-07-04 10:41     ` Tejas Belagod
2012-07-27 18:19       ` [PATCH, RFC] Re-work find_reloads_subreg_address (Re: [PATCH][RFC, Reload]. Reload bug?) Ulrich Weigand
2012-08-02 11:11         ` Tejas Belagod
2012-08-20 10:09           ` Tejas Belagod
2012-10-16 14:02             ` [commit] Re-work find_reloads_subreg_address Ulrich Weigand
2012-10-16 14:33               ` Tejas Belagod

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).