public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets.
@ 2012-03-27 12:57 Ramana Radhakrishnan
  2012-03-27 13:17 ` Ramana Radhakrishnan
  0 siblings, 1 reply; 6+ messages in thread
From: Ramana Radhakrishnan @ 2012-03-27 12:57 UTC (permalink / raw)
  To: gcc-patches; +Cc: Patch Tracking

Hi,

One of the problems with ivopts is that the auto-increment modelling
just takes into account whether HAVE_PRE_INC and friends are defined
for the architecture. However on ARM the VFP addressing modes don't
really support PRE_INCREMENT and POST_DECREMENT forms and hence there
is a bias in ivopts to prefer pre-increment forms over all-else. The
attached patch attempts to fix this - in general it makes things
better on ARM where a large number of cases where we have rather
embarassing code generation around array accesses of floating point
values where to honor this choice of auto-increment forms the compiler
is forced to move things back and forth between floating point and
integer registers and all other such cases.

The canonical example for this is

 void foo (float *x , float *y, float *z, float *m, int l)
   {
      int i;
      for (i = 0; i < l ; i++)
      {
        z[i] = x[i] * y[i] + m[i];
      }
   }

 sub r0, r0, #4
 sub r1, r1, #4
 sub r3, r3, #4
 add ip, r2, ip, asl #2
.L3:
 add r3, r3, #4
 add r0, r0, #4
 flds s15, [r3, #0]
 flds s13, [r0, #0]
 add r1, r1, #4
 flds s14, [r1, #0]
 fmacs s15, s13, s14
 mov r4, r3
 fstmias r2!, {s15}
 cmp r2, ip
 bne .L3
.L1:
 ldmfd sp!, {r4}
 bx lr

and after we generate :

foo:
 @ args = 4, pretend = 0, frame = 0
 @ frame_needed = 0, uses_anonymous_args = 0
 @ link register save eliminated.
 ldr ip, [sp, #0]
 cmp ip, #0
 bxle lr
 add ip, r0, ip, asl #2
.L3:
 fldmias r0!, {s13}
 fldmias r1!, {s14}
 fldmias r3!, {s15}
 fmacs s15, s13, s14
 cmp r0, ip
 fstmias r2!, {s15}
 bne .L3
 bx lr


In general , ivopts could do with some TLC in this area - looking at
the code generated for most of SPEC2k, I see a general improvement in
performance on an A9 board with a large number of cases of transfers
back and forth between VFP and integer registers much reduced (in one
case mgrid I saw up to a 6% improvement in performance in mgrid , 3%
in facerec) and overall upto a 1% improvement when this patch was
applied to the Linaro 4.6 tree - looking at object files with the same
patch applied on FSF trunk I see similar transformations as the 4.6
tree. I see some funny behaviour with twolf where there is noise in
the results and I'm not confident of that particular result -

In the interest of full disclosure here while looking at mgrid I
noticed a few cases where we were moving values more from integer to
the VFP side but overall I think this patch benefits more than harms .
These appeared to be around the areas where a floating point array was
being zero initialized. Given the VFP instruction set doesn't really
have a zero initializer form we were moving the value 0 into integer
registers, moving the value into a VFP register rather than just
choosing the integer side register store - I am not yet sure why that
is happening and that's somethiing I'm investigating. Before that , I
wanted some feedback on this patch as it stands today as I believe
it's reached a stage where it appears to be performing reasonably
well.

I did experiment with costs and in general trying to turn off these
auto-increment forms for the FP modes when we are not in soft-float
mode but nothing appeared to behave as well as this attached patch.

Thoughts and comments would be welcome. I don't know of any other
architectures where this will be applicable.

Regards,
Ramana


gcc/

	* tree-ssa-loop-ivopts.c (add_autoinc_candidates, get_address_cost):
	Replace use of HAVE_{POST/PRE}_{INCREMENT/DECREMENT} with
	USE_{LOAD/STORE}_{PRE/POST}_{INCREMENT/DECREMENT} appropriately.
	* config/arm/arm.h (ARM_AUTOINC_VALID_FOR_MODE_P): New.
	(USE_LOAD_POST_INCREMENT): Define.
	(USE_LOAD_PRE_INCREMENT): Define.
	(USE_LOAD_POST_DECREMENT): Define.
	(USE_LOAD_PRE_DECREMENT): Define.
	(USE_STORE_PRE_DECREMENT): Define.
	(USE_STORE_PRE_INCREMENT): Define.
	(USE_STORE_POST_DECREMENT): Define.
	(USE_STORE_POST_INCREMENT): Define.
	(ARM_POST_INC): Define.
	(ARM_PRE_INC): Define.
	(ARM_PRE_DEC): Define.
	(ARM_POST_DEC): Define.
	* config/arm/arm-protos.h (arm_autoinc_modes_ok_p): Declare.
	* config/arm/arm.c (arm_autoinc_modes_ok_p): Define.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets.
  2012-03-27 12:57 [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets Ramana Radhakrishnan
@ 2012-03-27 13:17 ` Ramana Radhakrishnan
  2012-03-28  9:57   ` Richard Guenther
  0 siblings, 1 reply; 6+ messages in thread
From: Ramana Radhakrishnan @ 2012-03-27 13:17 UTC (permalink / raw)
  To: gcc-patches; +Cc: Patch Tracking

[-- Attachment #1: Type: text/plain, Size: 43 bytes --]

And the patch is now attached ....

Ramana

[-- Attachment #2: wa-vfp-addressing.txt --]
[-- Type: text/plain, Size: 5321 bytes --]

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 900d09a..6e82fb0 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -247,5 +247,5 @@ extern int vfp3_const_double_for_fract_bits (rtx);
 
 extern void arm_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel);
 extern bool arm_expand_vec_perm_const (rtx target, rtx op0, rtx op1, rtx sel);
-
+extern bool arm_autoinc_modes_ok_p (enum machine_mode, int);
 #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9af66dd..31d6d9f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25652,5 +25652,40 @@ arm_vectorize_vec_perm_const_ok (enum machine_mode vmode,
   return ret;
 }
 
+bool
+arm_autoinc_modes_ok_p (enum machine_mode mode, int code)
+{
+  if (TARGET_SOFT_FLOAT)
+    return true;
+
+  switch (code)
+    {
+    case ARM_POST_INC:
+    case ARM_PRE_DEC:
+      if (VECTOR_MODE_P (mode))
+	{
+	  if (code != PRE_DEC)
+	    return true;
+	  else 
+	    return false;
+	}
+      
+      return true;
+
+    case ARM_POST_DEC:
+    case ARM_PRE_INC:
+      if (FLOAT_MODE_P (mode) || VECTOR_MODE_P (mode))
+	return false;
+      else
+	return true;
+     
+    default:
+      return false;
+      
+    }
+
+  return false;
+}
+
 \f
 #include "gt-arm.h"
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 443d2ed..2e4f3a0 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1623,6 +1623,27 @@ typedef struct
 #define HAVE_PRE_MODIFY_REG   TARGET_32BIT
 #define HAVE_POST_MODIFY_REG  TARGET_32BIT
 
+#define ARM_POST_INC 0
+#define ARM_PRE_INC  1
+#define ARM_POST_DEC 2
+#define ARM_PRE_DEC  3
+
+#define ARM_AUTOINC_VALID_FOR_MODE_P(mode, code) \
+  (TARGET_32BIT && arm_autoinc_modes_ok_p (mode, code))
+#define USE_LOAD_POST_INCREMENT(mode) \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_POST_INC)
+#define USE_LOAD_PRE_INCREMENT(mode)  \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_PRE_INC)
+#define USE_LOAD_POST_DECREMENT(mode) \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_POST_DEC)
+#define USE_LOAD_PRE_DECREMENT(mode)  \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_PRE_DEC)
+
+#define USE_STORE_PRE_DECREMENT(mode) USE_LOAD_PRE_DECREMENT(mode)
+#define USE_STORE_PRE_INCREMENT(mode) USE_LOAD_PRE_INCREMENT(mode)
+#define USE_STORE_POST_DECREMENT(mode) USE_LOAD_POST_DECREMENT(mode)
+#define USE_STORE_POST_INCREMENT(mode) USE_LOAD_POST_INCREMENT(mode)
+
 /* Macros to check register numbers against specific register classes.  */
 
 /* These assume that REGNO is a hard or pseudo reg number.
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 527c911..ac37608 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -2361,8 +2361,12 @@ add_autoinc_candidates (struct ivopts_data *data, tree base, tree step,
   cstepi = int_cst_value (step);
 
   mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));
-  if ((HAVE_PRE_INCREMENT && GET_MODE_SIZE (mem_mode) == cstepi)
-      || (HAVE_PRE_DECREMENT && GET_MODE_SIZE (mem_mode) == -cstepi))
+  if (((USE_LOAD_PRE_INCREMENT (mem_mode)
+	|| USE_STORE_PRE_INCREMENT (mem_mode))
+       && GET_MODE_SIZE (mem_mode) == cstepi)
+      || ((USE_LOAD_PRE_DECREMENT (mem_mode)
+	   || USE_STORE_PRE_DECREMENT (mem_mode))
+	  && GET_MODE_SIZE (mem_mode) == -cstepi))
     {
       enum tree_code code = MINUS_EXPR;
       tree new_base;
@@ -2379,8 +2383,12 @@ add_autoinc_candidates (struct ivopts_data *data, tree base, tree step,
       add_candidate_1 (data, new_base, step, important, IP_BEFORE_USE, use,
 		       use->stmt);
     }
-  if ((HAVE_POST_INCREMENT && GET_MODE_SIZE (mem_mode) == cstepi)
-      || (HAVE_POST_DECREMENT && GET_MODE_SIZE (mem_mode) == -cstepi))
+  if (((USE_LOAD_POST_INCREMENT (mem_mode)
+	|| USE_STORE_POST_INCREMENT (mem_mode))
+       && GET_MODE_SIZE (mem_mode) == cstepi)
+      || ((USE_LOAD_POST_DECREMENT (mem_mode)
+	   || USE_STORE_POST_DECREMENT (mem_mode))
+	  && GET_MODE_SIZE (mem_mode) == -cstepi))
     {
       add_candidate_1 (data, base, step, important, IP_AFTER_USE, use,
 		       use->stmt);
@@ -3314,25 +3322,29 @@ get_address_cost (bool symbol_present, bool var_present,
       reg0 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1);
       reg1 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 2);
 
-      if (HAVE_PRE_DECREMENT)
+      if (USE_LOAD_PRE_DECREMENT (mem_mode) 
+	  || USE_STORE_PRE_DECREMENT (mem_mode))
 	{
 	  addr = gen_rtx_PRE_DEC (address_mode, reg0);
 	  has_predec[mem_mode]
 	    = memory_address_addr_space_p (mem_mode, addr, as);
 	}
-      if (HAVE_POST_DECREMENT)
+      if (USE_LOAD_POST_DECREMENT (mem_mode) 
+	  || USE_STORE_POST_DECREMENT (mem_mode))
 	{
 	  addr = gen_rtx_POST_DEC (address_mode, reg0);
 	  has_postdec[mem_mode]
 	    = memory_address_addr_space_p (mem_mode, addr, as);
 	}
-      if (HAVE_PRE_INCREMENT)
+      if (USE_LOAD_PRE_INCREMENT (mem_mode) 
+	  || USE_STORE_PRE_DECREMENT (mem_mode))
 	{
 	  addr = gen_rtx_PRE_INC (address_mode, reg0);
 	  has_preinc[mem_mode]
 	    = memory_address_addr_space_p (mem_mode, addr, as);
 	}
-      if (HAVE_POST_INCREMENT)
+      if (USE_LOAD_POST_INCREMENT (mem_mode) 
+	  || USE_STORE_POST_INCREMENT (mem_mode))
 	{
 	  addr = gen_rtx_POST_INC (address_mode, reg0);
 	  has_postinc[mem_mode]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets.
  2012-03-27 13:17 ` Ramana Radhakrishnan
@ 2012-03-28  9:57   ` Richard Guenther
  2012-03-28 10:13     ` Richard Guenther
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Guenther @ 2012-03-28  9:57 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: gcc-patches, Patch Tracking

On Tue, Mar 27, 2012 at 3:17 PM, Ramana Radhakrishnan
<ramana.radhakrishnan@linaro.org> wrote:
> And the patch is now attached ....

This does not look like it would compile on any other target.

Richard.

> Ramana

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets.
  2012-03-28  9:57   ` Richard Guenther
@ 2012-03-28 10:13     ` Richard Guenther
  2012-04-10 12:51       ` Ramana Radhakrishnan
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Guenther @ 2012-03-28 10:13 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: gcc-patches, Patch Tracking

On Wed, Mar 28, 2012 at 11:57 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Tue, Mar 27, 2012 at 3:17 PM, Ramana Radhakrishnan
> <ramana.radhakrishnan@linaro.org> wrote:
>> And the patch is now attached ....
>
> This does not look like it would compile on any other target.

Looks like the macros are pre-existing in rtl.h.  With that the ivopts change
is ok.  I'll let arm folks decide over the arm specific bits.

Thanks,
Richard.

> Richard.
>
>> Ramana

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets.
  2012-03-28 10:13     ` Richard Guenther
@ 2012-04-10 12:51       ` Ramana Radhakrishnan
  2012-05-09 12:55         ` Ramana Radhakrishnan
  0 siblings, 1 reply; 6+ messages in thread
From: Ramana Radhakrishnan @ 2012-04-10 12:51 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Patch Tracking, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 2004 bytes --]

On 28 March 2012 11:13, Richard Guenther <richard.guenther@gmail.com> wrote:
> On Wed, Mar 28, 2012 at 11:57 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Tue, Mar 27, 2012 at 3:17 PM, Ramana Radhakrishnan
>> <ramana.radhakrishnan@linaro.org> wrote:
>>> And the patch is now attached ....
>>
>> This does not look like it would compile on any other target.
>
> Looks like the macros are pre-existing in rtl.h.  With that the ivopts change
> is ok.  I'll let arm folks decide over the arm specific bits.

Thanks for the approval on the ivopts stuff -which I haven't changed .
I have revised the ARM backend specific changes to tweak costs to
prevent auto-inc-dec from gratuitously adding more moves for the
pre/post_modify_disp variety of instructions and to disable these for
versions of the architecture which don't have support for LDRD.

I would like another set of eyes on the backend specific changes - I
am currently regression testing this final version on FSF trunk.

2012-04-10  Ramana Radhakrishnan  <ramana.radhakrishnan@linaro.org>

	* tree-ssa-loop-ivopts.c (add_autoinc_candidates, get_address_cost):
	Replace use of HAVE_{POST/PRE}_{INCREMENT/DECREMENT} with
	USE_{LOAD/STORE}_{PRE/POST}_{INCREMENT/DECREMENT} appropriately.
	* config/arm/arm.h (ARM_AUTOINC_VALID_FOR_MODE_P): New.
	(USE_LOAD_POST_INCREMENT): Define.
	(USE_LOAD_PRE_INCREMENT): Define.
	(USE_LOAD_POST_DECREMENT): Define.
	(USE_LOAD_PRE_DECREMENT): Define.
	(USE_STORE_PRE_DECREMENT): Define.
	(USE_STORE_PRE_INCREMENT): Define.
	(USE_STORE_POST_DECREMENT): Define.
	(USE_STORE_POST_INCREMENT): Define.
	(arm_auto_incmodes): Add enumeration.
	* config/arm/arm-protos.h (arm_autoinc_modes_ok_p): Declare.
	* config/arm/arm.c (arm_autoinc_modes_ok_p): Define.
	(arm_rtx_costs_1): Adjust costs for
	auto-inc modes and pre / post modify in floating point mode.
	(arm_size_rtx_costs): Likewise.


regards,
Ramana



> Richard.
>
>> Richard.
>>
>>> Ramana

[-- Attachment #2: final-vfp-addressing-modes-patch.txt --]
[-- Type: text/plain, Size: 7582 bytes --]

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 900d09a..f9cb75a 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -248,4 +248,6 @@ extern int vfp3_const_double_for_fract_bits (rtx);
 extern void arm_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel);
 extern bool arm_expand_vec_perm_const (rtx target, rtx op0, rtx op1, rtx sel);
 
+extern bool arm_autoinc_modes_ok_p (enum machine_mode, enum arm_auto_incmodes);
+
 #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5522fc1..6bc5aa9 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -7121,6 +7121,19 @@ arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, bool speed)
       /* Memory costs quite a lot for the first word, but subsequent words
 	 load at the equivalent of a single insn each.  */
       *total = COSTS_N_INSNS (2 + ARM_NUM_REGS (mode));
+
+      /* If we have hard float or there is no support for ldrd 
+	 and strd there is no point in allowing post_dec, 
+	 pre_inc and pre/post_modify_disp to have the same cost
+	 for memory accesses in floating point modes.  */
+      if ((TARGET_HARD_FLOAT
+	   || !TARGET_LDRD)
+	  && (FLOAT_MODE_P (mode) &&
+	      (GET_CODE (XEXP (x, 0)) == POST_DEC
+	       || GET_CODE (XEXP (x, 0)) == PRE_INC
+	       || GET_CODE (XEXP (x, 0)) == PRE_MODIFY
+	       || GET_CODE (XEXP (x, 0)) == POST_MODIFY)))
+	*total += COSTS_N_INSNS (2);
       return true;
 
     case DIV:
@@ -7831,6 +7844,20 @@ arm_size_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	*total = COSTS_N_INSNS (2);
       else
 	*total = COSTS_N_INSNS (ARM_NUM_REGS (mode));
+
+      /* If we have hard float or there is no support for ldrd 
+	 and strd there is no point in allowing post_dec, 
+	 pre_inc and pre/post_modify_disp to have the same cost
+	 for memory accesses in floating point modes.  */      
+      if ((TARGET_HARD_FLOAT
+	   || !TARGET_LDRD)
+	  && (FLOAT_MODE_P (mode) &&
+	      (GET_CODE (XEXP (x, 0)) == POST_DEC
+	       || GET_CODE (XEXP (x, 0)) == PRE_INC
+	       || GET_CODE (XEXP (x, 0)) == PRE_MODIFY
+	       || GET_CODE (XEXP (x, 0)) == POST_MODIFY)))
+	*total = COSTS_N_INSNS (2);
+      
       return true;
 
     case DIV:
@@ -25680,5 +25707,51 @@ arm_vectorize_vec_perm_const_ok (enum machine_mode vmode,
   return ret;
 }
 
-\f
+bool
+arm_autoinc_modes_ok_p (enum machine_mode mode, enum arm_auto_incmodes code)
+{
+  /* If we are soft float and we do not have ldrd 
+     then all auto increment forms are ok.  */
+  if (TARGET_SOFT_FLOAT && (TARGET_LDRD || GET_MODE_SIZE (mode) <= 4))
+    return true;
+
+  switch (code)
+    {
+      /* Post increment and Pre Decrement are supported for all
+	 instruction forms except for vector forms.  */
+    case ARM_POST_INC:
+    case ARM_PRE_DEC:
+      if (VECTOR_MODE_P (mode))
+	{
+	  if (code != ARM_PRE_DEC)
+	    return true;
+	  else 
+	    return false;
+	}
+      
+      return true;
+
+    case ARM_POST_DEC:
+    case ARM_PRE_INC:
+      /* Without LDRD and mode size greater than 
+	 word size, there is no point in auto-incrementing
+         because ldm and stm will not have these forms.  */
+      if (!TARGET_LDRD && GET_MODE_SIZE (mode) > 4)
+	return false;
+
+      /* Vector and floating point modes do not support
+	 these auto increment forms.  */
+      if (FLOAT_MODE_P (mode) || VECTOR_MODE_P (mode))
+	return false;
+
+      return true;
+     
+    default:
+      return false;
+      
+    }
+
+  return false;
+}
+
 #include "gt-arm.h"
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index c6b4cc0..f4204e4 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1613,6 +1613,30 @@ typedef struct
 #define HAVE_PRE_MODIFY_REG   TARGET_32BIT
 #define HAVE_POST_MODIFY_REG  TARGET_32BIT
 
+enum arm_auto_incmodes
+  {
+    ARM_POST_INC,
+    ARM_PRE_INC,
+    ARM_POST_DEC,
+    ARM_PRE_DEC
+  };
+
+#define ARM_AUTOINC_VALID_FOR_MODE_P(mode, code) \
+  (TARGET_32BIT && arm_autoinc_modes_ok_p (mode, code))
+#define USE_LOAD_POST_INCREMENT(mode) \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_POST_INC)
+#define USE_LOAD_PRE_INCREMENT(mode)  \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_PRE_INC)
+#define USE_LOAD_POST_DECREMENT(mode) \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_POST_DEC)
+#define USE_LOAD_PRE_DECREMENT(mode)  \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_PRE_DEC)
+
+#define USE_STORE_PRE_DECREMENT(mode) USE_LOAD_PRE_DECREMENT(mode)
+#define USE_STORE_PRE_INCREMENT(mode) USE_LOAD_PRE_INCREMENT(mode)
+#define USE_STORE_POST_DECREMENT(mode) USE_LOAD_POST_DECREMENT(mode)
+#define USE_STORE_POST_INCREMENT(mode) USE_LOAD_POST_INCREMENT(mode)
+
 /* Macros to check register numbers against specific register classes.  */
 
 /* These assume that REGNO is a hard or pseudo reg number.
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 527c911..ac37608 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -2361,8 +2361,12 @@ add_autoinc_candidates (struct ivopts_data *data, tree base, tree step,
   cstepi = int_cst_value (step);
 
   mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));
-  if ((HAVE_PRE_INCREMENT && GET_MODE_SIZE (mem_mode) == cstepi)
-      || (HAVE_PRE_DECREMENT && GET_MODE_SIZE (mem_mode) == -cstepi))
+  if (((USE_LOAD_PRE_INCREMENT (mem_mode)
+	|| USE_STORE_PRE_INCREMENT (mem_mode))
+       && GET_MODE_SIZE (mem_mode) == cstepi)
+      || ((USE_LOAD_PRE_DECREMENT (mem_mode)
+	   || USE_STORE_PRE_DECREMENT (mem_mode))
+	  && GET_MODE_SIZE (mem_mode) == -cstepi))
     {
       enum tree_code code = MINUS_EXPR;
       tree new_base;
@@ -2379,8 +2383,12 @@ add_autoinc_candidates (struct ivopts_data *data, tree base, tree step,
       add_candidate_1 (data, new_base, step, important, IP_BEFORE_USE, use,
 		       use->stmt);
     }
-  if ((HAVE_POST_INCREMENT && GET_MODE_SIZE (mem_mode) == cstepi)
-      || (HAVE_POST_DECREMENT && GET_MODE_SIZE (mem_mode) == -cstepi))
+  if (((USE_LOAD_POST_INCREMENT (mem_mode)
+	|| USE_STORE_POST_INCREMENT (mem_mode))
+       && GET_MODE_SIZE (mem_mode) == cstepi)
+      || ((USE_LOAD_POST_DECREMENT (mem_mode)
+	   || USE_STORE_POST_DECREMENT (mem_mode))
+	  && GET_MODE_SIZE (mem_mode) == -cstepi))
     {
       add_candidate_1 (data, base, step, important, IP_AFTER_USE, use,
 		       use->stmt);
@@ -3314,25 +3322,29 @@ get_address_cost (bool symbol_present, bool var_present,
       reg0 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1);
       reg1 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 2);
 
-      if (HAVE_PRE_DECREMENT)
+      if (USE_LOAD_PRE_DECREMENT (mem_mode) 
+	  || USE_STORE_PRE_DECREMENT (mem_mode))
 	{
 	  addr = gen_rtx_PRE_DEC (address_mode, reg0);
 	  has_predec[mem_mode]
 	    = memory_address_addr_space_p (mem_mode, addr, as);
 	}
-      if (HAVE_POST_DECREMENT)
+      if (USE_LOAD_POST_DECREMENT (mem_mode) 
+	  || USE_STORE_POST_DECREMENT (mem_mode))
 	{
 	  addr = gen_rtx_POST_DEC (address_mode, reg0);
 	  has_postdec[mem_mode]
 	    = memory_address_addr_space_p (mem_mode, addr, as);
 	}
-      if (HAVE_PRE_INCREMENT)
+      if (USE_LOAD_PRE_INCREMENT (mem_mode) 
+	  || USE_STORE_PRE_DECREMENT (mem_mode))
 	{
 	  addr = gen_rtx_PRE_INC (address_mode, reg0);
 	  has_preinc[mem_mode]
 	    = memory_address_addr_space_p (mem_mode, addr, as);
 	}
-      if (HAVE_POST_INCREMENT)
+      if (USE_LOAD_POST_INCREMENT (mem_mode) 
+	  || USE_STORE_POST_INCREMENT (mem_mode))
 	{
 	  addr = gen_rtx_POST_INC (address_mode, reg0);
 	  has_postinc[mem_mode]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets.
  2012-04-10 12:51       ` Ramana Radhakrishnan
@ 2012-05-09 12:55         ` Ramana Radhakrishnan
  0 siblings, 0 replies; 6+ messages in thread
From: Ramana Radhakrishnan @ 2012-05-09 12:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Guenther, Patch Tracking, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 2324 bytes --]

> I would like another set of eyes on the backend specific changes - I
> am currently regression testing this final version on FSF trunk.

After testing and benchmarking and getting some private feedback about
the patch, this is what I ended up committing.  I have a follow up
patch coming to adjust legitimate_address and friends for some of
these modes.

regards,
Ramana

2012-05-09  Ramana Radhakrishnan  <ramana.radhakrishnan@linaro.org>

	* tree-ssa-loop-ivopts.c (add_autoinc_candidates, get_address_cost):
	Replace use of HAVE_{POST/PRE}_{INCREMENT/DECREMENT} with
	USE_{LOAD/STORE}_{PRE/POST}_{INCREMENT/DECREMENT} appropriately.
	* config/arm/arm.h (ARM_AUTOINC_VALID_FOR_MODE_P): New.
	(USE_LOAD_POST_INCREMENT): Define.
	(USE_LOAD_PRE_INCREMENT): Define.
	(USE_LOAD_POST_DECREMENT): Define.
	(USE_LOAD_PRE_DECREMENT): Define.
	(USE_STORE_PRE_DECREMENT): Define.
	(USE_STORE_PRE_INCREMENT): Define.
	(USE_STORE_POST_DECREMENT): Define.
	(USE_STORE_POST_INCREMENT): Define.
	(arm_auto_incmodes): Add enumeration.
	* config/arm/arm-protos.h (arm_autoinc_modes_ok_p): Declare.
	* config/arm/arm.c (arm_autoinc_modes_ok_p): Define.



>
> 2012-04-10  Ramana Radhakrishnan  <ramana.radhakrishnan@linaro.org>
>
>        * tree-ssa-loop-ivopts.c (add_autoinc_candidates, get_address_cost):
>        Replace use of HAVE_{POST/PRE}_{INCREMENT/DECREMENT} with
>        USE_{LOAD/STORE}_{PRE/POST}_{INCREMENT/DECREMENT} appropriately.
>        * config/arm/arm.h (ARM_AUTOINC_VALID_FOR_MODE_P): New.
>        (USE_LOAD_POST_INCREMENT): Define.
>        (USE_LOAD_PRE_INCREMENT): Define.
>        (USE_LOAD_POST_DECREMENT): Define.
>        (USE_LOAD_PRE_DECREMENT): Define.
>        (USE_STORE_PRE_DECREMENT): Define.
>        (USE_STORE_PRE_INCREMENT): Define.
>        (USE_STORE_POST_DECREMENT): Define.
>        (USE_STORE_POST_INCREMENT): Define.
>        (arm_auto_incmodes): Add enumeration.
>        * config/arm/arm-protos.h (arm_autoinc_modes_ok_p): Declare.
>        * config/arm/arm.c (arm_autoinc_modes_ok_p): Define.
>        (arm_rtx_costs_1): Adjust costs for
>        auto-inc modes and pre / post modify in floating point mode.
>        (arm_size_rtx_costs): Likewise.
>
>
> regards,
> Ramana
>
>
>
>> Richard.
>>
>>> Richard.
>>>
>>>> Ramana

[-- Attachment #2: final-commit-vfp-patch.txt --]
[-- Type: text/plain, Size: 5773 bytes --]

Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c	(revision 187327)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -2362,8 +2362,12 @@
   cstepi = int_cst_value (step);
 
   mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p));
-  if ((HAVE_PRE_INCREMENT && GET_MODE_SIZE (mem_mode) == cstepi)
-      || (HAVE_PRE_DECREMENT && GET_MODE_SIZE (mem_mode) == -cstepi))
+  if (((USE_LOAD_PRE_INCREMENT (mem_mode)
+	|| USE_STORE_PRE_INCREMENT (mem_mode))
+       && GET_MODE_SIZE (mem_mode) == cstepi)
+      || ((USE_LOAD_PRE_DECREMENT (mem_mode)
+	   || USE_STORE_PRE_DECREMENT (mem_mode))
+	  && GET_MODE_SIZE (mem_mode) == -cstepi))
     {
       enum tree_code code = MINUS_EXPR;
       tree new_base;
@@ -2380,8 +2384,12 @@
       add_candidate_1 (data, new_base, step, important, IP_BEFORE_USE, use,
 		       use->stmt);
     }
-  if ((HAVE_POST_INCREMENT && GET_MODE_SIZE (mem_mode) == cstepi)
-      || (HAVE_POST_DECREMENT && GET_MODE_SIZE (mem_mode) == -cstepi))
+  if (((USE_LOAD_POST_INCREMENT (mem_mode)
+	|| USE_STORE_POST_INCREMENT (mem_mode))
+       && GET_MODE_SIZE (mem_mode) == cstepi)
+      || ((USE_LOAD_POST_DECREMENT (mem_mode)
+	   || USE_STORE_POST_DECREMENT (mem_mode))
+	  && GET_MODE_SIZE (mem_mode) == -cstepi))
     {
       add_candidate_1 (data, base, step, important, IP_AFTER_USE, use,
 		       use->stmt);
@@ -3315,25 +3323,29 @@
       reg0 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1);
       reg1 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 2);
 
-      if (HAVE_PRE_DECREMENT)
+      if (USE_LOAD_PRE_DECREMENT (mem_mode) 
+	  || USE_STORE_PRE_DECREMENT (mem_mode))
 	{
 	  addr = gen_rtx_PRE_DEC (address_mode, reg0);
 	  has_predec[mem_mode]
 	    = memory_address_addr_space_p (mem_mode, addr, as);
 	}
-      if (HAVE_POST_DECREMENT)
+      if (USE_LOAD_POST_DECREMENT (mem_mode) 
+	  || USE_STORE_POST_DECREMENT (mem_mode))
 	{
 	  addr = gen_rtx_POST_DEC (address_mode, reg0);
 	  has_postdec[mem_mode]
 	    = memory_address_addr_space_p (mem_mode, addr, as);
 	}
-      if (HAVE_PRE_INCREMENT)
+      if (USE_LOAD_PRE_INCREMENT (mem_mode) 
+	  || USE_STORE_PRE_DECREMENT (mem_mode))
 	{
 	  addr = gen_rtx_PRE_INC (address_mode, reg0);
 	  has_preinc[mem_mode]
 	    = memory_address_addr_space_p (mem_mode, addr, as);
 	}
-      if (HAVE_POST_INCREMENT)
+      if (USE_LOAD_POST_INCREMENT (mem_mode) 
+	  || USE_STORE_POST_INCREMENT (mem_mode))
 	{
 	  addr = gen_rtx_POST_INC (address_mode, reg0);
 	  has_postinc[mem_mode]
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 187327)
+++ gcc/config/arm/arm.c	(working copy)
@@ -25886,5 +25886,51 @@
   return ret;
 }
 
-\f
+bool
+arm_autoinc_modes_ok_p (enum machine_mode mode, enum arm_auto_incmodes code)
+{
+  /* If we are soft float and we do not have ldrd 
+     then all auto increment forms are ok.  */
+  if (TARGET_SOFT_FLOAT && (TARGET_LDRD || GET_MODE_SIZE (mode) <= 4))
+    return true;
+
+  switch (code)
+    {
+      /* Post increment and Pre Decrement are supported for all
+	 instruction forms except for vector forms.  */
+    case ARM_POST_INC:
+    case ARM_PRE_DEC:
+      if (VECTOR_MODE_P (mode))
+	{
+	  if (code != ARM_PRE_DEC)
+	    return true;
+	  else 
+	    return false;
+	}
+      
+      return true;
+
+    case ARM_POST_DEC:
+    case ARM_PRE_INC:
+      /* Without LDRD and mode size greater than 
+	 word size, there is no point in auto-incrementing
+         because ldm and stm will not have these forms.  */
+      if (!TARGET_LDRD && GET_MODE_SIZE (mode) > 4)
+	return false;
+
+      /* Vector and floating point modes do not support
+	 these auto increment forms.  */
+      if (FLOAT_MODE_P (mode) || VECTOR_MODE_P (mode))
+	return false;
+
+      return true;
+     
+    default:
+      return false;
+      
+    }
+
+  return false;
+}
+
 #include "gt-arm.h"
Index: gcc/config/arm/arm.h
===================================================================
--- gcc/config/arm/arm.h	(revision 187327)
+++ gcc/config/arm/arm.h	(working copy)
@@ -1613,6 +1613,30 @@
 #define HAVE_PRE_MODIFY_REG   TARGET_32BIT
 #define HAVE_POST_MODIFY_REG  TARGET_32BIT
 
+enum arm_auto_incmodes
+  {
+    ARM_POST_INC,
+    ARM_PRE_INC,
+    ARM_POST_DEC,
+    ARM_PRE_DEC
+  };
+
+#define ARM_AUTOINC_VALID_FOR_MODE_P(mode, code) \
+  (TARGET_32BIT && arm_autoinc_modes_ok_p (mode, code))
+#define USE_LOAD_POST_INCREMENT(mode) \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_POST_INC)
+#define USE_LOAD_PRE_INCREMENT(mode)  \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_PRE_INC)
+#define USE_LOAD_POST_DECREMENT(mode) \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_POST_DEC)
+#define USE_LOAD_PRE_DECREMENT(mode)  \
+  ARM_AUTOINC_VALID_FOR_MODE_P(mode, ARM_PRE_DEC)
+
+#define USE_STORE_PRE_DECREMENT(mode) USE_LOAD_PRE_DECREMENT(mode)
+#define USE_STORE_PRE_INCREMENT(mode) USE_LOAD_PRE_INCREMENT(mode)
+#define USE_STORE_POST_DECREMENT(mode) USE_LOAD_POST_DECREMENT(mode)
+#define USE_STORE_POST_INCREMENT(mode) USE_LOAD_POST_INCREMENT(mode)
+
 /* Macros to check register numbers against specific register classes.  */
 
 /* These assume that REGNO is a hard or pseudo reg number.
Index: gcc/config/arm/arm-protos.h
===================================================================
--- gcc/config/arm/arm-protos.h	(revision 187327)
+++ gcc/config/arm/arm-protos.h	(working copy)
@@ -250,4 +250,6 @@
 extern void arm_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel);
 extern bool arm_expand_vec_perm_const (rtx target, rtx op0, rtx op1, rtx sel);
 
+extern bool arm_autoinc_modes_ok_p (enum machine_mode, enum arm_auto_incmodes);
+
 #endif /* ! GCC_ARM_PROTOS_H */

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-05-09 12:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-27 12:57 [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets Ramana Radhakrishnan
2012-03-27 13:17 ` Ramana Radhakrishnan
2012-03-28  9:57   ` Richard Guenther
2012-03-28 10:13     ` Richard Guenther
2012-04-10 12:51       ` Ramana Radhakrishnan
2012-05-09 12:55         ` Ramana Radhakrishnan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).