public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
@ 2014-03-29  1:27 Robert Suchanek
  2014-03-29 11:24 ` Richard Sandiford
  2014-03-29 14:08 ` [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend Jakub Jelinek
  0 siblings, 2 replies; 31+ messages in thread
From: Robert Suchanek @ 2014-03-29  1:27 UTC (permalink / raw)
  To: gcc-patches; +Cc: vmakarov, rdsandiford

Hi All,

This patch enables LRA by default for MIPS. The classic reload is still
available and can be enabled via -mreload switch. 

All regression are fixed, with one exception described below.

There was a necessary change in the LRA core as I believe there was a genuine 
unhandled case in LRA when processing addresses. It is specific to MIPS16
as store/load[unsigned] halfword/byte instructions cannot access the stack pointer
directly. Potentially, it can affect other architectures if they have similar
limitation. One of the problems showed an RTL that contained $frame as the base register
(without any offset, simple move) but LRA temporarily eliminated it to $sp before
calling the target hook to validate the address.
The backend rejected it because of the mode and $sp. Then, LRA tried to emit base+disp 
but ICEd because there never was any displacement. Another testcase, revealed offset 
not being used and unnecessary 'add' instructions were inserted preventing the use of offsets. 
Marking an insn with STACK_POINTER_REGNUM as valid was not an option as LRA would 
generate an insn with $sp and fail during coherency check. The patch attempts to 
reload $sp into a register and re-validate the address with offset (if there is one). 
If this fails it sticks to the original plan inserting base+disp.

The generated code optimized for size is fairly acceptable. CSiBE shows a slight 
advantage of LRA over the reload for MIPS16 with some minor regression for mips32*, 
mips64*, on average less than 0.5%. The code size improvements are being investigated.

The patch has been tested on the following variations:
- cross-tested mips-mti-elf, mips-mti-linux-gnu (languages=c,c++):
  {-mips32,-mips32r2}{-EL,-EB}{-mhard-float,-msoft-float}{-mno-mips16,-mips16}
  -mips64r2 -mabi=n32 {-mhard-float,-msoft-float}
  -mips64r2 -mabi=64 {-mhard-float,-msoft-float}
- bootstrapped and regtested x86_84-unknown-linux-gnu (all languages)

There are two known DejaGNU failures on mips64 with mabi=64, namely, m{add,sub}-8 tests
because of the subtleties in LRA costing model but it's not a correctness issue.
The *mul_{add,sub}_si patterns are tuned explicitly for LRA and all failures 
have been resolved with m{add,sub}-* except the above. By saying failures I mean 
the differences between tests ran with/without -mreload switch. A number of failures 
already existed on ToT at the time of testing.

The patch is intended for Stage 1. As for the legal part, the company-wide 
copyright assignment is in process.

Regards,
Robert

testsuite/ChangeLog:

2014-03-26  Robert Suchanek  <Robert.Suchanek@imgtec.com>

	* lra-constraints.c (base_to_reg): New function.
	(process_address): Use new function.
	* rtlanal.c (get_base_term): Add CONSTANT_P (*inner).

	* config/mips/constraints.md ("d"): BASE_REG_CLASS
	replaced by ADDR_REG_CLASS.
	* config/mips/mips.c (mips_regno_mode_ok_for_base_p):
	Remove use !strict_p for MIPS16.
	(mips_register_priority): New function that implements
	the target hook TARGET_REGISTER_PRIORITY.
	(mips_spill_class): Likewise for TARGET_SPILL_CLASS
	(mips_lra_p): Likewise for TARGET_LRA_P.
	* config/mips/mips.h (reg_class): Add M16F_REGS and SPILL_REGS
	classes.
	(REG_CLASS_NAMES): Likewise.
	(REG_CLASS_CONTENTS): Likewise.
	(BASE_REG_CLASS): Use M16F_REGS.
	(ADDR_REG_CLASS): Define.
	(IRA_HARD_REGNO_ADD_COST_MULTIPLIER): Define.
	* config/mips/mips.md (*mul_acc_si, *mul_sub_si): Add alternative
	tuned for LRA. New set attribute to enable alternatives
	depending on the register allocator used.
	(*and<mode>3_mips16): Remove the load alternatives.
	(*lea64): Disable pattern for MIPS16.
	* config/mips/mips.opt
	(mreload): New option.
---
 gcc/config/mips/constraints.md |    2 +-
 gcc/config/mips/mips.c         |   51 +++++++++++++++++-
 gcc/config/mips/mips.h         |   17 +++++-
 gcc/config/mips/mips.md        |  112 +++++++++++++++++++++++-----------------
 gcc/config/mips/mips.opt       |    4 ++
 gcc/lra-constraints.c          |   44 +++++++++++++++-
 gcc/rtlanal.c                  |    3 +-
 7 files changed, 181 insertions(+), 52 deletions(-)

diff --git gcc/config/mips/constraints.md gcc/config/mips/constraints.md
index 49e4895..3810ac3 100644
--- gcc/config/mips/constraints.md
+++ gcc/config/mips/constraints.md
@@ -19,7 +19,7 @@
 
 ;; Register constraints
 
-(define_register_constraint "d" "BASE_REG_CLASS"
+(define_register_constraint "d" "ADDR_REG_CLASS"
   "An address register.  This is equivalent to @code{r} unless
    generating MIPS16 code.")
 
diff --git gcc/config/mips/mips.c gcc/config/mips/mips.c
index 143169b..f27a801 100644
--- gcc/config/mips/mips.c
+++ gcc/config/mips/mips.c
@@ -2255,7 +2255,7 @@ mips_regno_mode_ok_for_base_p (int regno, enum machine_mode mode,
      All in all, it seems more consistent to only enforce this restriction
      during and after reload.  */
   if (TARGET_MIPS16 && regno == STACK_POINTER_REGNUM)
-    return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
+    return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
 
   return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno);
 }
@@ -12114,6 +12114,32 @@ mips_register_move_cost (enum machine_mode mode,
   return 0;
 }
 
+/* Return a register priority for hard reg REGNO.  */
+
+static int
+mips_register_priority (int hard_regno)
+{
+  if (TARGET_MIPS16)
+   {
+     /* Treat MIPS16 registers with higher priority than other regs.  */
+     switch (hard_regno)
+       {
+       case 2:
+       case 3:
+       case 4:
+       case 5:
+       case 6:
+       case 7:
+       case 16:
+       case 17:
+         return 1;
+       default:
+         return 0;
+       }
+   }
+  return 0;
+}
+
 /* Implement TARGET_MEMORY_MOVE_COST.  */
 
 static int
@@ -18897,6 +18923,22 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
   *update = build2 (COMPOUND_EXPR, void_type_node, *update,
 		    atomic_feraiseexcept_call);
 }
+
+static reg_class_t
+mips_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED, 
+		  enum machine_mode mode ATTRIBUTE_UNUSED)
+{
+  if (TARGET_MIPS16)
+    return SPILL_REGS;
+  return NO_REGS;
+}
+
+static bool
+mips_lra_p (void)
+{
+  return !TARGET_RELOAD;
+}
+
 

 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
@@ -18960,6 +19002,8 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #define TARGET_VALID_POINTER_MODE mips_valid_pointer_mode
 #undef TARGET_REGISTER_MOVE_COST
 #define TARGET_REGISTER_MOVE_COST mips_register_move_cost
+#undef TARGET_REGISTER_PRIORITY
+#define TARGET_REGISTER_PRIORITY mips_register_priority
 #undef TARGET_MEMORY_MOVE_COST
 #define TARGET_MEMORY_MOVE_COST mips_memory_move_cost
 #undef TARGET_RTX_COSTS
@@ -19134,6 +19178,11 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV mips_atomic_assign_expand_fenv
 
+#undef TARGET_SPILL_CLASS
+#define TARGET_SPILL_CLASS mips_spill_class
+#undef TARGET_LRA_P
+#define TARGET_LRA_P mips_lra_p
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 

 #include "gt-mips.h"
diff --git gcc/config/mips/mips.h gcc/config/mips/mips.h
index a786d4c..712008f 100644
--- gcc/config/mips/mips.h
+++ gcc/config/mips/mips.h
@@ -1871,10 +1871,12 @@ enum reg_class
 {
   NO_REGS,			/* no registers in set */
   M16_REGS,			/* mips16 directly accessible registers */
+  M16F_REGS,			/* mips16 + frame */
   T_REG,			/* mips16 T register ($24) */
   M16_T_REGS,			/* mips16 registers plus T register */
   PIC_FN_ADDR_REG,		/* SVR4 PIC function address register */
   V1_REG,			/* Register $v1 ($3) used for TLS access.  */
+  SPILL_REGS,			/* All but $sp and call preserved regs are in here */
   LEA_REGS,			/* Every GPR except $25 */
   GR_REGS,			/* integer registers */
   FP_REGS,			/* floating point registers */
@@ -1908,10 +1910,12 @@ enum reg_class
 {									\
   "NO_REGS",								\
   "M16_REGS",								\
+  "M16F_REGS",								\
   "T_REG",								\
   "M16_T_REGS",								\
   "PIC_FN_ADDR_REG",							\
   "V1_REG",								\
+  "SPILL_REGS",								\
   "LEA_REGS",								\
   "GR_REGS",								\
   "FP_REGS",								\
@@ -1948,10 +1952,12 @@ enum reg_class
 {									                                \
   { 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* NO_REGS */		\
   { 0x000300fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16_REGS */		\
+  { 0x200300fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16F_REGS */		\
   { 0x01000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* T_REG */		\
   { 0x010300fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16_T_REGS */	\
   { 0x02000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* PIC_FN_ADDR_REG */	\
   { 0x00000008, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* V1_REG */		\
+  { 0x0003fffc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* SPILL_REGS */      	\
   { 0xfdffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* LEA_REGS */		\
   { 0xffffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* GR_REGS */		\
   { 0x00000000, 0xffffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* FP_REGS */		\
@@ -1984,7 +1990,9 @@ enum reg_class
    valid base register must belong.  A base register is one used in
    an address which is the register value plus a displacement.  */
 
-#define BASE_REG_CLASS  (TARGET_MIPS16 ? M16_REGS : GR_REGS)
+#define BASE_REG_CLASS  (TARGET_MIPS16 ? M16F_REGS : GR_REGS)
+
+#define ADDR_REG_CLASS  (TARGET_MIPS16 ? M16_REGS : GR_REGS)
 
 /* A macro whose definition is the name of the class to which a
    valid index register must belong.  An index register is one used
@@ -1994,6 +2002,13 @@ enum reg_class
 
 #define INDEX_REG_CLASS NO_REGS
 
+/* Add costs to hard registers based on frequency. This helps to negate
+   some of the reduced cost associated with argument registers which 
+   unfairly promotes their use and increases register pressure */
+#define IRA_HARD_REGNO_ADD_COST_MULTIPLIER(REGNO)                       \
+  (TARGET_MIPS16 && optimize_size                                       \
+   ? ((REGNO) >= 4 && (REGNO) <= 7 ? 2 : 0) : 0)
+
 /* We generally want to put call-clobbered registers ahead of
    call-saved ones.  (IRA expects this.)  */
 
diff --git gcc/config/mips/mips.md gcc/config/mips/mips.md
index 1e3e9e6..ababd5e 100644
--- gcc/config/mips/mips.md
+++ gcc/config/mips/mips.md
@@ -1622,40 +1622,66 @@
 ;; copy instructions.  Reload therefore thinks that the second alternative
 ;; is two reloads more costly than the first.  We add "*?*?" to the first
 ;; alternative as a counterweight.
+;;
+;; LRA simulates reload but the cost of reloading scratches is lower
+;; than of the classic reload. For the time being, removing the counterweight
+;; for LRA is more profitable.
 (define_insn "*mul_acc_si"
-  [(set (match_operand:SI 0 "register_operand" "=l*?*?,d?")
-	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d")
-			  (match_operand:SI 2 "register_operand" "d,d"))
-		 (match_operand:SI 3 "register_operand" "0,d")))
-   (clobber (match_scratch:SI 4 "=X,l"))
-   (clobber (match_scratch:SI 5 "=X,&d"))]
+  [(set (match_operand:SI 0 "register_operand" "=l*?*?,l,d?")
+	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d,d")
+			  (match_operand:SI 2 "register_operand" "d,d,d"))
+		 (match_operand:SI 3 "register_operand" "0,0,d")))
+   (clobber (match_scratch:SI 4 "=X,X,l"))
+   (clobber (match_scratch:SI 5 "=X,X,&d"))]
   "GENERATE_MADD_MSUB && !TARGET_MIPS16"
   "@
     madd\t%1,%2
+    madd\t%1,%2
     #"
   [(set_attr "type"	"imadd")
    (set_attr "accum_in"	"3")
    (set_attr "mode"	"SI")
-   (set_attr "insn_count" "1,2")])
+   (set_attr "insn_count" "1,1,2")
+   (set (attr "enabled")
+        (cond [(and (eq_attr "alternative" "0")
+                    (match_test "TARGET_RELOAD"))
+                  (const_string "yes")
+               (and (eq_attr "alternative" "1")
+                    (match_test "!TARGET_RELOAD"))
+                  (const_string "yes")
+               (eq_attr "alternative" "2")
+                  (const_string "yes")]
+              (const_string "no")))])
 
 ;; The same idea applies here.  The middle alternative needs one less
 ;; clobber than the final alternative, so we add "*?" as a counterweight.
 (define_insn "*mul_acc_si_r3900"
-  [(set (match_operand:SI 0 "register_operand" "=l*?*?,d*?,d?")
-	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d,d")
-			  (match_operand:SI 2 "register_operand" "d,d,d"))
-		 (match_operand:SI 3 "register_operand" "0,l,d")))
-   (clobber (match_scratch:SI 4 "=X,3,l"))
-   (clobber (match_scratch:SI 5 "=X,X,&d"))]
+  [(set (match_operand:SI 0 "register_operand" "=l*?*?,l,d*?,d?")
+	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d,d,d")
+			  (match_operand:SI 2 "register_operand" "d,d,d,d"))
+		 (match_operand:SI 3 "register_operand" "0,0,l,d")))
+   (clobber (match_scratch:SI 4 "=X,X,3,l"))
+   (clobber (match_scratch:SI 5 "=X,X,X,&d"))]
   "TARGET_MIPS3900 && !TARGET_MIPS16"
   "@
     madd\t%1,%2
+    madd\t%1,%2
     madd\t%0,%1,%2
     #"
   [(set_attr "type"	"imadd")
    (set_attr "accum_in"	"3")
    (set_attr "mode"	"SI")
-   (set_attr "insn_count" "1,1,2")])
+   (set_attr "insn_count" "1,1,1,2")
+   (set (attr "enabled")
+        (cond [(and (eq_attr "alternative" "0")
+                    (match_test "TARGET_RELOAD"))
+                  (const_string "yes")
+               (and (eq_attr "alternative" "1")
+                    (match_test "!TARGET_RELOAD"))
+                  (const_string "yes")
+               (eq_attr "alternative" "2,3")
+                  (const_string "yes")]
+              (const_string "no")))])
 
 ;; Split *mul_acc_si if both the source and destination accumulator
 ;; values are GPRs.
@@ -1859,20 +1885,31 @@
 
 ;; See the comment above *mul_add_si for details.
 (define_insn "*mul_sub_si"
-  [(set (match_operand:SI 0 "register_operand" "=l*?*?,d?")
-        (minus:SI (match_operand:SI 1 "register_operand" "0,d")
-                  (mult:SI (match_operand:SI 2 "register_operand" "d,d")
-                           (match_operand:SI 3 "register_operand" "d,d"))))
-   (clobber (match_scratch:SI 4 "=X,l"))
-   (clobber (match_scratch:SI 5 "=X,&d"))]
+  [(set (match_operand:SI 0 "register_operand" "=l*?*?,l,d?")
+        (minus:SI (match_operand:SI 1 "register_operand" "0,0,d")
+                  (mult:SI (match_operand:SI 2 "register_operand" "d,d,d")
+                           (match_operand:SI 3 "register_operand" "d,d,d"))))
+   (clobber (match_scratch:SI 4 "=X,X,l"))
+   (clobber (match_scratch:SI 5 "=X,X,&d"))]
   "GENERATE_MADD_MSUB"
   "@
    msub\t%2,%3
+   msub\t%2,%3
    #"
   [(set_attr "type"     "imadd")
    (set_attr "accum_in"	"1")
    (set_attr "mode"     "SI")
-   (set_attr "insn_count" "1,2")])
+   (set_attr "insn_count" "1,1,2")
+   (set (attr "enabled")
+        (cond [(and (eq_attr "alternative" "0")
+                    (match_test "TARGET_RELOAD"))
+                  (const_string "yes")
+               (and (eq_attr "alternative" "1")
+                    (match_test "!TARGET_RELOAD"))
+                  (const_string "yes")
+               (eq_attr "alternative" "2")
+                  (const_string "yes")]
+              (const_string "no")))])
 
 ;; Split *mul_sub_si if both the source and destination accumulator
 ;; values are GPRs.
@@ -2986,31 +3023,14 @@
    (set_attr "mode" "<MODE>")])
 
 (define_insn "*and<mode>3_mips16"
-  [(set (match_operand:GPR 0 "register_operand" "=d,d,d,d,d")
-	(and:GPR (match_operand:GPR 1 "nonimmediate_operand" "%W,W,W,d,0")
-		 (match_operand:GPR 2 "and_operand" "Yb,Yh,Yw,Yw,d")))]
+  [(set (match_operand:GPR 0 "register_operand" "=d,d")
+	(and:GPR (match_operand:GPR 1 "register_operand" "%d,0")
+		 (match_operand:GPR 2 "and_operand" "Yw,d")))]
   "TARGET_MIPS16 && and_operands_ok (<MODE>mode, operands[1], operands[2])"
-{
-  switch (which_alternative)
-    {
-    case 0:
-      operands[1] = gen_lowpart (QImode, operands[1]);
-      return "lbu\t%0,%1";
-    case 1:
-      operands[1] = gen_lowpart (HImode, operands[1]);
-      return "lhu\t%0,%1";
-    case 2:
-      operands[1] = gen_lowpart (SImode, operands[1]);
-      return "lwu\t%0,%1";
-    case 3:
-      return "#";
-    case 4:
-      return "and\t%0,%2";
-    default:
-      gcc_unreachable ();
-    }
-}
-  [(set_attr "move_type" "load,load,load,shift_shift,logical")
+  "@
+   #
+   and\t%0,%2"
+  [(set_attr "move_type" "shift_shift,logical")
    (set_attr "mode" "<MODE>")])
 
 (define_expand "ior<mode>3"
@@ -4139,7 +4159,7 @@
   [(set (match_operand:DI 0 "register_operand" "=d")
 	(match_operand:DI 1 "absolute_symbolic_operand" ""))
    (clobber (match_scratch:DI 2 "=&d"))]
-  "TARGET_EXPLICIT_RELOCS && ABI_HAS_64BIT_SYMBOLS && cse_not_expected"
+  "!TARGET_MIPS16 && TARGET_EXPLICIT_RELOCS && ABI_HAS_64BIT_SYMBOLS && cse_not_expected"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (high:DI (match_dup 3)))
diff --git gcc/config/mips/mips.opt gcc/config/mips/mips.opt
index 6ee5398..a044031 100644
--- gcc/config/mips/mips.opt
+++ gcc/config/mips/mips.opt
@@ -380,6 +380,10 @@ msynci
 Target Report Mask(SYNCI)
 Use synci instruction to invalidate i-cache
 
+mreload
+Target Report Var(TARGET_RELOAD)
+Use reload instead of lra
+
 mtune=
 Target RejectNegative Joined Var(mips_tune_option) ToLower Enum(mips_arch_opt_value)
 -mtune=PROCESSOR	Optimize the output for PROCESSOR
diff --git gcc/lra-constraints.c gcc/lra-constraints.c
index ba4d489..ab85495 100644
--- gcc/lra-constraints.c
+++ gcc/lra-constraints.c
@@ -2615,6 +2615,39 @@ valid_address_p (struct address_info *ad)
   return ok_p;
 }
 
+/* Make reload base reg from address AD.  */
+static rtx
+base_to_reg (struct address_info *ad)
+{
+  enum reg_class cl;
+  int code = -1;
+  rtx new_inner = NULL_RTX;
+  rtx new_reg = NULL_RTX;
+  rtx insn;
+  rtx last_insn = get_last_insn();
+
+  lra_assert (ad->base == ad->base_term && ad->disp == ad->disp_term);
+  cl = base_reg_class (ad->mode, ad->as, ad->base_outer_code,
+                       get_index_code (ad));
+  new_reg = lra_create_new_reg (GET_MODE (*ad->base_term), NULL_RTX,
+                                cl, "base");
+  new_inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), new_reg,
+                                   ad->disp_term == NULL
+                                   ? gen_int_mode (0, ad->mode) 
+                                   : *ad->disp_term);
+  if (!valid_address_p (ad->mode, new_inner, ad->as))
+    return NULL_RTX;
+  insn = emit_insn (gen_rtx_SET (ad->mode, new_reg, *ad->base_term));
+  code = recog_memoized (insn);
+  if (code < 0)
+    {
+      delete_insns_since (last_insn);
+      return NULL_RTX;
+    }
+  
+  return new_inner;
+}
+
 /* Make reload base reg + disp from address AD.  Return the new pseudo.  */
 static rtx
 base_plus_disp_to_reg (struct address_info *ad)
@@ -2818,6 +2851,8 @@ process_address (int nop, rtx *before, rtx *after)
 
      3) the address is a frame address with an invalid offset.
 
+     4) the address is a frame address with an invalid base.
+
      All these cases involve a non-autoinc address, so there is no
      point revalidating other types.  */
   if (ad.autoinc_p || valid_address_p (&ad))
@@ -2899,14 +2934,19 @@ process_address (int nop, rtx *before, rtx *after)
       int regno;
       enum reg_class cl;
       rtx set, insns, last_insn;
+      /* Try to reload base into register only if the base is invalid 
+         for the address but with valid offset, case (4) above.  */
+      start_sequence ();
+      new_reg = base_to_reg (&ad);
+
       /* base + disp => new base, cases (1) and (3) above.  */
       /* Another option would be to reload the displacement into an
 	 index register.  However, postreload has code to optimize
 	 address reloads that have the same base and different
 	 displacements, so reloading into an index register would
 	 not necessarily be a win.  */
-      start_sequence ();
-      new_reg = base_plus_disp_to_reg (&ad);
+      if (new_reg == NULL_RTX)
+        new_reg = base_plus_disp_to_reg (&ad);
       insns = get_insns ();
       last_insn = get_last_insn ();
       /* If we generated at least two insns, try last insn source as
diff --git gcc/rtlanal.c gcc/rtlanal.c
index 7a9efec..f699d17 100644
--- gcc/rtlanal.c
+++ gcc/rtlanal.c
@@ -5577,7 +5577,8 @@ get_base_term (rtx *inner)
     inner = strip_address_mutations (&XEXP (*inner, 0));
   if (REG_P (*inner)
       || MEM_P (*inner)
-      || GET_CODE (*inner) == SUBREG)
+      || GET_CODE (*inner) == SUBREG
+      || CONSTANT_P (*inner))
     return inner;
   return 0;
 }
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-03-29  1:27 [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend Robert Suchanek
@ 2014-03-29 11:24 ` Richard Sandiford
  2014-04-09 10:00   ` Robert Suchanek
  2014-03-29 14:08 ` [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend Jakub Jelinek
  1 sibling, 1 reply; 31+ messages in thread
From: Richard Sandiford @ 2014-03-29 11:24 UTC (permalink / raw)
  To: Robert Suchanek; +Cc: gcc-patches, vmakarov

First of all, thanks a lot for doing this.

Robert Suchanek <Robert.Suchanek@imgtec.com> writes:
> diff --git gcc/config/mips/mips.c gcc/config/mips/mips.c
> index 143169b..f27a801 100644
> --- gcc/config/mips/mips.c
> +++ gcc/config/mips/mips.c
> @@ -2255,7 +2255,7 @@ mips_regno_mode_ok_for_base_p (int regno, enum machine_mode mode,
>       All in all, it seems more consistent to only enforce this restriction
>       during and after reload.  */
>    if (TARGET_MIPS16 && regno == STACK_POINTER_REGNUM)
> -    return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
> +    return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
>  
>    return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno);
>  }

Not sure about this one.  We would need to update the comment that
explains why "!strict_p" is there, but AFAIK reason (1) would still apply.

Was this needed for correctness or because it gave better code?

> +/* Return a register priority for hard reg REGNO.  */
> +
> +static int
> +mips_register_priority (int hard_regno)
> +{
> +  if (TARGET_MIPS16)
> +   {
> +     /* Treat MIPS16 registers with higher priority than other regs.  */
> +     switch (hard_regno)
> +       {
> +       case 2:
> +       case 3:
> +       case 4:
> +       case 5:
> +       case 6:
> +       case 7:
> +       case 16:
> +       case 17:
> +         return 1;
> +       default:
> +         return 0;
> +       }
> +   }
> +  return 0;
> +}
> +

Easier as:

  if (TARGET_MIPS16
      && TEST_HARD_REG_BIT (reg_class_contents[M16_REGS], hard_regno))
    return 1;
  return 0;

> diff --git gcc/config/mips/mips.h gcc/config/mips/mips.h
> index a786d4c..712008f 100644
> --- gcc/config/mips/mips.h
> +++ gcc/config/mips/mips.h
> @@ -1871,10 +1871,12 @@ enum reg_class
>  {
>    NO_REGS,			/* no registers in set */
>    M16_REGS,			/* mips16 directly accessible registers */
> +  M16F_REGS,			/* mips16 + frame */

Constraints are supposed to be operating on real registers, after
elimination, so it seems odd to include a fake register.  What went
wrong with just M16_REGS?

> +  SPILL_REGS,			/* All but $sp and call preserved regs are in here */
...
> +  { 0x0003fffc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* SPILL_REGS */      	\

These two don't seem to match.  I think literally it would be 0x0300fffc,
but maybe you had to make SPILL_REGS a superset of M16_REGs?

> +/* Add costs to hard registers based on frequency. This helps to negate
> +   some of the reduced cost associated with argument registers which 
> +   unfairly promotes their use and increases register pressure */
> +#define IRA_HARD_REGNO_ADD_COST_MULTIPLIER(REGNO)                       \
> +  (TARGET_MIPS16 && optimize_size                                       \
> +   ? ((REGNO) >= 4 && (REGNO) <= 7 ? 2 : 0) : 0)

So we would be trying to use, say, $4 for the first incoming argument
even after it had been spilled?  Hmm.

Since this change is aimed specifically at one heuristic, I wonder
whether we should parameterise that heuristic somehow rather than
try to use a general hook to undo it.  But I don't think there's
anything particularly special about MIPS16 here, so maybe it's too
eager for all targets.

> diff --git gcc/config/mips/mips.md gcc/config/mips/mips.md
> index 1e3e9e6..ababd5e 100644
> --- gcc/config/mips/mips.md
> +++ gcc/config/mips/mips.md
> @@ -1622,40 +1622,66 @@
>  ;; copy instructions.  Reload therefore thinks that the second alternative
>  ;; is two reloads more costly than the first.  We add "*?*?" to the first
>  ;; alternative as a counterweight.
> +;;
> +;; LRA simulates reload but the cost of reloading scratches is lower
> +;; than of the classic reload. For the time being, removing the counterweight
> +;; for LRA is more profitable.
>  (define_insn "*mul_acc_si"
> -  [(set (match_operand:SI 0 "register_operand" "=l*?*?,d?")
> -	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d")
> -			  (match_operand:SI 2 "register_operand" "d,d"))
> -		 (match_operand:SI 3 "register_operand" "0,d")))
> -   (clobber (match_scratch:SI 4 "=X,l"))
> -   (clobber (match_scratch:SI 5 "=X,&d"))]
> +  [(set (match_operand:SI 0 "register_operand" "=l*?*?,l,d?")
> +	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d,d")
> +			  (match_operand:SI 2 "register_operand" "d,d,d"))
> +		 (match_operand:SI 3 "register_operand" "0,0,d")))
> +   (clobber (match_scratch:SI 4 "=X,X,l"))
> +   (clobber (match_scratch:SI 5 "=X,X,&d"))]
>    "GENERATE_MADD_MSUB && !TARGET_MIPS16"
>    "@
>      madd\t%1,%2
> +    madd\t%1,%2
>      #"
>    [(set_attr "type"	"imadd")
>     (set_attr "accum_in"	"3")
>     (set_attr "mode"	"SI")
> -   (set_attr "insn_count" "1,2")])
> +   (set_attr "insn_count" "1,1,2")
> +   (set (attr "enabled")
> +        (cond [(and (eq_attr "alternative" "0")
> +                    (match_test "TARGET_RELOAD"))
> +                  (const_string "yes")
> +               (and (eq_attr "alternative" "1")
> +                    (match_test "!TARGET_RELOAD"))
> +                  (const_string "yes")
> +               (eq_attr "alternative" "2")
> +                  (const_string "yes")]
> +              (const_string "no")))])

Looks good.  Glad to see the operand 0 hack going away. :-)

> @@ -2986,31 +3023,14 @@
>     (set_attr "mode" "<MODE>")])
>  
>  (define_insn "*and<mode>3_mips16"
> -  [(set (match_operand:GPR 0 "register_operand" "=d,d,d,d,d")
> -	(and:GPR (match_operand:GPR 1 "nonimmediate_operand" "%W,W,W,d,0")
> -		 (match_operand:GPR 2 "and_operand" "Yb,Yh,Yw,Yw,d")))]
> +  [(set (match_operand:GPR 0 "register_operand" "=d,d")
> +	(and:GPR (match_operand:GPR 1 "register_operand" "%d,0")
> +		 (match_operand:GPR 2 "and_operand" "Yw,d")))]
>    "TARGET_MIPS16 && and_operands_ok (<MODE>mode, operands[1], operands[2])"
> -{
> -  switch (which_alternative)
> -    {
> -    case 0:
> -      operands[1] = gen_lowpart (QImode, operands[1]);
> -      return "lbu\t%0,%1";
> -    case 1:
> -      operands[1] = gen_lowpart (HImode, operands[1]);
> -      return "lhu\t%0,%1";
> -    case 2:
> -      operands[1] = gen_lowpart (SImode, operands[1]);
> -      return "lwu\t%0,%1";
> -    case 3:
> -      return "#";
> -    case 4:
> -      return "and\t%0,%2";
> -    default:
> -      gcc_unreachable ();
> -    }
> -}
> -  [(set_attr "move_type" "load,load,load,shift_shift,logical")
> +  "@
> +   #
> +   and\t%0,%2"
> +  [(set_attr "move_type" "shift_shift,logical")
>     (set_attr "mode" "<MODE>")])

I think we want to keep the LWU case at the very least, since I assume
otherwise we'll load the full 64-bit spill slot and use a pair of shifts
to zero-extend it.  Reloading the stack address into a base register
should always be better than that.

I agree it's less clear for the byte and halfword cases.  All other
things -- including size -- being equal, reloading a stack address into
a base register and using an extending load should be better than
reloading the full spill slot and extending it, since the reloaded stack
address is more likely to be reused in other instructions.

The original MIPS16 didn't have ZEB and ZEH, so on the basis above,
I think using LBU and LHU there was probably a win.  I can imagine
it making sense to disable LBU and LHU for MIPS16e though.

It might be better to do any changes to this pattern as a follow-on
patch, since I think LRA should cope with it as-is.

> @@ -4139,7 +4159,7 @@
>    [(set (match_operand:DI 0 "register_operand" "=d")
>  	(match_operand:DI 1 "absolute_symbolic_operand" ""))
>     (clobber (match_scratch:DI 2 "=&d"))]
> -  "TARGET_EXPLICIT_RELOCS && ABI_HAS_64BIT_SYMBOLS && cse_not_expected"
> +  "!TARGET_MIPS16 && TARGET_EXPLICIT_RELOCS && ABI_HAS_64BIT_SYMBOLS && cse_not_expected"
>    "#"
>    "&& reload_completed"
>    [(set (match_dup 0) (high:DI (match_dup 3)))

Minor nit, but please keep within 80 chars:

  "!TARGET_MIPS16
   && TARGET_EXPLICIT_RELOCS
   && ABI_HAS_64BIT_SYMBOLS
   && cse_not_expected"

> diff --git gcc/lra-constraints.c gcc/lra-constraints.c
> index ba4d489..ab85495 100644
> --- gcc/lra-constraints.c
> +++ gcc/lra-constraints.c
> @@ -2615,6 +2615,39 @@ valid_address_p (struct address_info *ad)
>    return ok_p;
>  }
>  
> +/* Make reload base reg from address AD.  */
> +static rtx
> +base_to_reg (struct address_info *ad)
> +{
> +  enum reg_class cl;
> +  int code = -1;
> +  rtx new_inner = NULL_RTX;
> +  rtx new_reg = NULL_RTX;
> +  rtx insn;
> +  rtx last_insn = get_last_insn();
> +
> +  lra_assert (ad->base == ad->base_term && ad->disp == ad->disp_term);
> +  cl = base_reg_class (ad->mode, ad->as, ad->base_outer_code,
> +                       get_index_code (ad));
> +  new_reg = lra_create_new_reg (GET_MODE (*ad->base_term), NULL_RTX,
> +                                cl, "base");
> +  new_inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), new_reg,
> +                                   ad->disp_term == NULL
> +                                   ? gen_int_mode (0, ad->mode) 
> +                                   : *ad->disp_term);
> +  if (!valid_address_p (ad->mode, new_inner, ad->as))
> +    return NULL_RTX;
> +  insn = emit_insn (gen_rtx_SET (ad->mode, new_reg, *ad->base_term));
> +  code = recog_memoized (insn);
> +  if (code < 0)
> +    {
> +      delete_insns_since (last_insn);
> +      return NULL_RTX;
> +    }
> +  
> +  return new_inner;
> +}
> +
>  /* Make reload base reg + disp from address AD.  Return the new pseudo.  */
>  static rtx
>  base_plus_disp_to_reg (struct address_info *ad)
> @@ -2818,6 +2851,8 @@ process_address (int nop, rtx *before, rtx *after)
>  
>       3) the address is a frame address with an invalid offset.
>  
> +     4) the address is a frame address with an invalid base.
> +
>       All these cases involve a non-autoinc address, so there is no
>       point revalidating other types.  */
>    if (ad.autoinc_p || valid_address_p (&ad))
> @@ -2899,14 +2934,19 @@ process_address (int nop, rtx *before, rtx *after)
>        int regno;
>        enum reg_class cl;
>        rtx set, insns, last_insn;
> +      /* Try to reload base into register only if the base is invalid 
> +         for the address but with valid offset, case (4) above.  */
> +      start_sequence ();
> +      new_reg = base_to_reg (&ad);
> +
>        /* base + disp => new base, cases (1) and (3) above.  */
>        /* Another option would be to reload the displacement into an
>  	 index register.  However, postreload has code to optimize
>  	 address reloads that have the same base and different
>  	 displacements, so reloading into an index register would
>  	 not necessarily be a win.  */
> -      start_sequence ();
> -      new_reg = base_plus_disp_to_reg (&ad);
> +      if (new_reg == NULL_RTX)
> +        new_reg = base_plus_disp_to_reg (&ad);
>        insns = get_insns ();
>        last_insn = get_last_insn ();
>        /* If we generated at least two insns, try last insn source as

Obviously this is Vlad's call, but the idea looks good to me.
Speculatively calling lra_create_new_reg for cases where (4) doesn't
end up applying is probably too wasteful though.

> diff --git gcc/rtlanal.c gcc/rtlanal.c
> index 7a9efec..f699d17 100644
> --- gcc/rtlanal.c
> +++ gcc/rtlanal.c
> @@ -5577,7 +5577,8 @@ get_base_term (rtx *inner)
>      inner = strip_address_mutations (&XEXP (*inner, 0));
>    if (REG_P (*inner)
>        || MEM_P (*inner)
> -      || GET_CODE (*inner) == SUBREG)
> +      || GET_CODE (*inner) == SUBREG
> +      || CONSTANT_P (*inner))
>      return inner;
>    return 0;
>  }

This doesn't look right; the general form is base+index+displacement+segment,
with the constant term being treated as the displacement.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-03-29  1:27 [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend Robert Suchanek
  2014-03-29 11:24 ` Richard Sandiford
@ 2014-03-29 14:08 ` Jakub Jelinek
  1 sibling, 0 replies; 31+ messages in thread
From: Jakub Jelinek @ 2014-03-29 14:08 UTC (permalink / raw)
  To: Robert Suchanek; +Cc: gcc-patches, vmakarov, rdsandiford

On Sat, Mar 29, 2014 at 01:07:40AM +0000, Robert Suchanek wrote:
> This patch enables LRA by default for MIPS. The classic reload is still
> available and can be enabled via -mreload switch. 

FYI, all other targets that have LRA optionally selectable or deselectable
use -mno-lra for this (even when -mlra is the default), it would be better
for consistency not to invent new switch names for that.

	Jakub

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-03-29 11:24 ` Richard Sandiford
@ 2014-04-09 10:00   ` Robert Suchanek
  2014-04-09 21:24     ` Richard Sandiford
  0 siblings, 1 reply; 31+ messages in thread
From: Robert Suchanek @ 2014-04-09 10:00 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, vmakarov

> FYI, all other targets that have LRA optionally selectable or deselectable
> use -mno-lra for this (even when -mlra is the default), it would be better
> for consistency not to invent new switch names for that.

Agreed.

>> -    return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
>> +    return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
>>  
>>    return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno);
>>  }
> Not sure about this one.  We would need to update the comment that
> explains why "!strict_p" is there, but AFAIK reason (1) would still apply.
>
> Was this needed for correctness or because it gave better code?

"!strict_p" has been removed because of correctness issue. When LRA validates 
memory addresses pseudos are temporarily eliminated to hard registers (if possible)
but the target hook is always called as non-strict. This only affects MIPS16 instructions with
not directly accessible $sp. The strict variant, as I understand, was used in the reload
pass to indicate if a pseudo-register has been allocated a hard register. Unless LRA
should be setting the strict/non-strict depending on whether a temporal elimination
to hard reg was successful or there is something else that I missed? 

> Easier as:
>
>  if (TARGET_MIPS16
>      && TEST_HARD_REG_BIT (reg_class_contents[M16_REGS], hard_regno))
>    return 1;
>  return 0;

Indeed. 

>> +  M16F_REGS,			/* mips16 + frame */
>
> Constraints are supposed to be operating on real registers, after
> elimination, so it seems odd to include a fake register.  What went
> wrong with just M16_REGS?

Only the stack pointer has been added to M16_REGS. A number of patterns need to accept
it otherwise LRA inserts a lot of reloads and the code size goes up by about 10%. 
The change does have also a positive effect on reload but marginally.
"frame" meant to indicate inclusion of both the stack and hard frame pointers in the class
but perhaps I should name it differently to avoid confusion.

>> +  SPILL_REGS,			/* All but $sp and call preserved regs are in here */
>...
>> +  { 0x0003fffc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* SPILL_REGS */      	\
>
> These two don't seem to match.  I think literally it would be 0x0300fffc,
> but maybe you had to make SPILL_REGS a superset of M16_REGs?

I initially used 0x0300fffc but did some experiments and it turned out that 0x0003fffc (with $16, $17 regs)
gives slightly better code. I haven't updated the comment though. There is yet more to do 
and need to return to another thread with MIPS16 at some point as I found some limitations
of IRA/LRA to generate better code. $8-$15 are currently inaccessible as temporary storage
because these registers are marked as fixed (when optimizing for size) but leaving them
as fixed are better for the code size. I don't expect a big gain by using hard registers
for spilling but it more likely to improve the performance.

>> +/* Add costs to hard registers based on frequency. This helps to negate
>> +   some of the reduced cost associated with argument registers which 
>> +   unfairly promotes their use and increases register pressure */
>> +#define IRA_HARD_REGNO_ADD_COST_MULTIPLIER(REGNO)                       \
>> +  (TARGET_MIPS16 && optimize_size                                       \
>> +   ? ((REGNO) >= 4 && (REGNO) <= 7 ? 2 : 0) : 0)
>
> So we would be trying to use, say, $4 for the first incoming argument
> even after it had been spilled?  Hmm.
>
> Since this change is aimed specifically at one heuristic, I wonder
> whether we should parameterise that heuristic somehow rather than
> try to use a general hook to undo it.  But I don't think there's
> anything particularly special about MIPS16 here, so maybe it's too
> eager for all targets.

In a number of cases argument registers appeared to be unfairly promoted
increasing the register pressure and increasing the number of reloads.
Bumping up the cost of using those registers encourages IRA to spill
into memory but this appears to help LRA to do a better allocation. Of course,
not always it is a win but generally the gain outweighs the loss. 

I've seen an codesize improvements for other optimization levels
but I'm unsure whether we should make this change generic.
This part of the patch is not crucial though and can be send separately. 

>>  (define_insn "*and<mode>3_mips16"
>> ...
>
> I think we want to keep the LWU case at the very least, since I assume
> otherwise we'll load the full 64-bit spill slot and use a pair of shifts
> to zero-extend it.  Reloading the stack address into a base register
> should always be better than that.
>
> I agree it's less clear for the byte and halfword cases.  All other
> things -- including size -- being equal, reloading a stack address into
> a base register and using an extending load should be better than
> reloading the full spill slot and extending it, since the reloaded stack
> address is more likely to be reused in other instructions.
>
> The original MIPS16 didn't have ZEB and ZEH, so on the basis above,
> I think using LBU and LHU there was probably a win.  I can imagine
> it making sense to disable LBU and LHU for MIPS16e though.
>
> It might be better to do any changes to this pattern as a follow-on
> patch, since I think LRA should cope with it as-is.

Keeping LWU case should be fine. Alternatives were removed because
LRA was ICEing for the same reason of $sp not accessible for the byte 
and halfword cases. I tried to find a testcase to trigger LWU case 
but couldn't. I presume it must be a rare case? The problem with other
alternatives was that spilled pseudos were changed to memory without
checking if the use of $sp is valid. On the other hand, keeping LWU 
may only delay triggering the problem because the stack pointer 
should not be used. It could be a missing case in LRA too.

> Minor nit, but please keep within 80 chars:
>
>  "!TARGET_MIPS16
>   && TARGET_EXPLICIT_RELOCS
>   && ABI_HAS_64BIT_SYMBOLS
>   && cse_not_expected"

Overlooked it.

>> diff --git gcc/rtlanal.c gcc/rtlanal.c
>> index 7a9efec..f699d17 100644
>> --- gcc/rtlanal.c
>> +++ gcc/rtlanal.c
>> @@ -5577,7 +5577,8 @@ get_base_term (rtx *inner)
>>      inner = strip_address_mutations (&XEXP (*inner, 0));
>>    if (REG_P (*inner)
>>        || MEM_P (*inner)
>> -      || GET_CODE (*inner) == SUBREG)
>> +      || GET_CODE (*inner) == SUBREG
>> +      || CONSTANT_P (*inner))
>>      return inner;
>>    return 0;
>>  }
> 
> This doesn't look right; the general form is base+index+displacement+segment,
> with the constant term being treated as the displacement.

I'm not particularly happy with this either. This was an attempt to fix an ICE for 
the following RTL (gcc.dg/torture/asm-subreg-1.c compiled with -mips32r2 -mips16 -O1):

(insn 9 8 0 2 (asm_operands/v ("") ("") 0 [
            (mem/v/j/c:HI (lo_sum:SI (const:SI (unspec:SI [
                                (const_int 0 [0])
                            ] UNSPEC_GP))
                    (symbol_ref:SI ("_const_32") [flags 0x6]  <var_decl 0x7f50acd17558 _const_32>)) [0 _const_32+0 S2 A32])]
 [(asm_input:HI ("X") (null):0)]
 [] asm-subreg-1.c:13) asm-subreg-1.c:13 -1 (nil))

Any suggestions to handle this case?

Regards,
Robert

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-09 10:00   ` Robert Suchanek
@ 2014-04-09 21:24     ` Richard Sandiford
  2014-04-10 20:29       ` Richard Sandiford
  2014-04-14 11:13       ` Robert Suchanek
  0 siblings, 2 replies; 31+ messages in thread
From: Richard Sandiford @ 2014-04-09 21:24 UTC (permalink / raw)
  To: Robert Suchanek; +Cc: gcc-patches, vmakarov

Robert Suchanek <Robert.Suchanek@imgtec.com> writes:
>> FYI, all other targets that have LRA optionally selectable or deselectable
>> use -mno-lra for this (even when -mlra is the default), it would be better
>> for consistency not to invent new switch names for that.
>
> Agreed.
>
>>> -    return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
>>> +    return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
>>>  
>>>    return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno);
>>>  }
>> Not sure about this one.  We would need to update the comment that
>> explains why "!strict_p" is there, but AFAIK reason (1) would still apply.
>>
>> Was this needed for correctness or because it gave better code?
>
> "!strict_p" has been removed because of correctness issue. When LRA
> validates memory addresses pseudos are temporarily eliminated to hard
> registers (if possible) but the target hook is always called as
> non-strict. This only affects MIPS16 instructions with not directly
> accessible $sp. The strict variant, as I understand, was used in the
> reload pass to indicate if a pseudo-register has been allocated a hard
> register. Unless LRA should be setting the strict/non-strict depending
> on whether a temporal elimination to hard reg was successful or there
> is something else that I missed?

Hmm, OK, in that case I agree reason (2) doesn't apply.  That part was
always more of a consistency thing anyway, so I agree it's not worth
keeping around for reload.  I also had a look to see why
instantiate_virtual_regs_in_insn didn't complain about cases like:

  struct s { unsigned char c; };
  void foo (int, int, int, int, struct s);
  void bar (struct s *ptr) { foo (1, 2, 3, 4, *ptr); }

and I think it's because of the later:

2008-02-14  Michael Matz  <matz@suse.de>

	PR target/34930
	* function.c (instantiate_virtual_regs_in_insn): Reload address
	before falling back to reloading the whole operand.

which correctly reloads the address if necessary.

So yeah, I agree this is right after all, sorry.  Let's delete the
comment starting at "There are two problems here:" at the same time.

>>> +  M16F_REGS,			/* mips16 + frame */
>>
>> Constraints are supposed to be operating on real registers, after
>> elimination, so it seems odd to include a fake register.  What went
>> wrong with just M16_REGS?
>
> Only the stack pointer has been added to M16_REGS.

Sorry, I'd read "frame" as meaning "$frame", the soft frame pointer.
I agree M16_REGS + $sp is OK.

mips_regno_to_class should then map $sp to the new class, since it's now
the smallest containing class.  (We really should set that up automatically
one day...)

> A number of patterns need to accept it otherwise LRA inserts a lot of
> reloads and the code size goes up by about 10%.  The change does have
> also a positive effect on reload but marginally.  "frame" meant to
> indicate inclusion of both the stack and hard frame pointers in the
> class but perhaps I should name it differently to avoid confusion.

How about M16_SP_REGS, to match M16_T_REGS?

Also, the BASE_REG_CLASS/ADDR_REG_CLASS distinction isn't all that
obvious from the names.  ADDR_REG_CLASS is only needed for the "d"
constraint so maybe we could just use TARGET_MIPS16 ? M16_REGS : GR_REGS
directly for now.

>>> + SPILL_REGS, /* All but $sp and call preserved regs are in here */
>>...
>>> + { 0x0003fffc, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
>>> 0x00000000 }, /* SPILL_REGS */ \
>>
>> These two don't seem to match.  I think literally it would be 0x0300fffc,
>> but maybe you had to make SPILL_REGS a superset of M16_REGs?
>
> I initially used 0x0300fffc but did some experiments and it turned out
> that 0x0003fffc (with $16, $17 regs) gives slightly better code. I
> haven't updated the comment though.

I can imagine including all M16_REGS makes sense, but it seems odd to
drop the 2 temporaries.  Does 0x0303fffc have the same problem?

> There is yet more to do and need to return to another thread with
> MIPS16 at some point as I found some limitations of IRA/LRA to
> generate better code. $8-$15 are currently inaccessible as temporary
> storage because these registers are marked as fixed (when optimizing
> for size) but leaving them as fixed are better for the code size. I
> don't expect a big gain by using hard registers for spilling but it
> more likely to improve the performance.

Hmm, marking them fixed was supposed to be a temporary reload-only thing,
until the move to LRA.  It should never be worse to spill to these GPRs
over spilling to the stack, if the value isn't live across a call.

But that certainly doesn't need to be part of the initial patch.

>>> +/* Add costs to hard registers based on frequency. This helps to negate
>>> +   some of the reduced cost associated with argument registers which 
>>> +   unfairly promotes their use and increases register pressure */
>>> +#define IRA_HARD_REGNO_ADD_COST_MULTIPLIER(REGNO)                       \
>>> +  (TARGET_MIPS16 && optimize_size                                       \
>>> +   ? ((REGNO) >= 4 && (REGNO) <= 7 ? 2 : 0) : 0)
>>
>> So we would be trying to use, say, $4 for the first incoming argument
>> even after it had been spilled?  Hmm.
>>
>> Since this change is aimed specifically at one heuristic, I wonder
>> whether we should parameterise that heuristic somehow rather than
>> try to use a general hook to undo it.  But I don't think there's
>> anything particularly special about MIPS16 here, so maybe it's too
>> eager for all targets.
>
> In a number of cases argument registers appeared to be unfairly promoted
> increasing the register pressure and increasing the number of reloads.
> Bumping up the cost of using those registers encourages IRA to spill
> into memory but this appears to help LRA to do a better allocation. Of course,
> not always it is a win but generally the gain outweighs the loss. 
>
> I've seen an codesize improvements for other optimization levels
> but I'm unsure whether we should make this change generic.
> This part of the patch is not crucial though and can be send separately. 

OK, thanks, doing it separately sounds better.

>>>  (define_insn "*and<mode>3_mips16"
>>> ...
>>
>> I think we want to keep the LWU case at the very least, since I assume
>> otherwise we'll load the full 64-bit spill slot and use a pair of shifts
>> to zero-extend it.  Reloading the stack address into a base register
>> should always be better than that.
>>
>> I agree it's less clear for the byte and halfword cases.  All other
>> things -- including size -- being equal, reloading a stack address into
>> a base register and using an extending load should be better than
>> reloading the full spill slot and extending it, since the reloaded stack
>> address is more likely to be reused in other instructions.
>>
>> The original MIPS16 didn't have ZEB and ZEH, so on the basis above,
>> I think using LBU and LHU there was probably a win.  I can imagine
>> it making sense to disable LBU and LHU for MIPS16e though.
>>
>> It might be better to do any changes to this pattern as a follow-on
>> patch, since I think LRA should cope with it as-is.
>
> Keeping LWU case should be fine. Alternatives were removed because
> LRA was ICEing for the same reason of $sp not accessible for the byte 
> and halfword cases. I tried to find a testcase to trigger LWU case 
> but couldn't. I presume it must be a rare case? The problem with other
> alternatives was that spilled pseudos were changed to memory without
> checking if the use of $sp is valid. On the other hand, keeping LWU 
> may only delay triggering the problem because the stack pointer 
> should not be used. It could be a missing case in LRA too.

Did you see the failures even after your mips_regno_mode_ok_for_base_p
change?  LRA should know how to reload a "W" address.

>>> diff --git gcc/rtlanal.c gcc/rtlanal.c
>>> index 7a9efec..f699d17 100644
>>> --- gcc/rtlanal.c
>>> +++ gcc/rtlanal.c
>>> @@ -5577,7 +5577,8 @@ get_base_term (rtx *inner)
>>>      inner = strip_address_mutations (&XEXP (*inner, 0));
>>>    if (REG_P (*inner)
>>>        || MEM_P (*inner)
>>> -      || GET_CODE (*inner) == SUBREG)
>>> +      || GET_CODE (*inner) == SUBREG
>>> +      || CONSTANT_P (*inner))
>>>      return inner;
>>>    return 0;
>>>  }
>> 
>> This doesn't look right; the general form is base+index+displacement+segment,
>> with the constant term being treated as the displacement.
>
> I'm not particularly happy with this either. This was an attempt to fix an ICE for 
> the following RTL (gcc.dg/torture/asm-subreg-1.c compiled with -mips32r2 -mips16 -O1):
>
> (insn 9 8 0 2 (asm_operands/v ("") ("") 0 [
>             (mem/v/j/c:HI (lo_sum:SI (const:SI (unspec:SI [
>                                 (const_int 0 [0])
>                             ] UNSPEC_GP))
>                     (symbol_ref:SI ("_const_32") [flags 0x6]  <var_decl 0x7f50acd17558 _const_32>)) [0 _const_32+0 S2 A32])]
>  [(asm_input:HI ("X") (null):0)]
>  [] asm-subreg-1.c:13) asm-subreg-1.c:13 -1 (nil))
>
> Any suggestions to handle this case?

Thanks for the pointer.  I think this shows a more fundamental problem
with the handling of "X" constraints.  With something like:

void
foo (int **x, int y, int z)
{
  int *ptr = *x + y * z / 11;
  __asm__ __volatile__ ("" : : "X" (*ptr));
}

the entire expression gets treated as a MEM address, which neither
reload nor LRA can handle.  And with something like that, it isn't
obvious what class all the registers in the address should have.
With a sufficiently-complicated expression you could run out of registers.

So perhaps we should limit the propagation to things that
decompose_mem_address can handle.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-09 21:24     ` Richard Sandiford
@ 2014-04-10 20:29       ` Richard Sandiford
  2014-04-14 11:13       ` Robert Suchanek
  1 sibling, 0 replies; 31+ messages in thread
From: Richard Sandiford @ 2014-04-10 20:29 UTC (permalink / raw)
  To: Robert Suchanek; +Cc: gcc-patches, vmakarov

Richard Sandiford <rdsandiford@googlemail.com> writes:
> Robert Suchanek <Robert.Suchanek@imgtec.com> writes:
>> I'm not particularly happy with this either. This was an attempt to fix an ICE for 
>> the following RTL (gcc.dg/torture/asm-subreg-1.c compiled with -mips32r2 -mips16 -O1):
>>
>> (insn 9 8 0 2 (asm_operands/v ("") ("") 0 [
>>             (mem/v/j/c:HI (lo_sum:SI (const:SI (unspec:SI [
>>                                 (const_int 0 [0])
>>                             ] UNSPEC_GP))
>>                     (symbol_ref:SI ("_const_32") [flags 0x6]  <var_decl 0x7f50acd17558 _const_32>)) [0 _const_32+0 S2 A32])]
>>  [(asm_input:HI ("X") (null):0)]
>>  [] asm-subreg-1.c:13) asm-subreg-1.c:13 -1 (nil))
>>
>> Any suggestions to handle this case?
>
> Thanks for the pointer.  I think this shows a more fundamental problem
> with the handling of "X" constraints.  With something like:
>
> void
> foo (int **x, int y, int z)
> {
>   int *ptr = *x + y * z / 11;
>   __asm__ __volatile__ ("" : : "X" (*ptr));
> }
>
> the entire expression gets treated as a MEM address, which neither
> reload nor LRA can handle.  And with something like that, it isn't
> obvious what class all the registers in the address should have.
> With a sufficiently-complicated expression you could run out of registers.
>
> So perhaps we should limit the propagation to things that
> decompose_mem_address can handle.

Even that might be too loose, since invalid scales will need to be reloaded
as a multiplication or shift, and there's no guarantee that the target
can do that without clobbering the flags.  So maybe we should do something
like the patch below.

Alternatively we could stick to the decompose_mem_address-based check
above and teach LRA to keep invalid addresses for 'X'.  The problem then
is that we might ICE while printing the operand.

Thanks,
Richard


gcc/
	* recog.c (asm_operand_ok): Tighten MEM validity for 'X'.

gcc/testsuite/
	* gcc.dg/torture/asm-x-constraint-1.c: New test.

Index: gcc/recog.c
===================================================================
--- gcc/recog.c	2014-04-10 21:18:02.778009424 +0100
+++ gcc/recog.c	2014-04-10 21:18:02.996011570 +0100
@@ -1840,7 +1840,11 @@ asm_operand_ok (rtx op, const char *cons
 	  break;
 
 	case 'X':
-	  result = 1;
+	  /* Still enforce memory requirements for non-constant addresses,
+	     since we can't reload MEMs with completely arbitrary addresses.  */
+	  result = (!MEM_P (op)
+		    || CONSTANT_P (XEXP (op, 0))
+		    || memory_operand (op, VOIDmode));
 	  break;
 
 	case 'g':
Index: gcc/testsuite/gcc.dg/torture/asm-x-constraint-1.c
===================================================================
--- /dev/null	2014-04-10 19:40:00.640011981 +0100
+++ gcc/testsuite/gcc.dg/torture/asm-x-constraint-1.c	2014-04-10 21:19:05.405623027 +0100
@@ -0,0 +1,6 @@
+void
+foo (int **x, int y, int z)
+{
+  int *ptr = *x + y * z / 11;
+  __asm__ __volatile__ ("foo %0" : : "X" (*ptr));
+}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-09 21:24     ` Richard Sandiford
  2014-04-10 20:29       ` Richard Sandiford
@ 2014-04-14 11:13       ` Robert Suchanek
  2014-04-15 21:12         ` Richard Sandiford
  1 sibling, 1 reply; 31+ messages in thread
From: Robert Suchanek @ 2014-04-14 11:13 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, vmakarov

> So yeah, I agree this is right after all, sorry.  Let's delete the
> comment starting at "There are two problems here:" at the same time.

Ok.

> mips_regno_to_class should then map $sp to the new class, since it's now
> the smallest containing class.  (We really should set that up automatically
> one day...)

Indeed, it is easy to overlook it.

> How about M16_SP_REGS, to match M16_T_REGS?
> 
> Also, the BASE_REG_CLASS/ADDR_REG_CLASS distinction isn't all that
> obvious from the names.  ADDR_REG_CLASS is only needed for the "d"
> constraint so maybe we could just use TARGET_MIPS16 ? M16_REGS : GR_REGS
> directly for now.

Fair enough. I'll remove ADDR_REG_CLASS.

> I can imagine including all M16_REGS makes sense, but it seems odd to
> drop the 2 temporaries.  Does 0x0303fffc have the same problem?

It's a midway between 0x0003fffc and 0x0300fffc. I think 0x0303fffc will be
a good trade-off. The difference between 0x0303fffc and others is about 
~600 bytes in CSiBE which is less than 0.01% of code size change.

> Hmm, marking them fixed was supposed to be a temporary reload-only thing,
> until the move to LRA.  It should never be worse to spill to these GPRs
> over spilling to the stack, if the value isn't live across a call.

I would say this also affects IRA/LRA integration. I found that it is more
profitable to hide registers (MIPS16 only) in IRA to encourage spilling to
memory. Otherwise $8-$15 would be treated like any other registers and LRA
would inserts reloads to move in/out values of these registers. My assumption is
that if we could hide some of the registers in IRA but enable them in LRA
then all registers in SPILL_REGS would be available keeping reasonable code
size. Another way would be to increase the cost of moving values between 
M16_REGS and GR_REGS but it was already mentioned, and is true that there 
should be no difference of costs and it feels like a hack to make things work.

> Did you see the failures even after your mips_regno_mode_ok_for_base_p
> change?  LRA should know how to reload a "W" address.

Yes but I realize there is more. It fails because $sp is now included
in BASE_REG_CLASS and "W" is based on it. However, I suppose that 
it would be too eager to say it is wrong and likely there is something missing 
in LRA if we want to keep all alternatives. Currently there is no check
if a reloaded operand has a valid address, use of $sp in lbu/lhu cases.
Even if we added extra checks we are less likely to benefit as we need
to reload the base into register.

> Even that might be too loose, since invalid scales will need to be reloaded
> as a multiplication or shift, and there's no guarantee that the target
> can do that without clobbering the flags.  So maybe we should do something
> like the patch below.
>
> Alternatively we could stick to the decompose_mem_address-based check
> above and teach LRA to keep invalid addresses for 'X'.  The problem then
> is that we might ICE while printing the operand.

Tightening validity for 'X' appears to be reasonable. Will you commit this patch?

Regards,
Robert

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-14 11:13       ` Robert Suchanek
@ 2014-04-15 21:12         ` Richard Sandiford
  2014-04-16 21:10           ` Robert Suchanek
  0 siblings, 1 reply; 31+ messages in thread
From: Richard Sandiford @ 2014-04-15 21:12 UTC (permalink / raw)
  To: Robert Suchanek; +Cc: gcc-patches, vmakarov

Robert Suchanek <Robert.Suchanek@imgtec.com> writes:
>> Hmm, marking them fixed was supposed to be a temporary reload-only thing,
>> until the move to LRA.  It should never be worse to spill to these GPRs
>> over spilling to the stack, if the value isn't live across a call.
>
> I would say this also affects IRA/LRA integration. I found that it is more
> profitable to hide registers (MIPS16 only) in IRA to encourage spilling to
> memory. Otherwise $8-$15 would be treated like any other registers and LRA
> would inserts reloads to move in/out values of these registers. My assumption is
> that if we could hide some of the registers in IRA but enable them in LRA
> then all registers in SPILL_REGS would be available keeping reasonable code
> size. Another way would be to increase the cost of moving values between 
> M16_REGS and GR_REGS but it was already mentioned, and is true that there 
> should be no difference of costs and it feels like a hack to make things work.

OK.

This definitely sounds like it ought to be made to work, with some mixture
of target and generic changes.  But if it doesn't work out of the box
then let's leave that for future work.

>> Did you see the failures even after your mips_regno_mode_ok_for_base_p
>> change?  LRA should know how to reload a "W" address.
>
> Yes but I realize there is more. It fails because $sp is now included
> in BASE_REG_CLASS and "W" is based on it. However, I suppose that 
> it would be too eager to say it is wrong and likely there is something missing 
> in LRA if we want to keep all alternatives. Currently there is no check
> if a reloaded operand has a valid address, use of $sp in lbu/lhu cases.
> Even if we added extra checks we are less likely to benefit as we need
> to reload the base into register.

Not sure what you mean, sorry.  "W" exists specifically to exclude
$sp-based and $pc-based addresses.  LRA AFAIK should already be able
to reload addresses that are valid in the TARGET_LEGITIMATE_ADDRESS_P
sense but which do not match the constraints for a particular insn.

Can you remember one of the tests that fails?

>> Even that might be too loose, since invalid scales will need to be reloaded
>> as a multiplication or shift, and there's no guarantee that the target
>> can do that without clobbering the flags.  So maybe we should do something
>> like the patch below.
>>
>> Alternatively we could stick to the decompose_mem_address-based check
>> above and teach LRA to keep invalid addresses for 'X'.  The problem then
>> is that we might ICE while printing the operand.
>
> Tightening validity for 'X' appears to be reasonable. Will you commit
> this patch?

OK, just submitted separately.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-15 21:12         ` Richard Sandiford
@ 2014-04-16 21:10           ` Robert Suchanek
  2014-04-21 13:01             ` Richard Sandiford
  0 siblings, 1 reply; 31+ messages in thread
From: Robert Suchanek @ 2014-04-16 21:10 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, vmakarov

> >> Did you see the failures even after your mips_regno_mode_ok_for_base_p
> >> change?  LRA should know how to reload a "W" address.
> >
> > Yes but I realize there is more. It fails because $sp is now included
> > in BASE_REG_CLASS and "W" is based on it. However, I suppose that 
> > it would be too eager to say it is wrong and likely there is something missing 
> > in LRA if we want to keep all alternatives. Currently there is no check
> > if a reloaded operand has a valid address, use of $sp in lbu/lhu cases.
> > Even if we added extra checks we are less likely to benefit as we need
> > to reload the base into register.
>
> Not sure what you mean, sorry.  "W" exists specifically to exclude
> $sp-based and $pc-based addresses.  LRA AFAIK should already be able
> to reload addresses that are valid in the TARGET_LEGITIMATE_ADDRESS_P
> sense but which do not match the constraints for a particular insn.
>
> Can you remember one of the tests that fails?

I couldn't trigger the problem with the original testcase but found another
one that reveals it. The following needs to compiled with -mips32r2 -mips16 -Os:

struct { int addr; } c;
struct command { int args[1]; };
unsigned short a;

fn1 (struct command *p1)
{
    unsigned short d;
    d = fn2 ();
    a = p1->args[0];
    fn3 (a);
    if (c.addr)
    {
        fn4 (p1->args[0]);
        return;
    }
    (&c)->addr = fn5 ();
    fn6 (d);
}

Not sure how the constraint would/should exclude $sp-based address in LRA.
In this particular case, a spilled pseudo is changed to memory giving the following RTL form:

(insn 30 29 31 4 (set (reg:SI 4 $4)
        (and:SI (mem/c:SI (plus:SI (reg/f:SI 78 $frame)
                    (const_int 16 [0x10])) [7 %sfp+16 S4 A32])
            (const_int 65535 [0xffff]))) shell.i:17 161 {*andsi3_mips16}
     (expr_list:REG_DEAD (reg:SI 194 [ D.1469 ])
        (nil)))

The operand 1 during alternative selection is not marked as a bad operand as it is a memory
operand. $frame appears to be fine as it could be eliminated later to hard register. No reloads
are inserted for the instructions concerned. Unless, $frame should be temporarily eliminated
and then a reload would be inserted?

Regards,
Robert
 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-16 21:10           ` Robert Suchanek
@ 2014-04-21 13:01             ` Richard Sandiford
  2014-04-23 13:34               ` Robert Suchanek
  2014-04-23 15:33               ` Vladimir Makarov
  0 siblings, 2 replies; 31+ messages in thread
From: Richard Sandiford @ 2014-04-21 13:01 UTC (permalink / raw)
  To: Robert Suchanek; +Cc: gcc-patches, vmakarov

Robert Suchanek <Robert.Suchanek@imgtec.com> writes:
>> >> Did you see the failures even after your mips_regno_mode_ok_for_base_p
>> >> change?  LRA should know how to reload a "W" address.
>> >
>> > Yes but I realize there is more. It fails because $sp is now included
>> > in BASE_REG_CLASS and "W" is based on it. However, I suppose that 
>> > it would be too eager to say it is wrong and likely there is
>> > something missing
>> > in LRA if we want to keep all alternatives. Currently there is no check
>> > if a reloaded operand has a valid address, use of $sp in lbu/lhu cases.
>> > Even if we added extra checks we are less likely to benefit as we need
>> > to reload the base into register.
>>
>> Not sure what you mean, sorry.  "W" exists specifically to exclude
>> $sp-based and $pc-based addresses.  LRA AFAIK should already be able
>> to reload addresses that are valid in the TARGET_LEGITIMATE_ADDRESS_P
>> sense but which do not match the constraints for a particular insn.
>>
>> Can you remember one of the tests that fails?
>
> I couldn't trigger the problem with the original testcase but found
> another one that reveals it. The following needs to compiled with
> -mips32r2 -mips16 -Os:
>
> struct { int addr; } c;
> struct command { int args[1]; };
> unsigned short a;
>
> fn1 (struct command *p1)
> {
>     unsigned short d;
>     d = fn2 ();
>     a = p1->args[0];
>     fn3 (a);
>     if (c.addr)
>     {
>         fn4 (p1->args[0]);
>         return;
>     }
>     (&c)->addr = fn5 ();
>     fn6 (d);
> }

Thanks.

> Not sure how the constraint would/should exclude $sp-based address in
> LRA.  In this particular case, a spilled pseudo is changed to memory
> giving the following RTL form:
>
> (insn 30 29 31 4 (set (reg:SI 4 $4)
>         (and:SI (mem/c:SI (plus:SI (reg/f:SI 78 $frame)
>                     (const_int 16 [0x10])) [7 %sfp+16 S4 A32])
>             (const_int 65535 [0xffff]))) shell.i:17 161 {*andsi3_mips16}
>      (expr_list:REG_DEAD (reg:SI 194 [ D.1469 ])
>         (nil)))
>
> The operand 1 during alternative selection is not marked as a bad
> operand as it is a memory operand. $frame appears to be fine as it
> could be eliminated later to hard register. No reloads are inserted
> for the instructions concerned. Unless, $frame should be temporarily
> eliminated and then a reload would be inserted?

Yeah, I think the lack of elimination is the problem.  process_address
eliminates $frame temporarily before checking whether the address
is valid, but the places that check EXTRA_CONSTRAINT_STR pass the
original uneliminated address.  So the legitimate_address_p hook sees
the $sp-based address but the "W" constraint only sees the $frame-based
address (which might or might not be valid, depending on whether $frame
is eliminated to the stack or hard frame pointer).  I think the constraints
should see the eliminated address too.

This patch seems to fix it for me.  Tested on x86_64-linux-gnu.
Vlad, is this OK for trunk?

BTW, we might want to define something like:

#define MODE_BASE_REG_CLASS(MODE) \
  (TARGET_MIPS16 \
   ? ((MODE) == SImode || (MODE) == DImode ? M16_SP_REGS : M16_REGS) \
   : GR_REGS)

instead of BASE_REG_CLASS.  It might lead to slightly better code
(or not -- if it doesn't then don't bother :-)).

If this patch is OK then I think the only thing blocking the switch
to LRA is the asm-subreg-1.c failure.  I think it'd be fine to XFAIL
that test on MIPS for now, until there's a consensus about what "X" means
for asms.

Thanks,
Richard

gcc/
	* lra-constraints.c (valid_address_p): Move earlier in file.
	Add a constraint argument to the address_info version.
	(satisfies_memory_constraint_p): New function.
	(satisfies_address_constraint_p): Likewise.
	(process_alt_operands, curr_insn_transform): Use them.
	(process_address): Pass the constraint to valid_address_p when
	checking address operands.

Index: gcc/lra-constraints.c
===================================================================
--- gcc/lra-constraints.c	2014-04-21 10:36:06.715026374 +0100
+++ gcc/lra-constraints.c	2014-04-21 13:18:46.228298176 +0100
@@ -317,6 +317,94 @@ in_mem_p (int regno)
   return get_reg_class (regno) == NO_REGS;
 }
 
+/* Return 1 if ADDR is a valid memory address for mode MODE in address
+   space AS, and check that each pseudo has the proper kind of hard
+   reg.	 */
+static int
+valid_address_p (enum machine_mode mode ATTRIBUTE_UNUSED,
+		 rtx addr, addr_space_t as)
+{
+#ifdef GO_IF_LEGITIMATE_ADDRESS
+  lra_assert (ADDR_SPACE_GENERIC_P (as));
+  GO_IF_LEGITIMATE_ADDRESS (mode, addr, win);
+  return 0;
+
+ win:
+  return 1;
+#else
+  return targetm.addr_space.legitimate_address_p (mode, addr, 0, as);
+#endif
+}
+
+/* Return whether address AD is valid.  If CONSTRAINT is null,
+   check for general addresses, otherwise check the extra constraint
+   CONSTRAINT.  */
+static bool
+valid_address_p (struct address_info *ad, const char *constraint = 0)
+{
+  /* Some ports do not check displacements for eliminable registers,
+     so we replace them temporarily with the elimination target.  */
+  rtx saved_base_reg = NULL_RTX;
+  rtx saved_index_reg = NULL_RTX;
+  rtx *base_term = strip_subreg (ad->base_term);
+  rtx *index_term = strip_subreg (ad->index_term);
+  if (base_term != NULL)
+    {
+      saved_base_reg = *base_term;
+      lra_eliminate_reg_if_possible (base_term);
+      if (ad->base_term2 != NULL)
+	*ad->base_term2 = *ad->base_term;
+    }
+  if (index_term != NULL)
+    {
+      saved_index_reg = *index_term;
+      lra_eliminate_reg_if_possible (index_term);
+    }
+  bool ok_p = (constraint
+#ifdef EXTRA_CONSTRAINT_STR
+	       ? EXTRA_CONSTRAINT_STR (*ad->outer, constraint[0], constraint)
+#else
+	       ? false
+#endif
+	       : valid_address_p (ad->mode, *ad->outer, ad->as));
+  if (saved_base_reg != NULL_RTX)
+    {
+      *base_term = saved_base_reg;
+      if (ad->base_term2 != NULL)
+	*ad->base_term2 = *ad->base_term;
+    }
+  if (saved_index_reg != NULL_RTX)
+    *index_term = saved_index_reg;
+  return ok_p;
+}
+
+#ifdef EXTRA_CONSTRAINT_STR
+/* Return true if, after elimination, OP satisfies extra memory constraint
+   CONSTRAINT.  */
+static bool
+satisfies_memory_constraint_p (rtx op, const char *constraint)
+{
+  struct address_info ad;
+
+  if (!MEM_P (op))
+    return false;
+
+  decompose_mem_address (&ad, op);
+  return valid_address_p (&ad, constraint);
+}
+
+/* Return true if, after elimination, OP satisfies extra address constraint
+   CONSTRAINT.  */
+static bool
+satisfies_address_constraint_p (rtx op, const char *constraint)
+{
+  struct address_info ad;
+
+  decompose_lea_address (&ad, &op);
+  return valid_address_p (&ad, constraint);
+}
+#endif
+
 /* Initiate equivalences for LRA.  As we keep original equivalences
    before any elimination, we need to make copies otherwise any change
    in insns might change the equivalences.  */
@@ -1941,7 +2029,7 @@ process_alt_operands (int only_alternati
 #ifdef EXTRA_CONSTRAINT_STR
 		      if (EXTRA_MEMORY_CONSTRAINT (c, p))
 			{
-			  if (EXTRA_CONSTRAINT_STR (op, c, p))
+			  if (satisfies_memory_constraint_p (op, p))
 			    win = true;
 			  else if (spilled_pseudo_p (op))
 			    win = true;
@@ -1960,7 +2048,7 @@ process_alt_operands (int only_alternati
 			}
 		      if (EXTRA_ADDRESS_CONSTRAINT (c, p))
 			{
-			  if (EXTRA_CONSTRAINT_STR (op, c, p))
+			  if (satisfies_address_constraint_p (op, p))
 			    win = true;
 
 			  /* If we didn't already win, we can reload
@@ -2576,60 +2664,6 @@ process_alt_operands (int only_alternati
   return ok_p;
 }
 
-/* Return 1 if ADDR is a valid memory address for mode MODE in address
-   space AS, and check that each pseudo has the proper kind of hard
-   reg.	 */
-static int
-valid_address_p (enum machine_mode mode ATTRIBUTE_UNUSED,
-		 rtx addr, addr_space_t as)
-{
-#ifdef GO_IF_LEGITIMATE_ADDRESS
-  lra_assert (ADDR_SPACE_GENERIC_P (as));
-  GO_IF_LEGITIMATE_ADDRESS (mode, addr, win);
-  return 0;
-
- win:
-  return 1;
-#else
-  return targetm.addr_space.legitimate_address_p (mode, addr, 0, as);
-#endif
-}
-
-/* Return whether address AD is valid.  */
-
-static bool
-valid_address_p (struct address_info *ad)
-{
-  /* Some ports do not check displacements for eliminable registers,
-     so we replace them temporarily with the elimination target.  */
-  rtx saved_base_reg = NULL_RTX;
-  rtx saved_index_reg = NULL_RTX;
-  rtx *base_term = strip_subreg (ad->base_term);
-  rtx *index_term = strip_subreg (ad->index_term);
-  if (base_term != NULL)
-    {
-      saved_base_reg = *base_term;
-      lra_eliminate_reg_if_possible (base_term);
-      if (ad->base_term2 != NULL)
-	*ad->base_term2 = *ad->base_term;
-    }
-  if (index_term != NULL)
-    {
-      saved_index_reg = *index_term;
-      lra_eliminate_reg_if_possible (index_term);
-    }
-  bool ok_p = valid_address_p (ad->mode, *ad->outer, ad->as);
-  if (saved_base_reg != NULL_RTX)
-    {
-      *base_term = saved_base_reg;
-      if (ad->base_term2 != NULL)
-	*ad->base_term2 = *ad->base_term;
-    }
-  if (saved_index_reg != NULL_RTX)
-    *index_term = saved_index_reg;
-  return ok_p;
-}
-
 /* Make reload base reg + disp from address AD.  Return the new pseudo.  */
 static rtx
 base_plus_disp_to_reg (struct address_info *ad)
@@ -2832,7 +2866,7 @@ process_address (int nop, rtx *before, r
      EXTRA_CONSTRAINT_STR for the validation.  */
   if (constraint[0] != 'p'
       && EXTRA_ADDRESS_CONSTRAINT (constraint[0], constraint)
-      && EXTRA_CONSTRAINT_STR (op, constraint[0], constraint))
+      && valid_address_p (&ad, constraint))
     return change_p;
 #endif
 
@@ -3539,7 +3573,7 @@ curr_insn_transform (void)
 		  break;
 #ifdef EXTRA_CONSTRAINT_STR
 		if (EXTRA_MEMORY_CONSTRAINT (c, constraint)
-		    && EXTRA_CONSTRAINT_STR (tem, c, constraint))
+		    && satisfies_memory_constraint_p (tem, constraint))
 		  break;
 #endif
 	      }

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-21 13:01             ` Richard Sandiford
@ 2014-04-23 13:34               ` Robert Suchanek
  2014-04-23 14:10                 ` Richard Sandiford
  2014-04-23 15:33               ` Vladimir Makarov
  1 sibling, 1 reply; 31+ messages in thread
From: Robert Suchanek @ 2014-04-23 13:34 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, vmakarov

> Yeah, I think the lack of elimination is the problem.  process_address
> eliminates $frame temporarily before checking whether the address
> is valid, but the places that check EXTRA_CONSTRAINT_STR pass the
> original uneliminated address.  So the legitimate_address_p hook sees
> the $sp-based address but the "W" constraint only sees the $frame-based
> address (which might or might not be valid, depending on whether $frame
> is eliminated to the stack or hard frame pointer).  I think the constraints
> should see the eliminated address too.

That makes sense and explains why it worked when $frame was eliminated
to hard frame pointer but didn't for the stack pointer.

> BTW, we might want to define something like:
> 
> #define MODE_BASE_REG_CLASS(MODE) \
>   (TARGET_MIPS16 \
>    ? ((MODE) == SImode || (MODE) == DImode ? M16_SP_REGS : M16_REGS) \
>    : GR_REGS)
> 
> instead of BASE_REG_CLASS.  It might lead to slightly better code
> (or not -- if it doesn't then don't bother :-)).

I have already tried it and no visible difference was seen.

> If this patch is OK then I think the only thing blocking the switch
> to LRA is the asm-subreg-1.c failure.  I think it'd be fine to XFAIL
> that test on MIPS for now, until there's a consensus about what "X" means
> for asms.

The patch worked for me and passed the regression test. Thanks.

If we were going to XFAIL the test then it would apply specifically for -mips16 -O1.
In any other combination it appears to work. Would that be a stopper?

Below is the revised patch addressing all the comments and changes so far.

Regards,
Robert

2014-03-26  Robert Suchanek  <Robert.Suchanek@imgtec.com>

	* lra-constraints.c (base_to_reg): New function.
	(process_address): Use new function.

	* config/mips/constraints.md ("d"): BASE_REG_CLASS
	replaced by "TARGET_MIPS16 ? M16_REGS : GR_REGS".
	* config/mips/mips.c (mips_regno_mode_ok_for_base_p):
	Remove use !strict_p for MIPS16.
	(mips_register_priority): New function that implements
	the target hook TARGET_REGISTER_PRIORITY.
	(mips_spill_class): Likewise for TARGET_SPILL_CLASS
	(mips_lra_p): Likewise for TARGET_LRA_P.
	* config/mips/mips.h (reg_class): Add M16_SP_REGS and SPILL_REGS
	classes.
	(REG_CLASS_NAMES): Likewise.
	(REG_CLASS_CONTENTS): Likewise.
	(BASE_REG_CLASS): Use M16_SP_REGS.
	* config/mips/mips.md (*mul_acc_si, *mul_sub_si): Add alternative
	tuned for LRA. New set attribute to enable alternatives
	depending on the register allocator used.
	(*lea64): Disable pattern for MIPS16.
	* config/mips/mips.opt
	(mlra): New option

diff --git gcc/config/mips/constraints.md gcc/config/mips/constraints.md
index f6834fd..fa33c30 100644
--- gcc/config/mips/constraints.md
+++ gcc/config/mips/constraints.md
@@ -19,7 +19,7 @@
 
 ;; Register constraints
 
-(define_register_constraint "d" "BASE_REG_CLASS"
+(define_register_constraint "d" "TARGET_MIPS16 ? M16_REGS : GR_REGS"
   "An address register.  This is equivalent to @code{r} unless
    generating MIPS16 code.")
 
diff --git gcc/config/mips/mips.c gcc/config/mips/mips.c
index 45256e9..81b6c26 100644
--- gcc/config/mips/mips.c
+++ gcc/config/mips/mips.c
@@ -655,7 +655,7 @@ const enum reg_class mips_regno_to_class[FIRST_PSEUDO_REGISTER] = {
   M16_REGS,        M16_STORE_REGS,  LEA_REGS,        LEA_REGS,
   LEA_REGS,        LEA_REGS,        LEA_REGS,        LEA_REGS,
   T_REG,           PIC_FN_ADDR_REG, LEA_REGS,        LEA_REGS,
-  LEA_REGS,        LEA_REGS,        LEA_REGS,        LEA_REGS,
+  LEA_REGS,        M16_SP_REGS,     LEA_REGS,        LEA_REGS,
 
   FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
   FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
@@ -2241,22 +2241,9 @@ mips_regno_mode_ok_for_base_p (int regno, enum machine_mode mode,
     return true;
 
   /* In MIPS16 mode, the stack pointer can only address word and doubleword
-     values, nothing smaller.  There are two problems here:
-
-       (a) Instantiating virtual registers can introduce new uses of the
-	   stack pointer.  If these virtual registers are valid addresses,
-	   the stack pointer should be too.
-
-       (b) Most uses of the stack pointer are not made explicit until
-	   FRAME_POINTER_REGNUM and ARG_POINTER_REGNUM have been eliminated.
-	   We don't know until that stage whether we'll be eliminating to the
-	   stack pointer (which needs the restriction) or the hard frame
-	   pointer (which doesn't).
-
-     All in all, it seems more consistent to only enforce this restriction
-     during and after reload.  */
+     values, nothing smaller.  */
   if (TARGET_MIPS16 && regno == STACK_POINTER_REGNUM)
-    return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
+    return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
 
   return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno);
 }
@@ -12115,6 +12102,18 @@ mips_register_move_cost (enum machine_mode mode,
   return 0;
 }
 
+/* Return a register priority for hard reg REGNO.  */
+
+static int
+mips_register_priority (int hard_regno)
+{
+  /* Treat MIPS16 registers with higher priority than other regs.  */
+  if (TARGET_MIPS16
+      && TEST_HARD_REG_BIT (reg_class_contents[M16_REGS], hard_regno))
+    return 1;
+  return 0;
+}
+
 /* Implement TARGET_MEMORY_MOVE_COST.  */
 
 static int
@@ -18897,6 +18896,21 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
   *update = build2 (COMPOUND_EXPR, void_type_node, *update,
 		    atomic_feraiseexcept_call);
 }
+
+static reg_class_t
+mips_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
+		  enum machine_mode mode ATTRIBUTE_UNUSED)
+{
+  if (TARGET_MIPS16)
+    return SPILL_REGS;
+  return NO_REGS;
+}
+
+static bool
+mips_lra_p (void)
+{
+  return mips_lra_flag;
+}
 

 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
@@ -18960,6 +18974,8 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #define TARGET_VALID_POINTER_MODE mips_valid_pointer_mode
 #undef TARGET_REGISTER_MOVE_COST
 #define TARGET_REGISTER_MOVE_COST mips_register_move_cost
+#undef TARGET_REGISTER_PRIORITY
+#define TARGET_REGISTER_PRIORITY mips_register_priority
 #undef TARGET_MEMORY_MOVE_COST
 #define TARGET_MEMORY_MOVE_COST mips_memory_move_cost
 #undef TARGET_RTX_COSTS
@@ -19134,6 +19150,11 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV mips_atomic_assign_expand_fenv
 
+#undef TARGET_SPILL_CLASS
+#define TARGET_SPILL_CLASS mips_spill_class
+#undef TARGET_LRA_P
+#define TARGET_LRA_P mips_lra_p
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 

 #include "gt-mips.h"
diff --git gcc/config/mips/mips.h gcc/config/mips/mips.h
index b25865b..fb37999 100644
--- gcc/config/mips/mips.h
+++ gcc/config/mips/mips.h
@@ -1872,10 +1872,12 @@ enum reg_class
   NO_REGS,			/* no registers in set */
   M16_STORE_REGS,		/* microMIPS store registers  */
   M16_REGS,			/* mips16 directly accessible registers */
+  M16_SP_REGS,			/* mips16 + $sp */
   T_REG,			/* mips16 T register ($24) */
   M16_T_REGS,			/* mips16 registers plus T register */
   PIC_FN_ADDR_REG,		/* SVR4 PIC function address register */
   V1_REG,			/* Register $v1 ($3) used for TLS access.  */
+  SPILL_REGS,			/* All but $sp and call preserved regs are in here */
   LEA_REGS,			/* Every GPR except $25 */
   GR_REGS,			/* integer registers */
   FP_REGS,			/* floating point registers */
@@ -1910,10 +1912,12 @@ enum reg_class
   "NO_REGS",								\
   "M16_STORE_REGS",							\
   "M16_REGS",								\
+  "M16_SP_REGS",								\
   "T_REG",								\
   "M16_T_REGS",								\
   "PIC_FN_ADDR_REG",							\
   "V1_REG",								\
+  "SPILL_REGS",								\
   "LEA_REGS",								\
   "GR_REGS",								\
   "FP_REGS",								\
@@ -1951,10 +1955,12 @@ enum reg_class
   { 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* NO_REGS */		\
   { 0x000200fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16_STORE_REGS */	\
   { 0x000300fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16_REGS */		\
+  { 0x200300fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16_SP_REGS */		\
   { 0x01000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* T_REG */		\
   { 0x010300fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16_T_REGS */	\
   { 0x02000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* PIC_FN_ADDR_REG */	\
   { 0x00000008, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* V1_REG */		\
+  { 0x0303fffc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* SPILL_REGS */      	\
   { 0xfdffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* LEA_REGS */		\
   { 0xffffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* GR_REGS */		\
   { 0x00000000, 0xffffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* FP_REGS */		\
@@ -1987,7 +1993,7 @@ enum reg_class
    valid base register must belong.  A base register is one used in
    an address which is the register value plus a displacement.  */
 
-#define BASE_REG_CLASS  (TARGET_MIPS16 ? M16_REGS : GR_REGS)
+#define BASE_REG_CLASS  (TARGET_MIPS16 ? M16_SP_REGS : GR_REGS)
 
 /* A macro whose definition is the name of the class to which a
    valid index register must belong.  An index register is one used
diff --git gcc/config/mips/mips.md gcc/config/mips/mips.md
index f914ab6..4377a69 100644
--- gcc/config/mips/mips.md
+++ gcc/config/mips/mips.md
@@ -1622,40 +1622,66 @@
 ;; copy instructions.  Reload therefore thinks that the second alternative
 ;; is two reloads more costly than the first.  We add "*?*?" to the first
 ;; alternative as a counterweight.
+;;
+;; LRA simulates reload but the cost of reloading scratches is lower
+;; than of the classic reload. For the time being, removing the counterweight
+;; for LRA is more profitable.
 (define_insn "*mul_acc_si"
-  [(set (match_operand:SI 0 "register_operand" "=l*?*?,d?")
-	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d")
-			  (match_operand:SI 2 "register_operand" "d,d"))
-		 (match_operand:SI 3 "register_operand" "0,d")))
-   (clobber (match_scratch:SI 4 "=X,l"))
-   (clobber (match_scratch:SI 5 "=X,&d"))]
+  [(set (match_operand:SI 0 "register_operand" "=l*?*?,l,d?")
+	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d,d")
+			  (match_operand:SI 2 "register_operand" "d,d,d"))
+		 (match_operand:SI 3 "register_operand" "0,0,d")))
+   (clobber (match_scratch:SI 4 "=X,X,l"))
+   (clobber (match_scratch:SI 5 "=X,X,&d"))]
   "GENERATE_MADD_MSUB && !TARGET_MIPS16"
   "@
     madd\t%1,%2
+    madd\t%1,%2
     #"
   [(set_attr "type"	"imadd")
    (set_attr "accum_in"	"3")
    (set_attr "mode"	"SI")
-   (set_attr "insn_count" "1,2")])
+   (set_attr "insn_count" "1,1,2")
+   (set (attr "enabled")
+        (cond [(and (eq_attr "alternative" "0")
+                    (match_test "!mips_lra_flag"))
+                  (const_string "yes")
+               (and (eq_attr "alternative" "1")
+                    (match_test "mips_lra_flag"))
+                  (const_string "yes")
+               (eq_attr "alternative" "2")
+                  (const_string "yes")]
+              (const_string "no")))])
 
 ;; The same idea applies here.  The middle alternative needs one less
 ;; clobber than the final alternative, so we add "*?" as a counterweight.
 (define_insn "*mul_acc_si_r3900"
-  [(set (match_operand:SI 0 "register_operand" "=l*?*?,d*?,d?")
-	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d,d")
-			  (match_operand:SI 2 "register_operand" "d,d,d"))
-		 (match_operand:SI 3 "register_operand" "0,l,d")))
-   (clobber (match_scratch:SI 4 "=X,3,l"))
-   (clobber (match_scratch:SI 5 "=X,X,&d"))]
+  [(set (match_operand:SI 0 "register_operand" "=l*?*?,l,d*?,d?")
+	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d,d,d")
+			  (match_operand:SI 2 "register_operand" "d,d,d,d"))
+		 (match_operand:SI 3 "register_operand" "0,0,l,d")))
+   (clobber (match_scratch:SI 4 "=X,X,3,l"))
+   (clobber (match_scratch:SI 5 "=X,X,X,&d"))]
   "TARGET_MIPS3900 && !TARGET_MIPS16"
   "@
     madd\t%1,%2
+    madd\t%1,%2
     madd\t%0,%1,%2
     #"
   [(set_attr "type"	"imadd")
    (set_attr "accum_in"	"3")
    (set_attr "mode"	"SI")
-   (set_attr "insn_count" "1,1,2")])
+   (set_attr "insn_count" "1,1,1,2")
+   (set (attr "enabled")
+        (cond [(and (eq_attr "alternative" "0")
+                    (match_test "!mips_lra_flag"))
+                  (const_string "yes")
+               (and (eq_attr "alternative" "1")
+                    (match_test "mips_lra_flag"))
+                  (const_string "yes")
+               (eq_attr "alternative" "2,3")
+                  (const_string "yes")]
+              (const_string "no")))])
 
 ;; Split *mul_acc_si if both the source and destination accumulator
 ;; values are GPRs.
@@ -1859,20 +1885,31 @@
 
 ;; See the comment above *mul_add_si for details.
 (define_insn "*mul_sub_si"
-  [(set (match_operand:SI 0 "register_operand" "=l*?*?,d?")
-        (minus:SI (match_operand:SI 1 "register_operand" "0,d")
-                  (mult:SI (match_operand:SI 2 "register_operand" "d,d")
-                           (match_operand:SI 3 "register_operand" "d,d"))))
-   (clobber (match_scratch:SI 4 "=X,l"))
-   (clobber (match_scratch:SI 5 "=X,&d"))]
+  [(set (match_operand:SI 0 "register_operand" "=l*?*?,l,d?")
+        (minus:SI (match_operand:SI 1 "register_operand" "0,0,d")
+                  (mult:SI (match_operand:SI 2 "register_operand" "d,d,d")
+                           (match_operand:SI 3 "register_operand" "d,d,d"))))
+   (clobber (match_scratch:SI 4 "=X,X,l"))
+   (clobber (match_scratch:SI 5 "=X,X,&d"))]
   "GENERATE_MADD_MSUB"
   "@
    msub\t%2,%3
+   msub\t%2,%3
    #"
   [(set_attr "type"     "imadd")
    (set_attr "accum_in"	"1")
    (set_attr "mode"     "SI")
-   (set_attr "insn_count" "1,2")])
+   (set_attr "insn_count" "1,1,2")
+   (set (attr "enabled")
+        (cond [(and (eq_attr "alternative" "0")
+                    (match_test "!mips_lra_flag"))
+                  (const_string "yes")
+               (and (eq_attr "alternative" "1")
+                    (match_test "mips_lra_flag"))
+                  (const_string "yes")
+               (eq_attr "alternative" "2")
+                  (const_string "yes")]
+              (const_string "no")))])
 
 ;; Split *mul_sub_si if both the source and destination accumulator
 ;; values are GPRs.
@@ -4139,7 +4176,10 @@
   [(set (match_operand:DI 0 "register_operand" "=d")
 	(match_operand:DI 1 "absolute_symbolic_operand" ""))
    (clobber (match_scratch:DI 2 "=&d"))]
-  "TARGET_EXPLICIT_RELOCS && ABI_HAS_64BIT_SYMBOLS && cse_not_expected"
+  "!TARGET_MIPS16
+   && TARGET_EXPLICIT_RELOCS
+   && ABI_HAS_64BIT_SYMBOLS
+   && cse_not_expected"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (high:DI (match_dup 3)))
diff --git gcc/config/mips/mips.opt gcc/config/mips/mips.opt
index 6ee5398..c8e840d 100644
--- gcc/config/mips/mips.opt
+++ gcc/config/mips/mips.opt
@@ -380,6 +380,10 @@ msynci
 Target Report Mask(SYNCI)
 Use synci instruction to invalidate i-cache
 
+mlra
+Target Report Var(mips_lra_flag) Init(1) Save
+Use LRA instead of reload
+
 mtune=
 Target RejectNegative Joined Var(mips_tune_option) ToLower Enum(mips_arch_opt_value)
 -mtune=PROCESSOR	Optimize the output for PROCESSOR
diff --git gcc/lra-constraints.c gcc/lra-constraints.c
index 7d5103f..9c72670 100644
--- gcc/lra-constraints.c
+++ gcc/lra-constraints.c
@@ -2630,6 +2630,39 @@ valid_address_p (struct address_info *ad)
   return ok_p;
 }
 
+/* Make reload base reg from address AD.  */
+static rtx
+base_to_reg (struct address_info *ad)
+{
+  enum reg_class cl;
+  int code = -1;
+  rtx new_inner = NULL_RTX;
+  rtx new_reg = NULL_RTX;
+  rtx insn;
+  rtx last_insn = get_last_insn();
+
+  lra_assert (ad->base == ad->base_term && ad->disp == ad->disp_term);
+  cl = base_reg_class (ad->mode, ad->as, ad->base_outer_code,
+                       get_index_code (ad));
+  new_reg = lra_create_new_reg (GET_MODE (*ad->base_term), NULL_RTX,
+                                cl, "base");
+  new_inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), new_reg,
+                                   ad->disp_term == NULL
+                                   ? gen_int_mode (0, ad->mode)
+                                   : *ad->disp_term);
+  if (!valid_address_p (ad->mode, new_inner, ad->as))
+    return NULL_RTX;
+  insn = emit_insn (gen_rtx_SET (ad->mode, new_reg, *ad->base_term));
+  code = recog_memoized (insn);
+  if (code < 0)
+    {
+      delete_insns_since (last_insn);
+      return NULL_RTX;
+    }
+
+  return new_inner;
+}
+
 /* Make reload base reg + disp from address AD.  Return the new pseudo.  */
 static rtx
 base_plus_disp_to_reg (struct address_info *ad)
@@ -2847,6 +2880,8 @@ process_address (int nop, rtx *before, rtx *after)
 
      3) the address is a frame address with an invalid offset.
 
+     4) the address is a frame address with an invalid base.
+
      All these cases involve a non-autoinc address, so there is no
      point revalidating other types.  */
   if (ad.autoinc_p || valid_address_p (&ad))
@@ -2928,14 +2963,19 @@ process_address (int nop, rtx *before, rtx *after)
       int regno;
       enum reg_class cl;
       rtx set, insns, last_insn;
+      /* Try to reload base into register only if the base is invalid
+         for the address but with valid offset, case (4) above.  */
+      start_sequence ();
+      new_reg = base_to_reg (&ad);
+
       /* base + disp => new base, cases (1) and (3) above.  */
       /* Another option would be to reload the displacement into an
 	 index register.  However, postreload has code to optimize
 	 address reloads that have the same base and different
 	 displacements, so reloading into an index register would
 	 not necessarily be a win.  */
-      start_sequence ();
-      new_reg = base_plus_disp_to_reg (&ad);
+      if (new_reg == NULL_RTX)
+        new_reg = base_plus_disp_to_reg (&ad);
       insns = get_insns ();
       last_insn = get_last_insn ();
       /* If we generated at least two insns, try last insn source as
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-23 13:34               ` Robert Suchanek
@ 2014-04-23 14:10                 ` Richard Sandiford
  2014-04-23 15:34                   ` Robert Suchanek
  0 siblings, 1 reply; 31+ messages in thread
From: Richard Sandiford @ 2014-04-23 14:10 UTC (permalink / raw)
  To: Robert Suchanek; +Cc: gcc-patches, vmakarov

Robert Suchanek <Robert.Suchanek@imgtec.com> writes:
> If we were going to XFAIL the test then it would apply specifically
> for -mips16 -O1.  In any other combination it appears to work. Would
> that be a stopper?

Hmm, in that case maybe we should just leave it failing.  The alternative
would be to skip the test altogther for MIPS, with a PR referencing it,
but that seems a bit over-the-top.

> 2014-03-26  Robert Suchanek  <Robert.Suchanek@imgtec.com>
>
> 	* lra-constraints.c (base_to_reg): New function.
> 	(process_address): Use new function.
>
> 	* config/mips/constraints.md ("d"): BASE_REG_CLASS
> 	replaced by "TARGET_MIPS16 ? M16_REGS : GR_REGS".
> 	* config/mips/mips.c (mips_regno_mode_ok_for_base_p):
> 	Remove use !strict_p for MIPS16.
> 	(mips_register_priority): New function that implements
> 	the target hook TARGET_REGISTER_PRIORITY.
> 	(mips_spill_class): Likewise for TARGET_SPILL_CLASS
> 	(mips_lra_p): Likewise for TARGET_LRA_P.
> 	* config/mips/mips.h (reg_class): Add M16_SP_REGS and SPILL_REGS
> 	classes.
> 	(REG_CLASS_NAMES): Likewise.
> 	(REG_CLASS_CONTENTS): Likewise.
> 	(BASE_REG_CLASS): Use M16_SP_REGS.
> 	* config/mips/mips.md (*mul_acc_si, *mul_sub_si): Add alternative
> 	tuned for LRA. New set attribute to enable alternatives
> 	depending on the register allocator used.
> 	(*lea64): Disable pattern for MIPS16.
> 	* config/mips/mips.opt
> 	(mlra): New option

Looks good.

> @@ -12115,6 +12102,18 @@ mips_register_move_cost (enum machine_mode mode,
>    return 0;
>  }
>  
> +/* Return a register priority for hard reg REGNO.  */
> +
> +static int
> +mips_register_priority (int hard_regno)
> +{
> +  /* Treat MIPS16 registers with higher priority than other regs.  */
> +  if (TARGET_MIPS16
> +      && TEST_HARD_REG_BIT (reg_class_contents[M16_REGS], hard_regno))
> +    return 1;
> +  return 0;
> +}
> +
>  /* Implement TARGET_MEMORY_MOVE_COST.  */
>  
>  static int
> @@ -18897,6 +18896,21 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
>    *update = build2 (COMPOUND_EXPR, void_type_node, *update,
>  		    atomic_feraiseexcept_call);
>  }
> +
> +static reg_class_t
> +mips_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
> +		  enum machine_mode mode ATTRIBUTE_UNUSED)
> +{
> +  if (TARGET_MIPS16)
> +    return SPILL_REGS;
> +  return NO_REGS;
> +}
> +
> +static bool
> +mips_lra_p (void)
> +{
> +  return mips_lra_flag;
> +}
>  
>  /* Initialize the GCC target structure.  */
>  #undef TARGET_ASM_ALIGNED_HI_OP

Please use comments of the form:

  /* Implement TARGET_FOO.  */

above all three functions (instead of the current one in the case of
mips_register_priority), just so that it's painfully obvious that
these are target hooks.

OK for the MIPS part with that change, thanks.

Out of interest, do you see any difference if you include $sp in SPILL_REGS?
That obviously doesn't make much conceptual sense, but it would give a
cleaner class hierarchy.

Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-21 13:01             ` Richard Sandiford
  2014-04-23 13:34               ` Robert Suchanek
@ 2014-04-23 15:33               ` Vladimir Makarov
  2014-05-03 19:21                 ` Richard Sandiford
  1 sibling, 1 reply; 31+ messages in thread
From: Vladimir Makarov @ 2014-04-23 15:33 UTC (permalink / raw)
  To: Robert Suchanek, gcc-patches, rdsandiford

On 2014-04-21, 8:23 AM, Richard Sandiford wrote:
> Robert Suchanek <Robert.Suchanek@imgtec.com> writes:
>>>>> Did you see the failures even after your mips_regno_mode_ok_for_base_p
>>>>> change?  LRA should know how to reload a "W" address.
>>>>
>>>> Yes but I realize there is more. It fails because $sp is now included
>>>> in BASE_REG_CLASS and "W" is based on it. However, I suppose that
>>>> it would be too eager to say it is wrong and likely there is
>>>> something missing
>>>> in LRA if we want to keep all alternatives. Currently there is no check
>>>> if a reloaded operand has a valid address, use of $sp in lbu/lhu cases.
>>>> Even if we added extra checks we are less likely to benefit as we need
>>>> to reload the base into register.
>>>
>>> Not sure what you mean, sorry.  "W" exists specifically to exclude
>>> $sp-based and $pc-based addresses.  LRA AFAIK should already be able
>>> to reload addresses that are valid in the TARGET_LEGITIMATE_ADDRESS_P
>>> sense but which do not match the constraints for a particular insn.
>>>
>>> Can you remember one of the tests that fails?
>>
>> I couldn't trigger the problem with the original testcase but found
>> another one that reveals it. The following needs to compiled with
>> -mips32r2 -mips16 -Os:
>>
>> struct { int addr; } c;
>> struct command { int args[1]; };
>> unsigned short a;
>>
>> fn1 (struct command *p1)
>> {
>>      unsigned short d;
>>      d = fn2 ();
>>      a = p1->args[0];
>>      fn3 (a);
>>      if (c.addr)
>>      {
>>          fn4 (p1->args[0]);
>>          return;
>>      }
>>      (&c)->addr = fn5 ();
>>      fn6 (d);
>> }
>
> Thanks.
>
>> Not sure how the constraint would/should exclude $sp-based address in
>> LRA.  In this particular case, a spilled pseudo is changed to memory
>> giving the following RTL form:
>>
>> (insn 30 29 31 4 (set (reg:SI 4 $4)
>>          (and:SI (mem/c:SI (plus:SI (reg/f:SI 78 $frame)
>>                      (const_int 16 [0x10])) [7 %sfp+16 S4 A32])
>>              (const_int 65535 [0xffff]))) shell.i:17 161 {*andsi3_mips16}
>>       (expr_list:REG_DEAD (reg:SI 194 [ D.1469 ])
>>          (nil)))
>>
>> The operand 1 during alternative selection is not marked as a bad
>> operand as it is a memory operand. $frame appears to be fine as it
>> could be eliminated later to hard register. No reloads are inserted
>> for the instructions concerned. Unless, $frame should be temporarily
>> eliminated and then a reload would be inserted?
>
> Yeah, I think the lack of elimination is the problem.  process_address
> eliminates $frame temporarily before checking whether the address
> is valid, but the places that check EXTRA_CONSTRAINT_STR pass the
> original uneliminated address.  So the legitimate_address_p hook sees
> the $sp-based address but the "W" constraint only sees the $frame-based
> address (which might or might not be valid, depending on whether $frame
> is eliminated to the stack or hard frame pointer).  I think the constraints
> should see the eliminated address too.
>
> This patch seems to fix it for me.  Tested on x86_64-linux-gnu.
> Vlad, is this OK for trunk?
>
> BTW, we might want to define something like:
>
> #define MODE_BASE_REG_CLASS(MODE) \
>    (TARGET_MIPS16 \
>     ? ((MODE) == SImode || (MODE) == DImode ? M16_SP_REGS : M16_REGS) \
>     : GR_REGS)
>
> instead of BASE_REG_CLASS.  It might lead to slightly better code
> (or not -- if it doesn't then don't bother :-)).
>
> If this patch is OK then I think the only thing blocking the switch
> to LRA is the asm-subreg-1.c failure.  I think it'd be fine to XFAIL
> that test on MIPS for now, until there's a consensus about what "X" means
> for asms.
>
>
> gcc/
> 	* lra-constraints.c (valid_address_p): Move earlier in file.
> 	Add a constraint argument to the address_info version.
> 	(satisfies_memory_constraint_p): New function.
> 	(satisfies_address_constraint_p): Likewise.
> 	(process_alt_operands, curr_insn_transform): Use them.
> 	(process_address): Pass the constraint to valid_address_p when
> 	checking address operands.
>
>

Yes, it looks ok for me, Richard.  Thanks on working on this.

I am on vacation till May 4th. If the patch results in problems on other 
targets, I hope you revert it.  But to be honest, I believe it is very 
safe and don't expect any problems at all.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-23 14:10                 ` Richard Sandiford
@ 2014-04-23 15:34                   ` Robert Suchanek
  0 siblings, 0 replies; 31+ messages in thread
From: Robert Suchanek @ 2014-04-23 15:34 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, vmakarov

> Hmm, in that case maybe we should just leave it failing.  The alternative
> would be to skip the test altogther for MIPS, with a PR referencing it,
> but that seems a bit over-the-top.

I'd leave it as it is for now until the consensus regarding the 'X' constraint
is reached.
 
> Please use comments of the form:
> 
>   /* Implement TARGET_FOO.  */
> 
> above all three functions (instead of the current one in the case of
> mips_register_priority), just so that it's painfully obvious that
> these are target hooks.

Modified as requested and attached the patch below. I tried to keep 
to the conventions but apparently I seem to overlook certain things.
I'll remember this part now :).

> Out of interest, do you see any difference if you include $sp in SPILL_REGS?
> That obviously doesn't make much conceptual sense, but it would give a
> cleaner class hierarchy.

Including $sp does not make any difference, exactly the same code size.
Although I haven't thoroughly tested it, I limited the check to "-Os".

Regards,
Robert

2014-03-26  Robert Suchanek  <Robert.Suchanek@imgtec.com>

	* lra-constraints.c (base_to_reg): New function.
	(process_address): Use new function.

	* config/mips/constraints.md ("d"): BASE_REG_CLASS
	replaced by "TARGET_MIPS16 ? M16_REGS : GR_REGS".
	* config/mips/mips.c (mips_regno_mode_ok_for_base_p):
	Remove use !strict_p for MIPS16.
	(mips_register_priority): New function that implements
	the target hook TARGET_REGISTER_PRIORITY.
	(mips_spill_class): Likewise for TARGET_SPILL_CLASS
	(mips_lra_p): Likewise for TARGET_LRA_P.
	* config/mips/mips.h (reg_class): Add M16_SP_REGS and SPILL_REGS
	classes.
	(REG_CLASS_NAMES): Likewise.
	(REG_CLASS_CONTENTS): Likewise.
	(BASE_REG_CLASS): Use M16_SP_REGS.
	* config/mips/mips.md (*mul_acc_si, *mul_sub_si): Add alternative
	tuned for LRA. New set attribute to enable alternatives
	depending on the register allocator used.
	(*lea64): Disable pattern for MIPS16.
	* config/mips/mips.opt
	(mlra): New option

diff --git gcc/config/mips/constraints.md gcc/config/mips/constraints.md
index f6834fd..fa33c30 100644
--- gcc/config/mips/constraints.md
+++ gcc/config/mips/constraints.md
@@ -19,7 +19,7 @@
 
 ;; Register constraints
 
-(define_register_constraint "d" "BASE_REG_CLASS"
+(define_register_constraint "d" "TARGET_MIPS16 ? M16_REGS : GR_REGS"
   "An address register.  This is equivalent to @code{r} unless
    generating MIPS16 code.")
 
diff --git gcc/config/mips/mips.c gcc/config/mips/mips.c
index 45256e9..f8d90b2 100644
--- gcc/config/mips/mips.c
+++ gcc/config/mips/mips.c
@@ -655,7 +655,7 @@ const enum reg_class mips_regno_to_class[FIRST_PSEUDO_REGISTER] = {
   M16_REGS,        M16_STORE_REGS,  LEA_REGS,        LEA_REGS,
   LEA_REGS,        LEA_REGS,        LEA_REGS,        LEA_REGS,
   T_REG,           PIC_FN_ADDR_REG, LEA_REGS,        LEA_REGS,
-  LEA_REGS,        LEA_REGS,        LEA_REGS,        LEA_REGS,
+  LEA_REGS,        M16_SP_REGS,     LEA_REGS,        LEA_REGS,
 
   FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
   FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
@@ -2241,22 +2241,9 @@ mips_regno_mode_ok_for_base_p (int regno, enum machine_mode mode,
     return true;
 
   /* In MIPS16 mode, the stack pointer can only address word and doubleword
-     values, nothing smaller.  There are two problems here:
-
-       (a) Instantiating virtual registers can introduce new uses of the
-	   stack pointer.  If these virtual registers are valid addresses,
-	   the stack pointer should be too.
-
-       (b) Most uses of the stack pointer are not made explicit until
-	   FRAME_POINTER_REGNUM and ARG_POINTER_REGNUM have been eliminated.
-	   We don't know until that stage whether we'll be eliminating to the
-	   stack pointer (which needs the restriction) or the hard frame
-	   pointer (which doesn't).
-
-     All in all, it seems more consistent to only enforce this restriction
-     during and after reload.  */
+     values, nothing smaller.  */
   if (TARGET_MIPS16 && regno == STACK_POINTER_REGNUM)
-    return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
+    return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
 
   return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno);
 }
@@ -12115,6 +12102,18 @@ mips_register_move_cost (enum machine_mode mode,
   return 0;
 }
 
+/* Implement TARGET_REGISTER_PRIORITY.  */
+
+static int
+mips_register_priority (int hard_regno)
+{
+  /* Treat MIPS16 registers with higher priority than other regs.  */
+  if (TARGET_MIPS16
+      && TEST_HARD_REG_BIT (reg_class_contents[M16_REGS], hard_regno))
+    return 1;
+  return 0;
+}
+
 /* Implement TARGET_MEMORY_MOVE_COST.  */
 
 static int
@@ -18897,6 +18896,25 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
   *update = build2 (COMPOUND_EXPR, void_type_node, *update,
 		    atomic_feraiseexcept_call);
 }
+
+/* Implement TARGET_SPILL_CLASS.  */
+
+static reg_class_t
+mips_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
+		  enum machine_mode mode ATTRIBUTE_UNUSED)
+{
+  if (TARGET_MIPS16)
+    return SPILL_REGS;
+  return NO_REGS;
+}
+
+/* Implement TARGET_LRA_P.  */
+
+static bool
+mips_lra_p (void)
+{
+  return mips_lra_flag;
+}
 

 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
@@ -18960,6 +18978,8 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #define TARGET_VALID_POINTER_MODE mips_valid_pointer_mode
 #undef TARGET_REGISTER_MOVE_COST
 #define TARGET_REGISTER_MOVE_COST mips_register_move_cost
+#undef TARGET_REGISTER_PRIORITY
+#define TARGET_REGISTER_PRIORITY mips_register_priority
 #undef TARGET_MEMORY_MOVE_COST
 #define TARGET_MEMORY_MOVE_COST mips_memory_move_cost
 #undef TARGET_RTX_COSTS
@@ -19134,6 +19154,11 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV mips_atomic_assign_expand_fenv
 
+#undef TARGET_SPILL_CLASS
+#define TARGET_SPILL_CLASS mips_spill_class
+#undef TARGET_LRA_P
+#define TARGET_LRA_P mips_lra_p
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 

 #include "gt-mips.h"
diff --git gcc/config/mips/mips.h gcc/config/mips/mips.h
index b25865b..fb37999 100644
--- gcc/config/mips/mips.h
+++ gcc/config/mips/mips.h
@@ -1872,10 +1872,12 @@ enum reg_class
   NO_REGS,			/* no registers in set */
   M16_STORE_REGS,		/* microMIPS store registers  */
   M16_REGS,			/* mips16 directly accessible registers */
+  M16_SP_REGS,			/* mips16 + $sp */
   T_REG,			/* mips16 T register ($24) */
   M16_T_REGS,			/* mips16 registers plus T register */
   PIC_FN_ADDR_REG,		/* SVR4 PIC function address register */
   V1_REG,			/* Register $v1 ($3) used for TLS access.  */
+  SPILL_REGS,			/* All but $sp and call preserved regs are in here */
   LEA_REGS,			/* Every GPR except $25 */
   GR_REGS,			/* integer registers */
   FP_REGS,			/* floating point registers */
@@ -1910,10 +1912,12 @@ enum reg_class
   "NO_REGS",								\
   "M16_STORE_REGS",							\
   "M16_REGS",								\
+  "M16_SP_REGS",								\
   "T_REG",								\
   "M16_T_REGS",								\
   "PIC_FN_ADDR_REG",							\
   "V1_REG",								\
+  "SPILL_REGS",								\
   "LEA_REGS",								\
   "GR_REGS",								\
   "FP_REGS",								\
@@ -1951,10 +1955,12 @@ enum reg_class
   { 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* NO_REGS */		\
   { 0x000200fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16_STORE_REGS */	\
   { 0x000300fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16_REGS */		\
+  { 0x200300fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16_SP_REGS */		\
   { 0x01000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* T_REG */		\
   { 0x010300fc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* M16_T_REGS */	\
   { 0x02000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* PIC_FN_ADDR_REG */	\
   { 0x00000008, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* V1_REG */		\
+  { 0x0303fffc, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* SPILL_REGS */      	\
   { 0xfdffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* LEA_REGS */		\
   { 0xffffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* GR_REGS */		\
   { 0x00000000, 0xffffffff, 0x00000000, 0x00000000, 0x00000000, 0x00000000 },	/* FP_REGS */		\
@@ -1987,7 +1993,7 @@ enum reg_class
    valid base register must belong.  A base register is one used in
    an address which is the register value plus a displacement.  */
 
-#define BASE_REG_CLASS  (TARGET_MIPS16 ? M16_REGS : GR_REGS)
+#define BASE_REG_CLASS  (TARGET_MIPS16 ? M16_SP_REGS : GR_REGS)
 
 /* A macro whose definition is the name of the class to which a
    valid index register must belong.  An index register is one used
diff --git gcc/config/mips/mips.md gcc/config/mips/mips.md
index f914ab6..4377a69 100644
--- gcc/config/mips/mips.md
+++ gcc/config/mips/mips.md
@@ -1622,40 +1622,66 @@
 ;; copy instructions.  Reload therefore thinks that the second alternative
 ;; is two reloads more costly than the first.  We add "*?*?" to the first
 ;; alternative as a counterweight.
+;;
+;; LRA simulates reload but the cost of reloading scratches is lower
+;; than of the classic reload. For the time being, removing the counterweight
+;; for LRA is more profitable.
 (define_insn "*mul_acc_si"
-  [(set (match_operand:SI 0 "register_operand" "=l*?*?,d?")
-	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d")
-			  (match_operand:SI 2 "register_operand" "d,d"))
-		 (match_operand:SI 3 "register_operand" "0,d")))
-   (clobber (match_scratch:SI 4 "=X,l"))
-   (clobber (match_scratch:SI 5 "=X,&d"))]
+  [(set (match_operand:SI 0 "register_operand" "=l*?*?,l,d?")
+	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d,d")
+			  (match_operand:SI 2 "register_operand" "d,d,d"))
+		 (match_operand:SI 3 "register_operand" "0,0,d")))
+   (clobber (match_scratch:SI 4 "=X,X,l"))
+   (clobber (match_scratch:SI 5 "=X,X,&d"))]
   "GENERATE_MADD_MSUB && !TARGET_MIPS16"
   "@
     madd\t%1,%2
+    madd\t%1,%2
     #"
   [(set_attr "type"	"imadd")
    (set_attr "accum_in"	"3")
    (set_attr "mode"	"SI")
-   (set_attr "insn_count" "1,2")])
+   (set_attr "insn_count" "1,1,2")
+   (set (attr "enabled")
+        (cond [(and (eq_attr "alternative" "0")
+                    (match_test "!mips_lra_flag"))
+                  (const_string "yes")
+               (and (eq_attr "alternative" "1")
+                    (match_test "mips_lra_flag"))
+                  (const_string "yes")
+               (eq_attr "alternative" "2")
+                  (const_string "yes")]
+              (const_string "no")))])
 
 ;; The same idea applies here.  The middle alternative needs one less
 ;; clobber than the final alternative, so we add "*?" as a counterweight.
 (define_insn "*mul_acc_si_r3900"
-  [(set (match_operand:SI 0 "register_operand" "=l*?*?,d*?,d?")
-	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d,d")
-			  (match_operand:SI 2 "register_operand" "d,d,d"))
-		 (match_operand:SI 3 "register_operand" "0,l,d")))
-   (clobber (match_scratch:SI 4 "=X,3,l"))
-   (clobber (match_scratch:SI 5 "=X,X,&d"))]
+  [(set (match_operand:SI 0 "register_operand" "=l*?*?,l,d*?,d?")
+	(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "d,d,d,d")
+			  (match_operand:SI 2 "register_operand" "d,d,d,d"))
+		 (match_operand:SI 3 "register_operand" "0,0,l,d")))
+   (clobber (match_scratch:SI 4 "=X,X,3,l"))
+   (clobber (match_scratch:SI 5 "=X,X,X,&d"))]
   "TARGET_MIPS3900 && !TARGET_MIPS16"
   "@
     madd\t%1,%2
+    madd\t%1,%2
     madd\t%0,%1,%2
     #"
   [(set_attr "type"	"imadd")
    (set_attr "accum_in"	"3")
    (set_attr "mode"	"SI")
-   (set_attr "insn_count" "1,1,2")])
+   (set_attr "insn_count" "1,1,1,2")
+   (set (attr "enabled")
+        (cond [(and (eq_attr "alternative" "0")
+                    (match_test "!mips_lra_flag"))
+                  (const_string "yes")
+               (and (eq_attr "alternative" "1")
+                    (match_test "mips_lra_flag"))
+                  (const_string "yes")
+               (eq_attr "alternative" "2,3")
+                  (const_string "yes")]
+              (const_string "no")))])
 
 ;; Split *mul_acc_si if both the source and destination accumulator
 ;; values are GPRs.
@@ -1859,20 +1885,31 @@
 
 ;; See the comment above *mul_add_si for details.
 (define_insn "*mul_sub_si"
-  [(set (match_operand:SI 0 "register_operand" "=l*?*?,d?")
-        (minus:SI (match_operand:SI 1 "register_operand" "0,d")
-                  (mult:SI (match_operand:SI 2 "register_operand" "d,d")
-                           (match_operand:SI 3 "register_operand" "d,d"))))
-   (clobber (match_scratch:SI 4 "=X,l"))
-   (clobber (match_scratch:SI 5 "=X,&d"))]
+  [(set (match_operand:SI 0 "register_operand" "=l*?*?,l,d?")
+        (minus:SI (match_operand:SI 1 "register_operand" "0,0,d")
+                  (mult:SI (match_operand:SI 2 "register_operand" "d,d,d")
+                           (match_operand:SI 3 "register_operand" "d,d,d"))))
+   (clobber (match_scratch:SI 4 "=X,X,l"))
+   (clobber (match_scratch:SI 5 "=X,X,&d"))]
   "GENERATE_MADD_MSUB"
   "@
    msub\t%2,%3
+   msub\t%2,%3
    #"
   [(set_attr "type"     "imadd")
    (set_attr "accum_in"	"1")
    (set_attr "mode"     "SI")
-   (set_attr "insn_count" "1,2")])
+   (set_attr "insn_count" "1,1,2")
+   (set (attr "enabled")
+        (cond [(and (eq_attr "alternative" "0")
+                    (match_test "!mips_lra_flag"))
+                  (const_string "yes")
+               (and (eq_attr "alternative" "1")
+                    (match_test "mips_lra_flag"))
+                  (const_string "yes")
+               (eq_attr "alternative" "2")
+                  (const_string "yes")]
+              (const_string "no")))])
 
 ;; Split *mul_sub_si if both the source and destination accumulator
 ;; values are GPRs.
@@ -4139,7 +4176,10 @@
   [(set (match_operand:DI 0 "register_operand" "=d")
 	(match_operand:DI 1 "absolute_symbolic_operand" ""))
    (clobber (match_scratch:DI 2 "=&d"))]
-  "TARGET_EXPLICIT_RELOCS && ABI_HAS_64BIT_SYMBOLS && cse_not_expected"
+  "!TARGET_MIPS16
+   && TARGET_EXPLICIT_RELOCS
+   && ABI_HAS_64BIT_SYMBOLS
+   && cse_not_expected"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (high:DI (match_dup 3)))
diff --git gcc/config/mips/mips.opt gcc/config/mips/mips.opt
index 6ee5398..c8e840d 100644
--- gcc/config/mips/mips.opt
+++ gcc/config/mips/mips.opt
@@ -380,6 +380,10 @@ msynci
 Target Report Mask(SYNCI)
 Use synci instruction to invalidate i-cache
 
+mlra
+Target Report Var(mips_lra_flag) Init(1) Save
+Use LRA instead of reload
+
 mtune=
 Target RejectNegative Joined Var(mips_tune_option) ToLower Enum(mips_arch_opt_value)
 -mtune=PROCESSOR	Optimize the output for PROCESSOR
diff --git gcc/lra-constraints.c gcc/lra-constraints.c
index 7d5103f..9c72670 100644
--- gcc/lra-constraints.c
+++ gcc/lra-constraints.c
@@ -2630,6 +2630,39 @@ valid_address_p (struct address_info *ad)
   return ok_p;
 }
 
+/* Make reload base reg from address AD.  */
+static rtx
+base_to_reg (struct address_info *ad)
+{
+  enum reg_class cl;
+  int code = -1;
+  rtx new_inner = NULL_RTX;
+  rtx new_reg = NULL_RTX;
+  rtx insn;
+  rtx last_insn = get_last_insn();
+
+  lra_assert (ad->base == ad->base_term && ad->disp == ad->disp_term);
+  cl = base_reg_class (ad->mode, ad->as, ad->base_outer_code,
+                       get_index_code (ad));
+  new_reg = lra_create_new_reg (GET_MODE (*ad->base_term), NULL_RTX,
+                                cl, "base");
+  new_inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), new_reg,
+                                   ad->disp_term == NULL
+                                   ? gen_int_mode (0, ad->mode)
+                                   : *ad->disp_term);
+  if (!valid_address_p (ad->mode, new_inner, ad->as))
+    return NULL_RTX;
+  insn = emit_insn (gen_rtx_SET (ad->mode, new_reg, *ad->base_term));
+  code = recog_memoized (insn);
+  if (code < 0)
+    {
+      delete_insns_since (last_insn);
+      return NULL_RTX;
+    }
+
+  return new_inner;
+}
+
 /* Make reload base reg + disp from address AD.  Return the new pseudo.  */
 static rtx
 base_plus_disp_to_reg (struct address_info *ad)
@@ -2847,6 +2880,8 @@ process_address (int nop, rtx *before, rtx *after)
 
      3) the address is a frame address with an invalid offset.
 
+     4) the address is a frame address with an invalid base.
+
      All these cases involve a non-autoinc address, so there is no
      point revalidating other types.  */
   if (ad.autoinc_p || valid_address_p (&ad))
@@ -2928,14 +2963,19 @@ process_address (int nop, rtx *before, rtx *after)
       int regno;
       enum reg_class cl;
       rtx set, insns, last_insn;
+      /* Try to reload base into register only if the base is invalid
+         for the address but with valid offset, case (4) above.  */
+      start_sequence ();
+      new_reg = base_to_reg (&ad);
+
       /* base + disp => new base, cases (1) and (3) above.  */
       /* Another option would be to reload the displacement into an
 	 index register.  However, postreload has code to optimize
 	 address reloads that have the same base and different
 	 displacements, so reloading into an index register would
 	 not necessarily be a win.  */
-      start_sequence ();
-      new_reg = base_plus_disp_to_reg (&ad);
+      if (new_reg == NULL_RTX)
+        new_reg = base_plus_disp_to_reg (&ad);
       insns = get_insns ();
       last_insn = get_last_insn ();
       /* If we generated at least two insns, try last insn source as
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-04-23 15:33               ` Vladimir Makarov
@ 2014-05-03 19:21                 ` Richard Sandiford
  2014-05-06 14:06                   ` Kyrill Tkachov
  0 siblings, 1 reply; 31+ messages in thread
From: Richard Sandiford @ 2014-05-03 19:21 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: Robert Suchanek, gcc-patches

Vladimir Makarov <vmakarov@redhat.com> writes:
>>> Not sure how the constraint would/should exclude $sp-based address in
>>> LRA.  In this particular case, a spilled pseudo is changed to memory
>>> giving the following RTL form:
>>>
>>> (insn 30 29 31 4 (set (reg:SI 4 $4)
>>>          (and:SI (mem/c:SI (plus:SI (reg/f:SI 78 $frame)
>>>                      (const_int 16 [0x10])) [7 %sfp+16 S4 A32])
>>>              (const_int 65535 [0xffff]))) shell.i:17 161 {*andsi3_mips16}
>>>       (expr_list:REG_DEAD (reg:SI 194 [ D.1469 ])
>>>          (nil)))
>>>
>>> The operand 1 during alternative selection is not marked as a bad
>>> operand as it is a memory operand. $frame appears to be fine as it
>>> could be eliminated later to hard register. No reloads are inserted
>>> for the instructions concerned. Unless, $frame should be temporarily
>>> eliminated and then a reload would be inserted?
>>
>> Yeah, I think the lack of elimination is the problem.  process_address
>> eliminates $frame temporarily before checking whether the address
>> is valid, but the places that check EXTRA_CONSTRAINT_STR pass the
>> original uneliminated address.  So the legitimate_address_p hook sees
>> the $sp-based address but the "W" constraint only sees the $frame-based
>> address (which might or might not be valid, depending on whether $frame
>> is eliminated to the stack or hard frame pointer).  I think the constraints
>> should see the eliminated address too.
>>
>> This patch seems to fix it for me.  Tested on x86_64-linux-gnu.
>> Vlad, is this OK for trunk?
>>
>> BTW, we might want to define something like:
>>
>> #define MODE_BASE_REG_CLASS(MODE) \
>>    (TARGET_MIPS16 \
>>     ? ((MODE) == SImode || (MODE) == DImode ? M16_SP_REGS : M16_REGS) \
>>     : GR_REGS)
>>
>> instead of BASE_REG_CLASS.  It might lead to slightly better code
>> (or not -- if it doesn't then don't bother :-)).
>>
>> If this patch is OK then I think the only thing blocking the switch
>> to LRA is the asm-subreg-1.c failure.  I think it'd be fine to XFAIL
>> that test on MIPS for now, until there's a consensus about what "X" means
>> for asms.
>>
>>
>> gcc/
>> 	* lra-constraints.c (valid_address_p): Move earlier in file.
>> 	Add a constraint argument to the address_info version.
>> 	(satisfies_memory_constraint_p): New function.
>> 	(satisfies_address_constraint_p): Likewise.
>> 	(process_alt_operands, curr_insn_transform): Use them.
>> 	(process_address): Pass the constraint to valid_address_p when
>> 	checking address operands.
>>
>>
>
> Yes, it looks ok for me, Richard.  Thanks on working on this.
>
> I am on vacation till May 4th. If the patch results in problems on other 
> targets, I hope you revert it.  But to be honest, I believe it is very 
> safe and don't expect any problems at all.

Thanks Vlad, belatedly committed on that basis.  Like you say I'll revert
it at the first sign of trouble (although it ended up being closer to
your return than originally intended. :-))

Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-05-03 19:21                 ` Richard Sandiford
@ 2014-05-06 14:06                   ` Kyrill Tkachov
  2014-05-06 19:22                     ` Richard Sandiford
  0 siblings, 1 reply; 31+ messages in thread
From: Kyrill Tkachov @ 2014-05-06 14:06 UTC (permalink / raw)
  To: Vladimir Makarov, Robert Suchanek, gcc-patches, rdsandiford

On 03/05/14 20:21, Richard Sandiford wrote:
> Vladimir Makarov <vmakarov@redhat.com> writes:
>>>> Not sure how the constraint would/should exclude $sp-based address in
>>>> LRA.  In this particular case, a spilled pseudo is changed to memory
>>>> giving the following RTL form:
>>>>
>>>> (insn 30 29 31 4 (set (reg:SI 4 $4)
>>>>           (and:SI (mem/c:SI (plus:SI (reg/f:SI 78 $frame)
>>>>                       (const_int 16 [0x10])) [7 %sfp+16 S4 A32])
>>>>               (const_int 65535 [0xffff]))) shell.i:17 161 {*andsi3_mips16}
>>>>        (expr_list:REG_DEAD (reg:SI 194 [ D.1469 ])
>>>>           (nil)))
>>>>
>>>> The operand 1 during alternative selection is not marked as a bad
>>>> operand as it is a memory operand. $frame appears to be fine as it
>>>> could be eliminated later to hard register. No reloads are inserted
>>>> for the instructions concerned. Unless, $frame should be temporarily
>>>> eliminated and then a reload would be inserted?
>>> Yeah, I think the lack of elimination is the problem.  process_address
>>> eliminates $frame temporarily before checking whether the address
>>> is valid, but the places that check EXTRA_CONSTRAINT_STR pass the
>>> original uneliminated address.  So the legitimate_address_p hook sees
>>> the $sp-based address but the "W" constraint only sees the $frame-based
>>> address (which might or might not be valid, depending on whether $frame
>>> is eliminated to the stack or hard frame pointer).  I think the constraints
>>> should see the eliminated address too.
>>>
>>> This patch seems to fix it for me.  Tested on x86_64-linux-gnu.
>>> Vlad, is this OK for trunk?
>>>
>>> BTW, we might want to define something like:
>>>
>>> #define MODE_BASE_REG_CLASS(MODE) \
>>>     (TARGET_MIPS16 \
>>>      ? ((MODE) == SImode || (MODE) == DImode ? M16_SP_REGS : M16_REGS) \
>>>      : GR_REGS)
>>>
>>> instead of BASE_REG_CLASS.  It might lead to slightly better code
>>> (or not -- if it doesn't then don't bother :-)).
>>>
>>> If this patch is OK then I think the only thing blocking the switch
>>> to LRA is the asm-subreg-1.c failure.  I think it'd be fine to XFAIL
>>> that test on MIPS for now, until there's a consensus about what "X" means
>>> for asms.
>>>
>>>
>>> gcc/
>>> 	* lra-constraints.c (valid_address_p): Move earlier in file.
>>> 	Add a constraint argument to the address_info version.
>>> 	(satisfies_memory_constraint_p): New function.
>>> 	(satisfies_address_constraint_p): Likewise.
>>> 	(process_alt_operands, curr_insn_transform): Use them.
>>> 	(process_address): Pass the constraint to valid_address_p when
>>> 	checking address operands.
>>>
>>>
>> Yes, it looks ok for me, Richard.  Thanks on working on this.
>>
>> I am on vacation till May 4th. If the patch results in problems on other
>> targets, I hope you revert it.  But to be honest, I believe it is very
>> safe and don't expect any problems at all.
> Thanks Vlad, belatedly committed on that basis.  Like you say I'll revert
> it at the first sign of trouble (although it ended up being closer to
> your return than originally intended. :-))

Hi all,
This caused some testsuite failures on arm:
FAIL: gcc.target/arm/vfp-ldmdbd.c scan-assembler fldmdbd
FAIL: gcc.target/arm/vfp-ldmdbs.c scan-assembler fldmdbs
FAIL: gcc.target/arm/vfp-ldmiad.c scan-assembler fldmiad
FAIL: gcc.target/arm/vfp-ldmias.c scan-assembler fldmias

 From the vfp-ldmdbd.c test this patch changed the codegen from:
     fldmdbd    r5!, {d7}

into
     sub    r5, r5, #8
     fldd    d7, [r5]

Which broke the test.

Kyrill


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-05-06 14:06                   ` Kyrill Tkachov
@ 2014-05-06 19:22                     ` Richard Sandiford
  2014-05-09 22:58                       ` Matthew Fortune
  0 siblings, 1 reply; 31+ messages in thread
From: Richard Sandiford @ 2014-05-06 19:22 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: Vladimir Makarov, Robert Suchanek, gcc-patches

Kyrill Tkachov <kyrylo.tkachov@arm.com> writes:
> On 03/05/14 20:21, Richard Sandiford wrote:
>> Vladimir Makarov <vmakarov@redhat.com> writes:
>>>>> Not sure how the constraint would/should exclude $sp-based address in
>>>>> LRA.  In this particular case, a spilled pseudo is changed to memory
>>>>> giving the following RTL form:
>>>>>
>>>>> (insn 30 29 31 4 (set (reg:SI 4 $4)
>>>>>           (and:SI (mem/c:SI (plus:SI (reg/f:SI 78 $frame)
>>>>>                       (const_int 16 [0x10])) [7 %sfp+16 S4 A32])
>>>>>               (const_int 65535 [0xffff]))) shell.i:17 161 {*andsi3_mips16}
>>>>>        (expr_list:REG_DEAD (reg:SI 194 [ D.1469 ])
>>>>>           (nil)))
>>>>>
>>>>> The operand 1 during alternative selection is not marked as a bad
>>>>> operand as it is a memory operand. $frame appears to be fine as it
>>>>> could be eliminated later to hard register. No reloads are inserted
>>>>> for the instructions concerned. Unless, $frame should be temporarily
>>>>> eliminated and then a reload would be inserted?
>>>> Yeah, I think the lack of elimination is the problem.  process_address
>>>> eliminates $frame temporarily before checking whether the address
>>>> is valid, but the places that check EXTRA_CONSTRAINT_STR pass the
>>>> original uneliminated address.  So the legitimate_address_p hook sees
>>>> the $sp-based address but the "W" constraint only sees the $frame-based
>>>> address (which might or might not be valid, depending on whether $frame
>>>> is eliminated to the stack or hard frame pointer).  I think the constraints
>>>> should see the eliminated address too.
>>>>
>>>> This patch seems to fix it for me.  Tested on x86_64-linux-gnu.
>>>> Vlad, is this OK for trunk?
>>>>
>>>> BTW, we might want to define something like:
>>>>
>>>> #define MODE_BASE_REG_CLASS(MODE) \
>>>>     (TARGET_MIPS16 \
>>>>      ? ((MODE) == SImode || (MODE) == DImode ? M16_SP_REGS : M16_REGS) \
>>>>      : GR_REGS)
>>>>
>>>> instead of BASE_REG_CLASS.  It might lead to slightly better code
>>>> (or not -- if it doesn't then don't bother :-)).
>>>>
>>>> If this patch is OK then I think the only thing blocking the switch
>>>> to LRA is the asm-subreg-1.c failure.  I think it'd be fine to XFAIL
>>>> that test on MIPS for now, until there's a consensus about what "X" means
>>>> for asms.
>>>>
>>>>
>>>> gcc/
>>>> 	* lra-constraints.c (valid_address_p): Move earlier in file.
>>>> 	Add a constraint argument to the address_info version.
>>>> 	(satisfies_memory_constraint_p): New function.
>>>> 	(satisfies_address_constraint_p): Likewise.
>>>> 	(process_alt_operands, curr_insn_transform): Use them.
>>>> 	(process_address): Pass the constraint to valid_address_p when
>>>> 	checking address operands.
>>>>
>>>>
>>> Yes, it looks ok for me, Richard.  Thanks on working on this.
>>>
>>> I am on vacation till May 4th. If the patch results in problems on other
>>> targets, I hope you revert it.  But to be honest, I believe it is very
>>> safe and don't expect any problems at all.
>> Thanks Vlad, belatedly committed on that basis.  Like you say I'll revert
>> it at the first sign of trouble (although it ended up being closer to
>> your return than originally intended. :-))
>
> Hi all,
> This caused some testsuite failures on arm:
> FAIL: gcc.target/arm/vfp-ldmdbd.c scan-assembler fldmdbd
> FAIL: gcc.target/arm/vfp-ldmdbs.c scan-assembler fldmdbs
> FAIL: gcc.target/arm/vfp-ldmiad.c scan-assembler fldmiad
> FAIL: gcc.target/arm/vfp-ldmias.c scan-assembler fldmias
>
>  From the vfp-ldmdbd.c test this patch changed the codegen from:
>      fldmdbd    r5!, {d7}
>
> into
>      sub    r5, r5, #8
>      fldd    d7, [r5]
>
> Which broke the test.

Sorry for the breakage.  I've reverted the patch for now and will send a
fixed version when I have time.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-05-06 19:22                     ` Richard Sandiford
@ 2014-05-09 22:58                       ` Matthew Fortune
  2014-05-10 18:44                         ` Richard Sandiford
  0 siblings, 1 reply; 31+ messages in thread
From: Matthew Fortune @ 2014-05-09 22:58 UTC (permalink / raw)
  To: Richard Sandiford, Vladimir Makarov
  Cc: Robert Suchanek, gcc-patches, Kyrill Tkachov

Richard/Vlad,

Richard Sandiford <rdsandiford@googlemail.com> writes:
> Kyrill Tkachov <kyrylo.tkachov@arm.com> writes:
> > On 03/05/14 20:21, Richard Sandiford wrote:

...snip...

> > Hi all,
> > This caused some testsuite failures on arm:
> > FAIL: gcc.target/arm/vfp-ldmdbd.c scan-assembler fldmdbd
> > FAIL: gcc.target/arm/vfp-ldmdbs.c scan-assembler fldmdbs
> > FAIL: gcc.target/arm/vfp-ldmiad.c scan-assembler fldmiad
> > FAIL: gcc.target/arm/vfp-ldmias.c scan-assembler fldmias
> >
> >  From the vfp-ldmdbd.c test this patch changed the codegen from:
> >      fldmdbd    r5!, {d7}
> >
> > into
> >      sub    r5, r5, #8
> >      fldd    d7, [r5]
> >
> > Which broke the test.
> 
> Sorry for the breakage.  I've reverted the patch for now and will send a
> fixed version when I have time.

The problem appears to lie with the new satisfies_memory_constraint_p
which is passing just the address to valid_address_p but the constraint
is a memory constraint (which wants the mem to be retained).

One option is to re-construct the MEM using the address_info provided
to valid_address_p. This has mode, address space and whether it was
actually a MEM to start with so there is enough information. 

Another issue noticed while looking at this:
process_address does not seem to make an attempt to check for
EXTRA_MEMORY_CONSTRAINT even though it does decompose memory addresses.
For the lea address case then the address is checked with valid_address_p.
This seems inconsistent, is it intentional?

The patch below on top of Richard's addresses both problems and for the
fldmdbd test gets the correct output. I haven't done any more testing
than that though. I suspect there may be a better approach to achieve
the same effect but at least this shows what is wrong with the original
change.

Regards,
Matthew

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index f59bf55..22bb273 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -348,6 +348,9 @@ valid_address_p (struct address_info *ad, const char *constraint = 0)
   rtx saved_index_reg = NULL_RTX;
   rtx *base_term = strip_subreg (ad->base_term);
   rtx *index_term = strip_subreg (ad->index_term);
+#ifdef EXTRA_CONSTRAINT_STR
+  rtx orig_op = NULL_RTX;
+#endif
   if (base_term != NULL)
     {
       saved_base_reg = *base_term;
@@ -360,9 +363,18 @@ valid_address_p (struct address_info *ad, const char *constraint = 0)
       saved_index_reg = *index_term;
       lra_eliminate_reg_if_possible (index_term);
     }
+#ifdef EXTRA_CONSTRAINT_STR
+  if (ad->addr_outer_code == MEM)
+    {
+      orig_op = gen_rtx_MEM (ad->mode, *ad->outer);
+      MEM_ADDR_SPACE (orig_op) = ad->as;
+    }
+  else
+    orig_op = *ad->outer;
+#endif
   bool ok_p = (constraint
 #ifdef EXTRA_CONSTRAINT_STR
-	       ? EXTRA_CONSTRAINT_STR (*ad->outer, constraint[0], constraint)
+	       ? EXTRA_CONSTRAINT_STR (orig_op, constraint[0], constraint)
 #else
 	       ? false
 #endif
@@ -2865,7 +2877,8 @@ process_address (int nop, rtx *before, rtx *after)
   /* Target hooks sometimes reject extra constraint addresses -- use
      EXTRA_CONSTRAINT_STR for the validation.  */
   if (constraint[0] != 'p'
-      && EXTRA_ADDRESS_CONSTRAINT (constraint[0], constraint)
+      && (EXTRA_ADDRESS_CONSTRAINT (constraint[0], constraint)
+	  || EXTRA_MEMORY_CONSTRAINT (constraint[0], constraint))
       && valid_address_p (&ad, constraint))
     return change_p;
 #endif

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-05-09 22:58                       ` Matthew Fortune
@ 2014-05-10 18:44                         ` Richard Sandiford
  2014-05-14 13:24                           ` Robert Suchanek
  2014-05-18 19:38                           ` Richard Sandiford
  0 siblings, 2 replies; 31+ messages in thread
From: Richard Sandiford @ 2014-05-10 18:44 UTC (permalink / raw)
  To: Matthew Fortune
  Cc: Vladimir Makarov, Robert Suchanek, gcc-patches, Kyrill Tkachov

Thanks for looking at this.

Matthew Fortune <Matthew.Fortune@imgtec.com> writes:
>> > Hi all,
>> > This caused some testsuite failures on arm:
>> > FAIL: gcc.target/arm/vfp-ldmdbd.c scan-assembler fldmdbd
>> > FAIL: gcc.target/arm/vfp-ldmdbs.c scan-assembler fldmdbs
>> > FAIL: gcc.target/arm/vfp-ldmiad.c scan-assembler fldmiad
>> > FAIL: gcc.target/arm/vfp-ldmias.c scan-assembler fldmias
>> >
>> >  From the vfp-ldmdbd.c test this patch changed the codegen from:
>> >      fldmdbd    r5!, {d7}
>> >
>> > into
>> >      sub    r5, r5, #8
>> >      fldd    d7, [r5]
>> >
>> > Which broke the test.
>> 
>> Sorry for the breakage.  I've reverted the patch for now and will send a
>> fixed version when I have time.
>
> The problem appears to lie with the new satisfies_memory_constraint_p
> which is passing just the address to valid_address_p but the constraint
> is a memory constraint (which wants the mem to be retained).

Yeah.

> One option is to re-construct the MEM using the address_info provided
> to valid_address_p. This has mode, address space and whether it was
> actually a MEM to start with so there is enough information. 

We don't want to create garbage rtl though.  The substitution happens
in-place, so the routine does temporarily turn the original MEM into the
eliminated form.

My first reaction after seeing the bug was to pass the operand in as
another parameter to valid_address_p.  That made the interface a bit
ridiculous though, so I didn't post it and reverted the original patch
instead.

I think a cleaner way of doing it would be to have helper functions
that switch in and out of the eliminated form, storing the old form
in fields of a new structure (either separate from address_info,
or a local inheritance of it).  We probably also want to have arrays
of address_infos, one for each operand, so that we don't analyse the
same address too many times during the same insn.

> Another issue noticed while looking at this:
> process_address does not seem to make an attempt to check for
> EXTRA_MEMORY_CONSTRAINT even though it does decompose memory addresses.
> For the lea address case then the address is checked with valid_address_p.
> This seems inconsistent, is it intentional?

Yeah, I think so.  Memory constraints are different in two main ways:

(a) it's obvious from the MEM what the mode and address space of the
    access actually are.  legitimate_address_p is all about the most
    general address, so in practice no memory constraint should rely
    on accepting more than legitimate_address_p does.

(b) it's useful for one pattern to have several alternatives with
    different memory constraints (e.g. "ZS", "ZT" and "m" in the 32-bit
    MIPS move patterns).  There isn't really anything special about the
    first alternative.

IMO it'd be good to get rid of this special handling for extra address
constraints, but I've no idea how easy that would be.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-05-10 18:44                         ` Richard Sandiford
@ 2014-05-14 13:24                           ` Robert Suchanek
  2014-05-15 21:05                             ` Robert Suchanek
  2014-05-15 21:34                             ` Richard Sandiford
  2014-05-18 19:38                           ` Richard Sandiford
  1 sibling, 2 replies; 31+ messages in thread
From: Robert Suchanek @ 2014-05-14 13:24 UTC (permalink / raw)
  To: Richard Sandiford, Matthew Fortune
  Cc: Vladimir Makarov, gcc-patches, Kyrill Tkachov

Hi Richard,

Are you working on the solution to fix the breakage? I'm about
to look into this and wanted to find out how far we got with this.

Regards,
Robert

> -----Original Message-----
> From: Richard Sandiford [mailto:rdsandiford@googlemail.com]
> Sent: 10 May 2014 19:44
> To: Matthew Fortune
> Cc: Vladimir Makarov; Robert Suchanek; gcc-patches@gcc.gnu.org; Kyrill
> Tkachov
> Subject: Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
> 
> Thanks for looking at this.
> 
> Matthew Fortune <Matthew.Fortune@imgtec.com> writes:
> >> > Hi all,
> >> > This caused some testsuite failures on arm:
> >> > FAIL: gcc.target/arm/vfp-ldmdbd.c scan-assembler fldmdbd
> >> > FAIL: gcc.target/arm/vfp-ldmdbs.c scan-assembler fldmdbs
> >> > FAIL: gcc.target/arm/vfp-ldmiad.c scan-assembler fldmiad
> >> > FAIL: gcc.target/arm/vfp-ldmias.c scan-assembler fldmias
> >> >
> >> >  From the vfp-ldmdbd.c test this patch changed the codegen from:
> >> >      fldmdbd    r5!, {d7}
> >> >
> >> > into
> >> >      sub    r5, r5, #8
> >> >      fldd    d7, [r5]
> >> >
> >> > Which broke the test.
> >>
> >> Sorry for the breakage.  I've reverted the patch for now and will send a
> >> fixed version when I have time.
> >
> > The problem appears to lie with the new satisfies_memory_constraint_p
> > which is passing just the address to valid_address_p but the constraint
> > is a memory constraint (which wants the mem to be retained).
> 
> Yeah.
> 
> > One option is to re-construct the MEM using the address_info provided
> > to valid_address_p. This has mode, address space and whether it was
> > actually a MEM to start with so there is enough information.
> 
> We don't want to create garbage rtl though.  The substitution happens
> in-place, so the routine does temporarily turn the original MEM into the
> eliminated form.
> 
> My first reaction after seeing the bug was to pass the operand in as
> another parameter to valid_address_p.  That made the interface a bit
> ridiculous though, so I didn't post it and reverted the original patch
> instead.
> 
> I think a cleaner way of doing it would be to have helper functions
> that switch in and out of the eliminated form, storing the old form
> in fields of a new structure (either separate from address_info,
> or a local inheritance of it).  We probably also want to have arrays
> of address_infos, one for each operand, so that we don't analyse the
> same address too many times during the same insn.
> 
> > Another issue noticed while looking at this:
> > process_address does not seem to make an attempt to check for
> > EXTRA_MEMORY_CONSTRAINT even though it does decompose memory addresses.
> > For the lea address case then the address is checked with valid_address_p.
> > This seems inconsistent, is it intentional?
> 
> Yeah, I think so.  Memory constraints are different in two main ways:
> 
> (a) it's obvious from the MEM what the mode and address space of the
>     access actually are.  legitimate_address_p is all about the most
>     general address, so in practice no memory constraint should rely
>     on accepting more than legitimate_address_p does.
> 
> (b) it's useful for one pattern to have several alternatives with
>     different memory constraints (e.g. "ZS", "ZT" and "m" in the 32-bit
>     MIPS move patterns).  There isn't really anything special about the
>     first alternative.
> 
> IMO it'd be good to get rid of this special handling for extra address
> constraints, but I've no idea how easy that would be.
> 
> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-05-14 13:24                           ` Robert Suchanek
@ 2014-05-15 21:05                             ` Robert Suchanek
  2014-05-15 21:34                             ` Richard Sandiford
  1 sibling, 0 replies; 31+ messages in thread
From: Robert Suchanek @ 2014-05-15 21:05 UTC (permalink / raw)
  To: Richard Sandiford, Matthew Fortune
  Cc: Vladimir Makarov, gcc-patches, Kyrill Tkachov

Ping.
________________________________________
From: Robert Suchanek
Sent: 14 May 2014 14:24
To: Richard Sandiford; Matthew Fortune
Cc: Vladimir Makarov; gcc-patches@gcc.gnu.org; Kyrill Tkachov
Subject: RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend

Hi Richard,

Are you working on the solution to fix the breakage? I'm about
to look into this and wanted to find out how far we got with this.

Regards,
Robert

> -----Original Message-----
> From: Richard Sandiford [mailto:rdsandiford@googlemail.com]
> Sent: 10 May 2014 19:44
> To: Matthew Fortune
> Cc: Vladimir Makarov; Robert Suchanek; gcc-patches@gcc.gnu.org; Kyrill
> Tkachov
> Subject: Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
>
> Thanks for looking at this.
>
> Matthew Fortune <Matthew.Fortune@imgtec.com> writes:
> >> > Hi all,
> >> > This caused some testsuite failures on arm:
> >> > FAIL: gcc.target/arm/vfp-ldmdbd.c scan-assembler fldmdbd
> >> > FAIL: gcc.target/arm/vfp-ldmdbs.c scan-assembler fldmdbs
> >> > FAIL: gcc.target/arm/vfp-ldmiad.c scan-assembler fldmiad
> >> > FAIL: gcc.target/arm/vfp-ldmias.c scan-assembler fldmias
> >> >
> >> >  From the vfp-ldmdbd.c test this patch changed the codegen from:
> >> >      fldmdbd    r5!, {d7}
> >> >
> >> > into
> >> >      sub    r5, r5, #8
> >> >      fldd    d7, [r5]
> >> >
> >> > Which broke the test.
> >>
> >> Sorry for the breakage.  I've reverted the patch for now and will send a
> >> fixed version when I have time.
> >
> > The problem appears to lie with the new satisfies_memory_constraint_p
> > which is passing just the address to valid_address_p but the constraint
> > is a memory constraint (which wants the mem to be retained).
>
> Yeah.
>
> > One option is to re-construct the MEM using the address_info provided
> > to valid_address_p. This has mode, address space and whether it was
> > actually a MEM to start with so there is enough information.
>
> We don't want to create garbage rtl though.  The substitution happens
> in-place, so the routine does temporarily turn the original MEM into the
> eliminated form.
>
> My first reaction after seeing the bug was to pass the operand in as
> another parameter to valid_address_p.  That made the interface a bit
> ridiculous though, so I didn't post it and reverted the original patch
> instead.
>
> I think a cleaner way of doing it would be to have helper functions
> that switch in and out of the eliminated form, storing the old form
> in fields of a new structure (either separate from address_info,
> or a local inheritance of it).  We probably also want to have arrays
> of address_infos, one for each operand, so that we don't analyse the
> same address too many times during the same insn.
>
> > Another issue noticed while looking at this:
> > process_address does not seem to make an attempt to check for
> > EXTRA_MEMORY_CONSTRAINT even though it does decompose memory addresses.
> > For the lea address case then the address is checked with valid_address_p.
> > This seems inconsistent, is it intentional?
>
> Yeah, I think so.  Memory constraints are different in two main ways:
>
> (a) it's obvious from the MEM what the mode and address space of the
>     access actually are.  legitimate_address_p is all about the most
>     general address, so in practice no memory constraint should rely
>     on accepting more than legitimate_address_p does.
>
> (b) it's useful for one pattern to have several alternatives with
>     different memory constraints (e.g. "ZS", "ZT" and "m" in the 32-bit
>     MIPS move patterns).  There isn't really anything special about the
>     first alternative.
>
> IMO it'd be good to get rid of this special handling for extra address
> constraints, but I've no idea how easy that would be.
>
> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-05-14 13:24                           ` Robert Suchanek
  2014-05-15 21:05                             ` Robert Suchanek
@ 2014-05-15 21:34                             ` Richard Sandiford
  2014-05-16 21:05                               ` Robert Suchanek
  1 sibling, 1 reply; 31+ messages in thread
From: Richard Sandiford @ 2014-05-15 21:34 UTC (permalink / raw)
  To: Robert Suchanek
  Cc: Matthew Fortune, Vladimir Makarov, gcc-patches, Kyrill Tkachov

Robert Suchanek <Robert.Suchanek@imgtec.com> writes:
> Are you working on the solution to fix the breakage? I'm about
> to look into this and wanted to find out how far we got with this.

You mean the "cleaner way" I suggested, or something else?
If you want to have a go then feel free.  Otherwise I'll try to get
to it over the weekend.

Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-05-15 21:34                             ` Richard Sandiford
@ 2014-05-16 21:05                               ` Robert Suchanek
  0 siblings, 0 replies; 31+ messages in thread
From: Robert Suchanek @ 2014-05-16 21:05 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: Matthew Fortune, Vladimir Makarov, gcc-patches, Kyrill Tkachov

I was thinking of something else but it doesn't appear to be good enough
and most likely will follow the suggested way. 

Regards,
Robert
________________________________________
From: Richard Sandiford [rdsandiford@googlemail.com]
Sent: 15 May 2014 22:34
To: Robert Suchanek
Cc: Matthew Fortune; Vladimir Makarov; gcc-patches@gcc.gnu.org; Kyrill Tkachov
Subject: Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend

Robert Suchanek <Robert.Suchanek@imgtec.com> writes:
> Are you working on the solution to fix the breakage? I'm about
> to look into this and wanted to find out how far we got with this.

You mean the "cleaner way" I suggested, or something else?
If you want to have a go then feel free.  Otherwise I'll try to get
to it over the weekend.

Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
  2014-05-10 18:44                         ` Richard Sandiford
  2014-05-14 13:24                           ` Robert Suchanek
@ 2014-05-18 19:38                           ` Richard Sandiford
  2014-06-02 19:37                             ` RFA: Make LRA temporarily eliminate addresses before testing constraints Richard Sandiford
  1 sibling, 1 reply; 31+ messages in thread
From: Richard Sandiford @ 2014-05-18 19:38 UTC (permalink / raw)
  To: Matthew Fortune
  Cc: Vladimir Makarov, Robert Suchanek, gcc-patches, Kyrill Tkachov

Richard Sandiford <rdsandiford@googlemail.com> writes:
> I think a cleaner way of doing it would be to have helper functions
> that switch in and out of the eliminated form, storing the old form
> in fields of a new structure (either separate from address_info,
> or a local inheritance of it).  We probably also want to have arrays
> of address_infos, one for each operand, so that we don't analyse the
> same address too many times during the same insn.

In the end maintaining the array of address_infos seemed like too much
work.  It was hard to keep it up-to-date with various other changes
that can be made, including swapping commutative operands, to the point
where it wasn't obvious whether it was really an optimisation or not.

Here's a patch that does the first.  Tested on x86_64-linux-gnu.
This time I also compared the assembly output for gcc.dg, g++.dg
and gcc.c-torture at -O2 on:

  arch64-linux-gnu arm-eabi mipsisa64-sde-elf s390x-linux-gnu
  powerpc64-linux-gnu x86_64-linux-gnu

s390x in particular is very good at exposing problems with this code.
(It caught bugs in the aborted attempt to keep an array of address_infos.)

OK to install?

Thanks,
Richard


gcc/
	* lra-constraints.c (valid_address_p): Move earlier in file.
	(address_eliminator): New structure.
	(satisfies_memory_constraint_p): New function.
	(satisfies_address_constraint_p): Likewise.
	(process_alt_operands, process_address, curr_insn_transform): Use them.

Index: gcc/lra-constraints.c
===================================================================
--- gcc/lra-constraints.c	2014-05-17 17:49:19.071639652 +0100
+++ gcc/lra-constraints.c	2014-05-18 20:36:17.499181467 +0100
@@ -317,6 +317,118 @@ in_mem_p (int regno)
   return get_reg_class (regno) == NO_REGS;
 }
 
+/* Return 1 if ADDR is a valid memory address for mode MODE in address
+   space AS, and check that each pseudo has the proper kind of hard
+   reg.	 */
+static int
+valid_address_p (enum machine_mode mode ATTRIBUTE_UNUSED,
+		 rtx addr, addr_space_t as)
+{
+#ifdef GO_IF_LEGITIMATE_ADDRESS
+  lra_assert (ADDR_SPACE_GENERIC_P (as));
+  GO_IF_LEGITIMATE_ADDRESS (mode, addr, win);
+  return 0;
+
+ win:
+  return 1;
+#else
+  return targetm.addr_space.legitimate_address_p (mode, addr, 0, as);
+#endif
+}
+
+namespace {
+  /* Temporarily eliminates registers in an address (for the lifetime of
+     the object).  */
+  class address_eliminator {
+  public:
+    address_eliminator (struct address_info *ad);
+    ~address_eliminator ();
+
+  private:
+    struct address_info *m_ad;
+    rtx *m_base_loc;
+    rtx m_base_reg;
+    rtx *m_index_loc;
+    rtx m_index_reg;
+  };
+}
+
+address_eliminator::address_eliminator (struct address_info *ad)
+  : m_ad (ad),
+    m_base_loc (strip_subreg (ad->base_term)),
+    m_base_reg (NULL_RTX),
+    m_index_loc (strip_subreg (ad->index_term)),
+    m_index_reg (NULL_RTX)
+{
+  if (m_base_loc != NULL)
+    {
+      m_base_reg = *m_base_loc;
+      lra_eliminate_reg_if_possible (m_base_loc);
+      if (m_ad->base_term2 != NULL)
+	*m_ad->base_term2 = *m_ad->base_term;
+    }
+  if (m_index_loc != NULL)
+    {
+      m_index_reg = *m_index_loc;
+      lra_eliminate_reg_if_possible (m_index_loc);
+    }
+}
+
+address_eliminator::~address_eliminator ()
+{
+  if (m_base_loc && *m_base_loc != m_base_reg)
+    {
+      *m_base_loc = m_base_reg;
+      if (m_ad->base_term2 != NULL)
+	*m_ad->base_term2 = *m_ad->base_term;
+    }
+  if (m_index_loc && *m_index_loc != m_index_reg)
+    *m_index_loc = m_index_reg;
+}
+
+/* Return true if the eliminated form of AD is a legitimate target address.  */
+static bool
+valid_address_p (struct address_info *ad)
+{
+  address_eliminator eliminator (ad);
+  return valid_address_p (ad->mode, *ad->outer, ad->as);
+}
+
+#ifdef EXTRA_CONSTRAINT_STR
+/* Return true if the eliminated form of memory reference OP satisfies
+   extra address constraint CONSTRAINT.  */
+static bool
+satisfies_memory_constraint_p (rtx op, const char *constraint)
+{
+  struct address_info ad;
+
+  decompose_mem_address (&ad, op);
+  address_eliminator eliminator (&ad);
+  return EXTRA_CONSTRAINT_STR (op, *constraint, constraint);
+}
+
+/* Return true if the eliminated form of address AD satisfies extra
+   address constraint CONSTRAINT.  */
+static bool
+satisfies_address_constraint_p (struct address_info *ad,
+				const char *constraint)
+{
+  address_eliminator eliminator (ad);
+  return EXTRA_CONSTRAINT_STR (*ad->outer, *constraint, constraint);
+}
+
+/* Return true if the eliminated form of address OP satisfies extra
+   address constraint CONSTRAINT.  */
+static bool
+satisfies_address_constraint_p (rtx op, const char *constraint)
+{
+  struct address_info ad;
+
+  decompose_lea_address (&ad, &op);
+  return satisfies_address_constraint_p (&ad, constraint);
+}
+#endif
+
 /* Initiate equivalences for LRA.  As we keep original equivalences
    before any elimination, we need to make copies otherwise any change
    in insns might change the equivalences.  */
@@ -1941,7 +2053,8 @@ process_alt_operands (int only_alternati
 #ifdef EXTRA_CONSTRAINT_STR
 		      if (EXTRA_MEMORY_CONSTRAINT (c, p))
 			{
-			  if (EXTRA_CONSTRAINT_STR (op, c, p))
+			  if (MEM_P (op)
+			      && satisfies_memory_constraint_p (op, p))
 			    win = true;
 			  else if (spilled_pseudo_p (op))
 			    win = true;
@@ -1960,7 +2073,7 @@ process_alt_operands (int only_alternati
 			}
 		      if (EXTRA_ADDRESS_CONSTRAINT (c, p))
 			{
-			  if (EXTRA_CONSTRAINT_STR (op, c, p))
+			  if (satisfies_address_constraint_p (op, p))
 			    win = true;
 
 			  /* If we didn't already win, we can reload
@@ -2576,60 +2689,6 @@ process_alt_operands (int only_alternati
   return ok_p;
 }
 
-/* Return 1 if ADDR is a valid memory address for mode MODE in address
-   space AS, and check that each pseudo has the proper kind of hard
-   reg.	 */
-static int
-valid_address_p (enum machine_mode mode ATTRIBUTE_UNUSED,
-		 rtx addr, addr_space_t as)
-{
-#ifdef GO_IF_LEGITIMATE_ADDRESS
-  lra_assert (ADDR_SPACE_GENERIC_P (as));
-  GO_IF_LEGITIMATE_ADDRESS (mode, addr, win);
-  return 0;
-
- win:
-  return 1;
-#else
-  return targetm.addr_space.legitimate_address_p (mode, addr, 0, as);
-#endif
-}
-
-/* Return whether address AD is valid.  */
-
-static bool
-valid_address_p (struct address_info *ad)
-{
-  /* Some ports do not check displacements for eliminable registers,
-     so we replace them temporarily with the elimination target.  */
-  rtx saved_base_reg = NULL_RTX;
-  rtx saved_index_reg = NULL_RTX;
-  rtx *base_term = strip_subreg (ad->base_term);
-  rtx *index_term = strip_subreg (ad->index_term);
-  if (base_term != NULL)
-    {
-      saved_base_reg = *base_term;
-      lra_eliminate_reg_if_possible (base_term);
-      if (ad->base_term2 != NULL)
-	*ad->base_term2 = *ad->base_term;
-    }
-  if (index_term != NULL)
-    {
-      saved_index_reg = *index_term;
-      lra_eliminate_reg_if_possible (index_term);
-    }
-  bool ok_p = valid_address_p (ad->mode, *ad->outer, ad->as);
-  if (saved_base_reg != NULL_RTX)
-    {
-      *base_term = saved_base_reg;
-      if (ad->base_term2 != NULL)
-	*ad->base_term2 = *ad->base_term;
-    }
-  if (saved_index_reg != NULL_RTX)
-    *index_term = saved_index_reg;
-  return ok_p;
-}
-
 /* Make reload base reg + disp from address AD.  Return the new pseudo.  */
 static rtx
 base_plus_disp_to_reg (struct address_info *ad)
@@ -2832,7 +2891,7 @@ process_address (int nop, rtx *before, r
      EXTRA_CONSTRAINT_STR for the validation.  */
   if (constraint[0] != 'p'
       && EXTRA_ADDRESS_CONSTRAINT (constraint[0], constraint)
-      && EXTRA_CONSTRAINT_STR (op, constraint[0], constraint))
+      && satisfies_address_constraint_p (&ad, constraint))
     return change_p;
 #endif
 
@@ -3539,7 +3598,7 @@ curr_insn_transform (void)
 		  break;
 #ifdef EXTRA_CONSTRAINT_STR
 		if (EXTRA_MEMORY_CONSTRAINT (c, constraint)
-		    && EXTRA_CONSTRAINT_STR (tem, c, constraint))
+		    && satisfies_memory_constraint_p (tem, constraint))
 		  break;
 #endif
 	      }

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RFA: Make LRA temporarily eliminate addresses before testing constraints
  2014-05-18 19:38                           ` Richard Sandiford
@ 2014-06-02 19:37                             ` Richard Sandiford
  2014-06-03 20:50                               ` Vladimir Makarov
  0 siblings, 1 reply; 31+ messages in thread
From: Richard Sandiford @ 2014-06-02 19:37 UTC (permalink / raw)
  To: Matthew Fortune
  Cc: Vladimir Makarov, Robert Suchanek, gcc-patches, Kyrill Tkachov

Ping.  Imagination's copyright assignment has now gone through and so
in principle we're ready for the MIPS LRA switch to go in.  We need
this LRA patch as a prequisite though.

Robert: you also had an LRA change, but is it still needed after this one?
If so, could you repost it and explain the case it handles?

Thanks,
Richard

Richard Sandiford <rdsandiford@googlemail.com> writes:
> Richard Sandiford <rdsandiford@googlemail.com> writes:
>> I think a cleaner way of doing it would be to have helper functions
>> that switch in and out of the eliminated form, storing the old form
>> in fields of a new structure (either separate from address_info,
>> or a local inheritance of it).  We probably also want to have arrays
>> of address_infos, one for each operand, so that we don't analyse the
>> same address too many times during the same insn.
>
> In the end maintaining the array of address_infos seemed like too much
> work.  It was hard to keep it up-to-date with various other changes
> that can be made, including swapping commutative operands, to the point
> where it wasn't obvious whether it was really an optimisation or not.
>
> Here's a patch that does the first.  Tested on x86_64-linux-gnu.
> This time I also compared the assembly output for gcc.dg, g++.dg
> and gcc.c-torture at -O2 on:
>
>   arch64-linux-gnu arm-eabi mipsisa64-sde-elf s390x-linux-gnu
>   powerpc64-linux-gnu x86_64-linux-gnu
>
> s390x in particular is very good at exposing problems with this code.
> (It caught bugs in the aborted attempt to keep an array of address_infos.)
>
> OK to install?
>
> Thanks,
> Richard
>
>
> gcc/
> 	* lra-constraints.c (valid_address_p): Move earlier in file.
> 	(address_eliminator): New structure.
> 	(satisfies_memory_constraint_p): New function.
> 	(satisfies_address_constraint_p): Likewise.
> 	(process_alt_operands, process_address, curr_insn_transform): Use them.
>
> Index: gcc/lra-constraints.c
> ===================================================================
> --- gcc/lra-constraints.c	2014-05-17 17:49:19.071639652 +0100
> +++ gcc/lra-constraints.c	2014-05-18 20:36:17.499181467 +0100
> @@ -317,6 +317,118 @@ in_mem_p (int regno)
>    return get_reg_class (regno) == NO_REGS;
>  }
>  
> +/* Return 1 if ADDR is a valid memory address for mode MODE in address
> +   space AS, and check that each pseudo has the proper kind of hard
> +   reg.	 */
> +static int
> +valid_address_p (enum machine_mode mode ATTRIBUTE_UNUSED,
> +		 rtx addr, addr_space_t as)
> +{
> +#ifdef GO_IF_LEGITIMATE_ADDRESS
> +  lra_assert (ADDR_SPACE_GENERIC_P (as));
> +  GO_IF_LEGITIMATE_ADDRESS (mode, addr, win);
> +  return 0;
> +
> + win:
> +  return 1;
> +#else
> +  return targetm.addr_space.legitimate_address_p (mode, addr, 0, as);
> +#endif
> +}
> +
> +namespace {
> +  /* Temporarily eliminates registers in an address (for the lifetime of
> +     the object).  */
> +  class address_eliminator {
> +  public:
> +    address_eliminator (struct address_info *ad);
> +    ~address_eliminator ();
> +
> +  private:
> +    struct address_info *m_ad;
> +    rtx *m_base_loc;
> +    rtx m_base_reg;
> +    rtx *m_index_loc;
> +    rtx m_index_reg;
> +  };
> +}
> +
> +address_eliminator::address_eliminator (struct address_info *ad)
> +  : m_ad (ad),
> +    m_base_loc (strip_subreg (ad->base_term)),
> +    m_base_reg (NULL_RTX),
> +    m_index_loc (strip_subreg (ad->index_term)),
> +    m_index_reg (NULL_RTX)
> +{
> +  if (m_base_loc != NULL)
> +    {
> +      m_base_reg = *m_base_loc;
> +      lra_eliminate_reg_if_possible (m_base_loc);
> +      if (m_ad->base_term2 != NULL)
> +	*m_ad->base_term2 = *m_ad->base_term;
> +    }
> +  if (m_index_loc != NULL)
> +    {
> +      m_index_reg = *m_index_loc;
> +      lra_eliminate_reg_if_possible (m_index_loc);
> +    }
> +}
> +
> +address_eliminator::~address_eliminator ()
> +{
> +  if (m_base_loc && *m_base_loc != m_base_reg)
> +    {
> +      *m_base_loc = m_base_reg;
> +      if (m_ad->base_term2 != NULL)
> +	*m_ad->base_term2 = *m_ad->base_term;
> +    }
> +  if (m_index_loc && *m_index_loc != m_index_reg)
> +    *m_index_loc = m_index_reg;
> +}
> +
> +/* Return true if the eliminated form of AD is a legitimate target address.  */
> +static bool
> +valid_address_p (struct address_info *ad)
> +{
> +  address_eliminator eliminator (ad);
> +  return valid_address_p (ad->mode, *ad->outer, ad->as);
> +}
> +
> +#ifdef EXTRA_CONSTRAINT_STR
> +/* Return true if the eliminated form of memory reference OP satisfies
> +   extra address constraint CONSTRAINT.  */
> +static bool
> +satisfies_memory_constraint_p (rtx op, const char *constraint)
> +{
> +  struct address_info ad;
> +
> +  decompose_mem_address (&ad, op);
> +  address_eliminator eliminator (&ad);
> +  return EXTRA_CONSTRAINT_STR (op, *constraint, constraint);
> +}
> +
> +/* Return true if the eliminated form of address AD satisfies extra
> +   address constraint CONSTRAINT.  */
> +static bool
> +satisfies_address_constraint_p (struct address_info *ad,
> +				const char *constraint)
> +{
> +  address_eliminator eliminator (ad);
> +  return EXTRA_CONSTRAINT_STR (*ad->outer, *constraint, constraint);
> +}
> +
> +/* Return true if the eliminated form of address OP satisfies extra
> +   address constraint CONSTRAINT.  */
> +static bool
> +satisfies_address_constraint_p (rtx op, const char *constraint)
> +{
> +  struct address_info ad;
> +
> +  decompose_lea_address (&ad, &op);
> +  return satisfies_address_constraint_p (&ad, constraint);
> +}
> +#endif
> +
>  /* Initiate equivalences for LRA.  As we keep original equivalences
>     before any elimination, we need to make copies otherwise any change
>     in insns might change the equivalences.  */
> @@ -1941,7 +2053,8 @@ process_alt_operands (int only_alternati
>  #ifdef EXTRA_CONSTRAINT_STR
>  		      if (EXTRA_MEMORY_CONSTRAINT (c, p))
>  			{
> -			  if (EXTRA_CONSTRAINT_STR (op, c, p))
> +			  if (MEM_P (op)
> +			      && satisfies_memory_constraint_p (op, p))
>  			    win = true;
>  			  else if (spilled_pseudo_p (op))
>  			    win = true;
> @@ -1960,7 +2073,7 @@ process_alt_operands (int only_alternati
>  			}
>  		      if (EXTRA_ADDRESS_CONSTRAINT (c, p))
>  			{
> -			  if (EXTRA_CONSTRAINT_STR (op, c, p))
> +			  if (satisfies_address_constraint_p (op, p))
>  			    win = true;
>  
>  			  /* If we didn't already win, we can reload
> @@ -2576,60 +2689,6 @@ process_alt_operands (int only_alternati
>    return ok_p;
>  }
>  
> -/* Return 1 if ADDR is a valid memory address for mode MODE in address
> -   space AS, and check that each pseudo has the proper kind of hard
> -   reg.	 */
> -static int
> -valid_address_p (enum machine_mode mode ATTRIBUTE_UNUSED,
> -		 rtx addr, addr_space_t as)
> -{
> -#ifdef GO_IF_LEGITIMATE_ADDRESS
> -  lra_assert (ADDR_SPACE_GENERIC_P (as));
> -  GO_IF_LEGITIMATE_ADDRESS (mode, addr, win);
> -  return 0;
> -
> - win:
> -  return 1;
> -#else
> -  return targetm.addr_space.legitimate_address_p (mode, addr, 0, as);
> -#endif
> -}
> -
> -/* Return whether address AD is valid.  */
> -
> -static bool
> -valid_address_p (struct address_info *ad)
> -{
> -  /* Some ports do not check displacements for eliminable registers,
> -     so we replace them temporarily with the elimination target.  */
> -  rtx saved_base_reg = NULL_RTX;
> -  rtx saved_index_reg = NULL_RTX;
> -  rtx *base_term = strip_subreg (ad->base_term);
> -  rtx *index_term = strip_subreg (ad->index_term);
> -  if (base_term != NULL)
> -    {
> -      saved_base_reg = *base_term;
> -      lra_eliminate_reg_if_possible (base_term);
> -      if (ad->base_term2 != NULL)
> -	*ad->base_term2 = *ad->base_term;
> -    }
> -  if (index_term != NULL)
> -    {
> -      saved_index_reg = *index_term;
> -      lra_eliminate_reg_if_possible (index_term);
> -    }
> -  bool ok_p = valid_address_p (ad->mode, *ad->outer, ad->as);
> -  if (saved_base_reg != NULL_RTX)
> -    {
> -      *base_term = saved_base_reg;
> -      if (ad->base_term2 != NULL)
> -	*ad->base_term2 = *ad->base_term;
> -    }
> -  if (saved_index_reg != NULL_RTX)
> -    *index_term = saved_index_reg;
> -  return ok_p;
> -}
> -
>  /* Make reload base reg + disp from address AD.  Return the new pseudo.  */
>  static rtx
>  base_plus_disp_to_reg (struct address_info *ad)
> @@ -2832,7 +2891,7 @@ process_address (int nop, rtx *before, r
>       EXTRA_CONSTRAINT_STR for the validation.  */
>    if (constraint[0] != 'p'
>        && EXTRA_ADDRESS_CONSTRAINT (constraint[0], constraint)
> -      && EXTRA_CONSTRAINT_STR (op, constraint[0], constraint))
> +      && satisfies_address_constraint_p (&ad, constraint))
>      return change_p;
>  #endif
>  
> @@ -3539,7 +3598,7 @@ curr_insn_transform (void)
>  		  break;
>  #ifdef EXTRA_CONSTRAINT_STR
>  		if (EXTRA_MEMORY_CONSTRAINT (c, constraint)
> -		    && EXTRA_CONSTRAINT_STR (tem, c, constraint))
> +		    && satisfies_memory_constraint_p (tem, constraint))
>  		  break;
>  #endif
>  	      }

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: RFA: Make LRA temporarily eliminate addresses before testing constraints
  2014-06-02 19:37                             ` RFA: Make LRA temporarily eliminate addresses before testing constraints Richard Sandiford
@ 2014-06-03 20:50                               ` Vladimir Makarov
  2014-06-04 17:45                                 ` Richard Sandiford
  0 siblings, 1 reply; 31+ messages in thread
From: Vladimir Makarov @ 2014-06-03 20:50 UTC (permalink / raw)
  To: Matthew Fortune, Robert Suchanek, gcc-patches, Kyrill Tkachov,
	rdsandiford

On 06/02/2014 03:36 PM, Richard Sandiford wrote:
> Ping.  Imagination's copyright assignment has now gone through and so
> in principle we're ready for the MIPS LRA switch to go in.  We need
> this LRA patch as a prequisite though.
>
> Robert: you also had an LRA change, but is it still needed after this one?
> If so, could you repost it and explain the case it handles?
>
> Thanks,
> Richard
>
> Richard Sandiford <rdsandiford@googlemail.com> writes:
>> Richard Sandiford <rdsandiford@googlemail.com> writes:
>>> I think a cleaner way of doing it would be to have helper functions
>>> that switch in and out of the eliminated form, storing the old form
>>> in fields of a new structure (either separate from address_info,
>>> or a local inheritance of it).  We probably also want to have arrays
>>> of address_infos, one for each operand, so that we don't analyse the
>>> same address too many times during the same insn.
>> In the end maintaining the array of address_infos seemed like too much
>> work.  It was hard to keep it up-to-date with various other changes
>> that can be made, including swapping commutative operands, to the point
>> where it wasn't obvious whether it was really an optimisation or not.
>>
>> Here's a patch that does the first.  Tested on x86_64-linux-gnu.
>> This time I also compared the assembly output for gcc.dg, g++.dg
>> and gcc.c-torture at -O2 on:
>>
>>   arch64-linux-gnu arm-eabi mipsisa64-sde-elf s390x-linux-gnu
>>   powerpc64-linux-gnu x86_64-linux-gnu
>>
>> s390x in particular is very good at exposing problems with this code.
>> (It caught bugs in the aborted attempt to keep an array of address_infos.)
>>
>> OK to install?
>>
Yes, Richard.  Thanks for the wide testing.  It was also a pleasure to
read this C++ code.  Really nice.

 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: RFA: Make LRA temporarily eliminate addresses before testing constraints
  2014-06-03 20:50                               ` Vladimir Makarov
@ 2014-06-04 17:45                                 ` Richard Sandiford
  2014-06-11 11:30                                   ` Robert Suchanek
  2014-06-16 16:12                                   ` Robert Suchanek
  0 siblings, 2 replies; 31+ messages in thread
From: Richard Sandiford @ 2014-06-04 17:45 UTC (permalink / raw)
  To: Vladimir Makarov
  Cc: Matthew Fortune, Robert Suchanek, gcc-patches, Kyrill Tkachov

Vladimir Makarov <vmakarov@redhat.com> writes:
> On 06/02/2014 03:36 PM, Richard Sandiford wrote:
>> Ping.  Imagination's copyright assignment has now gone through and so
>> in principle we're ready for the MIPS LRA switch to go in.  We need
>> this LRA patch as a prequisite though.
>>
>> Robert: you also had an LRA change, but is it still needed after this one?
>> If so, could you repost it and explain the case it handles?
>>
>> Thanks,
>> Richard
>>
>> Richard Sandiford <rdsandiford@googlemail.com> writes:
>>> Richard Sandiford <rdsandiford@googlemail.com> writes:
>>>> I think a cleaner way of doing it would be to have helper functions
>>>> that switch in and out of the eliminated form, storing the old form
>>>> in fields of a new structure (either separate from address_info,
>>>> or a local inheritance of it).  We probably also want to have arrays
>>>> of address_infos, one for each operand, so that we don't analyse the
>>>> same address too many times during the same insn.
>>> In the end maintaining the array of address_infos seemed like too much
>>> work.  It was hard to keep it up-to-date with various other changes
>>> that can be made, including swapping commutative operands, to the point
>>> where it wasn't obvious whether it was really an optimisation or not.
>>>
>>> Here's a patch that does the first.  Tested on x86_64-linux-gnu.
>>> This time I also compared the assembly output for gcc.dg, g++.dg
>>> and gcc.c-torture at -O2 on:
>>>
>>>   arch64-linux-gnu arm-eabi mipsisa64-sde-elf s390x-linux-gnu
>>>   powerpc64-linux-gnu x86_64-linux-gnu
>>>
>>> s390x in particular is very good at exposing problems with this code.
>>> (It caught bugs in the aborted attempt to keep an array of address_infos.)
>>>
>>> OK to install?
>>>
> Yes, Richard.  Thanks for the wide testing.

Thanks Vlad, committed as r211242.  Given the problems with my first version,
I suppose it'd make sense to wait a few days to see whether there's any
fallout before applying the MIPS backend patch.

Richard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: RFA: Make LRA temporarily eliminate addresses before testing constraints
  2014-06-04 17:45                                 ` Richard Sandiford
@ 2014-06-11 11:30                                   ` Robert Suchanek
  2014-06-16 16:12                                   ` Robert Suchanek
  1 sibling, 0 replies; 31+ messages in thread
From: Robert Suchanek @ 2014-06-11 11:30 UTC (permalink / raw)
  To: Richard Sandiford, Vladimir Makarov
  Cc: Matthew Fortune, gcc-patches, Kyrill Tkachov

Hi Richard,

>> Robert: you also had an LRA change, but is it still needed after this one?
>> If so, could you repost it and explain the case it handles?

For just turning the LRA for the MIPS backend is not needed but we have issues
with the code size for MIPS16. LRA inserted a lot of reloads and the code size
increased on average by about 10% IIRC. To fix this, a number of patterns 
have to accept the stack pointer and a new class, M16_SP_REGS with 
M16_REGS + $sp was added.

However, this triggered a reloading problem as the stack pointer was rejected
by the back end and LRA tried to insert base+disp with the displacement not
always present. It only affects $sp not directly accessible as in MIPS16 case.

Regards,
Robert
         
gcc/                                                      
	* lra-constraints.c (base_to_reg): New function.       
	(process_address): Use new function.                   

diff --git gcc/lra-constraints.c gcc/lra-constraints.c
index 08716fe..d5ed37f 100644
--- gcc/lra-constraints.c
+++ gcc/lra-constraints.c
@@ -2686,6 +2686,39 @@ process_alt_operands (int only_alternative)
   return ok_p;
 }
 
+/* Make reload base reg from address AD.  */
+static rtx
+base_to_reg (struct address_info *ad)
+{
+  enum reg_class cl;
+  int code = -1;
+  rtx new_inner = NULL_RTX;
+  rtx new_reg = NULL_RTX;
+  rtx insn;
+  rtx last_insn = get_last_insn();
+
+  lra_assert (ad->base == ad->base_term && ad->disp == ad->disp_term);
+  cl = base_reg_class (ad->mode, ad->as, ad->base_outer_code,
+                       get_index_code (ad));
+  new_reg = lra_create_new_reg (GET_MODE (*ad->base_term), NULL_RTX,
+                                cl, "base");
+  new_inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), new_reg,
+                                   ad->disp_term == NULL
+                                   ? gen_int_mode (0, ad->mode)
+                                   : *ad->disp_term);
+  if (!valid_address_p (ad->mode, new_inner, ad->as))
+    return NULL_RTX;
+  insn = emit_insn (gen_rtx_SET (ad->mode, new_reg, *ad->base_term));
+  code = recog_memoized (insn);
+  if (code < 0)
+    {
+      delete_insns_since (last_insn);
+      return NULL_RTX;
+    }
+
+  return new_inner;
+}
+
 /* Make reload base reg + disp from address AD.  Return the new pseudo.  */
 static rtx
 base_plus_disp_to_reg (struct address_info *ad)
@@ -2908,6 +2941,8 @@ process_address_1 (int nop, rtx *before, rtx *after)
 
      3) the address is a frame address with an invalid offset.
 
+     4) the address is a frame address with an invalid base.
+
      All these cases involve a non-autoinc address, so there is no
      point revalidating other types.  */
   if (ad.autoinc_p || valid_address_p (&ad))
@@ -2989,14 +3024,19 @@ process_address_1 (int nop, rtx *before, rtx *after)
       int regno;
       enum reg_class cl;
       rtx set, insns, last_insn;
+      /* Try to reload base into register only if the base is invalid
+         for the address but with valid offset, case (4) above.  */
+      start_sequence ();
+      new_reg = base_to_reg (&ad);
+
       /* base + disp => new base, cases (1) and (3) above.  */
       /* Another option would be to reload the displacement into an
 	 index register.  However, postreload has code to optimize
 	 address reloads that have the same base and different
 	 displacements, so reloading into an index register would
 	 not necessarily be a win.  */
-      start_sequence ();
-      new_reg = base_plus_disp_to_reg (&ad);
+      if (new_reg == NULL_RTX)
+        new_reg = base_plus_disp_to_reg (&ad);
       insns = get_insns ();
       last_insn = get_last_insn ();
       /* If we generated at least two insns, try last insn source as

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: RFA: Make LRA temporarily eliminate addresses before testing constraints
  2014-06-04 17:45                                 ` Richard Sandiford
  2014-06-11 11:30                                   ` Robert Suchanek
@ 2014-06-16 16:12                                   ` Robert Suchanek
  2014-06-16 18:08                                     ` Vladimir Makarov
  1 sibling, 1 reply; 31+ messages in thread
From: Robert Suchanek @ 2014-06-16 16:12 UTC (permalink / raw)
  To: Richard Sandiford, Vladimir Makarov
  Cc: Matthew Fortune, gcc-patches, Kyrill Tkachov

Pinging for approval.

This part of the patch will be needed for MIPS16. The second part to enable
LRA in MIPS has been already approved.

> Hi Richard, 
> 
> >> Robert: you also had an LRA change, but is it still needed after this one?
> >> If so, could you repost it and explain the case it handles? 
>
> For just turning the LRA for the MIPS backend is not needed but we have issues
> with the code size for MIPS16. LRA inserted a lot of reloads and the code size
> increased on average by about 10% IIRC. To fix this, a number of patterns 
> have to accept the stack pointer and a new class, M16_SP_REGS with 
> M16_REGS + $sp was added.
> 
> However, this triggered a reloading problem as the stack pointer was rejected
> by the back end and LRA tried to insert base+disp with the displacement not
> always present. It only affects $sp not directly accessible as in MIPS16 case.
>
> Regards,
> Robert
>         
> gcc/                                                      
> 	* lra-constraints.c (base_to_reg): New function.       
> 	(process_address): Use new function.                   
> 
> diff --git gcc/lra-constraints.c gcc/lra-constraints.c
> index 08716fe..d5ed37f 100644
> --- gcc/lra-constraints.c
> +++ gcc/lra-constraints.c
> @@ -2686,6 +2686,39 @@ process_alt_operands (int only_alternative)
>    return ok_p;
>  }
>  
> +/* Make reload base reg from address AD.  */
> +static rtx
> +base_to_reg (struct address_info *ad)
> +{
> +  enum reg_class cl;
> +  int code = -1;
> +  rtx new_inner = NULL_RTX;
> +  rtx new_reg = NULL_RTX;
> +  rtx insn;
> +  rtx last_insn = get_last_insn();
> +
> +  lra_assert (ad->base == ad->base_term && ad->disp == ad->disp_term);
> +  cl = base_reg_class (ad->mode, ad->as, ad->base_outer_code,
> +                       get_index_code (ad));
> +  new_reg = lra_create_new_reg (GET_MODE (*ad->base_term), NULL_RTX,
> +                                cl, "base");
> +  new_inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), new_reg,
> +                                   ad->disp_term == NULL
> +                                   ? gen_int_mode (0, ad->mode)
> +                                   : *ad->disp_term);
> +  if (!valid_address_p (ad->mode, new_inner, ad->as))
> +    return NULL_RTX;
> +  insn = emit_insn (gen_rtx_SET (ad->mode, new_reg, *ad->base_term));
> +  code = recog_memoized (insn);
> +  if (code < 0)
> +    {
> +      delete_insns_since (last_insn);
> +      return NULL_RTX;
> +    }
> +
> +  return new_inner;
> +}
> +
>  /* Make reload base reg + disp from address AD.  Return the new pseudo.  */
>  static rtx
>  base_plus_disp_to_reg (struct address_info *ad)
> @@ -2908,6 +2941,8 @@ process_address_1 (int nop, rtx *before, rtx *after)
>  
>       3) the address is a frame address with an invalid offset.
>  
> +     4) the address is a frame address with an invalid base.
> +
>       All these cases involve a non-autoinc address, so there is no
>       point revalidating other types.  */
>    if (ad.autoinc_p || valid_address_p (&ad))
> @@ -2989,14 +3024,19 @@ process_address_1 (int nop, rtx *before, rtx *after)
>        int regno;
>        enum reg_class cl;
>        rtx set, insns, last_insn;
> +      /* Try to reload base into register only if the base is invalid
> +         for the address but with valid offset, case (4) above.  */
> +      start_sequence ();
> +      new_reg = base_to_reg (&ad);
> +
>        /* base + disp => new base, cases (1) and (3) above.  */
>        /* Another option would be to reload the displacement into an
>  	 index register.  However, postreload has code to optimize
>  	 address reloads that have the same base and different
>  	 displacements, so reloading into an index register would
>  	 not necessarily be a win.  */
> -      start_sequence ();
> -      new_reg = base_plus_disp_to_reg (&ad);
> +      if (new_reg == NULL_RTX)
> +        new_reg = base_plus_disp_to_reg (&ad);
>        insns = get_insns ();
>        last_insn = get_last_insn ();
>        /* If we generated at least two insns, try last insn source as

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: RFA: Make LRA temporarily eliminate addresses before testing constraints
  2014-06-16 16:12                                   ` Robert Suchanek
@ 2014-06-16 18:08                                     ` Vladimir Makarov
  2014-06-18 20:52                                       ` Matthew Fortune
  0 siblings, 1 reply; 31+ messages in thread
From: Vladimir Makarov @ 2014-06-16 18:08 UTC (permalink / raw)
  To: Robert Suchanek, Richard Sandiford
  Cc: Matthew Fortune, gcc-patches, Kyrill Tkachov

On 2014-06-16, 12:12 PM, Robert Suchanek wrote:
> Pinging for approval.
>
> This part of the patch will be needed for MIPS16. The second part to enable
> LRA in MIPS has been already approved.
>

   Sorry, Robert.  I thought you are waiting for some Richard's comment 
(actually he knows the code well and wrote address decoding in rtlanal.c).

   The patch is ok for me and makes LRA even more portable as it adds a 
new profitable address transformation and the code can be useful for 
other targets too.

   Thanks.


>> Hi Richard,
>>
>>>> Robert: you also had an LRA change, but is it still needed after this one?
>>>> If so, could you repost it and explain the case it handles?
>>
>> For just turning the LRA for the MIPS backend is not needed but we have issues
>> with the code size for MIPS16. LRA inserted a lot of reloads and the code size
>> increased on average by about 10% IIRC. To fix this, a number of patterns
>> have to accept the stack pointer and a new class, M16_SP_REGS with
>> M16_REGS + $sp was added.
>>
>> However, this triggered a reloading problem as the stack pointer was rejected
>> by the back end and LRA tried to insert base+disp with the displacement not
>> always present. It only affects $sp not directly accessible as in MIPS16 case.
>>
>> Regards,
>> Robert
>>
>> gcc/
>> 	* lra-constraints.c (base_to_reg): New function.
>> 	(process_address): Use new function.
>>
>> diff --git gcc/lra-constraints.c gcc/lra-constraints.c
>> index 08716fe..d5ed37f 100644
>> --- gcc/lra-constraints.c
>> +++ gcc/lra-constraints.c
>> @@ -2686,6 +2686,39 @@ process_alt_operands (int only_alternative)
>>     return ok_p;
>>   }
>>
>> +/* Make reload base reg from address AD.  */
>> +static rtx
>> +base_to_reg (struct address_info *ad)
>> +{
>> +  enum reg_class cl;
>> +  int code = -1;
>> +  rtx new_inner = NULL_RTX;
>> +  rtx new_reg = NULL_RTX;
>> +  rtx insn;
>> +  rtx last_insn = get_last_insn();
>> +
>> +  lra_assert (ad->base == ad->base_term && ad->disp == ad->disp_term);
>> +  cl = base_reg_class (ad->mode, ad->as, ad->base_outer_code,
>> +                       get_index_code (ad));
>> +  new_reg = lra_create_new_reg (GET_MODE (*ad->base_term), NULL_RTX,
>> +                                cl, "base");
>> +  new_inner = simplify_gen_binary (PLUS, GET_MODE (new_reg), new_reg,
>> +                                   ad->disp_term == NULL
>> +                                   ? gen_int_mode (0, ad->mode)
>> +                                   : *ad->disp_term);
>> +  if (!valid_address_p (ad->mode, new_inner, ad->as))
>> +    return NULL_RTX;
>> +  insn = emit_insn (gen_rtx_SET (ad->mode, new_reg, *ad->base_term));
>> +  code = recog_memoized (insn);
>> +  if (code < 0)
>> +    {
>> +      delete_insns_since (last_insn);
>> +      return NULL_RTX;
>> +    }
>> +
>> +  return new_inner;
>> +}
>> +
>>   /* Make reload base reg + disp from address AD.  Return the new pseudo.  */
>>   static rtx
>>   base_plus_disp_to_reg (struct address_info *ad)
>> @@ -2908,6 +2941,8 @@ process_address_1 (int nop, rtx *before, rtx *after)
>>
>>        3) the address is a frame address with an invalid offset.
>>
>> +     4) the address is a frame address with an invalid base.
>> +
>>        All these cases involve a non-autoinc address, so there is no
>>        point revalidating other types.  */
>>     if (ad.autoinc_p || valid_address_p (&ad))
>> @@ -2989,14 +3024,19 @@ process_address_1 (int nop, rtx *before, rtx *after)
>>         int regno;
>>         enum reg_class cl;
>>         rtx set, insns, last_insn;
>> +      /* Try to reload base into register only if the base is invalid
>> +         for the address but with valid offset, case (4) above.  */
>> +      start_sequence ();
>> +      new_reg = base_to_reg (&ad);
>> +
>>         /* base + disp => new base, cases (1) and (3) above.  */
>>         /* Another option would be to reload the displacement into an
>>   	 index register.  However, postreload has code to optimize
>>   	 address reloads that have the same base and different
>>   	 displacements, so reloading into an index register would
>>   	 not necessarily be a win.  */
>> -      start_sequence ();
>> -      new_reg = base_plus_disp_to_reg (&ad);
>> +      if (new_reg == NULL_RTX)
>> +        new_reg = base_plus_disp_to_reg (&ad);
>>         insns = get_insns ();
>>         last_insn = get_last_insn ();
>>         /* If we generated at least two insns, try last insn source as
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: RFA: Make LRA temporarily eliminate addresses before testing constraints
  2014-06-16 18:08                                     ` Vladimir Makarov
@ 2014-06-18 20:52                                       ` Matthew Fortune
  0 siblings, 0 replies; 31+ messages in thread
From: Matthew Fortune @ 2014-06-18 20:52 UTC (permalink / raw)
  To: Vladimir Makarov, Robert Suchanek, Richard Sandiford
  Cc: gcc-patches, Kyrill Tkachov, Rich Fuhler

> On 2014-06-16, 12:12 PM, Robert Suchanek wrote:
> > Pinging for approval.
> >
> > This part of the patch will be needed for MIPS16. The second part to
> enable
> > LRA in MIPS has been already approved.
> >
> 
>    Sorry, Robert.  I thought you are waiting for some Richard's comment
> (actually he knows the code well and wrote address decoding in rtlanal.c).
> 
>    The patch is ok for me and makes LRA even more portable as it adds a
> new profitable address transformation and the code can be useful for
> other targets too.

Thanks.

Core LRA change committed as: r211802
MIPS LRA committed as: r211805

Matthew

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2014-06-18 20:52 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-29  1:27 [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend Robert Suchanek
2014-03-29 11:24 ` Richard Sandiford
2014-04-09 10:00   ` Robert Suchanek
2014-04-09 21:24     ` Richard Sandiford
2014-04-10 20:29       ` Richard Sandiford
2014-04-14 11:13       ` Robert Suchanek
2014-04-15 21:12         ` Richard Sandiford
2014-04-16 21:10           ` Robert Suchanek
2014-04-21 13:01             ` Richard Sandiford
2014-04-23 13:34               ` Robert Suchanek
2014-04-23 14:10                 ` Richard Sandiford
2014-04-23 15:34                   ` Robert Suchanek
2014-04-23 15:33               ` Vladimir Makarov
2014-05-03 19:21                 ` Richard Sandiford
2014-05-06 14:06                   ` Kyrill Tkachov
2014-05-06 19:22                     ` Richard Sandiford
2014-05-09 22:58                       ` Matthew Fortune
2014-05-10 18:44                         ` Richard Sandiford
2014-05-14 13:24                           ` Robert Suchanek
2014-05-15 21:05                             ` Robert Suchanek
2014-05-15 21:34                             ` Richard Sandiford
2014-05-16 21:05                               ` Robert Suchanek
2014-05-18 19:38                           ` Richard Sandiford
2014-06-02 19:37                             ` RFA: Make LRA temporarily eliminate addresses before testing constraints Richard Sandiford
2014-06-03 20:50                               ` Vladimir Makarov
2014-06-04 17:45                                 ` Richard Sandiford
2014-06-11 11:30                                   ` Robert Suchanek
2014-06-16 16:12                                   ` Robert Suchanek
2014-06-16 18:08                                     ` Vladimir Makarov
2014-06-18 20:52                                       ` Matthew Fortune
2014-03-29 14:08 ` [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).