public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v3] Add support for sparc compare-and-branch
@ 2012-10-23  4:55 David Miller
  2012-10-25 18:28 ` David Miller
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: David Miller @ 2012-10-23  4:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: ebotcazou, ro


Differences from v2:

1) If another control transfer comes right after a cbcond we take
   an enormous performance penalty, some 20 cycles or more.  The
   documentation specifically warns about this, so emit a nop when
   we encounter this scenerio.

2) Add a heuristic to avoid using cbcond if we know at RTL emit
   time that we're going to compare against a constant that does
   not fit in the tiny 5-bit signed immediate field.

3) Use cbcond for unconditional jumps too.

Regstrapped on sparc-unknown-linux-gnu w/--with-cpu=niagara4.

Eric and Rainer, I think that functionally this patch is fully ready
to go into the tree except for the Solaris aspects which I do not have
the means to work on.  Have either of you made any progress in this
area?

Thanks!

gcc/

2012-10-12  David S. Miller  <davem@davemloft.net>

	* configure.ac: Add check for assembler SPARC4 instruction
	support.
	* configure: Rebuild.
	* config.in: Add HAVE_AS_SPARC4 section.
	* config/sparc/sparc.opt (mcbcond): New option.
	* doc/invoke.texi: Document it.
	* config/sparc/constraints.md: New constraint 'A' for 5-bit signed
	immediates.
	* doc/md.texi: Document it.
	* config/sparc/predicates.md (arith5_operand): New predicate.
	* config/sparc/sparc.c (dump_target_flag_bits): Handle MASK_CBCOND.
	(sparc_option_override): Likewise.
	(emit_cbcond_insn): New function.
	(emit_conditional_branch_insn): Call it.
	(emit_cbcond_nop): New function.
	(output_ubranch): Use cbcond, remove label arg.
	(output_cbcond): New function.
	* config/sparc/sparc-protos.h (output_ubranch): Update.
	(output_cbcond): Declare it.
	(emit_cbcond_nop): Likewise.
	* config/sparc/sparc.md (type attribute): New types 'cbcond'
	and uncond_cbcond.
	(emit_cbcond_nop): New attribute.
	(length attribute): Handle cbcond and uncond_cbcond.
	(in_call_delay attribute): Reject cbcond and uncond_cbcond.
	(in_branch_delay attribute): Likewise.
	(in_uncond_branch_delay attribute): Likewise.
	(in_annul_branch_delay attribute): Likewise.
	(*cbcond_sp32, *cbcond_sp64): New insn patterns.
	(jump): Rewrite into an expander.
	(*jump_ubranch, *jump_cbcond): New patterns.
	* config/sparc/niagara4.md: Match 'cbcond' and 'uncond_cbcond' in
        'n4_cti'.
	* config/sparc/sparc.h (AS_NIAGARA4_FLAG): New macro, use it
	when target default is niagara4.
	(SPARC_SIMM5_P): Define.
	* config/sparc/sol2.h (AS_SPARC64_FLAG): Adjust.
	(AS_SPARC32_FLAG): Define.
	(ASM_CPU32_DEFAULT_SPEC, ASM_CPU64_DEFAULT_SPEC): Use
	AS_NIAGARA4_FLAG as needed.

diff --git a/gcc/config.in b/gcc/config.in
index b13805d..791d14a 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -266,6 +266,12 @@
 #endif
 
 
+/* Define if your assembler supports SPARC4 instructions. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_SPARC4
+#endif
+
+
 /* Define if your assembler supports fprnd. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_AS_FPRND
diff --git a/gcc/config/sparc/constraints.md b/gcc/config/sparc/constraints.md
index 472490f..8862ea1 100644
--- a/gcc/config/sparc/constraints.md
+++ b/gcc/config/sparc/constraints.md
@@ -18,7 +18,7 @@
 ;; <http://www.gnu.org/licenses/>.
 
 ;;; Unused letters:
-;;;    AB                       
+;;;     B
 ;;;    a        jkl    q  tuv xyz
 
 
@@ -62,6 +62,11 @@
 
 ;; Integer constant constraints
 
+(define_constraint "A"
+ "Signed 5-bit integer constant"
+ (and (match_code "const_int")
+      (match_test "SPARC_SIMM5_P (ival)")))
+
 (define_constraint "H"
  "Valid operand of double arithmetic operation"
  (and (match_code "const_double")
diff --git a/gcc/config/sparc/niagara4.md b/gcc/config/sparc/niagara4.md
index 272c8ff..61ca801 100644
--- a/gcc/config/sparc/niagara4.md
+++ b/gcc/config/sparc/niagara4.md
@@ -56,7 +56,7 @@
 
 (define_insn_reservation "n4_cti" 2
   (and (eq_attr "cpu" "niagara4")
-    (eq_attr "type" "branch,call,sibcall,call_no_delay_slot,uncond_branch,return"))
+    (eq_attr "type" "cbcond,uncond_cbcond,branch,call,sibcall,call_no_delay_slot,uncond_branch,return"))
   "n4_slot1, nothing")
 
 (define_insn_reservation "n4_fp" 11
diff --git a/gcc/config/sparc/predicates.md b/gcc/config/sparc/predicates.md
index 326524b..b64e109 100644
--- a/gcc/config/sparc/predicates.md
+++ b/gcc/config/sparc/predicates.md
@@ -391,6 +391,14 @@
   (ior (match_operand 0 "register_operand")
        (match_operand 0 "uns_small_int_operand")))
 
+;; Return true if OP is a register, or is a CONST_INT that can fit in a
+;; signed 5-bit immediate field.  This is an acceptable second operand for
+;; the cbcond instructions.
+(define_predicate "arith5_operand"
+  (ior (match_operand 0 "register_operand")
+       (and (match_code "const_int")
+            (match_test "SPARC_SIMM5_P (INTVAL (op))"))))
+
 
 ;; Predicates for miscellaneous instructions.
 
diff --git a/gcc/config/sparc/sol2.h b/gcc/config/sparc/sol2.h
index ba2ec35..68cc592 100644
--- a/gcc/config/sparc/sol2.h
+++ b/gcc/config/sparc/sol2.h
@@ -58,8 +58,10 @@ along with GCC; see the file COPYING3.  If not see
    other assemblers will accept.  */
 
 #ifndef USE_GAS
-#define AS_SPARC64_FLAG	"-xarch=v9"
+#define AS_SPARC32_FLAG	"-m32 -xarch=v9"
+#define AS_SPARC64_FLAG	"-m64 -xarch=v9"
 #else
+#define AS_SPARC32_FLAG	"-TSO -32 -Av9"
 #define AS_SPARC64_FLAG	"-TSO -64 -Av9"
 #endif
 
@@ -136,9 +138,9 @@ along with GCC; see the file COPYING3.  If not see
 #undef CPP_CPU64_DEFAULT_SPEC
 #define CPP_CPU64_DEFAULT_SPEC ""
 #undef ASM_CPU32_DEFAULT_SPEC
-#define ASM_CPU32_DEFAULT_SPEC "-xarch=v8plusb"
+#define ASM_CPU32_DEFAULT_SPEC AS_SPARC32_FLAG AS_NIAGARA4_FLAG
 #undef ASM_CPU64_DEFAULT_SPEC
-#define ASM_CPU64_DEFAULT_SPEC AS_SPARC64_FLAG "b"
+#define ASM_CPU64_DEFAULT_SPEC AS_SPARC64_FLAG AS_NIAGARA4_FLAG
 #undef ASM_CPU_DEFAULT_SPEC
 #define ASM_CPU_DEFAULT_SPEC ASM_CPU32_DEFAULT_SPEC
 #endif
@@ -241,7 +243,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
 %{mcpu=niagara:" DEF_ARCH32_SPEC("-xarch=v8plusb") DEF_ARCH64_SPEC(AS_SPARC64_FLAG "b") "} \
 %{mcpu=niagara2:" DEF_ARCH32_SPEC("-xarch=v8plusb") DEF_ARCH64_SPEC(AS_SPARC64_FLAG "b") "} \
 %{mcpu=niagara3:" DEF_ARCH32_SPEC("-xarch=v8plus" AS_NIAGARA3_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA3_FLAG) "} \
-%{mcpu=niagara4:" DEF_ARCH32_SPEC("-xarch=v8plus" AS_NIAGARA3_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA3_FLAG) "} \
+%{mcpu=niagara4:" DEF_ARCH32_SPEC(AS_SPARC32_FLAG AS_NIAGARA4_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA4_FLAG) "} \
 %{!mcpu=niagara4:%{!mcpu=niagara3:%{!mcpu=niagara2:%{!mcpu=niagara:%{!mcpu=ultrasparc3:%{!mcpu=ultrasparc:%{!mcpu=v9:%{mcpu*:" DEF_ARCH32_SPEC("-xarch=v8") DEF_ARCH64_SPEC(AS_SPARC64_FLAG) "}}}}}}}} \
 %{!mcpu*:%(asm_cpu_default)} \
 "
diff --git a/gcc/config/sparc/sparc-protos.h b/gcc/config/sparc/sparc-protos.h
index 97f6233..d5b2b1f 100644
--- a/gcc/config/sparc/sparc-protos.h
+++ b/gcc/config/sparc/sparc-protos.h
@@ -71,7 +71,7 @@ extern void sparc_emit_set_symbolic_const64 (rtx, rtx, rtx);
 extern int sparc_splitdi_legitimate (rtx, rtx);
 extern int sparc_split_regreg_legitimate (rtx, rtx);
 extern int sparc_absnegfloat_split_legitimate (rtx, rtx);
-extern const char *output_ubranch (rtx, int, rtx);
+extern const char *output_ubranch (rtx, rtx);
 extern const char *output_cbranch (rtx, rtx, int, int, int, rtx);
 extern const char *output_return (rtx);
 extern const char *output_sibcall (rtx, rtx);
@@ -79,10 +79,12 @@ extern const char *output_v8plus_shift (rtx, rtx *, const char *);
 extern const char *output_v8plus_mult (rtx, rtx *, const char *);
 extern const char *output_v9branch (rtx, rtx, int, int, int, int, rtx);
 extern const char *output_probe_stack_range (rtx, rtx);
+extern const char *output_cbcond (rtx, rtx, rtx);
 extern bool emit_scc_insn (rtx []);
 extern void emit_conditional_branch_insn (rtx []);
 extern int mems_ok_for_ldd_peep (rtx, rtx, rtx);
 extern int empty_delay_slot (rtx);
+extern int emit_cbcond_nop (rtx);
 extern int eligible_for_return_delay (rtx);
 extern int eligible_for_sibcall_delay (rtx);
 extern int tls_call_delay (rtx);
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 8849c03..202f064 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -840,6 +840,8 @@ dump_target_flag_bits (const int flags)
     fprintf (stderr, "VIS2 ");
   if (flags & MASK_VIS3)
     fprintf (stderr, "VIS3 ");
+  if (flags & MASK_CBCOND)
+    fprintf (stderr, "CBCOND ");
   if (flags & MASK_DEPRECATED_V8_INSNS)
     fprintf (stderr, "DEPRECATED_V8_INSNS ");
   if (flags & MASK_SPARCLET)
@@ -946,7 +948,7 @@ sparc_option_override (void)
       MASK_V9|MASK_POPC|MASK_VIS2|MASK_VIS3|MASK_FMAF },
     /* UltraSPARC T4 */
     { "niagara4",	MASK_ISA,
-      MASK_V9|MASK_POPC|MASK_VIS2|MASK_VIS3|MASK_FMAF },
+      MASK_V9|MASK_POPC|MASK_VIS2|MASK_VIS3|MASK_FMAF|MASK_CBCOND },
   };
   const struct cpu_table *cpu;
   unsigned int i;
@@ -1073,6 +1075,9 @@ sparc_option_override (void)
 #ifndef HAVE_AS_FMAF_HPC_VIS3
 		   & ~(MASK_FMAF | MASK_VIS3)
 #endif
+#ifndef HAVE_AS_SPARC4
+		   & ~MASK_CBCOND
+#endif
 		   );
 
   /* If -mfpu or -mno-fpu was explicitly used, don't override with
@@ -1088,7 +1093,12 @@ sparc_option_override (void)
   if (TARGET_VIS3)
     target_flags |= MASK_VIS2 | MASK_VIS;
 
-  /* Don't allow -mvis, -mvis2, -mvis3, or -mfmaf if FPU is disabled.  */
+  /* -mcbcond implies -mvis3, -mvis2 and -mvis */
+  if (TARGET_CBCOND)
+    target_flags |= MASK_VIS3 | MASK_VIS2 | MASK_VIS;
+
+  /* Don't allow -mvis, -mvis2, -mvis3, or -mfmaf if FPU is
+     disabled.  */
   if (! TARGET_FPU)
     target_flags &= ~(MASK_VIS | MASK_VIS2 | MASK_VIS3 | MASK_FMAF);
 
@@ -2660,6 +2670,24 @@ emit_v9_brxx_insn (enum rtx_code code, rtx op0, rtx label)
 				    pc_rtx)));
 }
 
+/* Emit a conditional jump insn for the UA2011 architecture using
+   comparison code CODE and jump target LABEL.  This function exists
+   to take advantage of the UA2011 Compare and Branch insns.  */
+
+static void
+emit_cbcond_insn (enum rtx_code code, rtx op0, rtx op1, rtx label)
+{
+  rtx if_then_else;
+
+  if_then_else = gen_rtx_IF_THEN_ELSE (VOIDmode,
+				       gen_rtx_fmt_ee(code, GET_MODE(op0),
+						      op0, op1),
+				       gen_rtx_LABEL_REF (VOIDmode, label),
+				       pc_rtx);
+
+  emit_jump_insn (gen_rtx_SET (VOIDmode, pc_rtx, if_then_else));
+}
+
 void
 emit_conditional_branch_insn (rtx operands[])
 {
@@ -2674,6 +2702,20 @@ emit_conditional_branch_insn (rtx operands[])
       operands[2] = XEXP (operands[0], 1);
     }
 
+  /* If we can tell early on that the comparison is against a constant
+     that won't fit in the 5-bit signed immediate field of a cbcond,
+     use one of the other v9 conditional branch sequences.  */
+  if (TARGET_CBCOND
+      && GET_CODE (operands[1]) == REG
+      && (GET_MODE (operands[1]) == SImode
+	  || (TARGET_ARCH64 && GET_MODE (operands[1]) == DImode))
+      && (GET_CODE (operands[2]) != CONST_INT
+	  || SPARC_SIMM5_P (INTVAL (operands[2]))))
+    {
+      emit_cbcond_insn (GET_CODE (operands[0]), operands[1], operands[2], operands[3]);
+      return;
+    }
+
   if (TARGET_ARCH64 && operands[2] == const0_rtx
       && GET_CODE (operands[1]) == REG
       && GET_MODE (operands[1]) == DImode)
@@ -3014,6 +3056,44 @@ empty_delay_slot (rtx insn)
   return 1;
 }
 
+/* Return nonzero if we should emit a nop after a cbcond instruction.
+   The cbcond instruction does not have a delay slot, however there is
+   a severe performance penalty if a control transfer appears right
+   after a cbcond.  Therefore we emit a nop when we detect this
+   situation.  */
+
+int
+emit_cbcond_nop (rtx insn)
+{
+  rtx next = next_real_insn (insn);
+
+  if (!next)
+    return 1;
+
+  if (GET_CODE (next) == INSN
+      && GET_CODE (PATTERN (next)) == SEQUENCE)
+    next = XVECEXP (PATTERN (next), 0, 0);
+  else if (GET_CODE (next) == CALL_INSN
+	   && GET_CODE (PATTERN (next)) == PARALLEL)
+    {
+      rtx delay = XVECEXP (PATTERN (next), 0, 1);
+
+      if (GET_CODE (delay) == RETURN)
+	{
+	  /* It's a sibling call.  Do not emit the nop if we're going
+	     to emit something other than the jump itself as the first
+	     instruction of the sibcall sequence.  */
+	  if (sparc_leaf_function_p || TARGET_FLAT)
+	    return 0;
+	}
+    }
+
+  if (NONJUMP_INSN_P (next))
+    return 0;
+
+  return 1;
+}
+
 /* Return nonzero if TRIAL can go into the call delay slot.  */
 
 int
@@ -7102,19 +7182,49 @@ sparc_preferred_simd_mode (enum machine_mode mode)
    DEST is the destination insn (i.e. the label), INSN is the source.  */
 
 const char *
-output_ubranch (rtx dest, int label, rtx insn)
+output_ubranch (rtx dest, rtx insn)
 {
   static char string[64];
   bool v9_form = false;
+  int delta;
   char *p;
 
-  if (TARGET_V9 && INSN_ADDRESSES_SET_P ())
+  /* Even if we are trying to use cbcond for this, evaluate
+     whether we can use V9 branches as our backup plan.  */
+
+  delta = 5000000;
+  if (INSN_ADDRESSES_SET_P ())
+    delta = (INSN_ADDRESSES (INSN_UID (dest))
+	     - INSN_ADDRESSES (INSN_UID (insn)));
+
+  /* Leave some instructions for "slop".  */
+  if (TARGET_V9 && delta >= -260000 && delta < 260000)
+    v9_form = true;
+
+  if (TARGET_CBCOND)
     {
-      int delta = (INSN_ADDRESSES (INSN_UID (dest))
-		   - INSN_ADDRESSES (INSN_UID (insn)));
-      /* Leave some instructions for "slop".  */
-      if (delta >= -260000 && delta < 260000)
-	v9_form = true;
+      bool emit_nop = emit_cbcond_nop (insn);
+      bool far = false;
+      const char *rval;
+
+      if (delta < -500 || delta > 500)
+	far = true;
+
+      if (far)
+	{
+	  if (v9_form)
+	    rval = "ba,a,pt\t%%xcc, %l0";
+	  else
+	    rval = "b,a\t%l0";
+	}
+      else
+	{
+	  if (emit_nop)
+	    rval = "cwbe\t%%g0, %%g0, %l0\n\tnop";
+	  else
+	    rval = "cwbe\t%%g0, %%g0, %l0";
+	}
+      return rval;
     }
 
   if (v9_form)
@@ -7125,7 +7235,7 @@ output_ubranch (rtx dest, int label, rtx insn)
   p = strchr (string, '\0');
   *p++ = '%';
   *p++ = 'l';
-  *p++ = '0' + label;
+  *p++ = '0';
   *p++ = '%';
   *p++ = '(';
   *p = '\0';
@@ -7604,6 +7714,183 @@ sparc_emit_fixunsdi (rtx *operands, enum machine_mode mode)
   emit_label (donelab);
 }
 
+/* Return the string to output a compare and branch instruction to DEST.
+   DEST is the destination insn (i.e. the label), INSN is the source,
+   and OP is the conditional expression.  */
+
+const char *
+output_cbcond (rtx op, rtx dest, rtx insn)
+{
+  enum machine_mode mode = GET_MODE (XEXP (op, 0));
+  enum rtx_code code = GET_CODE (op);
+  static char string[64];
+  int far, emit_nop, len;
+  char *p;
+
+  /* Compare and Branch is limited to +-2KB.  If it is too far away,
+     change
+
+     cxbne X, Y, .LC30
+
+     to
+
+     cxbe X, Y, .+12
+     ba,pt xcc, .LC30
+      nop  */
+
+  len = get_attr_length (insn);
+
+  far = len == 3;
+  emit_nop = len == 2;
+
+  if (far)
+    code = reverse_condition (code);
+
+  p = string;
+
+  *p++ = 'c';
+  *p++ = mode == SImode ? 'w' : 'x';
+  *p++ = 'b';
+
+  switch (code)
+    {
+    case NE:
+      *p++ = 'n';
+      *p++ = 'e';
+      break;
+
+    case EQ:
+      *p++ = 'e';
+      break;
+
+    case GE:
+      if (mode == CC_NOOVmode || mode == CCX_NOOVmode)
+	{
+	  *p++ = 'p';
+	  *p++ = 'o';
+	  *p++ = 's';
+	}
+      else
+	{
+	  *p++ = 'g';
+	  *p++ = 'e';
+	}
+      break;
+
+    case GT:
+      *p++ = 'g';
+      break;
+
+    case LE:
+      *p++ = 'l';
+      *p++ = 'e';
+      break;
+
+    case LT:
+      if (mode == CC_NOOVmode || mode == CCX_NOOVmode)
+	{
+	  *p++ = 'n';
+	  *p++ = 'e';
+	  *p++ = 'g';
+	}
+      else
+	*p++ = 'l';
+      break;
+
+    case GEU:
+      *p++ = 'c';
+      *p++ = 'c';
+      break;
+
+    case GTU:
+      *p++ = 'g';
+      *p++ = 'u';
+      break;
+
+    case LEU:
+      *p++ = 'l';
+      *p++ = 'e';
+      *p++ = 'u';
+      break;
+
+    case LTU:
+      *p++ = 'c';
+      *p++ = 's';
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  *p++ = '\t';
+  *p++ = '%';
+  *p++ = '1';
+  *p++ = ',';
+  *p++ = ' ';
+  *p++ = '%';
+  *p++ = '2';
+  *p++ = ',';
+  *p++ = ' ';
+
+  if (far)
+    {
+      int veryfar = 1, delta;
+
+      if (INSN_ADDRESSES_SET_P ())
+	{
+	  delta = (INSN_ADDRESSES (INSN_UID (dest))
+		   - INSN_ADDRESSES (INSN_UID (insn)));
+	  /* Leave some instructions for "slop".  */
+	  if (delta >= -260000 && delta < 260000)
+	    veryfar = 0;
+	}
+      *p++ = '.';
+      *p++ = '+';
+      *p++ = '1';
+      *p++ = '2';
+      *p++ = '\n';
+      *p++ = '\t';
+      if (veryfar)
+	{
+	  *p++ = 'b';
+	  *p++ = '\t';
+	}
+      else
+	{
+	  *p++ = 'b';
+	  *p++ = 'a';
+	  *p++ = ',';
+	  *p++ = 'p';
+	  *p++ = 't';
+	  *p++ = '\t';
+	  *p++ = '%';
+	  *p++ = '%';
+	  *p++ = 'x';
+	  *p++ = 'c';
+	  *p++ = 'c';
+	  *p++ = ',';
+	  *p++ = ' ';
+	}
+    }
+
+  *p++ = '%';
+  *p++ = 'l';
+  *p++ = '3';
+
+  if (far || emit_nop)
+    {
+      *p++ = '\n';
+      *p++ = '\t';
+      *p++ = 'n';
+      *p++ = 'o';
+      *p++ = 'p';
+    }
+
+  *p = '\0';
+
+  return string;
+}
+
 /* Return the string to output a conditional branch to LABEL, testing
    register REG.  LABEL is the operand number of the label; REG is the
    operand number of the reg.  OP is the conditional expression.  The mode
diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h
index 8f86100..374919f 100644
--- a/gcc/config/sparc/sparc.h
+++ b/gcc/config/sparc/sparc.h
@@ -195,7 +195,7 @@ extern enum cmodel sparc_cmodel;
 #endif
 #if TARGET_CPU_DEFAULT == TARGET_CPU_niagara4
 #define CPP_CPU64_DEFAULT_SPEC "-D__sparc_v9__"
-#define ASM_CPU64_DEFAULT_SPEC "-Av9" AS_NIAGARA3_FLAG
+#define ASM_CPU64_DEFAULT_SPEC AS_NIAGARA4_FLAG
 #endif
 
 #else
@@ -337,7 +337,7 @@ extern enum cmodel sparc_cmodel;
 %{mcpu=niagara:%{!mv8plus:-Av9b}} \
 %{mcpu=niagara2:%{!mv8plus:-Av9b}} \
 %{mcpu=niagara3:%{!mv8plus:-Av9" AS_NIAGARA3_FLAG "}} \
-%{mcpu=niagara4:%{!mv8plus:-Av9" AS_NIAGARA3_FLAG "}} \
+%{mcpu=niagara4:%{!mv8plus:" AS_NIAGARA4_FLAG "}} \
 %{!mcpu*:%(asm_cpu_default)} \
 "
 
@@ -1006,7 +1006,8 @@ extern char leaf_reg_remap[];
 /* Local macro to handle the two v9 classes of FP regs.  */
 #define FP_REG_CLASS_P(CLASS) ((CLASS) == FP_REGS || (CLASS) == EXTRA_FP_REGS)
 
-/* Predicates for 10-bit, 11-bit and 13-bit signed constants.  */
+/* Predicates for 5-bit, 10-bit, 11-bit and 13-bit signed constants.  */
+#define SPARC_SIMM5_P(X)  ((unsigned HOST_WIDE_INT) (X) + 0x10 < 0x20)
 #define SPARC_SIMM10_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x200 < 0x400)
 #define SPARC_SIMM11_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x400 < 0x800)
 #define SPARC_SIMM13_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x1000 < 0x2000)
@@ -1746,6 +1747,12 @@ extern int sparc_indent_opcode;
 #define AS_NIAGARA3_FLAG "d"
 #endif
 
+#ifndef HAVE_AS_SPARC4
+#define AS_NIAGARA4_FLAG " -xarch=v9b"
+#else
+#define AS_NIAGARA4_FLAG " -xarch=sparc4"
+#endif
+
 /* We use gcc _mcount for profiling.  */
 #define NO_PROFILE_COUNTERS 0
 
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index f604f46..bdc8a8d 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -257,6 +257,7 @@
   "ialu,compare,shift,
    load,sload,store,
    uncond_branch,branch,call,sibcall,call_no_delay_slot,return,
+   cbcond,uncond_cbcond,
    imul,idiv,
    fpload,fpstore,
    fp,fpmove,
@@ -275,6 +276,12 @@
   (symbol_ref "(empty_delay_slot (insn)
 		? EMPTY_DELAY_SLOT_TRUE : EMPTY_DELAY_SLOT_FALSE)"))
 
+;; True if we are making use of compare-and-branch instructions.
+;; True if we should emit a nop after a cbcond instruction
+(define_attr "emit_cbcond_nop" "false,true"
+  (symbol_ref "(emit_cbcond_nop (insn)
+                ? EMIT_CBCOND_NOP_TRUE : EMIT_CBCOND_NOP_FALSE)"))
+
 (define_attr "branch_type" "none,icc,fcc,reg"
   (const_string "none"))
 
@@ -377,6 +384,30 @@
 	       (if_then_else (eq_attr "empty_delay_slot" "true")
 		 (const_int 4)
 		 (const_int 3))))
+         (eq_attr "type" "cbcond")
+	   (if_then_else (lt (pc) (match_dup 3))
+	     (if_then_else (lt (minus (match_dup 3) (pc)) (const_int 500))
+               (if_then_else (eq_attr "emit_cbcond_nop" "true")
+                 (const_int 2)
+                 (const_int 1))
+               (const_int 3))
+	     (if_then_else (lt (minus (pc) (match_dup 3)) (const_int 500))
+               (if_then_else (eq_attr "emit_cbcond_nop" "true")
+                 (const_int 2)
+                 (const_int 1))
+               (const_int 3)))
+         (eq_attr "type" "uncond_cbcond")
+	   (if_then_else (lt (pc) (match_dup 0))
+	     (if_then_else (lt (minus (match_dup 0) (pc)) (const_int 500))
+               (if_then_else (eq_attr "emit_cbcond_nop" "true")
+                 (const_int 2)
+                 (const_int 1))
+               (const_int 1))
+	     (if_then_else (lt (minus (pc) (match_dup 0)) (const_int 500))
+               (if_then_else (eq_attr "emit_cbcond_nop" "true")
+                 (const_int 2)
+                 (const_int 1))
+               (const_int 1)))
 	 ] (const_int 1)))
 
 ;; FP precision.
@@ -397,7 +428,7 @@
 		? TLS_CALL_DELAY_TRUE : TLS_CALL_DELAY_FALSE)"))
 
 (define_attr "in_call_delay" "false,true"
-  (cond [(eq_attr "type" "uncond_branch,branch,call,sibcall,call_no_delay_slot,multi")
+  (cond [(eq_attr "type" "uncond_branch,branch,cbcond,uncond_cbcond,call,sibcall,call_no_delay_slot,multi")
 		(const_string "false")
 	 (eq_attr "type" "load,fpload,store,fpstore")
 		(if_then_else (eq_attr "length" "1")
@@ -431,19 +462,19 @@
 ;; because it prevents us from moving back the final store of inner loops.
 
 (define_attr "in_branch_delay" "false,true"
-  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,sibcall,call_no_delay_slot,multi")
+  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,cbcond,uncond_cbcond,call,sibcall,call_no_delay_slot,multi")
 		     (eq_attr "length" "1"))
 		(const_string "true")
 		(const_string "false")))
 
 (define_attr "in_uncond_branch_delay" "false,true"
-  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,sibcall,call_no_delay_slot,multi")
+  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,cbcond,uncond_cbcond,call,sibcall,call_no_delay_slot,multi")
 		     (eq_attr "length" "1"))
 		(const_string "true")
 		(const_string "false")))
 
 (define_attr "in_annul_branch_delay" "false,true"
-  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,sibcall,call_no_delay_slot,multi")
+  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,cbcond,uncond_cbcond,call,sibcall,call_no_delay_slot,multi")
 		     (eq_attr "length" "1"))
 		(const_string "true")
 		(const_string "false")))
@@ -1313,6 +1344,32 @@
 ;; SPARC V9-specific jump insns.  None of these are guaranteed to be
 ;; in the architecture.
 
+(define_insn "*cbcond_sp32"
+  [(set (pc)
+        (if_then_else (match_operator 0 "noov_compare_operator"
+                       [(match_operand:SI 1 "register_operand" "r")
+                        (match_operand:SI 2 "arith5_operand" "rA")])
+                      (label_ref (match_operand 3 "" ""))
+                      (pc)))]
+  "TARGET_CBCOND"
+{
+  return output_cbcond (operands[0], operands[3], insn);
+}
+  [(set_attr "type" "cbcond")])
+
+(define_insn "*cbcond_sp64"
+  [(set (pc)
+        (if_then_else (match_operator 0 "noov_compare_operator"
+                       [(match_operand:DI 1 "register_operand" "r")
+                        (match_operand:DI 2 "arith5_operand" "rA")])
+                      (label_ref (match_operand 3 "" ""))
+                      (pc)))]
+  "TARGET_ARCH64 && TARGET_CBCOND"
+{
+  return output_cbcond (operands[0], operands[3], insn);
+}
+  [(set_attr "type" "cbcond")])
+
 ;; There are no 32 bit brreg insns.
 
 ;; XXX
@@ -6076,12 +6133,22 @@
 
 ;; Unconditional and other jump instructions.
 
-(define_insn "jump"
+(define_expand "jump"
   [(set (pc) (label_ref (match_operand 0 "" "")))]
-  ""
-  "* return output_ubranch (operands[0], 0, insn);"
+  "")
+
+(define_insn "*jump_ubranch"
+  [(set (pc) (label_ref (match_operand 0 "" "")))]
+  "!TARGET_CBCOND"
+  "* return output_ubranch (operands[0], insn);"
   [(set_attr "type" "uncond_branch")])
 
+(define_insn "*jump_cbcond"
+  [(set (pc) (label_ref (match_operand 0 "" "")))]
+  "TARGET_CBCOND"
+  "* return output_ubranch (operands[0], insn);"
+  [(set_attr "type" "uncond_cbcond")])
+
 (define_expand "tablejump"
   [(parallel [(set (pc) (match_operand 0 "register_operand" "r"))
 	      (use (label_ref (match_operand 1 "" "")))])]
diff --git a/gcc/config/sparc/sparc.opt b/gcc/config/sparc/sparc.opt
index 58ba6b7..241cb07 100644
--- a/gcc/config/sparc/sparc.opt
+++ b/gcc/config/sparc/sparc.opt
@@ -73,6 +73,10 @@ mvis3
 Target Report Mask(VIS3)
 Use UltraSPARC Visual Instruction Set version 3.0 extensions
 
+mcbcond
+Target Report Mask(CBCOND)
+Use UltraSPARC Compare-and-Branch extensions
+
 mfmaf
 Target Report Mask(FMAF)
 Use UltraSPARC Fused Multiply-Add extensions
diff --git a/gcc/configure b/gcc/configure
index a223c60..3fc0088 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -24090,6 +24090,48 @@ if test $gcc_cv_as_sparc_fmaf = yes; then
 $as_echo "#define HAVE_AS_FMAF_HPC_VIS3 1" >>confdefs.h
 
 fi
+
+    { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for SPARC4 instructions" >&5
+$as_echo_n "checking assembler for SPARC4 instructions... " >&6; }
+if test "${gcc_cv_as_sparc_sparc4+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_sparc_sparc4=no
+  if test x$gcc_cv_as != x; then
+    $as_echo '.text
+       .register %g2, #scratch
+       .register %g3, #scratch
+       .align 4
+       cxbe %g2, %g3, 1f
+1:     cwbneg %g2, %g3, 1f
+1:     sha1
+       md5
+       aes_kexpand0 %f4, %f6, %f8
+       des_round %f38, %f40, %f42, %f44
+       camellia_f %f54, %f56, %f58, %f60
+       kasumi_fi_xor %f46, %f48, %f50, %f52' > conftest.s
+    if { ac_try='$gcc_cv_as $gcc_cv_as_flags -xarch=sparc4 -o conftest.o conftest.s >&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+    then
+	gcc_cv_as_sparc_sparc4=yes
+    else
+      echo "configure: failed program was" >&5
+      cat conftest.s >&5
+    fi
+    rm -f conftest.o conftest.s
+  fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_sparc_sparc4" >&5
+$as_echo "$gcc_cv_as_sparc_sparc4" >&6; }
+if test $gcc_cv_as_sparc_sparc4 = yes; then
+
+$as_echo "#define HAVE_AS_SPARC4 1" >>confdefs.h
+
+fi
     ;;
 
   i[34567]86-*-* | x86_64-*-*)
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 17e1d86..95007a2 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3501,6 +3501,24 @@ foo:
        fnaddd %f10, %f12, %f14],,
       [AC_DEFINE(HAVE_AS_FMAF_HPC_VIS3, 1,
                 [Define if your assembler supports FMAF, HPC, and VIS 3.0 instructions.])])
+
+    gcc_GAS_CHECK_FEATURE([SPARC4 instructions],
+      gcc_cv_as_sparc_sparc4,,
+      [-xarch=sparc4],
+      [.text
+       .register %g2, #scratch
+       .register %g3, #scratch
+       .align 4
+       cxbe %g2, %g3, 1f
+1:     cwbneg %g2, %g3, 1f
+1:     sha1
+       md5
+       aes_kexpand0 %f4, %f6, %f8
+       des_round %f38, %f40, %f42, %f44
+       camellia_f %f54, %f56, %f58, %f60
+       kasumi_fi_xor %f46, %f48, %f50, %f52],,
+      [AC_DEFINE(HAVE_AS_SPARC4, 1,
+                [Define if your assembler supports SPARC4 instructions.])])
     ;;
 
 changequote(,)dnl
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f8c9230..fc2addc 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -918,6 +918,7 @@ See RS/6000 and PowerPC Options.
 -munaligned-doubles  -mno-unaligned-doubles @gol
 -mv8plus  -mno-v8plus  -mvis  -mno-vis @gol
 -mvis2  -mno-vis2  -mvis3  -mno-vis3 @gol
+-mcbcond -mno-cbcond @gol
 -mfmaf  -mno-fmaf  -mpopc  -mno-popc @gol
 -mfix-at697f}
 
@@ -18878,6 +18879,16 @@ default is @option{-mvis3} when targeting a cpu that supports such
 instructions, such as niagara-3 and later.  Setting @option{-mvis3}
 also sets @option{-mvis2} and @option{-mvis}.
 
+@item -mcbcond
+@itemx -mno-cbcond
+@opindex mcbcond
+@opindex mno-cbcond
+With @option{-mcbcond}, GCC generates code that takes advantage of
+compare-and-branch instructions, as defined in the Sparc Architecture 2011.
+The default is @option{-mcbcond} when targeting a cpu that supports such
+instructions, such as niagara-4 and later.  Setting @option{-mcbcond} also
+sets @option{-mvis3}, @option{-mvis2}, and @option{-mvis}.
+
 @item -mpopc
 @itemx -mno-popc
 @opindex mpopc
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 32866d5..250cb1c 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3157,6 +3157,9 @@ when the Visual Instruction Set is available.
 @item h
 64-bit global or out register for the SPARC-V8+ architecture.
 
+@item A
+Signed 5-bit constant
+
 @item D
 A vector constant
 

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-11-15 14:29 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-23  4:55 [PATCH v3] Add support for sparc compare-and-branch David Miller
2012-10-25 18:28 ` David Miller
2012-10-26  9:23   ` Rainer Orth
2012-11-13  2:46     ` David Miller
2012-11-15 14:29       ` Rainer Orth
2012-10-26  9:13 ` Eric Botcazou
2012-10-26  9:31   ` David Miller
2012-11-11 22:30 ` Eric Botcazou
2012-11-11 23:16   ` David Miller
2012-11-12  8:38     ` Eric Botcazou
2012-11-12 14:39       ` Rainer Orth
2012-11-12 15:37         ` Eric Botcazou
2012-11-12 15:46           ` Rainer Orth
2012-11-12 19:35             ` David Miller
2012-11-13 19:34               ` Eric Botcazou
2012-11-13 19:41                 ` David Miller
2012-11-13 21:52                   ` Eric Botcazou
2012-11-13 21:58                     ` David Miller
2012-11-12 19:37       ` David Miller
2012-11-12 17:56 ` Richard Henderson
2012-11-12 19:38   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).