public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
@ 2007-08-12 20:23 David Daney
  2007-08-13  6:26 ` Ken Raeburn
  2007-08-13  7:23 ` Richard Sandiford
  0 siblings, 2 replies; 20+ messages in thread
From: David Daney @ 2007-08-12 20:23 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 3018 bytes --]

While trouble shooting libgcj problems on n64, I discovered that the 
atomic operations implemented in libjava/sysdep/mips/locks.h are both 
buggy and do not support 64 bit operations.  Instead of hacking them up, 
it seems like a better path is to implement the builtin atomic memory 
operations for mips and use them instead.  The result is this 
(very)lightly tested patch.

I thought I would try to get some feedback on the patch before 
proceeding further.  I think it implements enough of the sync_* insns so 
that all the documented __sync_* builtins can be expanded in-line for SI 
and DI mode operations.

Currently with the patch the code generated for this program:
----------------
unsigned long v;
unsigned long f0(unsigned long n)
{
    return __sync_fetch_and_add (&v, n);
}
----------------

has a redundant move in its inner loop:
-------------------
f0:
    .frame    $sp,0,$31        # vars= 0, regs= 0/0, args= 0, gp= 0
    .mask    0x00000000,0
    .fmask    0x00000000,0
    .set    noreorder
    .set    nomacro
   
    lui    $7,%hi(%neg(%gp_rel(f0)))
    daddu    $7,$7,$25
    daddiu    $7,$7,%lo(%neg(%gp_rel(f0)))
    ld    $6,%got_disp(v)($7)
    sync
.L7:
    lld    $5,0($6)
    daddu    $2,$5,$4
    move    $3,$2       <------ redundant move.  daddu could target $3
    scd    $3,0($6)
    beq    $3,$0,.L7
    move    $2,$5

    j    $31
    nop

    .set    macro
    .set    reorder
    .end    f0
------------------------

Perhaps it is a dataflow issue.

Currently there is a change to the rdhwr insn in the patch, that quiets 
some warning and makes it more correct, that is unrelated to the rest of 
the patch.  Perhaps I should create a separate patch for that.

Comments?

2007-08-12  David Daney  <ddaney@avtrex.com>

    * config/mips/mips.md (UNSPEC_LL, UNSPEC_SC, UNSPEC_COMPARE_AND_SWAP,
    UNSPEC_SYNC_ADD, UNSPEC_SYNC_SUB, UNSPEC_SYNC_IOR, UNSPEC_SYNC_XOR,
    UNSPEC_SYNC_AND, UNSPEC_SYNC_NAND, UNSPEC_SYNC_OLD_ADD,
    UNSPEC_SYNC_OLD_SUB, UNSPEC_SYNC_OLD_IOR, UNSPEC_SYNC_OLD_XOR,
    UNSPEC_SYNC_OLD_AND, UNSPEC_SYNC_OLD_NAND, UNSPEC_SYNC_NEW_ADD,
    UNSPEC_SYNC_NEW_SUB, UNSPEC_SYNC_NEW_IOR, UNSPEC_SYNC_NEW_XOR,
    UNSPEC_SYNC_NEW_AND, UNSPEC_SYNC_NEW_NAND): New define_constants.
    (sync): Change condition to ISA_HAS_SYNC.
    (rdhwr): Change predicate for operand 0 to register_operand.
    (load_linked, store_contidional): New insns.
    (memory_barrier, sync_compare_and_swap, sync_old_add, sync_old_sub,
    sync_old_ior, sync_old_xor, sync_old_and, sync_old_nand, sync_new_add,
    sync_new_sub,,sync_new_ior, sync_new_xor, sync_new_and, sync_new_nand,
    sync_add, sync_sub, sync_ior, sync_xor, sync_and, sync_nand): New
    expands.
    * config/mips/mips-protos.h (mips_atomic_memory_op): New enum.
    (mips_expand_atomic_memory_op): Declare new function.
    * config/mips/mips.c (mips_emit_unary): New function.
    (mips_expand_atomic_memory_op): New function.
    * config/mips/mips.h (ISA_HAS_SYNC, ISA_HAS_LL_SC): New ISA predicates.
   


[-- Attachment #2: mips-atomic.diff --]
[-- Type: text/x-patch, Size: 15242 bytes --]

Index: config/mips/mips.md
===================================================================
--- config/mips/mips.md	(revision 127356)
+++ config/mips/mips.md	(working copy)
@@ -53,7 +53,31 @@ (define_constants
    (UNSPEC_RDHWR		34)
    (UNSPEC_SYNCI		35)
    (UNSPEC_SYNC			36)
-
+   (UNSPEC_LL			37)
+   (UNSPEC_SC			38)
+   (UNSPEC_COMPARE_AND_SWAP	39)
+
+   (UNSPEC_SYNC_ADD		40)
+   (UNSPEC_SYNC_SUB		41)
+   (UNSPEC_SYNC_IOR		42)
+   (UNSPEC_SYNC_XOR		43)
+   (UNSPEC_SYNC_AND		44)
+   (UNSPEC_SYNC_NAND		45)
+   
+   (UNSPEC_SYNC_OLD_ADD		46)
+   (UNSPEC_SYNC_OLD_SUB		47)
+   (UNSPEC_SYNC_OLD_IOR		48)
+   (UNSPEC_SYNC_OLD_XOR		49)
+   (UNSPEC_SYNC_OLD_AND		50)
+   (UNSPEC_SYNC_OLD_NAND	51)
+   
+   (UNSPEC_SYNC_NEW_ADD		52)
+   (UNSPEC_SYNC_NEW_SUB		53)
+   (UNSPEC_SYNC_NEW_IOR		54)
+   (UNSPEC_SYNC_NEW_XOR		55)
+   (UNSPEC_SYNC_NEW_AND		56)
+   (UNSPEC_SYNC_NEW_NAND	57)
+   
    (UNSPEC_ADDRESS_FIRST	100)
 
    (FAKE_CALL_REGNO		79)
@@ -4251,7 +4275,7 @@ (define_expand "clear_cache"
 
 (define_insn "sync"
   [(unspec_volatile [(const_int 0)] UNSPEC_SYNC)]
-  "ISA_HAS_SYNCI"
+  "ISA_HAS_SYNC"
   "sync")
 
 (define_insn "synci"
@@ -4261,7 +4285,7 @@ (define_insn "synci"
   "synci\t0(%0)")
 
 (define_insn "rdhwr"
-  [(set (match_operand:SI 0 "general_operand" "=d")
+  [(set (match_operand:SI 0 "register_operand" "=d")
         (unspec_volatile [(match_operand:SI 1 "const_int_operand" "n")]
         UNSPEC_RDHWR))]
   "ISA_HAS_SYNCI"
@@ -4283,6 +4307,303 @@ (define_insn "clear_hazard"
          "\t.set\tpop";
 }
   [(set_attr "length" "20")])
+
+(define_expand "memory_barrier"
+  [(unspec_volatile [(const_int 0)] UNSPEC_SYNC)]
+  "ISA_HAS_SYNC"
+  "")
+
+(define_insn "load_linked<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=d")
+  	(unspec_volatile:GPR [(match_operand:GPR 1 "memory_operand" "o")]
+         UNSPEC_LL))]
+  "ISA_HAS_LL_SC"
+  "ll<d>\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "store_contidional<mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "=o")
+  	(match_operand:GPR 1 "register_operand" "d"))
+   (set (match_operand:GPR 2 "register_operand" "=1")		   
+  	(unspec_volatile:GPR [(const_int 0)] UNSPEC_SC))]
+  "ISA_HAS_LL_SC"
+  "sc<d>\t%1,%0"
+  [(set_attr "type" "store")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "sync_compare_and_swap<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")
+				(match_operand:GPR 3 "general_operand")]
+	   UNSPEC_COMPARE_AND_SWAP))])]
+  "ISA_HAS_LL_SC"
+{
+  rtx loop, exit, cmp, cmp_result, tmp, sc_result;
+  emit_insn (gen_memory_barrier ());
+
+  loop = gen_label_rtx ();
+  emit_label (loop);
+
+  /* LL */
+  emit_insn (gen_load_linked<mode> (operands[0], operands[1]));
+
+  /* If not equal exit.  */
+  exit = gen_label_rtx ();
+  cmp = gen_rtx_NE (VOIDmode, operands[0], operands[2]);
+  emit_jump_insn (gen_condjump (cmp, exit));
+
+  tmp = gen_reg_rtx (<MODE>mode);
+  mips_emit_move (tmp, operands[3]);
+
+  /* SC */
+  sc_result = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_store_contidional<mode> (operands[1], tmp, sc_result));
+
+  /* If SC failed, try again.  */
+  cmp_result = gen_rtx_EQ (VOIDmode, sc_result, const0_rtx);
+  emit_jump_insn (gen_condjump (cmp_result, loop));
+
+  emit_label (exit);
+  DONE;
+})
+
+(define_expand "sync_old_add<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_OLD_ADD))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_ADD, false,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_old_sub<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_OLD_SUB))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_SUB, false,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_old_ior<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_OLD_IOR))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_IOR, false,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_old_xor<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_OLD_XOR))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_XOR, false,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_old_and<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_OLD_AND))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_AND, false,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_old_nand<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_OLD_NAND))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_NAND, false,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_new_add<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_NEW_ADD))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_ADD, true,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_new_sub<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_NEW_SUB))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_SUB, true,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_new_ior<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_NEW_IOR))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_IOR, true,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_new_xor<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_NEW_XOR))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_XOR, true,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_new_and<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_NEW_AND))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_AND, true,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_new_nand<mode>"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "=d")
+	  (match_operand:GPR 1 "memory_operand" "+o"))
+     (set (match_dup 1)
+	  (unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	   UNSPEC_SYNC_NEW_NAND))])]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_NAND, true,
+				operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "sync_add<mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+o")
+	(unspec_volatile:GPR [(match_operand:GPR 1 "register_operand" "d")]
+	   UNSPEC_SYNC_ADD))]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_ADD, false,
+				NULL, operands[0], operands[1]);
+  DONE;
+})
+
+(define_expand "sync_sub<mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+o")
+	(unspec_volatile:GPR [(match_operand:GPR 1 "register_operand" "d")]
+	   UNSPEC_SYNC_SUB))]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_SUB, false,
+				NULL, operands[0], operands[1]);
+  DONE;
+})
+
+(define_expand "sync_ior<mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+o")
+	(unspec_volatile:GPR [(match_operand:GPR 1 "register_operand" "d")]
+	   UNSPEC_SYNC_IOR))]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_IOR, false,
+				NULL, operands[0], operands[1]);
+  DONE;
+})
+
+(define_expand "sync_xor<mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+o")
+	(unspec_volatile:GPR [(match_operand:GPR 1 "register_operand" "d")]
+	   UNSPEC_SYNC_XOR))]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_XOR, false,
+				NULL, operands[0], operands[1]);
+  DONE;
+})
+
+(define_expand "sync_and<mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+o")
+	(unspec_volatile:GPR [(match_operand:GPR 1 "register_operand" "d")]
+	   UNSPEC_SYNC_AND))]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_AND, false,
+				NULL, operands[0], operands[1]);
+  DONE;
+})
+
+(define_expand "sync_nand<mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+o")
+	(unspec_volatile:GPR [(match_operand:GPR 1 "register_operand" "d")]
+	   UNSPEC_SYNC_NAND))]
+  "ISA_HAS_LL_SC"
+{
+  mips_expand_atomic_memory_op (<MODE>mode, ATOMIC_NAND, false,
+				NULL, operands[0], operands[1]);
+  DONE;
+})
 \f
 ;; Block moves, see mips.c for more details.
 ;; Argument 0 is the destination
Index: config/mips/mips-protos.h
===================================================================
--- config/mips/mips-protos.h	(revision 127356)
+++ config/mips/mips-protos.h	(working copy)
@@ -162,6 +162,17 @@ enum mips_loadgp_style {
   LOADGP_RTP
 };
 
+/* Operation perfomed by mips_expand_atomic_memory_op.  */
+enum mips_atomic_memory_op {
+  ATOMIC_ADD,
+  ATOMIC_SUB,
+  ATOMIC_IOR,
+  ATOMIC_XOR,
+  ATOMIC_AND,
+  ATOMIC_NAND
+};
+
+
 struct mips16e_save_restore_info;
 
 extern bool mips_symbolic_constant_p (rtx, enum mips_symbol_context,
@@ -216,7 +227,9 @@ extern void mips_emit_fcc_reload (rtx, r
 extern void mips_set_return_address (rtx, rtx);
 extern bool mips_expand_block_move (rtx, rtx, rtx);
 extern void mips_expand_synci_loop (rtx, rtx);
-
+extern void mips_expand_atomic_memory_op (enum machine_mode,
+                                          enum mips_atomic_memory_op,
+                                          bool, rtx, rtx, rtx);
 extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx);
 extern void function_arg_advance (CUMULATIVE_ARGS *, enum machine_mode,
 				  tree, int);
Index: config/mips/mips.c
===================================================================
--- config/mips/mips.c	(revision 127371)
+++ config/mips/mips.c	(working copy)
@@ -3351,6 +3351,14 @@ mips_emit_binary (enum rtx_code code, rt
 			  gen_rtx_fmt_ee (code, GET_MODE (target), op0, op1)));
 }
 
+/* Emit an instruction of the form (set TARGET (CODE OP0)).  */
+static void
+mips_emit_unary (enum rtx_code code, rtx target, rtx op0)
+{
+  emit_insn (gen_rtx_SET (VOIDmode, target,
+			  gen_rtx_fmt_e (code, GET_MODE (target), op0)));
+}
+
 /* Return true if CMP1 is a suitable second operand for relational
    operator CODE.  See also the *sCC patterns in mips.md.  */
 
@@ -4001,6 +4009,84 @@ mips_expand_synci_loop (rtx begin, rtx e
   emit_jump_insn (gen_condjump (cmp_result, label));
 }
 \f
+
+/* Expand an atomic memory operation.  */
+
+extern void mips_expand_atomic_memory_op (enum machine_mode mode,
+                                          enum mips_atomic_memory_op op,
+                                          bool return_new,
+                                          rtx result, rtx location, rtx value)
+{
+  rtx loop, cmp, cmp_result, new_value, sc_result;
+
+  if (result == NULL)
+    result = gen_reg_rtx (mode);
+
+  new_value= gen_reg_rtx (mode);
+
+  emit_insn (gen_memory_barrier ());
+
+  loop = gen_label_rtx ();
+  emit_label (loop);
+
+  /* LL */
+  switch (mode)
+    {
+    case SImode:
+      emit_insn (gen_load_linkedsi (result, location));
+      break;
+    case DImode:
+      emit_insn (gen_load_linkeddi (result, location));
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  switch (op)
+    {
+    case ATOMIC_ADD:
+      mips_emit_binary (PLUS, new_value, result, value);
+      break;
+    case ATOMIC_SUB:
+      mips_emit_binary (MINUS, new_value, result, value);
+      break;
+    case ATOMIC_IOR:
+      mips_emit_binary (IOR, new_value, result, value);
+      break;
+    case ATOMIC_XOR:
+      mips_emit_binary (XOR, new_value, result, value);
+      break;
+    case ATOMIC_AND:
+      mips_emit_binary (AND, new_value, result, value);
+      break;
+    case ATOMIC_NAND:
+      mips_emit_unary (NOT, new_value, result);
+      mips_emit_binary (AND, new_value, new_value, value);
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  /* SC */
+  sc_result = gen_reg_rtx (mode);
+  switch (mode)
+    {
+    case SImode:
+      emit_insn (gen_store_contidionalsi (location, new_value, sc_result));
+      break;
+    case DImode:
+      emit_insn (gen_store_contidionaldi (location, new_value, sc_result));
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  /* If SC failed, try again.  */
+  cmp_result = gen_rtx_EQ (VOIDmode, sc_result, const0_rtx);
+  emit_jump_insn (gen_condjump (cmp_result, loop));
+
+  if (return_new)
+    mips_emit_move (result, new_value);
+}
+\f
 /* Expand a movmemsi instruction.  */
 
 bool
Index: config/mips/mips.h
===================================================================
--- config/mips/mips.h	(revision 127356)
+++ config/mips/mips.h	(working copy)
@@ -876,6 +876,11 @@ extern enum mips_code_readable_setting m
 /* ISA includes synci, jr.hb and jalr.hb.  */
 #define ISA_HAS_SYNCI (ISA_MIPS32R2 && !TARGET_MIPS16)
 
+/* ISA includes sync.  */
+#define ISA_HAS_SYNC ((mips_isa >= 2 || TARGET_MIPS3900) && !TARGET_MIPS16)
+
+/* ISA includes ll and sc.  */
+#define ISA_HAS_LL_SC (mips_isa >= 2 && !TARGET_MIPS16)
 \f
 /* Add -G xx support.  */
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-12 20:23 RFC: [PATCH] MIPS: Implement built-in atomic memory operations David Daney
@ 2007-08-13  6:26 ` Ken Raeburn
  2007-08-13  6:38   ` David Daney
  2007-08-13  7:23 ` Richard Sandiford
  1 sibling, 1 reply; 20+ messages in thread
From: Ken Raeburn @ 2007-08-13  6:26 UTC (permalink / raw)
  To: David Daney; +Cc: GCC Patches, Richard Sandiford

On Aug 12, 2007, at 16:20, David Daney wrote:
> While trouble shooting libgcj problems on n64, I discovered that  
> the atomic operations implemented in libjava/sysdep/mips/locks.h  
> are both buggy and do not support 64 bit operations.  Instead of  
> hacking them up, it seems like a better path is to implement the  
> builtin atomic memory operations for mips and use them instead.   
> The result is this (very)lightly tested patch.

Did you mean to consistently spell "store_contidional" that way  
(instead of "conditional")?

Ken

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-13  6:26 ` Ken Raeburn
@ 2007-08-13  6:38   ` David Daney
  0 siblings, 0 replies; 20+ messages in thread
From: David Daney @ 2007-08-13  6:38 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: GCC Patches, Richard Sandiford

Ken Raeburn wrote:
> On Aug 12, 2007, at 16:20, David Daney wrote:
>> While trouble shooting libgcj problems on n64, I discovered that the 
>> atomic operations implemented in libjava/sysdep/mips/locks.h are both 
>> buggy and do not support 64 bit operations.  Instead of hacking them 
>> up, it seems like a better path is to implement the builtin atomic 
>> memory operations for mips and use them instead.  The result is this 
>> (very)lightly tested patch.
>
> Did you mean to consistently spell "store_contidional" that way 
> (instead of "conditional")?
No.  Sometime things just come out wrong.  Since I do the majority of my 
programming by cut-and-paste operations, the typo became universal.  I 
will fix it in the next revision of the patch.

David Daney

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-12 20:23 RFC: [PATCH] MIPS: Implement built-in atomic memory operations David Daney
  2007-08-13  6:26 ` Ken Raeburn
@ 2007-08-13  7:23 ` Richard Sandiford
  2007-08-13  7:24   ` Richard Sandiford
                     ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Richard Sandiford @ 2007-08-13  7:23 UTC (permalink / raw)
  To: David Daney; +Cc: GCC Patches

Thanks for doing this.

David Daney <ddaney@avtrex.com> writes:
> +(define_insn "load_linked<mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=d")
> +  	(unspec_volatile:GPR [(match_operand:GPR 1 "memory_operand" "o")]
> +         UNSPEC_LL))]
> +  "ISA_HAS_LL_SC"
> +  "ll<d>\t%0,%1"
> +  [(set_attr "type" "load")
> +   (set_attr "mode" "<MODE>")])

I think the constraint should be "R" rather than "o".  The operand
doesn't need to be offsettable, but you don't want it to include a GOT
load (for -mabicalls -mno-explicit-relocs).

> +(define_insn "store_contidional<mode>"
> +  [(set (match_operand:GPR 0 "memory_operand" "=o")
> +  	(match_operand:GPR 1 "register_operand" "d"))
> +   (set (match_operand:GPR 2 "register_operand" "=1")		   
> +  	(unspec_volatile:GPR [(const_int 0)] UNSPEC_SC))]
> +  "ISA_HAS_LL_SC"
> +  "sc<d>\t%1,%0"
> +  [(set_attr "type" "store")
> +   (set_attr "mode" "<MODE>")]

Same here.  But... I worry about the safety of splitting the instructions
up like this.  The result of the ll/sc pair is not predictable if we stick
a load between them, and what's to stop the compiler from inserting a
load from read-only memory?

I think it would be safer to use a single insn for the whole loop.
I realise this is going in the opposite direction to the cache flush
patch, where I wanted single-insn rtl patterns, but there were no such
correctness issues then.  ll/sc is a very special case.

Also, as with alpha/sync.md, I'd like to see a code macro used in the
sync_old_<fetch_op><mode> and sync_new_<fetchop><mode> cases,
where fetch_op covers everything except "nand".

So I'd suggest having one pattern each for the loops in:

  - sync_compare_and_swap<mode>
  - sync_old_<fetch_op><mode>
  - sync_old_nand<mode>
  - sync_new_<fetch_op><mode>
  - sync_new_nand<mode>

See alpha.md for the fetch_op stuff.

Richard

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-13  7:23 ` Richard Sandiford
@ 2007-08-13  7:24   ` Richard Sandiford
  2007-08-13 16:22   ` David Daney
  2007-08-14  8:25   ` David Daney
  2 siblings, 0 replies; 20+ messages in thread
From: Richard Sandiford @ 2007-08-13  7:24 UTC (permalink / raw)
  To: David Daney; +Cc: GCC Patches

Richard Sandiford <richard@codesourcery.com> writes:
> See alpha.md for the fetch_op stuff.

Sorry, ignore that, left over from a previous version...

Richard

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-13  7:23 ` Richard Sandiford
  2007-08-13  7:24   ` Richard Sandiford
@ 2007-08-13 16:22   ` David Daney
  2007-08-13 16:53     ` Richard Sandiford
  2007-08-13 18:08     ` Maciej W. Rozycki
  2007-08-14  8:25   ` David Daney
  2 siblings, 2 replies; 20+ messages in thread
From: David Daney @ 2007-08-13 16:22 UTC (permalink / raw)
  To: richard; +Cc: GCC Patches

Richard Sandiford wrote:
> Thanks for doing this.
>
> David Daney <ddaney@avtrex.com> writes:
>   
>> +(define_insn "load_linked<mode>"
>> +  [(set (match_operand:GPR 0 "register_operand" "=d")
>> +  	(unspec_volatile:GPR [(match_operand:GPR 1 "memory_operand" "o")]
>> +         UNSPEC_LL))]
>> +  "ISA_HAS_LL_SC"
>> +  "ll<d>\t%0,%1"
>> +  [(set_attr "type" "load")
>> +   (set_attr "mode" "<MODE>")])
>>     
>
> I think the constraint should be "R" rather than "o".  The operand
> doesn't need to be offsettable, but you don't want it to include a GOT
> load (for -mabicalls -mno-explicit-relocs).
>
>   
Right.  "R" does seem to be exactly what I was looking for, I should 
have dug a little deeper looking for it.

>> +(define_insn "store_contidional<mode>"
>> +  [(set (match_operand:GPR 0 "memory_operand" "=o")
>> +  	(match_operand:GPR 1 "register_operand" "d"))
>> +   (set (match_operand:GPR 2 "register_operand" "=1")		   
>> +  	(unspec_volatile:GPR [(const_int 0)] UNSPEC_SC))]
>> +  "ISA_HAS_LL_SC"
>> +  "sc<d>\t%1,%0"
>> +  [(set_attr "type" "store")
>> +   (set_attr "mode" "<MODE>")]
>>     
>
> Same here.  But... I worry about the safety of splitting the instructions
> up like this.  The result of the ll/sc pair is not predictable if we stick
> a load between them,
Just for the sake of argument: Both load_linked and store_conditional 
are unspec_volatile, the compiler will not be putting anything between 
them we don't emit ourselves.  It seems that the expander could assure 
that all operands were either in registers or immediate constants before 
emitting insns between load_linked and store_conditional.
>  and what's to stop the compiler from inserting a
> load from read-only memory?
>   
That is not a problem unique to ll/sc is it?  Presumable there would be 
some sort of fault and something like SIGSEGV or SIGBUS would be generated.

> I think it would be safer to use a single insn for the whole loop.
> I realise this is going in the opposite direction to the cache flush
> patch, where I wanted single-insn rtl patterns, but there were no such
> correctness issues then.  ll/sc is a very special case.
>
>   
I am not opposed to doing it this way, but want to address the questions 
I posed above.

> Also, as with alpha/sync.md, I'd like to see a code macro used in the
> sync_old_<fetch_op><mode> and sync_new_<fetchop><mode> cases,
> where fetch_op covers everything except "nand".
>
> So I'd suggest having one pattern each for the loops in:
>
>   - sync_compare_and_swap<mode>
>   - sync_old_<fetch_op><mode>
>   - sync_old_nand<mode>
>   - sync_new_<fetch_op><mode>
>   - sync_new_nand<mode>
>   
Clearly that would be a cleaner way of implementing it.

David Daney

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-13 16:22   ` David Daney
@ 2007-08-13 16:53     ` Richard Sandiford
  2007-08-13 18:08     ` Maciej W. Rozycki
  1 sibling, 0 replies; 20+ messages in thread
From: Richard Sandiford @ 2007-08-13 16:53 UTC (permalink / raw)
  To: David Daney; +Cc: GCC Patches

David Daney <ddaney@avtrex.com> writes:
> Richard Sandiford wrote:
>>> +(define_insn "store_contidional<mode>"
>>> +  [(set (match_operand:GPR 0 "memory_operand" "=o")
>>> +  	(match_operand:GPR 1 "register_operand" "d"))
>>> +   (set (match_operand:GPR 2 "register_operand" "=1")		   
>>> +  	(unspec_volatile:GPR [(const_int 0)] UNSPEC_SC))]
>>> +  "ISA_HAS_LL_SC"
>>> +  "sc<d>\t%1,%0"
>>> +  [(set_attr "type" "store")
>>> +   (set_attr "mode" "<MODE>")]
>>>     
>>
>> Same here.  But... I worry about the safety of splitting the instructions
>> up like this.  The result of the ll/sc pair is not predictable if we stick
>> a load between them,
> Just for the sake of argument: Both load_linked and store_conditional 
> are unspec_volatile, the compiler will not be putting anything between 
> them we don't emit ourselves.  It seems that the expander could assure 
> that all operands were either in registers or immediate constants before 
> emitting insns between load_linked and store_conditional.

Sorry, I'd forgotten that they were unspec_volatile.  But even so,
that isn't a complete guarantee.  Let's say the pseudo register in
the "sc" address is P, and consider the case where the P is not
allocated a hard register.  If P is equivalent to a constant C
throughout the function, reload will load C immediately before the "sc".
And that load of C might involve a memory access; it might be a GOT load,
or a load from the constant pool.  And if P is instead spilled to the stack,
reload will insert a load from the stack immediately before the "sc".
When optimisation is enabled, reload will often be able to inherit
the reload for the "ll", but that isn't guaranteed, and doesn't help
when not optimising.

>>  and what's to stop the compiler from inserting a
>> load from read-only memory?
>>   
> That is not a problem unique to ll/sc is it?  Presumable there would be 
> some sort of fault and something like SIGSEGV or SIGBUS would be generated.

Well, this is moot given the unspec_volatile thing, but I was talking
about loads that are known not to trap, or (when -fnon-call-exceptions
is off) loads that postdominate this operation.

Richard

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-13 16:22   ` David Daney
  2007-08-13 16:53     ` Richard Sandiford
@ 2007-08-13 18:08     ` Maciej W. Rozycki
  2007-08-13 18:33       ` David Daney
  1 sibling, 1 reply; 20+ messages in thread
From: Maciej W. Rozycki @ 2007-08-13 18:08 UTC (permalink / raw)
  To: David Daney; +Cc: richard, GCC Patches

On Mon, 13 Aug 2007, David Daney wrote:

> > and what's to stop the compiler from inserting a
> > load from read-only memory?
> >   
> That is not a problem unique to ll/sc is it?  Presumable there would be some
> sort of fault and something like SIGSEGV or SIGBUS would be generated.

 It is implementation-specific how much address space is checked by the 
LL/SC logic for bus activity, so if you have another load (or store) 
between an LL and an SC chances are the SC will never succeed.

  Maciej

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-13 18:08     ` Maciej W. Rozycki
@ 2007-08-13 18:33       ` David Daney
  2007-08-15 17:25         ` Maciej W. Rozycki
  0 siblings, 1 reply; 20+ messages in thread
From: David Daney @ 2007-08-13 18:33 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: richard, GCC Patches

Maciej W. Rozycki wrote:
> On Mon, 13 Aug 2007, David Daney wrote:
> 
>>> and what's to stop the compiler from inserting a
>>> load from read-only memory?
>>>   
>> That is not a problem unique to ll/sc is it?  Presumable there would be some
>> sort of fault and something like SIGSEGV or SIGBUS would be generated.
> 
>  It is implementation-specific how much address space is checked by the 
> LL/SC logic for bus activity, so if you have another load (or store) 
> between an LL and an SC chances are the SC will never succeed.
> 

Right.  It is clear that although with my original patch correct code is 
generated for simple test cases, in general there is the possibility of 
  generating bad code.

My next revision will follow Richard's suggestion and hard code the 
entire ll/sc sequence.

David Daney

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-13  7:23 ` Richard Sandiford
  2007-08-13  7:24   ` Richard Sandiford
  2007-08-13 16:22   ` David Daney
@ 2007-08-14  8:25   ` David Daney
  2007-08-14 13:51     ` Thiemo Seufer
  2007-08-14 13:55     ` Richard Sandiford
  2 siblings, 2 replies; 20+ messages in thread
From: David Daney @ 2007-08-14  8:25 UTC (permalink / raw)
  To: David Daney, GCC Patches, richard

[-- Attachment #1: Type: text/plain, Size: 1425 bytes --]

Richard Sandiford wrote:
> So I'd suggest having one pattern each for the loops in:
>
>   - sync_compare_and_swap<mode>
>   - sync_old_<fetch_op><mode>
>   - sync_old_nand<mode>
>   - sync_new_<fetch_op><mode>
>   - sync_new_nand<mode>
>
>   
Like this?

I split the fetch_ops into two groups (arithmetic and bitwise).  This 
was to make it possible to have patterns that matched immediate operands 
as they are treated differently in the two cases (sign extended vs. zero 
extended).

Still only lightly tested.

Comments?

2007-08-14  David Daney  <ddaney@avtrex.com>

    * config/mips/mips.md (UNSPEC_LL, UNSPEC_SC,
    UNSPEC_COMPARE_AND_SWAP, UNSPEC_SYNC_OLD_OP, UNSPEC_SYNC_NEW_OP,
    UNSPEC_SYNC_EXCHANGE): New define_constants.
    (sync): Change condition to ISA_HAS_SYNC.
    (rdhwr): Change predicate for operand 0 to register_operand.
    (memory_barrier): New expand.
    (sync_compare_and_swap<mode>, sync_old_<fetchop_arith_name><mode>,
    sync_new_<fetchop_arith_name><mode>, sync_old_<fetchop_bit_name><mode>,
    sync_new_<fetchop_bit_name><mode>, sync_old_nand<mode>,
    sync_new_nand<mode>, sync_lock_test_and_set<mode>): New insns.
    (FETCHOP_ARITH, FETCHOP_BIT): New code macros.
    (fetchop_arith_name, fetchop_arith_op, fetchop_arith_imm_op,
    fetchop_bit_name, fetchop_bit_op, fetchop_bit_imm_op): New code attrs.
    * config/mips/mips.h (ISA_HAS_SYNC, ISA_HAS_LL_SC): New ISA predicates.


[-- Attachment #2: mips-atomic.diff --]
[-- Type: text/x-patch, Size: 9281 bytes --]

Index: config/mips/mips.md
===================================================================
--- config/mips/mips.md	(revision 127356)
+++ config/mips/mips.md	(working copy)
@@ -53,7 +53,13 @@ (define_constants
    (UNSPEC_RDHWR		34)
    (UNSPEC_SYNCI		35)
    (UNSPEC_SYNC			36)
-
+   (UNSPEC_LL			37)
+   (UNSPEC_SC			38)
+   (UNSPEC_COMPARE_AND_SWAP	39)
+   (UNSPEC_SYNC_OLD_OP		40)
+   (UNSPEC_SYNC_NEW_OP		41)
+   (UNSPEC_SYNC_EXCHANGE	42)
+   
    (UNSPEC_ADDRESS_FIRST	100)
 
    (FAKE_CALL_REGNO		79)
@@ -4251,7 +4257,7 @@ (define_expand "clear_cache"
 
 (define_insn "sync"
   [(unspec_volatile [(const_int 0)] UNSPEC_SYNC)]
-  "ISA_HAS_SYNCI"
+  "ISA_HAS_SYNC"
   "sync")
 
 (define_insn "synci"
@@ -4261,7 +4267,7 @@ (define_insn "synci"
   "synci\t0(%0)")
 
 (define_insn "rdhwr"
-  [(set (match_operand:SI 0 "general_operand" "=d")
+  [(set (match_operand:SI 0 "register_operand" "=d")
         (unspec_volatile [(match_operand:SI 1 "const_int_operand" "n")]
         UNSPEC_RDHWR))]
   "ISA_HAS_SYNCI"
@@ -4283,6 +4289,257 @@ (define_insn "clear_hazard"
          "\t.set\tpop";
 }
   [(set_attr "length" "20")])
+
+(define_expand "memory_barrier"
+  [(unspec_volatile [(const_int 0)] UNSPEC_SYNC)]
+  "ISA_HAS_SYNC"
+  "")
+
+(define_insn "sync_compare_and_swap<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d,d")
+			      (match_operand:GPR 3 "arith_operand" "I,d")]
+	 UNSPEC_COMPARE_AND_SWAP))
+   (clobber (match_scratch:GPR 4 "=d,d"))]
+  "ISA_HAS_LL_SC"
+{
+  output_asm_insn(".set\tpush\n"
+         "\t.set\tnoreorder\n"
+         "\t.set\tnomacro\n"
+	 "\tsync\n"
+	 "1:\tll<d>\t%0,%1\n"
+	 "\tbne\t%0,%2,2f", operands);
+
+  if (which_alternative == 0)
+    output_asm_insn("li\t%4,%3", operands);
+  else
+    output_asm_insn("move\t%4,%3", operands);
+
+  return "sc<d>\t%4,%1\n"
+	 "\tbeq\t%4,$0,1b\n"
+	 "\tnop\n"
+	 "2:\n"
+         "\t.set\tpop";
+}
+  [(set_attr "length" "28")])
+
+(define_code_macro FETCHOP_ARITH [plus minus])
+(define_code_attr fetchop_arith_name [(plus "add") (minus "sub")])
+(define_code_attr fetchop_arith_op [(plus "addu") (minus "subu")])
+(define_code_attr fetchop_arith_imm_op [(plus "addiu") (minus "subiu")])
+
+
+(define_insn "sync_old_<fetchop_arith_name><mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR
+          [(FETCHOP_ARITH:GPR (match_operand:GPR 2 "arith_operand" "I,d")
+			      (match_dup 2))]
+	 UNSPEC_SYNC_OLD_OP))
+   (clobber (match_scratch:GPR 3 "=d,d"))]
+  "ISA_HAS_LL_SC"
+{
+  output_asm_insn(".set\tpush\n"
+         "\t.set\tnoreorder\n"
+         "\t.set\tnomacro\n"
+	 "\tsync\n"
+	 "1:\tll<d>\t%0,%1\n", operands);
+
+  if (which_alternative == 0)
+    output_asm_insn("<d><fetchop_arith_imm_op>\t%3,%0,%2", operands);
+  else
+    output_asm_insn("<d><fetchop_arith_op>\t%3,%0,%2", operands);
+
+  return "sc<d>\t%3,%1\n"
+	 "\tbeq\t%3,$0,1b\n"
+	 "\tnop\n"
+         "\t.set\tpop";
+}
+  [(set_attr "length" "24")])
+
+(define_insn "sync_new_<fetchop_arith_name><mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR
+          [(FETCHOP_ARITH:GPR (match_operand:GPR 2 "arith_operand" "I,d")
+			      (match_dup 2))]
+	 UNSPEC_SYNC_NEW_OP))
+   (clobber (match_scratch:GPR 3 "=d,d"))]
+  "ISA_HAS_LL_SC"
+{
+  output_asm_insn(".set\tpush\n"
+         "\t.set\tnoreorder\n"
+         "\t.set\tnomacro\n"
+	 "\tsync\n"
+	 "1:\tll<d>\t%0,%1\n", operands);
+
+  if (which_alternative == 0)
+    output_asm_insn("<d><fetchop_arith_imm_op>\t%3,%0,%2", operands);
+  else
+    output_asm_insn("<d><fetchop_arith_op>\t%3,%0,%2", operands);
+
+  output_asm_insn("sc<d>\t%3,%1\n"
+		  "\tbeq\t%3,$0,1b\n", operands);
+  /* The new value was clobbered.  Recalculate it in the delay slot.  */
+  if (which_alternative == 0)
+    output_asm_insn("<d><fetchop_arith_imm_op>\t%0,%0,%2", operands);
+  else
+    output_asm_insn("<d><fetchop_arith_op>\t%0,%0,%2", operands);
+
+  return "\t.set\tpop";
+}
+  [(set_attr "length" "24")])
+
+(define_code_macro FETCHOP_BIT [ior xor and])
+(define_code_attr fetchop_bit_name [(ior "ior") (xor "xor") (and "and")])
+(define_code_attr fetchop_bit_op [(ior "or") (xor "xor") (and "and")])
+(define_code_attr fetchop_bit_imm_op [(ior "ori") (xor "xori") (and "andi")])
+
+(define_insn "sync_old_<fetchop_bit_name><mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR
+          [(FETCHOP_BIT:GPR (match_operand:GPR 2 "uns_arith_operand" "K,d")
+			      (match_dup 2))]
+	 UNSPEC_SYNC_OLD_OP))
+   (clobber (match_scratch:GPR 3 "=d,d"))]
+  "ISA_HAS_LL_SC"
+{
+  output_asm_insn(".set\tpush\n"
+         "\t.set\tnoreorder\n"
+         "\t.set\tnomacro\n"
+	 "\tsync\n"
+	 "1:\tll<d>\t%0,%1\n", operands);
+
+  if (which_alternative == 0)
+    output_asm_insn("<fetchop_bit_imm_op>\t%3,%0,%2", operands);
+  else
+    output_asm_insn("<fetchop_bit_op>\t%3,%0,%2", operands);
+
+  return "sc<d>\t%3,%1\n"
+	 "\tbeq\t%3,$0,1b\n"
+	 "\tnop\n"
+         "\t.set\tpop";
+}
+  [(set_attr "length" "24")])
+
+(define_insn "sync_new_<fetchop_bit_name><mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR
+          [(FETCHOP_BIT:GPR (match_operand:GPR 2 "uns_arith_operand" "K,d")
+			      (match_dup 2))]
+	 UNSPEC_SYNC_NEW_OP))
+   (clobber (match_scratch:GPR 3 "=d,d"))]
+  "ISA_HAS_LL_SC"
+{
+  output_asm_insn(".set\tpush\n"
+         "\t.set\tnoreorder\n"
+         "\t.set\tnomacro\n"
+	 "\tsync\n"
+	 "1:\tll<d>\t%0,%1\n", operands);
+
+  if (which_alternative == 0)
+    output_asm_insn("<fetchop_bit_imm_op>\t%3,%0,%2", operands);
+  else
+    output_asm_insn("<fetchop_bit_op>\t%3,%0,%2", operands);
+
+  output_asm_insn("sc<d>\t%3,%1\n"
+		  "\tbeq\t%3,$0,1b\n", operands);
+  /* The new value was clobbered.  Recalculate it in the delay slot.  */
+  if (which_alternative == 0)
+    output_asm_insn("<fetchop_bit_imm_op>\t%0,%0,%2", operands);
+  else
+    output_asm_insn("<fetchop_bit_op>\t%0,%0,%2", operands);
+
+  return "\t.set\tpop";
+}
+  [(set_attr "length" "24")])
+
+(define_insn "sync_old_nand<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "uns_arith_operand" "K,d")]
+	 UNSPEC_SYNC_OLD_OP))
+   (clobber (match_scratch:GPR 3 "=d,d"))]
+  "ISA_HAS_LL_SC"
+{
+  output_asm_insn(".set\tpush\n"
+         "\t.set\tnoreorder\n"
+         "\t.set\tnomacro\n"
+	 "\tsync\n"
+	 "1:\tll<d>\t%0,%1\n"
+	 "\tnor\t%0,$0,%0", operands);
+
+  if (which_alternative == 0)
+    output_asm_insn("andi\t%3,%0,%2", operands);
+  else
+    output_asm_insn("and\t%3,%0,%2", operands);
+
+  return "sc<d>\t%3,%1\n"
+	 "\tbeq\t%3,$0,1b\n"
+	 "\tnop\n"
+         "\t.set\tpop";
+}
+  [(set_attr "length" "28")])
+
+(define_insn "sync_new_nand<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "uns_arith_operand" "K,d")]
+	 UNSPEC_SYNC_NEW_OP))
+   (clobber (match_scratch:GPR 3 "=d,d"))]
+  "ISA_HAS_LL_SC"
+{
+  output_asm_insn(".set\tpush\n"
+         "\t.set\tnoreorder\n"
+         "\t.set\tnomacro\n"
+	 "\tsync\n"
+	 "1:\tll<d>\t%0,%1\n"
+	 "\tnor\t%0,$0,%0", operands);
+
+  if (which_alternative == 0)
+    output_asm_insn("andi\t%3,%0,%2", operands);
+  else
+    output_asm_insn("and\t%3,%0,%2", operands);
+
+  return "move\t%0,%3\n"
+	 "\tsc<d>\t%3,%1\n"
+	 "\tbeq\t%3,$0,1b\n"
+	 "\tnop\n"
+         "\t.set\tpop";
+}
+  [(set_attr "length" "32")])
+
+(define_insn "sync_lock_test_and_set<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d")
+	(match_operand:GPR 1 "memory_operand" "+R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
+	 UNSPEC_SYNC_EXCHANGE))
+   (clobber (match_scratch:GPR 3 "=d"))]
+  "ISA_HAS_LL_SC"
+{
+  return ".set\tpush\n"
+         "\t.set\tnoreorder\n"
+         "\t.set\tnomacro\n"
+	 "1:\tll<d>\t%0,%1\n"
+	 "\tmove\t%3,%2"
+	 "\tsc<d>\t%3,%1\n"
+	 "\tbeq\t%3,$0,1b\n"
+	 "\tnop\n"
+	 "\tsync\n"
+         "\t.set\tpop";
+}
+  [(set_attr "length" "24")])
 \f
 ;; Block moves, see mips.c for more details.
 ;; Argument 0 is the destination
Index: config/mips/mips.h
===================================================================
--- config/mips/mips.h	(revision 127356)
+++ config/mips/mips.h	(working copy)
@@ -876,6 +876,11 @@ extern enum mips_code_readable_setting m
 /* ISA includes synci, jr.hb and jalr.hb.  */
 #define ISA_HAS_SYNCI (ISA_MIPS32R2 && !TARGET_MIPS16)
 
+/* ISA includes sync.  */
+#define ISA_HAS_SYNC ((mips_isa >= 2 || TARGET_MIPS3900) && !TARGET_MIPS16)
+
+/* ISA includes ll and sc.  */
+#define ISA_HAS_LL_SC (mips_isa >= 2 && !TARGET_MIPS16)
 \f
 /* Add -G xx support.  */
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-14  8:25   ` David Daney
@ 2007-08-14 13:51     ` Thiemo Seufer
  2007-08-14 15:09       ` David Daney
  2007-08-14 13:55     ` Richard Sandiford
  1 sibling, 1 reply; 20+ messages in thread
From: Thiemo Seufer @ 2007-08-14 13:51 UTC (permalink / raw)
  To: David Daney; +Cc: GCC Patches, richard

David Daney wrote:
> Richard Sandiford wrote:
>> So I'd suggest having one pattern each for the loops in:
>>
>>   - sync_compare_and_swap<mode>
>>   - sync_old_<fetch_op><mode>
>>   - sync_old_nand<mode>
>>   - sync_new_<fetch_op><mode>
>>   - sync_new_nand<mode>
>>
>>   
> Like this?
>
> I split the fetch_ops into two groups (arithmetic and bitwise).  This was 
> to make it possible to have patterns that matched immediate operands as 
> they are treated differently in the two cases (sign extended vs. zero 
> extended).
>
> Still only lightly tested.
>
> Comments?
[snip]
> +(define_insn "sync_lock_test_and_set<mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=&d")
> +	(match_operand:GPR 1 "memory_operand" "+R"))
> +   (set (match_dup 1)
> +	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
> +	 UNSPEC_SYNC_EXCHANGE))
> +   (clobber (match_scratch:GPR 3 "=d"))]
> +  "ISA_HAS_LL_SC"
> +{
> +  return ".set\tpush\n"
> +         "\t.set\tnoreorder\n"
> +         "\t.set\tnomacro\n"
> +	 "1:\tll<d>\t%0,%1\n"
> +	 "\tmove\t%3,%2"

Does this work? IIRC move is a macro.

> +	 "\tsc<d>\t%3,%1\n"
> +	 "\tbeq\t%3,$0,1b\n"
> +	 "\tnop\n"
> +	 "\tsync\n"
> +         "\t.set\tpop";
> +}
> +  [(set_attr "length" "24")])
>  \f
>  ;; Block moves, see mips.c for more details.
>  ;; Argument 0 is the destination
> Index: config/mips/mips.h
> ===================================================================
> --- config/mips/mips.h	(revision 127356)
> +++ config/mips/mips.h	(working copy)
> @@ -876,6 +876,11 @@ extern enum mips_code_readable_setting m
>  /* ISA includes synci, jr.hb and jalr.hb.  */
>  #define ISA_HAS_SYNCI (ISA_MIPS32R2 && !TARGET_MIPS16)
>  
> +/* ISA includes sync.  */
> +#define ISA_HAS_SYNC ((mips_isa >= 2 || TARGET_MIPS3900) && !TARGET_MIPS16)
> +
> +/* ISA includes ll and sc.  */
> +#define ISA_HAS_LL_SC (mips_isa >= 2 && !TARGET_MIPS16)

For a Linux configuration with MIPS I this should also be true,
the kernel emulates ll/sc in those cases.


Thiemo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-14  8:25   ` David Daney
  2007-08-14 13:51     ` Thiemo Seufer
@ 2007-08-14 13:55     ` Richard Sandiford
  2007-08-17 16:37       ` David Daney
  1 sibling, 1 reply; 20+ messages in thread
From: Richard Sandiford @ 2007-08-14 13:55 UTC (permalink / raw)
  To: David Daney; +Cc: GCC Patches

David Daney <ddaney@avtrex.com> writes:
> Richard Sandiford wrote:
>> So I'd suggest having one pattern each for the loops in:
>>
>>   - sync_compare_and_swap<mode>
>>   - sync_old_<fetch_op><mode>
>>   - sync_old_nand<mode>
>>   - sync_new_<fetch_op><mode>
>>   - sync_new_nand<mode>
>>
>>   
> Like this?
>
> I split the fetch_ops into two groups (arithmetic and bitwise).  This 
> was to make it possible to have patterns that matched immediate operands 
> as they are treated differently in the two cases (sign extended vs. zero 
> extended).

FWIW, that difference could be handled by code attributes.  I think the
real problem is the need for "<d>" in arithmetic ops and not logical ops.

> +(define_insn "sync_compare_and_swap<mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
> +	(match_operand:GPR 1 "memory_operand" "+R,R"))
> +   (set (match_dup 1)
> +	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d,d")
> +			      (match_operand:GPR 3 "arith_operand" "I,d")]
> +	 UNSPEC_COMPARE_AND_SWAP))
> +   (clobber (match_scratch:GPR 4 "=d,d"))]

Operand 4 needs to be earlyclobber too, to handle the case where the
loop iterates.  But why not use $at instead?

FWIW, we could handle "IKL" (with an appropriate predicate)
without any change to the template.

> +{
> +  output_asm_insn(".set\tpush\n"
> +         "\t.set\tnoreorder\n"
> +         "\t.set\tnomacro\n"
> +	 "\tsync\n"
> +	 "1:\tll<d>\t%0,%1\n"
> +	 "\tbne\t%0,%2,2f", operands);
> +
> +  if (which_alternative == 0)
> +    output_asm_insn("li\t%4,%3", operands);
> +  else
> +    output_asm_insn("move\t%4,%3", operands);
> +
> +  return "sc<d>\t%4,%1\n"
> +	 "\tbeq\t%4,$0,1b\n"
> +	 "\tnop\n"
> +	 "2:\n"
> +         "\t.set\tpop";
> +}
> +  [(set_attr "length" "28")])

Please use %(%<...%>%) to wrap the set noreorder/set nomacro stuff.
Make that %(%<%[...%]%>%) if we're using $at (nice and readable, eh?)
Also use $@ instead of $at and $. instead of $0.

I think it would be (slightly) cleaner to have:

/* Return an asm string that atomically:

     - Compares memory reference %1 to register %2 and, if they are
       equal, changes %1 to %3.

     - Sets register %0 to the old value of memory reference %1.

   SUFFIX is the suffix that should be added to "ll" and "sc" instructions
   and OP is the instruction that should be used to load %3 into a
   register.  */
#define MIPS_COMPARE_AND_SWAP(SUFFIX, OP)	\
  "%(%<%[sync\n"				\
  "1:\tll" SUFFIX "t%0,%1\n"			\
  "\tbne\t%0,%2,2f\n"				\
  OP "\t%@,%3\n"				\
  "\tsd" SUFFIX "%@,%1\n"			\
  "\tbeq\t%@,%.,1b\n"				\
  "\tnop\n"					\
  "2:%]%>%)\n"
....
  if (which_alternative == 0)
    return MIPS_COMPARE_AND_SWAP ("<d>", "li");
  else
    return MIPS_COMPARE_AND_SWAP ("<d>", "move");

The win isn't that big here, but it's bigger if we do the same thing
for the (repeated) sync_old* and sync_new* patterns; more below.

> +(define_code_macro FETCHOP_ARITH [plus minus])

These are usually lowercase for MIPS.  Please put at the top of the
file, with the others.

> +(define_code_attr fetchop_arith_name [(plus "add") (minus "sub")])

Add this to the "optab" attribute

> +(define_code_attr fetchop_arith_op [(plus "addu") (minus "subu")])

...and this to the "insn" attribute.

> +(define_code_attr fetchop_arith_imm_op [(plus "addiu") (minus "subiu")])

We shouldn't pretend there's a "subiu" insn.  I suppose this is where
the idea of using code macros falls down ;(  We could get around it by
adding code attributes for the predicate and constraint (as Alpha does)
but perhaps it is better to keep them separate.

> +(define_insn "sync_old_<fetchop_arith_name><mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
> +	(match_operand:GPR 1 "memory_operand" "+R,R"))
> +   (set (match_dup 1)
> +	(unspec_volatile:GPR
> +          [(FETCHOP_ARITH:GPR (match_operand:GPR 2 "arith_operand" "I,d")
> +			      (match_dup 2))]
> +	 UNSPEC_SYNC_OLD_OP))
> +   (clobber (match_scratch:GPR 3 "=d,d"))]
> +  "ISA_HAS_LL_SC"

Same $at comment here.  Also, the first operand to the fetchop ought to
be 1 rather than 2.  (Not that it makes any difference to the generated
code; it's just a little confusing as-is.)

Again, I think it would be good to add a MIPS_SYNC_OLD macro,
along the lines of MIPS_COMPARE_AND_SWAP above.

> +(define_code_attr fetchop_bit_imm_op [(ior "ori") (xor "xori") (and "andi")])

Let's make this "imm_insn", to match "insn".

> +(define_insn "sync_new_<fetchop_arith_name><mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
> +	(match_operand:GPR 1 "memory_operand" "+R,R"))
> +   (set (match_dup 1)
> +	(unspec_volatile:GPR
> +          [(FETCHOP_ARITH:GPR (match_operand:GPR 2 "arith_operand" "I,d")
> +			      (match_dup 2))]
> +	 UNSPEC_SYNC_NEW_OP))
> +   (clobber (match_scratch:GPR 3 "=d,d"))]
> +  "ISA_HAS_LL_SC"

Again, this pattern is a little confusing.  Operand 0 is set to the
result of the operation, not the original memory reference.

Same $at comment, and it might be better to have a MIPS_SYNC_NEW macro.

> +(define_insn "sync_new_nand<mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
> +	(match_operand:GPR 1 "memory_operand" "+R,R"))
> +   (set (match_dup 1)
> +	(unspec_volatile:GPR [(match_operand:GPR 2 "uns_arith_operand" "K,d")]
> +	 UNSPEC_SYNC_NEW_OP))
> +   (clobber (match_scratch:GPR 3 "=d,d"))]
> +  "ISA_HAS_LL_SC"
> +{
> +  output_asm_insn(".set\tpush\n"
> +         "\t.set\tnoreorder\n"
> +         "\t.set\tnomacro\n"
> +	 "\tsync\n"
> +	 "1:\tll<d>\t%0,%1\n"
> +	 "\tnor\t%0,$0,%0", operands);
> +
> +  if (which_alternative == 0)
> +    output_asm_insn("andi\t%3,%0,%2", operands);
> +  else
> +    output_asm_insn("and\t%3,%0,%2", operands);
> +
> +  return "move\t%0,%3\n"
> +	 "\tsc<d>\t%3,%1\n"
> +	 "\tbeq\t%3,$0,1b\n"
> +	 "\tnop\n"
> +         "\t.set\tpop";
> +}

Why not avoid the move and use the same delay-slot repeat that you used
in the other sync_new* patterns?

> +(define_insn "sync_lock_test_and_set<mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=&d")
> +	(match_operand:GPR 1 "memory_operand" "+R"))
> +   (set (match_dup 1)
> +	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
> +	 UNSPEC_SYNC_EXCHANGE))
> +   (clobber (match_scratch:GPR 3 "=d"))]
> +  "ISA_HAS_LL_SC"
> +{
> +  return ".set\tpush\n"
> +         "\t.set\tnoreorder\n"
> +         "\t.set\tnomacro\n"
> +	 "1:\tll<d>\t%0,%1\n"
> +	 "\tmove\t%3,%2"
> +	 "\tsc<d>\t%3,%1\n"
> +	 "\tbeq\t%3,$0,1b\n"
> +	 "\tnop\n"
> +	 "\tsync\n"
> +         "\t.set\tpop";
> +}

No need for a C block.  Just make this a plain asm template.

Sorry for all the comments...

Richard

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-14 13:51     ` Thiemo Seufer
@ 2007-08-14 15:09       ` David Daney
  2007-08-14 15:35         ` Maciej W. Rozycki
  2007-08-14 15:39         ` Thiemo Seufer
  0 siblings, 2 replies; 20+ messages in thread
From: David Daney @ 2007-08-14 15:09 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: GCC Patches, richard

Thiemo Seufer wrote:
>> +	 "\tmove\t%3,%2"
>>     
>
> Does this work? IIRC move is a macro.
>
>   

With the binutils 2.17 assembler it does.

>> +
>> +/* ISA includes ll and sc.  */
>> +#define ISA_HAS_LL_SC (mips_isa >= 2 && !TARGET_MIPS16)
>>     
>
> For a Linux configuration with MIPS I this should also be true,
> the kernel emulates ll/sc in those cases.
>
>   
In this case could you "lie" to the compiler by passing -mips2 on the 
command line?  Otherwise we would have to either leave it as it is or 
add a new command line parameter to explicitly enable ll/sc.

David Daney

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-14 15:09       ` David Daney
@ 2007-08-14 15:35         ` Maciej W. Rozycki
  2007-08-14 15:54           ` David Daney
  2007-08-14 15:39         ` Thiemo Seufer
  1 sibling, 1 reply; 20+ messages in thread
From: Maciej W. Rozycki @ 2007-08-14 15:35 UTC (permalink / raw)
  To: David Daney; +Cc: Thiemo Seufer, GCC Patches, richard

On Tue, 14 Aug 2007, David Daney wrote:

> > Does this work? IIRC move is a macro.
> 
> With the binutils 2.17 assembler it does.

 Well, "move" is more of an instruction alias than a proper macro in the 
sense it always expands to a single instruction (and is therefore 
unaffected by ".set nomacro", etc.).

> > For a Linux configuration with MIPS I this should also be true,
> > the kernel emulates ll/sc in those cases.
> >
> >   
> In this case could you "lie" to the compiler by passing -mips2 on the command
> line?  Otherwise we would have to either leave it as it is or add a new
> command line parameter to explicitly enable ll/sc.

 I would feel a bit uneasy about adding the implication of the 
availability of ll/sc based on the target operating system.  Using a 
switch like "-march=mips2" is unacceptable as it enables some other 
instructions.  I would be in favour to an additional option along the 
lines of -mbranch-likely.

  Maciej

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-14 15:09       ` David Daney
  2007-08-14 15:35         ` Maciej W. Rozycki
@ 2007-08-14 15:39         ` Thiemo Seufer
  1 sibling, 0 replies; 20+ messages in thread
From: Thiemo Seufer @ 2007-08-14 15:39 UTC (permalink / raw)
  To: David Daney; +Cc: GCC Patches, richard

David Daney wrote:
[snip]
>>> +/* ISA includes ll and sc.  */
>>> +#define ISA_HAS_LL_SC (mips_isa >= 2 && !TARGET_MIPS16)
>>>     
>>
>> For a Linux configuration with MIPS I this should also be true,
>> the kernel emulates ll/sc in those cases.
>>
>>   
> In this case could you "lie" to the compiler by passing -mips2 on the 
> command line?

No, -mips2 enables a number of things which aren't emulated by the
kernel. The current technique is a .set mips2 ... .set mips0 pair
around handwritten ll/sc code.

> Otherwise we would have to either leave it as it is or add a 
> new command line parameter to explicitly enable ll/sc.

Couldn't linux.h simply override the default definition in mips.h?


Thiemo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-14 15:35         ` Maciej W. Rozycki
@ 2007-08-14 15:54           ` David Daney
  2007-08-14 16:02             ` Richard Sandiford
  0 siblings, 1 reply; 20+ messages in thread
From: David Daney @ 2007-08-14 15:54 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Thiemo Seufer, GCC Patches, richard

Maciej W. Rozycki wrote:
>>> For a Linux configuration with MIPS I this should also be true,
>>> the kernel emulates ll/sc in those cases.
>>>
>>>   
>>>       
>> In this case could you "lie" to the compiler by passing -mips2 on the command
>> line?  Otherwise we would have to either leave it as it is or add a new
>> command line parameter to explicitly enable ll/sc.
>>     
>
>  I would feel a bit uneasy about adding the implication of the 
> availability of ll/sc based on the target operating system.  Using a 
> switch like "-march=mips2" is unacceptable as it enables some other 
> instructions.  I would be in favour to an additional option along the 
> lines of -mbranch-likely.
>
>   
Something like -mllsc.  The default value could be set at configure time 
with --with-llsc=[true|false]. Perhaps mips*-*-linux* would imply 
--with-llsc=true unless overridden.  Done in a similar manner to the 
default -march and -msoft-float settings.

Perhaps Richard could opine on the matter.

David Daney

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-14 15:54           ` David Daney
@ 2007-08-14 16:02             ` Richard Sandiford
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Sandiford @ 2007-08-14 16:02 UTC (permalink / raw)
  To: David Daney; +Cc: Maciej W. Rozycki, Thiemo Seufer, GCC Patches

David Daney <ddaney@avtrex.com> writes:
> Maciej W. Rozycki wrote:
>>>> For a Linux configuration with MIPS I this should also be true,
>>>> the kernel emulates ll/sc in those cases.
>>>>
>>>>   
>>>>       
>>> In this case could you "lie" to the compiler by passing -mips2 on the command
>>> line?  Otherwise we would have to either leave it as it is or add a new
>>> command line parameter to explicitly enable ll/sc.
>>>     
>>
>>  I would feel a bit uneasy about adding the implication of the 
>> availability of ll/sc based on the target operating system.  Using a 
>> switch like "-march=mips2" is unacceptable as it enables some other 
>> instructions.  I would be in favour to an additional option along the 
>> lines of -mbranch-likely.
>>
>>   
> Something like -mllsc.  The default value could be set at configure time 
> with --with-llsc=[true|false]. Perhaps mips*-*-linux* would imply 
> --with-llsc=true unless overridden.  Done in a similar manner to the 
> default -march and -msoft-float settings.
>
> Perhaps Richard could opine on the matter.

Sounds fine to me.  FWIW, if you'd rather not add this option yourself,
I think it's OK for the patch to go in without.

Richard

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-13 18:33       ` David Daney
@ 2007-08-15 17:25         ` Maciej W. Rozycki
  0 siblings, 0 replies; 20+ messages in thread
From: Maciej W. Rozycki @ 2007-08-15 17:25 UTC (permalink / raw)
  To: David Daney; +Cc: richard, GCC Patches

On Mon, 13 Aug 2007, David Daney wrote:

> > It is implementation-specific how much address space is checked by the LL/SC
> > logic for bus activity, so if you have another load (or store) between an LL
> > and an SC chances are the SC will never succeed.
> > 
> 
> Right.  It is clear that although with my original patch correct code is
> generated for simple test cases, in general there is the possibility of 
>  generating bad code.

 Yeah, it would be nice to have an ability to mark a block of code as 
prohibited from data memory accesses, but at the moment I do not even have 
an idea of how it might look like.  Such a feature would permit (almost) 
arbitrarily complicated atomic RMW operations as LL/SC is probably as 
flexible as you can get.

  Maciej

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-14 13:55     ` Richard Sandiford
@ 2007-08-17 16:37       ` David Daney
  2007-08-17 17:29         ` Richard Sandiford
  0 siblings, 1 reply; 20+ messages in thread
From: David Daney @ 2007-08-17 16:37 UTC (permalink / raw)
  To: GCC Patches, richard

[-- Attachment #1: Type: text/plain, Size: 9897 bytes --]

Richard Sandiford wrote:
> David Daney <ddaney@avtrex.com> writes:
>   
>> I split the fetch_ops into two groups (arithmetic and bitwise).  This 
>> was to make it possible to have patterns that matched immediate operands 
>> as they are treated differently in the two cases (sign extended vs. zero 
>> extended).
>>     
>
> FWIW, that difference could be handled by code attributes.  I think the
> real problem is the need for "<d>" in arithmetic ops and not logical ops.
>
>   

Right.  My stated reason was what go me thinking about splitting them 
up, but in the end, as you note, the <d> business is what necessitates 
it.  It may be possible to invent some complicated code macro to handle 
it, but this seems clean enough.

>> +(define_insn "sync_compare_and_swap<mode>"
>> +  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
>> +	(match_operand:GPR 1 "memory_operand" "+R,R"))
>> +   (set (match_dup 1)
>> +	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d,d")
>> +			      (match_operand:GPR 3 "arith_operand" "I,d")]
>> +	 UNSPEC_COMPARE_AND_SWAP))
>> +   (clobber (match_scratch:GPR 4 "=d,d"))]
>>     
>
> Operand 4 needs to be earlyclobber too, to handle the case where the
> loop iterates.  But why not use $at instead?
>
>   

Done.

> FWIW, we could handle "IKL" (with an appropriate predicate)
> without any change to the template.
>
>   

Perhaps as a follow on.  I am not sure how often these built-ins would 
be used there the "L" constraint would match.

>> +{
>> +  output_asm_insn(".set\tpush\n"
>> +         "\t.set\tnoreorder\n"
>> +         "\t.set\tnomacro\n"
>> +	 "\tsync\n"
>> +	 "1:\tll<d>\t%0,%1\n"
>> +	 "\tbne\t%0,%2,2f", operands);
>> +
>> +  if (which_alternative == 0)
>> +    output_asm_insn("li\t%4,%3", operands);
>> +  else
>> +    output_asm_insn("move\t%4,%3", operands);
>> +
>> +  return "sc<d>\t%4,%1\n"
>> +	 "\tbeq\t%4,$0,1b\n"
>> +	 "\tnop\n"
>> +	 "2:\n"
>> +         "\t.set\tpop";
>> +}
>> +  [(set_attr "length" "28")])
>>     
>
> Please use %(%<...%>%) to wrap the set noreorder/set nomacro stuff.
> Make that %(%<%[...%]%>%) if we're using $at (nice and readable, eh?)
> Also use $@ instead of $at and $. instead of $0.
>
>   

Oh good.  I didn't know about those.


> I think it would be (slightly) cleaner to have:
>
> /* Return an asm string that atomically:
>
>      - Compares memory reference %1 to register %2 and, if they are
>        equal, changes %1 to %3.
>
>      - Sets register %0 to the old value of memory reference %1.
>
>    SUFFIX is the suffix that should be added to "ll" and "sc" instructions
>    and OP is the instruction that should be used to load %3 into a
>    register.  */
> #define MIPS_COMPARE_AND_SWAP(SUFFIX, OP)	\
>   "%(%<%[sync\n"				\
>   "1:\tll" SUFFIX "t%0,%1\n"			\
>   "\tbne\t%0,%2,2f\n"				\
>   OP "\t%@,%3\n"				\
>   "\tsd" SUFFIX "%@,%1\n"			\
>   "\tbeq\t%@,%.,1b\n"				\
>   "\tnop\n"					\
>   "2:%]%>%)\n"
> ....
>   if (which_alternative == 0)
>     return MIPS_COMPARE_AND_SWAP ("<d>", "li");
>   else
>     return MIPS_COMPARE_AND_SWAP ("<d>", "move");
>
> The win isn't that big here, but it's bigger if we do the same thing
> for the (repeated) sync_old* and sync_new* patterns; more below.
>
>   

OK.  I went with that pretty much verbatim.

>> +(define_code_macro FETCHOP_ARITH [plus minus])
>>     
>
> These are usually lowercase for MIPS.  Please put at the top of the
> file, with the others.
>
>   

Changed.

>> +(define_code_attr fetchop_arith_name [(plus "add") (minus "sub")])
>>     
>
> Add this to the "optab" attribute
>
>   

Done

>> +(define_code_attr fetchop_arith_op [(plus "addu") (minus "subu")])
>>     
>
> ...and this to the "insn" attribute.
>
>   

Done.

>> +(define_code_attr fetchop_arith_imm_op [(plus "addiu") (minus "subiu")])
>>     
>
> We shouldn't pretend there's a "subiu" insn.  I suppose this is where
> the idea of using code macros falls down ;(  We could get around it by
> adding code attributes for the predicate and constraint (as Alpha does)
> but perhaps it is better to keep them separate.
>
>   

Right.  I probably would have discovered that if I tried to assemble 
it.  I handle this case now with the new 'immediate_negative' code macro 
attribute.

>> +(define_insn "sync_old_<fetchop_arith_name><mode>"
>> +  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
>> +	(match_operand:GPR 1 "memory_operand" "+R,R"))
>> +   (set (match_dup 1)
>> +	(unspec_volatile:GPR
>> +          [(FETCHOP_ARITH:GPR (match_operand:GPR 2 "arith_operand" "I,d")
>> +			      (match_dup 2))]
>> +	 UNSPEC_SYNC_OLD_OP))
>> +   (clobber (match_scratch:GPR 3 "=d,d"))]
>> +  "ISA_HAS_LL_SC"
>>     
>
> Same $at comment here.  Also, the first operand to the fetchop ought to
> be 1 rather than 2.  (Not that it makes any difference to the generated
> code; it's just a little confusing as-is.)
>
> Again, I think it would be good to add a MIPS_SYNC_OLD macro,
> along the lines of MIPS_COMPARE_AND_SWAP above.
>
>   

Fixed.

>> +(define_code_attr fetchop_bit_imm_op [(ior "ori") (xor "xori") (and "andi")])
>>     
>
> Let's make this "imm_insn", to match "insn".
>
>   

Uh, I used 'immediate_insn'.  If you thing the shorter spelling would be 
better I will change it.

>> +(define_insn "sync_new_<fetchop_arith_name><mode>"
>> +  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
>> +	(match_operand:GPR 1 "memory_operand" "+R,R"))
>> +   (set (match_dup 1)
>> +	(unspec_volatile:GPR
>> +          [(FETCHOP_ARITH:GPR (match_operand:GPR 2 "arith_operand" "I,d")
>> +			      (match_dup 2))]
>> +	 UNSPEC_SYNC_NEW_OP))
>> +   (clobber (match_scratch:GPR 3 "=d,d"))]
>> +  "ISA_HAS_LL_SC"
>>     
>
> Again, this pattern is a little confusing.  Operand 0 is set to the
> result of the operation, not the original memory reference.
>
>   

OK, but who is really to say which part of a parallel happens first?  I 
tried rewriting it as you suggest and it seems to work, but it makes all 
the cases look so different that I thought it was easier to work on if 
they were all the same.

> Same $at comment, and it might be better to have a MIPS_SYNC_NEW macro.
>
>   
>> +(define_insn "sync_new_nand<mode>"
>> +  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
>> +	(match_operand:GPR 1 "memory_operand" "+R,R"))
>> +   (set (match_dup 1)
>> +	(unspec_volatile:GPR [(match_operand:GPR 2 "uns_arith_operand" "K,d")]
>> +	 UNSPEC_SYNC_NEW_OP))
>> +   (clobber (match_scratch:GPR 3 "=d,d"))]
>> +  "ISA_HAS_LL_SC"
>> +{
>> +  output_asm_insn(".set\tpush\n"
>> +         "\t.set\tnoreorder\n"
>> +         "\t.set\tnomacro\n"
>> +	 "\tsync\n"
>> +	 "1:\tll<d>\t%0,%1\n"
>> +	 "\tnor\t%0,$0,%0", operands);
>> +
>> +  if (which_alternative == 0)
>> +    output_asm_insn("andi\t%3,%0,%2", operands);
>> +  else
>> +    output_asm_insn("and\t%3,%0,%2", operands);
>> +
>> +  return "move\t%0,%3\n"
>> +	 "\tsc<d>\t%3,%1\n"
>> +	 "\tbeq\t%3,$0,1b\n"
>> +	 "\tnop\n"
>> +         "\t.set\tpop";
>> +}
>>     
>
> Why not avoid the move and use the same delay-slot repeat that you used
> in the other sync_new* patterns?
>   

Good point.  Fixed now.

>> +(define_insn "sync_lock_test_and_set<mode>"
>> +  [(set (match_operand:GPR 0 "register_operand" "=&d")
>> +	(match_operand:GPR 1 "memory_operand" "+R"))
>> +   (set (match_dup 1)
>> +	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
>> +	 UNSPEC_SYNC_EXCHANGE))
>> +   (clobber (match_scratch:GPR 3 "=d"))]
>> +  "ISA_HAS_LL_SC"
>> +{
>> +  return ".set\tpush\n"
>> +         "\t.set\tnoreorder\n"
>> +         "\t.set\tnomacro\n"
>> +	 "1:\tll<d>\t%0,%1\n"
>> +	 "\tmove\t%3,%2"
>> +	 "\tsc<d>\t%3,%1\n"
>> +	 "\tbeq\t%3,$0,1b\n"
>> +	 "\tnop\n"
>> +	 "\tsync\n"
>> +         "\t.set\tpop";
>> +}
>>     
>
> No need for a C block.  Just make this a plain asm template.
>   

I used the macro like all the other cases because with a constant 
operand, we can generate smaller code but need two alternatives.


Here is the new patch.  I added a new constraint 'Z' which matches 
constants where we can change the sign and still have sign-extend and 
fit in 16 bits.  This allows subtraction of a constant to be written as 
addition of the negative value of the constant.  Other than that it is 
mostly just addressing the issues raised by Richard.

I added some tests, and actually am testing this version.

So far tested on mips64-linux n32 ABI only.  If OKed, I will also test 
on o32 and n64  and libjava using these builtins instead of the hand 
coded assembly routines.

Comments?  OK to commit?

David Daney

gcc/
2007-08-17  David Daney  <ddaney@avtrex.com>

    * config/mips/mips.md (UNSPEC_COMPARE_AND_SWAP, UNSPEC_SYNC_OLD_OP,
    UNSPEC_SYNC_NEW_OP, UNSPEC_SYNC_EXCHANGE): New define_constants.
    (optab, insn): Add more attributes.
    (fetchop_arith, fetchop_bit): New code macros.
    (immediate_constraint, immediate_negative, immediate_insn): New
    code macro attributes.
    (sync): Change condition to ISA_HAS_SYNC.
    (rdhwr): Change predicate for operand 0 to register_operand.
    (memory_barrier): New expand.
    (sync_compare_and_swap<mode>, sync_<optab><mode>,
    sync_old_<optab><mode>,    sync_new_<optab><mode>, sync_nand<mode>,
    sync_old_nand<mode>, sync_new_nand<mode>,
    sync_lock_test_and_set<mode>): New insns.
    * config/mips/mips.h (ISA_HAS_SYNC, ISA_HAS_LL_SC): New ISA predicates.
    (MIPS_COMPARE_AND_SWAP, MIPS_SYNC_OP, MIPS_SYNC_OLD_OP,
    MIPS_SYNC_NEW_OP, MIPS_SYNC_NAND, MIPS_SYNC_OLD_NAND,
    MIPS_SYNC_NEW_NAND, MIPS_SYNC_EXCHANGE): New Macros.
    * config/mips/constraints.md (Z): New mips constraint.
    * doc/md.texi: Document it.

gcc/testsuite/  
2007-08-17  David Daney  <ddaney@avtrex.com>
    * gcc.target/mips/gcc-have-sync-compare-and-swap-1.c: New test.
    * gcc.target/mips/gcc-have-sync-compare-and-swap-2.c: Ditto.
    * gcc.target/mips/atomic-memory-1.c: Ditto.
   



[-- Attachment #2: mips-atomic.diff --]
[-- Type: text/x-patch, Size: 17187 bytes --]

Index: doc/md.texi
===================================================================
--- doc/md.texi	(revision 127356)
+++ doc/md.texi	(working copy)
@@ -2540,6 +2540,9 @@ Floating-point zero.
 
 @item R
 An address that can be used in a non-macro load or store.
+
+@item Z
+A constant in the range -32767 to 32767 (inclusive).
 @end table
 
 @item Motorola 680x0---@file{config/m68k/m68k.h}
Index: testsuite/gcc.target/mips/gcc-have-sync-compare-and-swap-1.c
===================================================================
--- testsuite/gcc.target/mips/gcc-have-sync-compare-and-swap-1.c	(revision 0)
+++ testsuite/gcc.target/mips/gcc-have-sync-compare-and-swap-1.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do preprocess } */
+/* { dg-options "-mips32 -mabi=32" } */
+
+#ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1
+#error nonono
+#endif
+
+#ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
+#error nonono
+#endif
+
+#ifndef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
+#error nonono
+#endif
+
+#ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8
+#error nonono
+#endif
+
+#ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
+#error nonono
+#endif
Index: testsuite/gcc.target/mips/atomic-memory-1.c
===================================================================
--- testsuite/gcc.target/mips/atomic-memory-1.c	(revision 0)
+++ testsuite/gcc.target/mips/atomic-memory-1.c	(revision 0)
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+extern void abort (void);
+extern void exit (int);
+
+int main ()
+{
+#ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
+  unsigned v = 0;
+  __sync_synchronize ();
+
+  if (!__sync_bool_compare_and_swap (&v, 0, 30000))
+    abort();
+  if (30000 != __sync_val_compare_and_swap (&v, 30000, 100000))
+    abort();
+  __sync_sub_and_fetch (&v, 0x8000);
+  __sync_sub_and_fetch (&v, 0x7fff);
+  if (v != 34465)
+    abort();
+  if (__sync_nand_and_fetch (&v, 0xff) != 94)
+    abort();
+  if (__sync_fetch_and_add (&v, 6) != 94)
+    abort();
+  if (v != 100)
+    abort();
+  if (__sync_or_and_fetch (&v, 0xf001) != 0xf065)
+    abort();
+  if (__sync_and_and_fetch (&v, 0x1000) != 0x1000)
+    abort();
+  if (__sync_xor_and_fetch (&v, 0xa51040) != 0xa50040)
+    abort();
+  __sync_and_and_fetch (&v, 7);
+  if (__sync_lock_test_and_set(&v, 1) != 0)
+    abort();
+  if (v != 1)
+    abort();
+  __sync_lock_release (&v);
+  if (v != 0)
+    abort();
+#endif
+  exit(0);
+}
Index: testsuite/gcc.target/mips/gcc-have-sync-compare-and-swap-2.c
===================================================================
--- testsuite/gcc.target/mips/gcc-have-sync-compare-and-swap-2.c	(revision 0)
+++ testsuite/gcc.target/mips/gcc-have-sync-compare-and-swap-2.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do preprocess { target { mips64*-*-* } } } */
+/* { dg-options "-mips64 -mabi=64" } */
+
+#ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1
+#error nonono
+#endif
+
+#ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
+#error nonono
+#endif
+
+#ifndef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
+#error nonono
+#endif
+
+#ifndef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8
+#error nonono
+#endif
+
+#ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
+#error nonono
+#endif
Index: config/mips/constraints.md
===================================================================
--- config/mips/constraints.md	(revision 127356)
+++ config/mips/constraints.md	(working copy)
@@ -135,6 +135,11 @@ (define_constraint "P"
   (and (match_code "const_int")
        (match_test "ival > 0 && ival < 0x10000")))
 
+(define_constraint "Z"
+  "A constant in the range -32767 to 32767 (inclusive)."
+  (and (match_code "const_int")
+       (match_test "ival > -0x8000 && ival < 0x8000")))
+
 ;; Floating-point constraints
 
 (define_constraint "G"
Index: config/mips/mips.md
===================================================================
--- config/mips/mips.md	(revision 127356)
+++ config/mips/mips.md	(working copy)
@@ -53,7 +53,11 @@ (define_constants
    (UNSPEC_RDHWR		34)
    (UNSPEC_SYNCI		35)
    (UNSPEC_SYNC			36)
-
+   (UNSPEC_COMPARE_AND_SWAP	37)
+   (UNSPEC_SYNC_OLD_OP		38)
+   (UNSPEC_SYNC_NEW_OP		39)
+   (UNSPEC_SYNC_EXCHANGE	40)
+   
    (UNSPEC_ADDRESS_FIRST	100)
 
    (FAKE_CALL_REGNO		79)
@@ -576,12 +580,22 @@ (define_code_attr su [(sign_extend "s") 
 ;; <optab> expands to the name of the optab for a particular code.
 (define_code_attr optab [(ashift "ashl")
 			 (ashiftrt "ashr")
-			 (lshiftrt "lshr")])
+			 (lshiftrt "lshr")
+			 (plus "add")
+			 (minus "sub")
+			 (ior "ior")
+			 (xor "xor")
+			 (and "and")])
 
 ;; <insn> expands to the name of the insn that implements a particular code.
 (define_code_attr insn [(ashift "sll")
 			(ashiftrt "sra")
-			(lshiftrt "srl")])
+			(lshiftrt "srl")
+			(plus "addu")
+			(minus "subu")
+			(ior "or")
+			(xor "xor")
+			(and "and")])
 
 ;; <fcond> is the c.cond.fmt condition associated with a particular code.
 (define_code_attr fcond [(unordered "un")
@@ -597,6 +611,25 @@ (define_code_attr swapped_fcond [(ge "le
 				 (gt "lt")
 				 (unge "ule")
 				 (ungt "ult")])
+
+;; Atomic fetch arithmetical operations.
+(define_code_macro fetchop_arith [plus minus])
+
+;; Atomic fetch bitwise operations.
+(define_code_macro fetchop_bit [ior xor and])
+
+;; <immediate_constraint> expands to the constraint to be used
+;; for immediate arithmetical operations.
+(define_code_attr immediate_constraint [(plus "I") (minus "Z")])
+
+;; <immediate_negative> expands to 1 for minus because subtraction may be
+;; implemented as addition of the negative immediate value.
+(define_code_attr immediate_negative [(plus "0") (minus "1")])
+
+;; <immediate_insn> expands to the name of the insn that implements
+;; a particular code to operate in immediate values.
+(define_code_attr immediate_insn [(ior "ori") (xor "xori") (and "andi")])
+
 \f
 ;; .........................
 ;;
@@ -4251,7 +4284,7 @@ (define_expand "clear_cache"
 
 (define_insn "sync"
   [(unspec_volatile [(const_int 0)] UNSPEC_SYNC)]
-  "ISA_HAS_SYNCI"
+  "ISA_HAS_SYNC"
   "sync")
 
 (define_insn "synci"
@@ -4261,7 +4294,7 @@ (define_insn "synci"
   "synci\t0(%0)")
 
 (define_insn "rdhwr"
-  [(set (match_operand:SI 0 "general_operand" "=d")
+  [(set (match_operand:SI 0 "register_operand" "=d")
         (unspec_volatile [(match_operand:SI 1 "const_int_operand" "n")]
         UNSPEC_RDHWR))]
   "ISA_HAS_SYNCI"
@@ -4283,6 +4316,209 @@ (define_insn "clear_hazard"
          "\t.set\tpop";
 }
   [(set_attr "length" "20")])
+
+;; Atomic memory operations.
+
+(define_expand "memory_barrier"
+  [(unspec_volatile [(const_int 0)] UNSPEC_SYNC)]
+  "ISA_HAS_SYNC"
+  "")
+
+(define_insn "sync_compare_and_swap<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d,d")
+			      (match_operand:GPR 3 "arith_operand" "I,d")]
+	 UNSPEC_COMPARE_AND_SWAP))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    return MIPS_COMPARE_AND_SWAP ("<d>", "li");
+  else
+    return MIPS_COMPARE_AND_SWAP ("<d>", "move");
+}
+  [(set_attr "length" "28")])
+
+(define_insn "sync_<optab><mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+R,R")
+	(unspec_volatile:GPR
+          [(fetchop_arith:GPR (match_operand:GPR 1 "arith_operand"
+						   "<immediate_constraint>,d")
+			      (match_dup 0))]
+	 UNSPEC_SYNC_OLD_OP))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    {
+      if (<immediate_negative>)
+	{
+	  gcc_assert(GET_CODE (operands[1]) == CONST_INT);
+	  operands[1] = GEN_INT (-INTVAL (operands[1]));
+	}
+      return MIPS_SYNC_OP ("<d>", "<d>addiu");	
+    }
+  else
+    return MIPS_SYNC_OP ("<d>", "<d><insn>");	
+}
+  [(set_attr "length" "24")])
+
+(define_insn "sync_old_<optab><mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR
+          [(fetchop_arith:GPR (match_operand:GPR 2 "arith_operand"
+						   "<immediate_constraint>,d")
+			      (match_dup 1))]
+	 UNSPEC_SYNC_OLD_OP))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    {
+      if (<immediate_negative>)
+	{
+	  gcc_assert(GET_CODE (operands[2]) == CONST_INT);
+	  operands[2] = GEN_INT (-INTVAL (operands[2]));
+	}
+      return MIPS_SYNC_OLD_OP ("<d>", "<d>addiu");	
+    }
+  else
+    return MIPS_SYNC_OLD_OP ("<d>", "<d><insn>");	
+}
+  [(set_attr "length" "24")])
+
+(define_insn "sync_new_<optab><mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR
+	  [(fetchop_arith:GPR (match_operand:GPR 2 "arith_operand"
+						   "<immediate_constraint>,d")
+			      (match_dup 1))]
+	 UNSPEC_SYNC_NEW_OP))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    {
+      if (<immediate_negative>)
+	{
+	  gcc_assert(GET_CODE (operands[2]) == CONST_INT);
+	  operands[2] = GEN_INT (-INTVAL (operands[2]));
+	}
+      return MIPS_SYNC_NEW_OP ("<d>", "<d>addiu");	
+    }
+  else
+    return MIPS_SYNC_NEW_OP ("<d>", "<d><insn>");	
+}
+  [(set_attr "length" "24")])
+
+(define_insn "sync_<optab><mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+R,R")
+	(unspec_volatile:GPR
+          [(fetchop_bit:GPR (match_operand:GPR 1 "uns_arith_operand" "K,d")
+			      (match_dup 0))]
+	 UNSPEC_SYNC_OLD_OP))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    return MIPS_SYNC_OP ("<d>", "<immediate_insn>");	
+  else
+    return MIPS_SYNC_OP ("<d>", "<insn>");	
+}
+  [(set_attr "length" "24")])
+
+(define_insn "sync_old_<optab><mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR
+          [(fetchop_bit:GPR (match_operand:GPR 2 "uns_arith_operand" "K,d")
+			    (match_dup 1))]
+	 UNSPEC_SYNC_OLD_OP))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    return MIPS_SYNC_OLD_OP ("<d>", "<immediate_insn>");	
+  else
+    return MIPS_SYNC_OLD_OP ("<d>", "<insn>");	
+}
+  [(set_attr "length" "24")])
+
+(define_insn "sync_new_<optab><mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR
+          [(fetchop_bit:GPR (match_operand:GPR 2 "uns_arith_operand" "K,d")
+			    (match_dup 1))]
+	 UNSPEC_SYNC_NEW_OP))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    return MIPS_SYNC_NEW_OP ("<d>", "<immediate_insn>");	
+  else
+    return MIPS_SYNC_NEW_OP ("<d>", "<insn>");	
+}
+  [(set_attr "length" "24")])
+
+(define_insn "sync_nand<mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+R,R")
+	(unspec_volatile:GPR [(match_operand:GPR 1 "uns_arith_operand" "K,d")]
+	 UNSPEC_SYNC_OLD_OP))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    return MIPS_SYNC_NAND ("<d>", "andi");	
+  else
+    return MIPS_SYNC_NAND ("<d>", "and");	
+}
+  [(set_attr "length" "28")])
+
+(define_insn "sync_old_nand<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+        (unspec_volatile:GPR [(match_operand:GPR 2 "uns_arith_operand" "K,d")]
+	 UNSPEC_SYNC_OLD_OP))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    return MIPS_SYNC_OLD_NAND ("<d>", "andi");	
+  else
+    return MIPS_SYNC_OLD_NAND ("<d>", "and");	
+}
+  [(set_attr "length" "28")])
+
+(define_insn "sync_new_nand<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "uns_arith_operand" "K,d")]
+	 UNSPEC_SYNC_NEW_OP))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    return MIPS_SYNC_NEW_NAND ("<d>", "andi");	
+  else
+    return MIPS_SYNC_NEW_NAND ("<d>", "and");	
+}
+  [(set_attr "length" "28")])
+
+(define_insn "sync_lock_test_and_set<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
+	(match_operand:GPR 1 "memory_operand" "+R,R"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "arith_operand" "I,d")]
+	 UNSPEC_SYNC_EXCHANGE))]
+  "ISA_HAS_LL_SC"
+{
+  if (which_alternative == 0)
+    return MIPS_SYNC_EXCHANGE ("<d>", "li");
+  else
+    return MIPS_SYNC_EXCHANGE ("<d>", "move");
+}
+  [(set_attr "length" "24")])
 \f
 ;; Block moves, see mips.c for more details.
 ;; Argument 0 is the destination
Index: config/mips/mips.h
===================================================================
--- config/mips/mips.h	(revision 127356)
+++ config/mips/mips.h	(working copy)
@@ -876,6 +876,11 @@ extern enum mips_code_readable_setting m
 /* ISA includes synci, jr.hb and jalr.hb.  */
 #define ISA_HAS_SYNCI (ISA_MIPS32R2 && !TARGET_MIPS16)
 
+/* ISA includes sync.  */
+#define ISA_HAS_SYNC ((mips_isa >= 2 || TARGET_MIPS3900) && !TARGET_MIPS16)
+
+/* ISA includes ll and sc.  */
+#define ISA_HAS_LL_SC (mips_isa >= 2 && !TARGET_MIPS16)
 \f
 /* Add -G xx support.  */
 
@@ -2815,3 +2820,140 @@ while (0)
 #ifndef HAVE_AS_TLS
 #define HAVE_AS_TLS 0
 #endif
+
+/* Return an asm string that atomically:
+
+     - Compares memory reference %1 to register %2 and, if they are
+       equal, changes %1 to %3.
+
+     - Sets register %0 to the old value of memory reference %1.
+
+   SUFFIX is the suffix that should be added to "ll" and "sc" instructions
+   and OP is the instruction that should be used to load %3 into a
+   register.  */
+#define MIPS_COMPARE_AND_SWAP(SUFFIX, OP)	\
+  "%(%<%[sync\n"				\
+  "1:\tll" SUFFIX "\t%0,%1\n"			\
+  "\tbne\t%0,%2,2f\n"				\
+  "\t" OP "\t%@,%3\n"				\
+  "\tsc" SUFFIX "\t%@,%1\n"			\
+  "\tbeq\t%@,%.,1b\n"				\
+  "\tnop\n"					\
+  "2:%]%>%)"
+
+/* Return an asm string that atomically:
+
+     - Sets memory reference %0 to %0 INSN %1.
+
+   SUFFIX is the suffix that should be added to "ll" and "sc"
+   instructions.  */
+#define MIPS_SYNC_OP(SUFFIX, INSN)		\
+  "%(%<%[sync\n"				\
+  "1:\tll" SUFFIX "\t%@,%0\n"			\
+  "\t" INSN "\t%@,%@,%1\n"			\
+  "\tsc" SUFFIX "\t%@,%0\n"			\
+  "\tbeq\t%@,%.,1b\n"				\
+  "\tnop%]%>%)"
+
+/* Return an asm string that atomically:
+
+     - Sets memory reference %1 to %1 INSN %2.
+
+     - Sets register %0 to the old value of memory reference %1.
+
+   SUFFIX is the suffix that should be added to "ll" and "sc"
+   instructions.  */
+#define MIPS_SYNC_OLD_OP(SUFFIX, INSN)		\
+  "%(%<%[sync\n"				\
+  "1:\tll" SUFFIX "\t%0,%1\n"			\
+  "\t" INSN "\t%@,%0,%2\n"			\
+  "\tsc" SUFFIX "\t%@,%1\n"			\
+  "\tbeq\t%@,%.,1b\n"				\
+  "\tnop%]%>%)"
+
+/* Return an asm string that atomically:
+
+     - Sets memory reference %1 to %1 INSN %2.
+
+     - Sets register %0 to the new value of memory reference %1.
+
+   SUFFIX is the suffix that should be added to "ll" and "sc"
+   instructions.  */
+#define MIPS_SYNC_NEW_OP(SUFFIX, INSN)		\
+  "%(%<%[sync\n"				\
+  "1:\tll" SUFFIX "\t%0,%1\n"			\
+  "\t" INSN "\t%@,%0,%2\n"			\
+  "\tsc" SUFFIX "\t%@,%1\n"			\
+  "\tbeq\t%@,%.,1b\n"				\
+  "\t" INSN "\t%0,%0,%2%]%>%)"
+
+/* Return an asm string that atomically:
+
+     - Sets memory reference %0 to ~%0 AND %1.
+
+   SUFFIX is the suffix that should be added to "ll" and "sc"
+   instructions.  INSN is the and instruction needed to and a register
+   with %2.  */
+#define MIPS_SYNC_NAND(SUFFIX, INSN)		\
+  "%(%<%[sync\n"				\
+  "1:\tll" SUFFIX "\t%@,%0\n"			\
+  "\tnor\t%@,%@,%.\n"				\
+  "\t" INSN "\t%@,%@,%1\n"			\
+  "\tsc" SUFFIX "\t%@,%0\n"			\
+  "\tbeq\t%@,%.,1b\n"				\
+  "\tnop%]%>%)"
+
+/* Return an asm string that atomically:
+
+     - Sets memory reference %1 to ~%1 AND %2.
+
+     - Sets register %0 to the old value of memory reference %1.
+
+   SUFFIX is the suffix that should be added to "ll" and "sc"
+   instructions.  INSN is the and instruction needed to and a register
+   with %2.  */
+#define MIPS_SYNC_OLD_NAND(SUFFIX, INSN)	\
+  "%(%<%[sync\n"				\
+  "1:\tll" SUFFIX "\t%0,%1\n"			\
+  "\tnor\t%@,%0,%.\n"				\
+  "\t" INSN "\t%@,%@,%2\n"			\
+  "\tsc" SUFFIX "\t%@,%1\n"			\
+  "\tbeq\t%@,%.,1b\n"				\
+  "\tnop%]%>%)"
+
+/* Return an asm string that atomically:
+
+     - Sets memory reference %1 to ~%1 AND %2.
+
+     - Sets register %0 to the new value of memory reference %1.
+
+   SUFFIX is the suffix that should be added to "ll" and "sc"
+   instructions.  INSN is the and instruction needed to and a register
+   with %2.  */
+#define MIPS_SYNC_NEW_NAND(SUFFIX, INSN)	\
+  "%(%<%[sync\n"				\
+  "1:\tll" SUFFIX "\t%0,%1\n"			\
+  "\tnor\t%0,%0,%.\n"				\
+  "\t" INSN "\t%@,%0,%2\n"			\
+  "\tsc" SUFFIX "\t%@,%1\n"			\
+  "\tbeq\t%@,%.,1b\n"				\
+  "\t" INSN "\t%0,%0,%2%]%>%)"
+
+/* Return an asm string that atomically:
+
+     - Sets memory reference %1 to %2.
+
+     - Sets register %0 to the old value of memory reference %1.
+
+   SUFFIX is the suffix that should be added to "ll" and "sc"
+   instructions.  OP is the and instruction that should be used to
+   load %2 into a register.  */
+#define MIPS_SYNC_EXCHANGE(SUFFIX, OP)		\
+  "%(%<%[\n"					\
+  "1:\tll" SUFFIX "\t%0,%1\n"			\
+  "\t" OP "\t%@,%2\n"				\
+  "\tsc" SUFFIX "\t%@,%1\n"			\
+  "\tbeq\t%@,%.,1b\n"				\
+  "\tnop\n"					\
+  "\tsync%]%>%)"
+

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: [PATCH] MIPS: Implement built-in atomic memory operations.
  2007-08-17 16:37       ` David Daney
@ 2007-08-17 17:29         ` Richard Sandiford
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Sandiford @ 2007-08-17 17:29 UTC (permalink / raw)
  To: David Daney; +Cc: GCC Patches

David Daney <ddaney@avtrex.com> writes:
> Richard Sandiford wrote:
>> FWIW, we could handle "IKL" (with an appropriate predicate)
>> without any change to the template.
>
> Perhaps as a follow on.  I am not sure how often these built-ins would 
> be used there the "L" constraint would match.

Sure, that's fine.

>>> +(define_code_attr fetchop_arith_imm_op [(plus "addiu") (minus "subiu")])
>>>     
>>
>> We shouldn't pretend there's a "subiu" insn.  I suppose this is where
>> the idea of using code macros falls down ;(  We could get around it by
>> adding code attributes for the predicate and constraint (as Alpha does)
>> but perhaps it is better to keep them separate.
>>
>>   
>
> Right.  I probably would have discovered that if I tried to assemble 
> it.  I handle this case now with the new 'immediate_negative' code macro 
> attribute.

Unfortunately, adding 'Z' isn't really the right thing.  My point --
not clearly made, sorry -- was that (minus ... (const_int X)) is not
canonical RTL.  You should always have (plus ... (const_int -X)) instead.
Thus the MINUS case shouldn't match (and shouldn't need to match)
any constants at all.

(I hadn't realised that Alpha allowed constant operands for minus, sorry.
It was a bad example in this case.)

>>> +(define_code_attr fetchop_bit_imm_op [(ior "ori") (xor "xori") (and "andi")])
>>
>> Let's make this "imm_insn", to match "insn".
>
> Uh, I used 'immediate_insn'.  If you thing the shorter spelling would be 
> better I will change it.

No, I think your name's better.

>>> +(define_insn "sync_new_<fetchop_arith_name><mode>"
>>> +  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
>>> +	(match_operand:GPR 1 "memory_operand" "+R,R"))
>>> +   (set (match_dup 1)
>>> +	(unspec_volatile:GPR
>>> +          [(FETCHOP_ARITH:GPR (match_operand:GPR 2 "arith_operand" "I,d")
>>> +			      (match_dup 2))]
>>> +	 UNSPEC_SYNC_NEW_OP))
>>> +   (clobber (match_scratch:GPR 3 "=d,d"))]
>>> +  "ISA_HAS_LL_SC"
>>
>> Again, this pattern is a little confusing.  Operand 0 is set to the
>> result of the operation, not the original memory reference.
>
> OK, but who is really to say which part of a parallel happens first?  I 
> tried rewriting it as you suggest and it seems to work, but it makes all 
> the cases look so different that I thought it was easier to work on if 
> they were all the same.

PARALLELs are strict read-modify-write, i.e. all operands are read before
any are written.  So as written, that pattern says that operand 0 is set
to the original value of operand 1.  That's exactly right for the *_old_*
patterns, but not the *_new_* ones.

>>> +(define_insn "sync_lock_test_and_set<mode>"
>>> +  [(set (match_operand:GPR 0 "register_operand" "=&d")
>>> +	(match_operand:GPR 1 "memory_operand" "+R"))
>>> +   (set (match_dup 1)
>>> +	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "d")]
>>> +	 UNSPEC_SYNC_EXCHANGE))
>>> +   (clobber (match_scratch:GPR 3 "=d"))]
>>> +  "ISA_HAS_LL_SC"
>>> +{
>>> +  return ".set\tpush\n"
>>> +         "\t.set\tnoreorder\n"
>>> +         "\t.set\tnomacro\n"
>>> +	 "1:\tll<d>\t%0,%1\n"
>>> +	 "\tmove\t%3,%2"
>>> +	 "\tsc<d>\t%3,%1\n"
>>> +	 "\tbeq\t%3,$0,1b\n"
>>> +	 "\tnop\n"
>>> +	 "\tsync\n"
>>> +         "\t.set\tpop";
>>> +}
>>>     
>>
>> No need for a C block.  Just make this a plain asm template.
>
> I used the macro like all the other cases because with a constant 
> operand, we can generate smaller code but need two alternatives.

Even better, thanks.

> I added some tests, and actually am testing this version.

The tests looks good, thanks.  I like the way you've separated the
feature macro tests, so that the run tests can assume that the macros
are accurate.

The rest of the patch looks good too.  Given the comments above, I think:

> +;; <immediate_constraint> expands to the constraint to be used
> +;; for immediate arithmetical operations.
> +(define_code_attr immediate_constraint [(plus "I") (minus "Z")])

...this should be "" for minus,

> +;; <immediate_negative> expands to 1 for minus because subtraction may be
> +;; implemented as addition of the negative immediate value.
> +(define_code_attr immediate_negative [(plus "0") (minus "1")])

...this should not be needed at all and

> +(define_insn "sync_<optab><mode>"
> +  [(set (match_operand:GPR 0 "memory_operand" "+R,R")
> +	(unspec_volatile:GPR
> +          [(fetchop_arith:GPR (match_operand:GPR 1 "arith_operand"
> +						   "<immediate_constraint>,d")
> +			      (match_dup 0))]
> +	 UNSPEC_SYNC_OLD_OP))]
> +  "ISA_HAS_LL_SC"
> +{
> +  if (which_alternative == 0)
> +    {
> +      if (<immediate_negative>)
> +	{
> +	  gcc_assert(GET_CODE (operands[1]) == CONST_INT);
> +	  operands[1] = GEN_INT (-INTVAL (operands[1]));
> +	}
> +      return MIPS_SYNC_OP ("<d>", "<d>addiu");	
> +    }
> +  else
> +    return MIPS_SYNC_OP ("<d>", "<d><insn>");	
> +}
> +  [(set_attr "length" "24")])

...this should simply assert that <CODE> == PLUS:

  if (which_alternative == 0)
    {
      gcc_assert (<CODE> == PLUS);
      return MIPS_SYNC_OP ("<d>", "<d>addiu");	
    }
  else
    return MIPS_SYNC_OP ("<d>", "<d><insn>");	

Likewise for the other add/sub patterns.

> +(define_insn "sync_new_<optab><mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
> +	(match_operand:GPR 1 "memory_operand" "+R,R"))
> +   (set (match_dup 1)
> +	(unspec_volatile:GPR
> +	  [(fetchop_arith:GPR (match_operand:GPR 2 "arith_operand"
> +						   "<immediate_constraint>,d")
> +			      (match_dup 1))]
> +	 UNSPEC_SYNC_NEW_OP))]
> +  "ISA_HAS_LL_SC"

As per the above, please make this:

  [(set (match_operand:GPR 0 "register_operand" "=&d,d")
	(fetchop_arith:GPR (match_operand:GPR 2 "arith_operand"
						   "<immediate_constraint>,d")
			   (match_dup 1)))
   ...]

> +/* ISA includes ll and sc.  */
> +#define ISA_HAS_LL_SC (mips_isa >= 2 && !TARGET_MIPS16)

Could you add a comment saying that this implies ISA_HAS_SYNC?
The expanders use both ISA_HAS_SYNC and ISA_HAS_LL_SC instructions.

OK with those changes, thanks.

Richard

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2007-08-17 17:29 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-12 20:23 RFC: [PATCH] MIPS: Implement built-in atomic memory operations David Daney
2007-08-13  6:26 ` Ken Raeburn
2007-08-13  6:38   ` David Daney
2007-08-13  7:23 ` Richard Sandiford
2007-08-13  7:24   ` Richard Sandiford
2007-08-13 16:22   ` David Daney
2007-08-13 16:53     ` Richard Sandiford
2007-08-13 18:08     ` Maciej W. Rozycki
2007-08-13 18:33       ` David Daney
2007-08-15 17:25         ` Maciej W. Rozycki
2007-08-14  8:25   ` David Daney
2007-08-14 13:51     ` Thiemo Seufer
2007-08-14 15:09       ` David Daney
2007-08-14 15:35         ` Maciej W. Rozycki
2007-08-14 15:54           ` David Daney
2007-08-14 16:02             ` Richard Sandiford
2007-08-14 15:39         ` Thiemo Seufer
2007-08-14 13:55     ` Richard Sandiford
2007-08-17 16:37       ` David Daney
2007-08-17 17:29         ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).