public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [cxx-mem-model] __sync_mem_load
@ 2011-06-23 23:20 Andrew MacLeod
  2011-06-29 17:36 ` Richard Henderson
  2011-06-29 18:19 ` Richard Henderson
  0 siblings, 2 replies; 3+ messages in thread
From: Andrew MacLeod @ 2011-06-23 23:20 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Henderson, Aldy Hernandez

[-- Attachment #1: Type: text/plain, Size: 598 bytes --]

Here's the patch for __sync_mem_load, complete with tests.

I'll change and correct the actual implementation of the load pattern 
later ( I think I have the x86 fence wrong). It occurs to me that if I 
implement the __sync_mem_thread_fence (model) routine, then the 
appropriate fences can be created  by using that __sync instead of 
sync_synchronize all the time as well.

Im trying to gather a set of optimal, or at least decent, sequences for 
each pattern. I'll go back and make changes when I have collected some 
and finished all the other atomic operations.  It'll be easier then.

Andrew

[-- Attachment #2: load.patch --]
[-- Type: text/plain, Size: 13545 bytes --]


	* doc/extend.texi (__sync_mem_load): Document.
	* c-family/c-common.c (resolve_overloaded_builtin): Add 
	BUILT_IN_SYNC_MEM_LOAD_N.
	* optabs.c (expand_sync_mem_load): New.
	* optabs.h (enum direct_optab_index): Add DOI_sync_mem_load.
	(sync_mem_load_optab): Define.
	* genopinit.c: Add entry for sync_mem_load.
	* builtins.c (expand_builtin_sync_mem_load): New.
	(expand_builtin): Handle BUILT_IN_SYNC_MEM_LOAD_*
	* sync-bultins.def: Add entries for BUILT_IN_SYNC_MEM_LOAD_*.
	* testsuite/gcc.dg/sync-mem-invalid.c: Add invalid load tests.
	* testsuite/gcc.dg/sync-mem.h: Add load executable tests.
	* builtin-types.def (BT_FN_I{1,2,4,8,16}_VPTR_INT): New.
	* expr.h (expand_sync_mem_load): Declare.
	* fortran/types.def (BT_FN_I{1,2,4,8,16}_VPTR_INT): New.
	* config/i386/sync.md (sync_mem_load<mode>): New pattern.

Index: doc/extend.texi
===================================================================
*** doc/extend.texi	(revision 175331)
--- doc/extend.texi	(working copy)
*************** This means that all previous memory stor
*** 6729,6734 ****
--- 6729,6747 ----
  previous memory loads have been satisfied, but following memory reads
  are not prevented from being speculated to before the barrier.
  
+ @item @var{type} __sync_mem_load (@var{type} *ptr, int memmodel, ...)
+ @findex __sync_mem_load
+ This builtin implements an atomic load operation within the constraints of a
+ memory model.  It returns the contents of @code{*@var{ptr}}.
+ 
+ The valid memory model variants for this builtin are
+ __SYNC_MEM_RELAXED, __SYNC_MEM_SEQ_CST, __SYNC_MEM_ACQUIRE, and 
+ __SYNC_MEM_CONSUME. The target pattern is responsible
+ for issuing the different synchronization instructions. It should default to 
+ the more restrictive memory model, the sequentially consistent model.  If 
+ nothing is implemented for the target, the compiler will implement it by
+ issuing a memory barrier, the load, and then another memory barrier.
+ 
  @item @var{type} __sync_mem_exchange (@var{type} *ptr, @var{type} value, int memmodel, ...)
  @findex __sync_mem_exchange
  This builtin implements an atomic exchange operation within the
Index: c-family/c-common.c
===================================================================
*** c-family/c-common.c	(revision 175331)
--- c-family/c-common.c	(working copy)
*************** resolve_overloaded_builtin (location_t l
*** 9061,9066 ****
--- 9061,9067 ----
      case BUILT_IN_SYNC_LOCK_TEST_AND_SET_N:
      case BUILT_IN_SYNC_LOCK_RELEASE_N:
      case BUILT_IN_SYNC_MEM_EXCHANGE_N:
+     case BUILT_IN_SYNC_MEM_LOAD_N:
        {
  	int n = sync_resolve_size (function, params);
  	tree new_function, first_param, result;
Index: optabs.c
===================================================================
*** optabs.c	(revision 175331)
--- optabs.c	(working copy)
*************** expand_sync_lock_test_and_set (rtx mem, 
*** 7057,7062 ****
--- 7057,7104 ----
    return NULL_RTX;
  }
  
+ /* This function expands the atomic load operation:
+    return the atomically loaded value in MEM.
+ 
+    MEMMODEL is the memory model variant to use.
+    TARGET is an option place to stick the return value.  */
+ 
+ rtx
+ expand_sync_mem_load (enum memmodel model, rtx mem, rtx target)
+ {
+   enum machine_mode mode = GET_MODE (mem);
+   enum insn_code icode;
+ 
+   /* If the target supports the load directly, great.  */
+   icode = direct_optab_handler (sync_mem_load_optab, mode);
+   if (icode != CODE_FOR_nothing)
+     {
+       struct expand_operand ops[3];
+ 
+       create_output_operand (&ops[0], target, mode);
+       create_fixed_operand (&ops[1], mem);
+       create_integer_operand (&ops[2], model);
+       if (maybe_expand_insn (icode, 3, ops))
+ 	return ops[0].value;
+     }
+ 
+   /* For any model other than RELAXED, perform a synchronization first.  */
+   if (model != MEMMODEL_RELAXED)
+     expand_builtin_sync_synchronize ();
+ 
+   /* If the result is unused, don't bother loading.  */
+   if (target != const0_rtx)
+     emit_move_insn (target, mem);
+   else
+     target = NULL_RTX;
+ 
+   /* For SEQ_CST, also emit a barrier after the load.  */
+   if (model == MEMMODEL_SEQ_CST)
+     expand_builtin_sync_synchronize ();
+ 
+   return target;
+ }
+ 
  /* This function expands the atomic exchange operation:
     atomically store VAL in MEM and return the previous value in MEM.
  
Index: optabs.h
===================================================================
*** optabs.h	(revision 175331)
--- optabs.h	(working copy)
*************** enum direct_optab_index
*** 679,684 ****
--- 679,685 ----
  
    /* Atomic operations with C++0x memory model parameters. */
    DOI_sync_mem_exchange,
+   DOI_sync_mem_load,
  
    DOI_MAX
  };
*************** typedef struct direct_optab_d *direct_op
*** 730,735 ****
--- 731,738 ----
  
  #define sync_mem_exchange_optab \
    (&direct_optab_table[(int) DOI_sync_mem_exchange])
+ #define sync_mem_load_optab \
+   (&direct_optab_table[(int) DOI_sync_mem_load])
  \f
  /* Target-dependent globals.  */
  struct target_optabs {
Index: genopinit.c
===================================================================
*** genopinit.c	(revision 175331)
--- genopinit.c	(working copy)
*************** static const char * const optabs[] =
*** 242,247 ****
--- 242,248 ----
    "set_direct_optab_handler (sync_lock_test_and_set_optab, $A, CODE_FOR_$(sync_lock_test_and_set$I$a$))",
    "set_direct_optab_handler (sync_lock_release_optab, $A, CODE_FOR_$(sync_lock_release$I$a$))",
    "set_direct_optab_handler (sync_mem_exchange_optab, $A, CODE_FOR_$(sync_mem_exchange$I$a$))",
+   "set_direct_optab_handler (sync_mem_load_optab, $A, CODE_FOR_$(sync_mem_load$I$a$))",
    "set_optab_handler (vec_set_optab, $A, CODE_FOR_$(vec_set$a$))",
    "set_optab_handler (vec_extract_optab, $A, CODE_FOR_$(vec_extract$a$))",
    "set_optab_handler (vec_extract_even_optab, $A, CODE_FOR_$(vec_extract_even$a$))",
Index: builtins.c
===================================================================
*** builtins.c	(revision 175331)
--- builtins.c	(working copy)
*************** get_memmodel (tree exp)
*** 5221,5226 ****
--- 5221,5255 ----
    return (enum memmodel) INTVAL (op);
  }
  
+ /* Expand the __sync_mem_load intrinsic:
+ 
+    	TYPE __sync_mem_load (TYPE *to, TYPE from, enum memmodel)
+ 
+    EXP is the CALL_EXPR.
+    TARGET is an optional place for us to store the results.  */
+ 
+ static rtx
+ expand_builtin_sync_mem_load (enum machine_mode mode, tree exp, rtx target)
+ {
+   rtx mem;
+   enum memmodel model;
+ 
+   model = get_memmodel (CALL_EXPR_ARG (exp, 1));
+   if (model != MEMMODEL_RELAXED
+       && model != MEMMODEL_SEQ_CST
+       && model != MEMMODEL_CONSUME
+       && model != MEMMODEL_ACQUIRE)
+     {
+       error ("invalid memory model for %<__sync_mem_load%>");
+       return NULL_RTX;
+     }
+ 
+   /* Expand the operand.  */
+   mem = get_builtin_sync_mem (CALL_EXPR_ARG (exp, 0), mode);
+ 
+   return expand_sync_mem_load (model, mem, target);
+ }
+ 
  /* Expand the __sync_mem_exchange intrinsic:
  
     	TYPE __sync_mem_exchange (TYPE *to, TYPE from, enum memmodel)
*************** expand_builtin (tree exp, rtx target, rt
*** 6077,6082 ****
--- 6106,6123 ----
  	return target;
        break;
  
+     case BUILT_IN_SYNC_MEM_LOAD_1:
+     case BUILT_IN_SYNC_MEM_LOAD_2:
+     case BUILT_IN_SYNC_MEM_LOAD_4:
+     case BUILT_IN_SYNC_MEM_LOAD_8:
+     case BUILT_IN_SYNC_MEM_LOAD_16:
+       mode = get_builtin_sync_mode (fcode - BUILT_IN_SYNC_MEM_LOAD_1);
+       target = expand_builtin_sync_mem_load (mode, exp, target);
+       if (target)
+ 	return target;
+       break;
+ 
+ 
      case BUILT_IN_SYNC_MEM_EXCHANGE_1:
      case BUILT_IN_SYNC_MEM_EXCHANGE_2:
      case BUILT_IN_SYNC_MEM_EXCHANGE_4:
Index: sync-builtins.def
===================================================================
*** sync-builtins.def	(revision 175331)
--- sync-builtins.def	(working copy)
*************** DEF_SYNC_BUILTIN (BUILT_IN_SYNC_SYNCHRON
*** 259,264 ****
--- 259,283 ----
  
  /* __sync* builtins for the C++ memory model.  */
  
+ DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_N,
+ 		  "__sync_mem_load",
+ 		  BT_FN_VOID_VAR, ATTR_NOTHROW_LEAF_LIST)
+ DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_1,
+ 		  "__sync_mem_load_1",
+ 		  BT_FN_I1_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
+ DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_2,
+ 		  "__sync_mem_load_2",
+ 		  BT_FN_I2_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
+ DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_4,
+ 		  "__sync_mem_load_4",
+ 		  BT_FN_I4_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
+ DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_8,
+ 		  "__sync_mem_load_8",
+ 		  BT_FN_I8_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
+ DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_16,
+ 		  "__sync_mem_load_16",
+ 		  BT_FN_I16_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
+ 
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_EXCHANGE_N,
  		  "__sync_mem_exchange",
  		  BT_FN_VOID_VAR, ATTR_NOTHROW_LEAF_LIST)
Index: testsuite/gcc.dg/sync-mem-invalid.c
===================================================================
*** testsuite/gcc.dg/sync-mem-invalid.c	(revision 175331)
--- testsuite/gcc.dg/sync-mem-invalid.c	(working copy)
*************** int i;
*** 8,11 ****
--- 8,13 ----
  main ()
  {
    __sync_mem_exchange (&i, 1, __SYNC_MEM_CONSUME); /* { dg-error "invalid memory model" } */
+   __sync_mem_load (&i, __SYNC_MEM_RELEASE); /* { dg-error "invalid memory model" } */
+   __sync_mem_load (&i, __SYNC_MEM_ACQ_REL); /* { dg-error "invalid memory model" } */
  }
Index: testsuite/gcc.dg/sync-mem.h
===================================================================
*** testsuite/gcc.dg/sync-mem.h	(revision 175331)
--- testsuite/gcc.dg/sync-mem.h	(working copy)
*************** void test_exchange()
*** 18,27 ****
    EXCHANGE (v, __SYNC_MEM_SEQ_CST);
  }
  
  
  
  main ()
  {
!   test_exchange();
    return 0;
  }
--- 18,39 ----
    EXCHANGE (v, __SYNC_MEM_SEQ_CST);
  }
  
+ #define LOAD(VAR, MODE) if (__sync_mem_load (&VAR, MODE) != count++) abort(); else VAR++
+ 
+ void test_load()
+ {
+   v = 0;
+   count = 0;
+   LOAD (v, __SYNC_MEM_RELAXED);
+   LOAD (v, __SYNC_MEM_ACQUIRE);
+   LOAD (v, __SYNC_MEM_CONSUME);
+   LOAD (v, __SYNC_MEM_SEQ_CST);
+ }
  
  
  main ()
  {
!   test_exchange ();
!   test_load ();
    return 0;
  }
Index: builtin-types.def
===================================================================
*** builtin-types.def	(revision 175331)
--- builtin-types.def	(working copy)
*************** DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_LONGPTR_
*** 315,320 ****
--- 315,325 ----
  		     BT_BOOL, BT_PTR_LONG, BT_PTR_LONG)
  DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_ULONGLONGPTR_ULONGLONGPTR,
  		     BT_BOOL, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I1_VPTR_INT, BT_I1, BT_VOLATILE_PTR, BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I2_VPTR_INT, BT_I2, BT_VOLATILE_PTR, BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I4_VPTR_INT, BT_I4, BT_VOLATILE_PTR, BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I8_VPTR_INT, BT_I8, BT_VOLATILE_PTR, BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I16_VPTR_INT, BT_I16, BT_VOLATILE_PTR, BT_INT)
  
  DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_PTR)
  
Index: expr.h
===================================================================
*** expr.h	(revision 175331)
--- expr.h	(working copy)
*************** rtx expand_bool_compare_and_swap (rtx, r
*** 217,222 ****
--- 217,223 ----
  rtx expand_sync_operation (rtx, rtx, enum rtx_code);
  rtx expand_sync_fetch_operation (rtx, rtx, enum rtx_code, bool, rtx);
  rtx expand_sync_lock_test_and_set (rtx, rtx, rtx);
+ rtx expand_sync_mem_load (enum memmodel, rtx, rtx);
  rtx expand_sync_mem_exchange (enum memmodel, rtx, rtx, rtx);
  \f
  /* Functions from expmed.c:  */
Index: fortran/types.def
===================================================================
*** fortran/types.def	(revision 175331)
--- fortran/types.def	(working copy)
*************** DEF_FUNCTION_TYPE_2 (BT_FN_I4_VPTR_I4, B
*** 98,103 ****
--- 98,108 ----
  DEF_FUNCTION_TYPE_2 (BT_FN_I8_VPTR_I8, BT_I8, BT_VOLATILE_PTR, BT_I8)
  DEF_FUNCTION_TYPE_2 (BT_FN_I16_VPTR_I16, BT_I16, BT_VOLATILE_PTR, BT_I16)
  DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_PTR, BT_VOID, BT_PTR, BT_PTR)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I1_VPTR_INT, BT_I1, BT_VOLATILE_PTR, BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I2_VPTR_INT, BT_I2, BT_VOLATILE_PTR, BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I4_VPTR_INT, BT_I4, BT_VOLATILE_PTR, BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I8_VPTR_INT, BT_I8, BT_VOLATILE_PTR, BT_INT)
+ DEF_FUNCTION_TYPE_2 (BT_FN_I16_VPTR_INT, BT_I16, BT_VOLATILE_PTR, BT_INT)
  
  DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_PTR)
  
Index: config/i386/sync.md
===================================================================
*** config/i386/sync.md	(revision 175331)
--- config/i386/sync.md	(working copy)
***************
*** 232,237 ****
--- 232,252 ----
    return "lock{%;} add{<imodesuffix>}\t{%1, %0|%0, %1}";
  })
  
+ 
+ (define_expand "sync_mem_load<mode>"
+   [(match_operand:SWI 0 "register_operand" "")          ;; output
+    (match_operand:SWI 1 "memory_operand" "")            ;; memory
+    (match_operand:SI  2 "const_int_operand" "")]        ;; memory model
+    ""
+ {
+   if (INTVAL (operands[2]) == MEMMODEL_ACQUIRE || 
+       INTVAL (operands[2]) == MEMMODEL_SEQ_CST)
+     expand_builtin_sync_synchronize ();
+   ix86_expand_move (<MODE>mode, operands);
+   DONE;
+ })
+ 
+ 
  (define_expand "sync_mem_exchange<mode>"
    [(match_operand:SWI 0 "register_operand" "")		;; output
     (match_operand:SWI 1 "memory_operand" "")		;; memory

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [cxx-mem-model] __sync_mem_load
  2011-06-23 23:20 [cxx-mem-model] __sync_mem_load Andrew MacLeod
@ 2011-06-29 17:36 ` Richard Henderson
  2011-06-29 18:19 ` Richard Henderson
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Henderson @ 2011-06-29 17:36 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: gcc-patches, Aldy Hernandez

On 06/23/2011 03:50 PM, Andrew MacLeod wrote:
> 	* doc/extend.texi (__sync_mem_load): Document.
> 	* c-family/c-common.c (resolve_overloaded_builtin): Add 
> 	BUILT_IN_SYNC_MEM_LOAD_N.
> 	* optabs.c (expand_sync_mem_load): New.
> 	* optabs.h (enum direct_optab_index): Add DOI_sync_mem_load.
> 	(sync_mem_load_optab): Define.
> 	* genopinit.c: Add entry for sync_mem_load.
> 	* builtins.c (expand_builtin_sync_mem_load): New.
> 	(expand_builtin): Handle BUILT_IN_SYNC_MEM_LOAD_*
> 	* sync-bultins.def: Add entries for BUILT_IN_SYNC_MEM_LOAD_*.
> 	* testsuite/gcc.dg/sync-mem-invalid.c: Add invalid load tests.
> 	* testsuite/gcc.dg/sync-mem.h: Add load executable tests.
> 	* builtin-types.def (BT_FN_I{1,2,4,8,16}_VPTR_INT): New.
> 	* expr.h (expand_sync_mem_load): Declare.
> 	* fortran/types.def (BT_FN_I{1,2,4,8,16}_VPTR_INT): New.
> 	* config/i386/sync.md (sync_mem_load<mode>): New pattern.

Looks good.

> + (define_expand "sync_mem_load<mode>"
> +   [(match_operand:SWI 0 "register_operand" "")          ;; output
> +    (match_operand:SWI 1 "memory_operand" "")            ;; memory
> +    (match_operand:SI  2 "const_int_operand" "")]        ;; memory model
> +    ""
> + {
> +   if (INTVAL (operands[2]) == MEMMODEL_ACQUIRE || 
> +       INTVAL (operands[2]) == MEMMODEL_SEQ_CST)
> +     expand_builtin_sync_synchronize ();
> +   ix86_expand_move (<MODE>mode, operands);
> +   DONE;
> + })

Formatting error (|| operator on next line).

Invoke gen_memory_barrier directly instead of a
call back into expand_builtin_sync_synchronize.

Isn't there a MEMMODEL setting that would call for the
use of LFENCE instead of MFENCE?


r~

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [cxx-mem-model] __sync_mem_load
  2011-06-23 23:20 [cxx-mem-model] __sync_mem_load Andrew MacLeod
  2011-06-29 17:36 ` Richard Henderson
@ 2011-06-29 18:19 ` Richard Henderson
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Henderson @ 2011-06-29 18:19 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: gcc-patches, Aldy Hernandez

On 06/23/2011 03:50 PM, Andrew MacLeod wrote:
> + (define_expand "sync_mem_load<mode>"
> +   [(match_operand:SWI 0 "register_operand" "")          ;; output
> +    (match_operand:SWI 1 "memory_operand" "")            ;; memory
> +    (match_operand:SI  2 "const_int_operand" "")]        ;; memory model
> +    ""
> + {
> +   if (INTVAL (operands[2]) == MEMMODEL_ACQUIRE || 
> +       INTVAL (operands[2]) == MEMMODEL_SEQ_CST)

Oh, and I suspect all of these will be easier to debug with

  enum memmodel mm = (enum memmodel) INTVAL (operands[2]);
  if (mm == ...)

r~

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-06-29 17:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-23 23:20 [cxx-mem-model] __sync_mem_load Andrew MacLeod
2011-06-29 17:36 ` Richard Henderson
2011-06-29 18:19 ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).