public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266]
@ 2021-04-26 12:45 Christoph Muellner
  2021-04-26 12:45 ` [PATCH 01/10] RISC-V: Simplify memory model code [PR 100265] Christoph Muellner
                   ` (10 more replies)
  0 siblings, 11 replies; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

This series provides a cleanup of the current atomics implementation
of RISC-V:

* PR100265: Use proper fences for atomic load/store
* PR100266: Provide programmatic implementation of CAS

As both are very related, I merged the patches into one series
(to avoid merge issues if one overtake the other).

The first patch could be squashed into the following patches,
but I found it easier to understand the changes with it in place.

The series has been tested as follows:
* Building and testing a multilib RV32/64 toolchain
  (bootstrapped with riscv-gnu-toolchain repo)
* Manual review of generated sequences for GCC's atomic builtins API

The second part of the series (the re-implementation of CAS) benefits
from a REE improvement (see PR100264):
  https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568680.html
If this patch is not in place, then an additional s.ext instruction
is emitted after the SC.W (in case of RV64 and CAS for uint32_t).

Christoph Muellner (10):
  RISC-V: Simplify memory model code [PR 100265]
  RISC-V: Emit proper memory ordering suffixes for AMOs [PR 100265]
  RISC-V: Eliminate %F specifier from riscv_print_operand() [PR 100265]
  RISC-V: Don't use amoswap for atomic stores [PR 100265]
  RISC-V: Emit fences according to chosen memory model [PR 100265]
  RISC-V: Implement atomic_{load,store} [PR 100265]
  RISC-V: Model INSNs for LR and SC [PR 100266]
  RISC-V: Add s.ext-consuming INSNs for LR and SC [PR 100266]
  RISC-V: Generate helpers for cbranch4 [PR 100266]
  RISC-V: Provide programmatic implementation of CAS [PR 100266]

 gcc/config/riscv/riscv-protos.h |   1 +
 gcc/config/riscv/riscv.c        | 134 +++++++++++++---------
 gcc/config/riscv/riscv.md       |   2 +-
 gcc/config/riscv/sync.md        | 190 ++++++++++++++++++++++----------
 4 files changed, 215 insertions(+), 112 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 01/10] RISC-V: Simplify memory model code [PR 100265]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
@ 2021-04-26 12:45 ` Christoph Muellner
  2021-04-26 12:45 ` [PATCH 02/10] RISC-V: Emit proper memory ordering suffixes for AMOs " Christoph Muellner
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

We don't have any special treatment of MEMMODEL_SYNC_* values,
so let's hide them behind the memmodel_base() function.

    gcc/
	PR 100265
        * config/riscv/riscv.c (riscv_memmodel_needs_amo_acquire):
	  Ignore MEMMODEL_SYNC_* values.
        * config/riscv/riscv.c (riscv_memmodel_needs_release_fence):
	  Likewise.
        * config/riscv/riscv.c (riscv_print_operand): Eliminate
	  MEMMODEL_SYNC_* values by calling memmodel_base().
---
 gcc/config/riscv/riscv.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 17cdf705c328..9b5aedc77131 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3345,20 +3345,17 @@ riscv_print_operand_reloc (FILE *file, rtx op, bool hi_reloc)
    acquire portion of memory model MODEL.  */
 
 static bool
-riscv_memmodel_needs_amo_acquire (enum memmodel model)
+riscv_memmodel_needs_amo_acquire (const enum memmodel model)
 {
   switch (model)
     {
       case MEMMODEL_ACQ_REL:
       case MEMMODEL_SEQ_CST:
-      case MEMMODEL_SYNC_SEQ_CST:
       case MEMMODEL_ACQUIRE:
       case MEMMODEL_CONSUME:
-      case MEMMODEL_SYNC_ACQUIRE:
 	return true;
 
       case MEMMODEL_RELEASE:
-      case MEMMODEL_SYNC_RELEASE:
       case MEMMODEL_RELAXED:
 	return false;
 
@@ -3371,20 +3368,17 @@ riscv_memmodel_needs_amo_acquire (enum memmodel model)
    implement the release portion of memory model MODEL.  */
 
 static bool
-riscv_memmodel_needs_release_fence (enum memmodel model)
+riscv_memmodel_needs_release_fence (const enum memmodel model)
 {
   switch (model)
     {
       case MEMMODEL_ACQ_REL:
       case MEMMODEL_SEQ_CST:
-      case MEMMODEL_SYNC_SEQ_CST:
       case MEMMODEL_RELEASE:
-      case MEMMODEL_SYNC_RELEASE:
 	return true;
 
       case MEMMODEL_ACQUIRE:
       case MEMMODEL_CONSUME:
-      case MEMMODEL_SYNC_ACQUIRE:
       case MEMMODEL_RELAXED:
 	return false;
 
@@ -3409,6 +3403,7 @@ riscv_print_operand (FILE *file, rtx op, int letter)
 {
   machine_mode mode = GET_MODE (op);
   enum rtx_code code = GET_CODE (op);
+  const enum memmodel model = memmodel_base (INTVAL (op));
 
   switch (letter)
     {
@@ -3428,12 +3423,12 @@ riscv_print_operand (FILE *file, rtx op, int letter)
       break;
 
     case 'A':
-      if (riscv_memmodel_needs_amo_acquire ((enum memmodel) INTVAL (op)))
+      if (riscv_memmodel_needs_amo_acquire (model))
 	fputs (".aq", file);
       break;
 
     case 'F':
-      if (riscv_memmodel_needs_release_fence ((enum memmodel) INTVAL (op)))
+      if (riscv_memmodel_needs_release_fence (model))
 	fputs ("fence iorw,ow; ", file);
       break;
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 02/10] RISC-V: Emit proper memory ordering suffixes for AMOs [PR 100265]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
  2021-04-26 12:45 ` [PATCH 01/10] RISC-V: Simplify memory model code [PR 100265] Christoph Muellner
@ 2021-04-26 12:45 ` Christoph Muellner
  2021-04-26 12:45 ` [PATCH 03/10] RISC-V: Eliminate %F specifier from riscv_print_operand() " Christoph Muellner
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

The ratified A extension supports '.aq', '.rl' and '.aqrl' as
memory ordering suffixes. Let's emit them in case we get a '%A'
conversion specifier for riscv_print_operand().

As '%A' was already used for a similar, but restricted, purpose
(only '.aq' was emitted so far), this does not require any other
changes.

    gcc/
        PR 100265
        * config/riscv/riscv.c (riscv_memmodel_needs_amo_acquire):
          Remove function.
        * config/riscv/riscv.c (riscv_print_amo_memory_ordering_suffix):
          Add function to emit AMO memory ordering suffixes.
        * config/riscv/riscv.c (riscv_print_operand): Call
          riscv_print_amo_memory_ordering_suffix() instead of
          riscv_memmodel_needs_amo_acquire().
---
 gcc/config/riscv/riscv.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 9b5aedc77131..881eab66a481 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3341,24 +3341,26 @@ riscv_print_operand_reloc (FILE *file, rtx op, bool hi_reloc)
   fputc (')', file);
 }
 
-/* Return true if the .AQ suffix should be added to an AMO to implement the
-   acquire portion of memory model MODEL.  */
+/* Print the memory ordering suffix for AMOs.  */
 
-static bool
-riscv_memmodel_needs_amo_acquire (const enum memmodel model)
+static void
+riscv_print_amo_memory_ordering_suffix (FILE *file, const enum memmodel model)
 {
   switch (model)
     {
-      case MEMMODEL_ACQ_REL:
-      case MEMMODEL_SEQ_CST:
-      case MEMMODEL_ACQUIRE:
+      case MEMMODEL_RELAXED:
+	break;
       case MEMMODEL_CONSUME:
-	return true;
-
+      case MEMMODEL_ACQUIRE:
+	fputs (".aq", file);
+	break;
       case MEMMODEL_RELEASE:
-      case MEMMODEL_RELAXED:
-	return false;
-
+	fputs (".rl", file);
+	break;
+      case MEMMODEL_ACQ_REL:
+      case MEMMODEL_SEQ_CST:
+	fputs (".aqrl", file);
+	break;
       default:
 	gcc_unreachable ();
     }
@@ -3423,8 +3425,7 @@ riscv_print_operand (FILE *file, rtx op, int letter)
       break;
 
     case 'A':
-      if (riscv_memmodel_needs_amo_acquire (model))
-	fputs (".aq", file);
+      riscv_print_amo_memory_ordering_suffix (file, model);
       break;
 
     case 'F':
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 03/10] RISC-V: Eliminate %F specifier from riscv_print_operand() [PR 100265]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
  2021-04-26 12:45 ` [PATCH 01/10] RISC-V: Simplify memory model code [PR 100265] Christoph Muellner
  2021-04-26 12:45 ` [PATCH 02/10] RISC-V: Emit proper memory ordering suffixes for AMOs " Christoph Muellner
@ 2021-04-26 12:45 ` Christoph Muellner
  2021-04-26 12:45 ` [PATCH 04/10] RISC-V: Don't use amoswap for atomic stores " Christoph Muellner
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

A previous patch took care, that the proper memory ordering suffixes
for AMOs are emitted. Therefore there is no reason to keep the fence
generation mechanism for release operations.

    gcc/
        PR 100265
        * config/riscv/riscv.c (riscv_memmodel_needs_release_fence):
          Remove function.
        * config/riscv/riscv.c (riscv_print_operand): Remove
          %F format specifier.
        * config/riscv/sync.md: Remove %F format specifier uses.
---
 gcc/config/riscv/riscv.c | 29 -----------------------------
 gcc/config/riscv/sync.md | 16 ++++++++--------
 2 files changed, 8 insertions(+), 37 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 881eab66a481..87cdde73ae21 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3366,29 +3366,6 @@ riscv_print_amo_memory_ordering_suffix (FILE *file, const enum memmodel model)
     }
 }
 
-/* Return true if a FENCE should be emitted to before a memory access to
-   implement the release portion of memory model MODEL.  */
-
-static bool
-riscv_memmodel_needs_release_fence (const enum memmodel model)
-{
-  switch (model)
-    {
-      case MEMMODEL_ACQ_REL:
-      case MEMMODEL_SEQ_CST:
-      case MEMMODEL_RELEASE:
-	return true;
-
-      case MEMMODEL_ACQUIRE:
-      case MEMMODEL_CONSUME:
-      case MEMMODEL_RELAXED:
-	return false;
-
-      default:
-	gcc_unreachable ();
-    }
-}
-
 /* Implement TARGET_PRINT_OPERAND.  The RISCV-specific operand codes are:
 
    'h'	Print the high-part relocation associated with OP, after stripping
@@ -3396,7 +3373,6 @@ riscv_memmodel_needs_release_fence (const enum memmodel model)
    'R'	Print the low-part relocation associated with OP.
    'C'	Print the integer branch condition for comparison OP.
    'A'	Print the atomic operation suffix for memory model OP.
-   'F'	Print a FENCE if the memory model requires a release.
    'z'	Print x0 if OP is zero, otherwise print OP normally.
    'i'	Print i if the operand is not a register.  */
 
@@ -3428,11 +3404,6 @@ riscv_print_operand (FILE *file, rtx op, int letter)
       riscv_print_amo_memory_ordering_suffix (file, model);
       break;
 
-    case 'F':
-      if (riscv_memmodel_needs_release_fence (model))
-	fputs ("fence iorw,ow; ", file);
-      break;
-
     case 'i':
       if (code != REG)
         fputs ("i", file);
diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 747a799e2377..aeeb2e854b68 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -65,7 +65,7 @@
        (match_operand:SI 2 "const_int_operand")]      ;; model
       UNSPEC_ATOMIC_STORE))]
   "TARGET_ATOMIC"
-  "%F2amoswap.<amo>%A2 zero,%z1,%0"
+  "amoswap.<amo>%A2 zero,%z1,%0"
   [(set (attr "length") (const_int 8))])
 
 (define_insn "atomic_<atomic_optab><mode>"
@@ -76,8 +76,8 @@
 	   (match_operand:SI 2 "const_int_operand")] ;; model
 	 UNSPEC_SYNC_OLD_OP))]
   "TARGET_ATOMIC"
-  "%F2amo<insn>.<amo>%A2 zero,%z1,%0"
-  [(set (attr "length") (const_int 8))])
+  "amo<insn>.<amo>%A2 zero,%z1,%0"
+)
 
 (define_insn "atomic_fetch_<atomic_optab><mode>"
   [(set (match_operand:GPR 0 "register_operand" "=&r")
@@ -89,8 +89,8 @@
 	   (match_operand:SI 3 "const_int_operand")] ;; model
 	 UNSPEC_SYNC_OLD_OP))]
   "TARGET_ATOMIC"
-  "%F3amo<insn>.<amo>%A3 %0,%z2,%1"
-  [(set (attr "length") (const_int 8))])
+  "amo<insn>.<amo>%A3 %0,%z2,%1"
+)
 
 (define_insn "atomic_exchange<mode>"
   [(set (match_operand:GPR 0 "register_operand" "=&r")
@@ -101,8 +101,8 @@
    (set (match_dup 1)
 	(match_operand:GPR 2 "register_operand" "0"))]
   "TARGET_ATOMIC"
-  "%F3amoswap.<amo>%A3 %0,%z2,%1"
-  [(set (attr "length") (const_int 8))])
+  "amoswap.<amo>%A3 %0,%z2,%1"
+)
 
 (define_insn "atomic_cas_value_strong<mode>"
   [(set (match_operand:GPR 0 "register_operand" "=&r")
@@ -115,7 +115,7 @@
 	 UNSPEC_COMPARE_AND_SWAP))
    (clobber (match_scratch:GPR 6 "=&r"))]
   "TARGET_ATOMIC"
-  "%F5 1: lr.<amo>%A5 %0,%1; bne %0,%z2,1f; sc.<amo>%A4 %6,%z3,%1; bnez %6,1b; 1:"
+  "1: lr.<amo>%A5 %0,%1; bne %0,%z2,1f; sc.<amo>%A4 %6,%z3,%1; bnez %6,1b; 1:"
   [(set (attr "length") (const_int 20))])
 
 (define_expand "atomic_compare_and_swap<mode>"
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 04/10] RISC-V: Don't use amoswap for atomic stores [PR 100265]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
                   ` (2 preceding siblings ...)
  2021-04-26 12:45 ` [PATCH 03/10] RISC-V: Eliminate %F specifier from riscv_print_operand() " Christoph Muellner
@ 2021-04-26 12:45 ` Christoph Muellner
  2021-04-26 12:45 ` [PATCH 05/10] RISC-V: Emit fences according to chosen memory model " Christoph Muellner
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

Using amoswap as atomic store is not an expected optimization
and most likely causes a performance penalty.
Neither SW nor HW have a benefit from this optimization,
so let's simply drop it.

    gcc/
        PR 100265
        * config/riscv/sync.md (atomic_store<mode>):
          Remove.
---
 gcc/config/riscv/sync.md | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index aeeb2e854b68..efd49745a8e2 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -57,17 +57,6 @@
 
 ;; Atomic memory operations.
 
-;; Implement atomic stores with amoswap.  Fall back to fences for atomic loads.
-(define_insn "atomic_store<mode>"
-  [(set (match_operand:GPR 0 "memory_operand" "=A")
-    (unspec_volatile:GPR
-      [(match_operand:GPR 1 "reg_or_0_operand" "rJ")
-       (match_operand:SI 2 "const_int_operand")]      ;; model
-      UNSPEC_ATOMIC_STORE))]
-  "TARGET_ATOMIC"
-  "amoswap.<amo>%A2 zero,%z1,%0"
-  [(set (attr "length") (const_int 8))])
-
 (define_insn "atomic_<atomic_optab><mode>"
   [(set (match_operand:GPR 0 "memory_operand" "+A")
 	(unspec_volatile:GPR
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 05/10] RISC-V: Emit fences according to chosen memory model [PR 100265]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
                   ` (3 preceding siblings ...)
  2021-04-26 12:45 ` [PATCH 04/10] RISC-V: Don't use amoswap for atomic stores " Christoph Muellner
@ 2021-04-26 12:45 ` Christoph Muellner
  2021-04-26 12:45 ` [PATCH 06/10] RISC-V: Implement atomic_{load,store} " Christoph Muellner
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

mem_thread_fence gets the desired memory model as operand.
Let's emit fences according to this value (as defined in section
    "Code Porting and Mapping Guidelines" of the unpriv spec).

    gcc/
        PR 100265
        * config/riscv/sync.md (mem_thread_fence):
          Emit fences according to given operand.
        * config/riscv/sync.md (mem_fence):
          Add INSNs for different fence flavours.
        * config/riscv/sync.md (mem_thread_fence_1):
          Remove.
---
 gcc/config/riscv/sync.md | 41 +++++++++++++++++++++++++++-------------
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index efd49745a8e2..406db1730b81 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -34,26 +34,41 @@
 ;; Memory barriers.
 
 (define_expand "mem_thread_fence"
-  [(match_operand:SI 0 "const_int_operand" "")] ;; model
+  [(match_operand:SI 0 "const_int_operand")] ;; model
   ""
 {
-  if (INTVAL (operands[0]) != MEMMODEL_RELAXED)
-    {
-      rtx mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
-      MEM_VOLATILE_P (mem) = 1;
-      emit_insn (gen_mem_thread_fence_1 (mem, operands[0]));
-    }
+  enum memmodel model = memmodel_from_int (INTVAL (operands[0]));
+  if (!(is_mm_relaxed (model)))
+      emit_insn (gen_mem_fence (operands[0]));
   DONE;
 })
 
-;; Until the RISC-V memory model (hence its mapping from C++) is finalized,
-;; conservatively emit a full FENCE.
-(define_insn "mem_thread_fence_1"
+(define_expand "mem_fence"
+  [(set (match_dup 1)
+	(unspec:BLK [(match_dup 1) (match_operand:SI 0 "const_int_operand")]
+		    UNSPEC_MEMORY_BARRIER))]
+  ""
+{
+  operands[1] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
+  MEM_VOLATILE_P (operands[1]) = 1;
+})
+
+(define_insn "*mem_fence"
   [(set (match_operand:BLK 0 "" "")
-	(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
-   (match_operand:SI 1 "const_int_operand" "")] ;; model
+	(unspec:BLK [(match_dup 0) (match_operand:SI 1 "const_int_operand")]
+		    UNSPEC_MEMORY_BARRIER))]
   ""
-  "fence\tiorw,iorw")
+{
+  enum memmodel model = memmodel_from_int (INTVAL (operands[1]));
+  if (is_mm_consume (model) || is_mm_acquire (model))
+    return "fence\tr, rw";
+  else if (is_mm_release (model))
+    return "fence\trw, w";
+  else if (is_mm_acq_rel (model))
+    return "fence.tso";
+  else
+    return "fence\trw, rw";
+})
 
 ;; Atomic memory operations.
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 06/10] RISC-V: Implement atomic_{load,store} [PR 100265]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
                   ` (4 preceding siblings ...)
  2021-04-26 12:45 ` [PATCH 05/10] RISC-V: Emit fences according to chosen memory model " Christoph Muellner
@ 2021-04-26 12:45 ` Christoph Muellner
  2021-04-26 12:45 ` [PATCH 07/10] RISC-V: Model INSNs for LR and SC [PR 100266] Christoph Muellner
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

A recent commit introduced a mechanism to emit proper fences
for RISC-V. Additionally, we already have emit_move_insn ().
Let's reuse this code and provide atomic_load<mode> and
atomic_store<mode> for RISC-V (as defined in section
"Code Porting and Mapping Guidelines" of the unpriv spec).
Note, that this works also for sub-word atomics.

    gcc/
        PR 100265
        * config/riscv/sync.md (atomic_load<mode>): New.
        * config/riscv/sync.md (atomic_store<mode>): New.
---
 gcc/config/riscv/sync.md | 41 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 406db1730b81..ceec324dfa30 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -23,6 +23,7 @@
   UNSPEC_COMPARE_AND_SWAP
   UNSPEC_SYNC_OLD_OP
   UNSPEC_SYNC_EXCHANGE
+  UNSPEC_ATOMIC_LOAD
   UNSPEC_ATOMIC_STORE
   UNSPEC_MEMORY_BARRIER
 ])
@@ -72,6 +73,46 @@
 
 ;; Atomic memory operations.
 
+(define_expand "atomic_load<mode>"
+  [(set (match_operand:ANYI 0 "register_operand" "=r")
+    (unspec_volatile:ANYI
+      [(match_operand:ANYI 1 "memory_operand" "A")
+       (match_operand:SI 2 "const_int_operand")]      ;; model
+      UNSPEC_ATOMIC_LOAD))]
+  ""
+  {
+    rtx target = operands[0];
+    rtx mem = operands[1];
+    enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
+
+    if (is_mm_seq_cst (model))
+      emit_insn (gen_mem_fence (GEN_INT (MEMMODEL_SEQ_CST)));
+    emit_move_insn (target, mem);
+    if (is_mm_acquire (model) || is_mm_seq_cst (model))
+      emit_insn (gen_mem_fence (GEN_INT (MEMMODEL_ACQUIRE)));
+
+    DONE;
+})
+
+(define_expand "atomic_store<mode>"
+  [(set (match_operand:ANYI 0 "memory_operand" "=A")
+    (unspec_volatile:ANYI
+      [(match_operand:ANYI 1 "reg_or_0_operand" "rJ")
+       (match_operand:SI 2 "const_int_operand")]      ;; model
+      UNSPEC_ATOMIC_STORE))]
+  ""
+  {
+    rtx mem = operands[0];
+    rtx val = operands[1];
+    enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
+
+    if (is_mm_release (model) || is_mm_seq_cst (model))
+      emit_insn (gen_mem_fence (GEN_INT (MEMMODEL_RELEASE)));
+    emit_move_insn (mem, val);
+
+    DONE;
+})
+
 (define_insn "atomic_<atomic_optab><mode>"
   [(set (match_operand:GPR 0 "memory_operand" "+A")
 	(unspec_volatile:GPR
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 07/10] RISC-V: Model INSNs for LR and SC [PR 100266]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
                   ` (5 preceding siblings ...)
  2021-04-26 12:45 ` [PATCH 06/10] RISC-V: Implement atomic_{load,store} " Christoph Muellner
@ 2021-04-26 12:45 ` Christoph Muellner
  2021-04-26 12:45 ` [PATCH 08/10] RISC-V: Add s.ext-consuming " Christoph Muellner
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

In order to emit LR/SC sequences, let's provide INSNs, which
take care of memory ordering constraints.

    gcc/
        PR 100266
        * config/rsicv/sync.md (UNSPEC_LOAD_RESERVED): New.
        * config/rsicv/sync.md (UNSPEC_STORE_CONDITIONAL): New.
        * config/riscv/sync.md (riscv_load_reserved): New.
        * config/riscv/sync.md (riscv_store_conditional): New.
---
 gcc/config/riscv/sync.md | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index ceec324dfa30..edff6520b87e 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -26,6 +26,8 @@
   UNSPEC_ATOMIC_LOAD
   UNSPEC_ATOMIC_STORE
   UNSPEC_MEMORY_BARRIER
+  UNSPEC_LOAD_RESERVED
+  UNSPEC_STORE_CONDITIONAL
 ])
 
 (define_code_iterator any_atomic [plus ior xor and])
@@ -113,6 +115,28 @@
     DONE;
 })
 
+(define_insn "@riscv_load_reserved<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+    (unspec_volatile:GPR
+      [(match_operand:GPR 1 "memory_operand" "A")
+       (match_operand:SI 2 "const_int_operand")]      ;; model
+      UNSPEC_LOAD_RESERVED))]
+  "TARGET_ATOMIC"
+  "lr.<amo>%A2 %0, %1"
+)
+
+(define_insn "@riscv_store_conditional<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+    (unspec_volatile:GPR [(const_int 0)] UNSPEC_STORE_CONDITIONAL))
+   (set (match_operand:GPR 1 "memory_operand" "=A")
+    (unspec_volatile:GPR
+      [(match_operand:GPR 2 "reg_or_0_operand" "rJ")
+       (match_operand:SI 3 "const_int_operand")]      ;; model
+      UNSPEC_STORE_CONDITIONAL))]
+  "TARGET_ATOMIC"
+  "sc.<amo>%A3 %0, %z2, %1"
+)
+
 (define_insn "atomic_<atomic_optab><mode>"
   [(set (match_operand:GPR 0 "memory_operand" "+A")
 	(unspec_volatile:GPR
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 08/10] RISC-V: Add s.ext-consuming INSNs for LR and SC [PR 100266]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
                   ` (6 preceding siblings ...)
  2021-04-26 12:45 ` [PATCH 07/10] RISC-V: Model INSNs for LR and SC [PR 100266] Christoph Muellner
@ 2021-04-26 12:45 ` Christoph Muellner
  2021-04-26 12:45 ` [PATCH 09/10] RISC-V: Generate helpers for cbranch4 " Christoph Muellner
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

The current model of the LR and SC INSNs requires a sign-extension
to use the generated SImode value for conditional branches, which
only operate on XLEN registers.
However, the sign-extension is actually not required in both cases,
therefore this patch introduces additional INSNs that consume
the sign-extension.

Rationale:
The sign-extension of the loaded value of a LR.W is specified as
sign-extended. Therefore, a sign-extension is not required.
The sign-extension of the success value a SC.W is specified as
non-zero. As sign-extended non-zero value remains non-zero,
therefore the sign-extension is not required.

    gcc/
        PR 100266
        * config/riscv/sync.md (riscv_load_reserved): New.
        * config/riscv/sync.md (riscv_store_conditional): New.
---
 gcc/config/riscv/sync.md | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index edff6520b87e..49b860da8ef0 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -125,6 +125,21 @@
   "lr.<amo>%A2 %0, %1"
 )
 
+;; This pattern allows to consume a sign-extension of the loaded value.
+;; This is legal, because the specification of LR.W defines the loaded
+;; value to be sign-extended.
+
+(define_insn "riscv_load_reserved"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+    (sign_extend:DI
+      (unspec_volatile:SI
+	[(match_operand:SI 1 "memory_operand" "A")
+	 (match_operand:SI 2 "const_int_operand")]      ;; model
+	UNSPEC_LOAD_RESERVED)))]
+  "TARGET_ATOMIC && TARGET_64BIT"
+  "lr.w%A2 %0, %1"
+)
+
 (define_insn "@riscv_store_conditional<mode>"
   [(set (match_operand:GPR 0 "register_operand" "=&r")
     (unspec_volatile:GPR [(const_int 0)] UNSPEC_STORE_CONDITIONAL))
@@ -137,6 +152,25 @@
   "sc.<amo>%A3 %0, %z2, %1"
 )
 
+;; This pattern allows to consume a sign-extension of the success
+;; value of SC.W, which can then be used for instructions which
+;; require values of XLEN-size (e.g. conditional branches).
+;; This is legal, because any non-zero value remains non-zero
+;; after sign-extension.
+
+(define_insn "riscv_store_conditional"
+  [(set (match_operand:DI 0 "register_operand" "=&r")
+    (sign_extend:DI
+      (unspec_volatile:SI [(const_int 0)] UNSPEC_STORE_CONDITIONAL)))
+   (set (match_operand:SI 1 "memory_operand" "=A")
+    (unspec_volatile:SI
+      [(match_operand:SI 2 "reg_or_0_operand" "rJ")
+       (match_operand:SI 3 "const_int_operand")]      ;; model
+      UNSPEC_STORE_CONDITIONAL))]
+  "TARGET_ATOMIC && TARGET_64BIT"
+  "sc.w%A3 %0, %z2, %1"
+)
+
 (define_insn "atomic_<atomic_optab><mode>"
   [(set (match_operand:GPR 0 "memory_operand" "+A")
 	(unspec_volatile:GPR
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 09/10] RISC-V: Generate helpers for cbranch4 [PR 100266]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
                   ` (7 preceding siblings ...)
  2021-04-26 12:45 ` [PATCH 08/10] RISC-V: Add s.ext-consuming " Christoph Muellner
@ 2021-04-26 12:45 ` Christoph Muellner
  2021-04-26 14:39   ` Kito Cheng
  2021-04-26 12:45 ` [PATCH 10/10] RISC-V: Provide programmatic implementation of CAS " Christoph Muellner
  2021-04-28 22:40 ` [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Jim Wilson
  10 siblings, 1 reply; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

On RISC-V we are facing the fact, that our conditional branches
require Pmode conditions. Currently, we generate them explicitly
with a check for Pmode and then calling the proper generator
(i.e. gen_cbranchdi4 on RV64 and gen_cbranchsi4 on RV32).
Let's make simplify this code by using gen_cbranch4 (Pmode).

    gcc/
        PR 100266
        * config/rsicv/riscv.c (riscv_block_move_loop): Simplify.
        * config/rsicv/riscv.md (cbranch<mode>4): Generate helpers.
---
 gcc/config/riscv/riscv.c  | 5 +----
 gcc/config/riscv/riscv.md | 2 +-
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 87cdde73ae21..6e97b38db6db 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3250,10 +3250,7 @@ riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length,
 
   /* Emit the loop condition.  */
   test = gen_rtx_NE (VOIDmode, src_reg, final_src);
-  if (Pmode == DImode)
-    emit_jump_insn (gen_cbranchdi4 (test, src_reg, final_src, label));
-  else
-    emit_jump_insn (gen_cbranchsi4 (test, src_reg, final_src, label));
+  emit_jump_insn (gen_cbranch4 (Pmode, test, src_reg, final_src, label));
 
   /* Mop up any left-over bytes.  */
   if (leftover)
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c3687d57047b..52f8a321ac23 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1908,7 +1908,7 @@
 		      (label_ref (match_operand 1))
 		      (pc)))])
 
-(define_expand "cbranch<mode>4"
+(define_expand "@cbranch<mode>4"
   [(set (pc)
 	(if_then_else (match_operator 0 "comparison_operator"
 		      [(match_operand:BR 1 "register_operand")
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 10/10] RISC-V: Provide programmatic implementation of CAS [PR 100266]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
                   ` (8 preceding siblings ...)
  2021-04-26 12:45 ` [PATCH 09/10] RISC-V: Generate helpers for cbranch4 " Christoph Muellner
@ 2021-04-26 12:45 ` Christoph Muellner
  2021-04-27 15:17   ` Jim Wilson
  2021-04-28 22:40 ` [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Jim Wilson
  10 siblings, 1 reply; 15+ messages in thread
From: Christoph Muellner @ 2021-04-26 12:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jim Wilson, Kito Cheng, Christoph Muellner

The existing CAS implementation uses an INSN definition, which provides
the core LR/SC sequence. Additionally to that, there is a follow-up code,
that evaluates the results and calculates the return values.
This has two drawbacks: a) an extension to sub-word CAS implementations
is not possible (even if, then it would be unmaintainable), and b) the
implementation is hard to maintain/improve.
This patch provides a programmatic implementation of CAS, similar
like many other architectures are having one.

The implementation supports both, RV32 and RV64.

Additionally, the implementation does not introduce data dependencies
for computation of the return value. Instead, we set the return value
(success state of the CAS operation) based on structural information.
This approach is also shown in the the RISC-V unpriv spec (as part
of the sample code for a compare-and-swap function using LR/SC).
The cost of this implementation is a single LI instruction on top,
which is actually not required in case of success (it will be
overwritten in the success case later).

The resulting sequence requires 9 instructions in the success case.
The previous implementation required 11 instructions in the succcess
case (including a taken branch) and had a "subw;seqz;beqz" sequence,
with direct dependencies.

Below is the generated code of a 32-bit CAS sequence with the old
implementation and the new implementation (ignore the ANDIs below).

Old:
     f00:       419c                    lw      a5,0(a1)
     f02:       1005272f                lr.w    a4,(a0)
     f06:       00f71563                bne     a4,a5,f10
     f0a:       18c526af                sc.w    a3,a2,(a0)
     f0e:       faf5                    bnez    a3,f02
     f10:       40f707bb                subw    a5,a4,a5
     f14:       0017b513                seqz    a0,a5
     f18:       c391                    beqz    a5,f1c
     f1a:       c198                    sw      a4,0(a1)
     f1c:       8905                    andi    a0,a0,1
     f1e:       8082                    ret

New:
     e28:       4194                    lw      a3,0(a1)
     e2a:       4701                    li      a4,0
     e2c:       1005282f                lr.w    a6,(a0)
     e30:       00d81963                bne     a6,a3,e42
     e34:       18c527af                sc.w    a5,a2,(a0)
     e38:       fbf5                    bnez    a5,e2c
     e3a:       4705                    li      a4,1
     e3c:       00177513                andi    a0,a4,1
     e40:       8082                    ret
     e42:       0105a023                sw      a6,0(a1)
     e46:       00177513                andi    a0,a4,1
     e4a:       8082                    ret

    gcc/
        PR 100266
        * config/riscv/riscv-protos.h (riscv_expand_compare_and_swap): New.
        * config/riscv/riscv.c (riscv_emit_unlikely_jump): New.
        * config/rsicv/riscv.c (riscv_expand_compare_and_swap): New.
        * config/rsicv/sync.md (atomic_cas_value_strong): Removed.
        * config/rsicv/sync.md (atomic_compare_and_swap): Call
	  riscv_expand_compare_and_swap.
---
 gcc/config/riscv/riscv-protos.h |  1 +
 gcc/config/riscv/riscv.c        | 68 +++++++++++++++++++++++++++++++++
 gcc/config/riscv/sync.md        | 35 +----------------
 3 files changed, 70 insertions(+), 34 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 43d7224d6941..eb7e67d3b95a 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -59,6 +59,7 @@ extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
 extern void riscv_expand_float_scc (rtx, enum rtx_code, rtx, rtx);
 extern void riscv_expand_conditional_branch (rtx, enum rtx_code, rtx, rtx);
 extern void riscv_expand_conditional_move (rtx, rtx, rtx, rtx_code, rtx, rtx);
+extern void riscv_expand_compare_and_swap (rtx[]);
 #endif
 extern rtx riscv_legitimize_call_address (rtx);
 extern void riscv_set_return_address (rtx, rtx);
diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 6e97b38db6db..c81a9bd6a29e 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -2488,6 +2488,74 @@ riscv_expand_conditional_move (rtx dest, rtx cons, rtx alt, rtx_code code,
 						      cons, alt)));
 }
 
+/* Mark the previous jump instruction as unlikely.  */
+
+static void
+riscv_emit_unlikely_jump (rtx insn)
+{
+  rtx_insn *jump = emit_jump_insn (insn);
+  add_reg_br_prob_note (jump, profile_probability::very_unlikely ());
+}
+
+/* Expand code to perform a compare-and-swap.  */
+
+extern void riscv_expand_compare_and_swap (rtx operands[])
+{
+  rtx bval, oldval, mem, expval, newval, mod_s, mod_f, scratch, cond1, cond2;
+  machine_mode mode;
+  rtx_code_label *begin_label, *end_label;
+
+  bval = operands[0];
+  oldval = operands[1];
+  mem = operands[2];
+  expval = operands[3];
+  newval = operands[4];
+  mod_s = operands[6];
+  mod_f = operands[7];
+  mode = GET_MODE (mem);
+  scratch = gen_reg_rtx (mode);
+  begin_label = gen_label_rtx ();
+  end_label = gen_label_rtx ();
+
+  /* No support for sub-word CAS.  */
+  if (mode == QImode || mode == HImode)
+    gcc_unreachable ();
+
+  /* We use mod_f for LR and mod_s for SC below, but
+     RV does not have any guarantees for LR.rl and SC.aq.  */
+  if (is_mm_acquire (memmodel_base (INTVAL (mod_s)))
+      && is_mm_relaxed (memmodel_base (INTVAL (mod_f))))
+    {
+      mod_f = GEN_INT (MEMMODEL_ACQUIRE);
+      mod_s = GEN_INT (MEMMODEL_RELAXED);
+    }
+
+  /* Since we want to maintain a branch-free good-case, but also want
+     to not have two branches in the bad-case, we set bval to FALSE
+     on top of the sequence.  In the bad case, we simply jump over
+     the assignment of bval to TRUE at the end of the sequence.  */
+
+  emit_insn (gen_rtx_SET (bval, gen_rtx_CONST_INT (SImode, FALSE)));
+
+  emit_label (begin_label);
+
+  emit_insn (gen_riscv_load_reserved (mode, oldval, mem, mod_f));
+
+  cond1 = gen_rtx_NE (mode, oldval, expval);
+  riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond1, oldval, expval,
+			    end_label));
+
+  emit_insn (gen_riscv_store_conditional (mode, scratch, mem, newval, mod_s));
+
+  cond2 = gen_rtx_NE (mode, scratch, const0_rtx);
+  riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond2, scratch, const0_rtx,
+			    begin_label));
+
+  emit_insn (gen_rtx_SET (bval, gen_rtx_CONST_INT (SImode, TRUE)));
+
+  emit_label (end_label);
+}
+
 /* Implement TARGET_FUNCTION_ARG_BOUNDARY.  Every parameter gets at
    least PARM_BOUNDARY bits of alignment, but will be given anything up
    to PREFERRED_STACK_BOUNDARY bits if the type requires it.  */
diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 49b860da8ef0..da8dbf698163 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -207,20 +207,6 @@
   "amoswap.<amo>%A3 %0,%z2,%1"
 )
 
-(define_insn "atomic_cas_value_strong<mode>"
-  [(set (match_operand:GPR 0 "register_operand" "=&r")
-	(match_operand:GPR 1 "memory_operand" "+A"))
-   (set (match_dup 1)
-	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")
-			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")
-			      (match_operand:SI 4 "const_int_operand")  ;; mod_s
-			      (match_operand:SI 5 "const_int_operand")] ;; mod_f
-	 UNSPEC_COMPARE_AND_SWAP))
-   (clobber (match_scratch:GPR 6 "=&r"))]
-  "TARGET_ATOMIC"
-  "1: lr.<amo>%A5 %0,%1; bne %0,%z2,1f; sc.<amo>%A4 %6,%z3,%1; bnez %6,1b; 1:"
-  [(set (attr "length") (const_int 20))])
-
 (define_expand "atomic_compare_and_swap<mode>"
   [(match_operand:SI 0 "register_operand" "")   ;; bool output
    (match_operand:GPR 1 "register_operand" "")  ;; val output
@@ -232,26 +218,7 @@
    (match_operand:SI 7 "const_int_operand" "")] ;; mod_f
   "TARGET_ATOMIC"
 {
-  emit_insn (gen_atomic_cas_value_strong<mode> (operands[1], operands[2],
-						operands[3], operands[4],
-						operands[6], operands[7]));
-
-  rtx compare = operands[1];
-  if (operands[3] != const0_rtx)
-    {
-      rtx difference = gen_rtx_MINUS (<MODE>mode, operands[1], operands[3]);
-      compare = gen_reg_rtx (<MODE>mode);
-      emit_insn (gen_rtx_SET (compare, difference));
-    }
-
-  if (word_mode != <MODE>mode)
-    {
-      rtx reg = gen_reg_rtx (word_mode);
-      emit_insn (gen_rtx_SET (reg, gen_rtx_SIGN_EXTEND (word_mode, compare)));
-      compare = reg;
-    }
-
-  emit_insn (gen_rtx_SET (operands[0], gen_rtx_EQ (SImode, compare, const0_rtx)));
+  riscv_expand_compare_and_swap (operands);
   DONE;
 })
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 09/10] RISC-V: Generate helpers for cbranch4 [PR 100266]
  2021-04-26 12:45 ` [PATCH 09/10] RISC-V: Generate helpers for cbranch4 " Christoph Muellner
@ 2021-04-26 14:39   ` Kito Cheng
  2021-05-05 19:26     ` Christoph Müllner
  0 siblings, 1 reply; 15+ messages in thread
From: Kito Cheng @ 2021-04-26 14:39 UTC (permalink / raw)
  To: Christoph Muellner; +Cc: GCC Patches, Jim Wilson

This patch is a good and simple improvement which could be an independent patch.

There is only 1 comment from me for this patch, could you also add @
to cbranch pattern for floating mode, I would prefer make the
gen_cbranch4 could handle floating mode for consistency.

So feel free to commit this patch once you have addressed my comment.



On Mon, Apr 26, 2021 at 8:46 PM Christoph Muellner
<cmuellner@gcc.gnu.org> wrote:
>
> On RISC-V we are facing the fact, that our conditional branches
> require Pmode conditions. Currently, we generate them explicitly
> with a check for Pmode and then calling the proper generator
> (i.e. gen_cbranchdi4 on RV64 and gen_cbranchsi4 on RV32).
> Let's make simplify this code by using gen_cbranch4 (Pmode).
>
>     gcc/
>         PR 100266
>         * config/rsicv/riscv.c (riscv_block_move_loop): Simplify.
>         * config/rsicv/riscv.md (cbranch<mode>4): Generate helpers.
> ---
>  gcc/config/riscv/riscv.c  | 5 +----
>  gcc/config/riscv/riscv.md | 2 +-
>  2 files changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> index 87cdde73ae21..6e97b38db6db 100644
> --- a/gcc/config/riscv/riscv.c
> +++ b/gcc/config/riscv/riscv.c
> @@ -3250,10 +3250,7 @@ riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length,
>
>    /* Emit the loop condition.  */
>    test = gen_rtx_NE (VOIDmode, src_reg, final_src);
> -  if (Pmode == DImode)
> -    emit_jump_insn (gen_cbranchdi4 (test, src_reg, final_src, label));
> -  else
> -    emit_jump_insn (gen_cbranchsi4 (test, src_reg, final_src, label));
> +  emit_jump_insn (gen_cbranch4 (Pmode, test, src_reg, final_src, label));
>
>    /* Mop up any left-over bytes.  */
>    if (leftover)
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index c3687d57047b..52f8a321ac23 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1908,7 +1908,7 @@
>                       (label_ref (match_operand 1))
>                       (pc)))])
>
> -(define_expand "cbranch<mode>4"
> +(define_expand "@cbranch<mode>4"
>    [(set (pc)
>         (if_then_else (match_operator 0 "comparison_operator"
>                       [(match_operand:BR 1 "register_operand")
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 10/10] RISC-V: Provide programmatic implementation of CAS [PR 100266]
  2021-04-26 12:45 ` [PATCH 10/10] RISC-V: Provide programmatic implementation of CAS " Christoph Muellner
@ 2021-04-27 15:17   ` Jim Wilson
  0 siblings, 0 replies; 15+ messages in thread
From: Jim Wilson @ 2021-04-27 15:17 UTC (permalink / raw)
  To: Christoph Muellner; +Cc: GCC Patches, Kito Cheng

On Mon, Apr 26, 2021 at 5:46 AM Christoph Muellner <cmuellner@gcc.gnu.org>
wrote:

> The existing CAS implementation uses an INSN definition, which provides
> the core LR/SC sequence. Additionally to that, there is a follow-up code,
> that evaluates the results and calculates the return values.
> This has two drawbacks: a) an extension to sub-word CAS implementations
> is not possible (even if, then it would be unmaintainable), and b) the
> implementation is hard to maintain/improve.
> This patch provides a programmatic implementation of CAS, similar
> like many other architectures are having one.


I noticed that when the address isn't already valid for lr/sc then we end
up with extra instructions to fix the address.  For instance, using
gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c, I get for the lr/sc loop
.L2:
    addi a5,a3,%lo(v)
    lr.w a1, 0(a5)
    bne a1,a2,.L7
    addi a1,a3,%lo(v)
    sc.w a5, a0, 0(a1)
    sext.w a5,a5
    bne a5,zero,.L2
and note that there are two addi %lo instructions.  The current code gives
     addi a4,a4,%lo(v)
1: lr.w a2,0(a4); bne a2,a5,1f; sc.w a6,a0,0(a4); bnez a6,1b; 1:
which is better, as the address is fixed before the lr/sc loop.

The sext is fixed by the REE patch, or by directly generating the
sign-extending sc.w so that isn't an issue here.

Jim

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266]
  2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
                   ` (9 preceding siblings ...)
  2021-04-26 12:45 ` [PATCH 10/10] RISC-V: Provide programmatic implementation of CAS " Christoph Muellner
@ 2021-04-28 22:40 ` Jim Wilson
  10 siblings, 0 replies; 15+ messages in thread
From: Jim Wilson @ 2021-04-28 22:40 UTC (permalink / raw)
  To: Christoph Muellner; +Cc: GCC Patches, Kito Cheng

On Mon, Apr 26, 2021 at 5:45 AM Christoph Muellner <cmuellner@gcc.gnu.org>
wrote:

> This series provides a cleanup of the current atomics implementation
> of RISC-V:
>

This looks OK to me, other than the issue with address instructions emitted
inside the lr/sc loop.  That could be fixed with a follow up patch though.

amoswap is probably faster than a fence followed by a store, but they
aren't exactly the same operation, since I think the amoswap only provides
consistency for the target address whereas the fence would provide
consistency for all addresses.  It does look odd that we were using amoswap
here.  The inconsistency with atomic_load might be a problem.  I'm OK with
dropping this.

I'm not sure if we are implementing the __sync_* builtins properly.  Most
of them are defined as full barriers.  With our weak memory model RVWMO,
that implies that we need a fence along with an amo instruction.  However,
this is broken without your patch, and not made any worse by your patch, so
it isn't an issue with your patch.  Just a FYI.  See for instance the
discussion in PR 65697 and the patch to fix this for aarch64.
https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/config/aarch64/aarch64.c;h=93bea074d6831b2aea2c0a40dacddc9ad20788c7;hp=648a548e0e06d2968fb74e4b588c368db060ad74;hb=f70fb3b635f9618c6d2ee3848ba836914f7951c2;hpb=fc65eccabc6d6e881ff5efcd674aa3791cf8cee6

Jim

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 09/10] RISC-V: Generate helpers for cbranch4 [PR 100266]
  2021-04-26 14:39   ` Kito Cheng
@ 2021-05-05 19:26     ` Christoph Müllner
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Müllner @ 2021-05-05 19:26 UTC (permalink / raw)
  To: Kito Cheng; +Cc: GCC Patches, Jim Wilson

On Mon, Apr 26, 2021 at 4:40 PM Kito Cheng <kito.cheng@sifive.com> wrote:
>
> This patch is a good and simple improvement which could be an independent patch.
>
> There is only 1 comment from me for this patch, could you also add @
> to cbranch pattern for floating mode, I would prefer make the
> gen_cbranch4 could handle floating mode for consistency.

Did that and I also found one more code location, where which could be
simplified.
Patch can be found here:
  https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569689.html

>
> So feel free to commit this patch once you have addressed my comment.
>
>
>
> On Mon, Apr 26, 2021 at 8:46 PM Christoph Muellner
> <cmuellner@gcc.gnu.org> wrote:
> >
> > On RISC-V we are facing the fact, that our conditional branches
> > require Pmode conditions. Currently, we generate them explicitly
> > with a check for Pmode and then calling the proper generator
> > (i.e. gen_cbranchdi4 on RV64 and gen_cbranchsi4 on RV32).
> > Let's make simplify this code by using gen_cbranch4 (Pmode).
> >
> >     gcc/
> >         PR 100266
> >         * config/rsicv/riscv.c (riscv_block_move_loop): Simplify.
> >         * config/rsicv/riscv.md (cbranch<mode>4): Generate helpers.
> > ---
> >  gcc/config/riscv/riscv.c  | 5 +----
> >  gcc/config/riscv/riscv.md | 2 +-
> >  2 files changed, 2 insertions(+), 5 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> > index 87cdde73ae21..6e97b38db6db 100644
> > --- a/gcc/config/riscv/riscv.c
> > +++ b/gcc/config/riscv/riscv.c
> > @@ -3250,10 +3250,7 @@ riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length,
> >
> >    /* Emit the loop condition.  */
> >    test = gen_rtx_NE (VOIDmode, src_reg, final_src);
> > -  if (Pmode == DImode)
> > -    emit_jump_insn (gen_cbranchdi4 (test, src_reg, final_src, label));
> > -  else
> > -    emit_jump_insn (gen_cbranchsi4 (test, src_reg, final_src, label));
> > +  emit_jump_insn (gen_cbranch4 (Pmode, test, src_reg, final_src, label));
> >
> >    /* Mop up any left-over bytes.  */
> >    if (leftover)
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index c3687d57047b..52f8a321ac23 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -1908,7 +1908,7 @@
> >                       (label_ref (match_operand 1))
> >                       (pc)))])
> >
> > -(define_expand "cbranch<mode>4"
> > +(define_expand "@cbranch<mode>4"
> >    [(set (pc)
> >         (if_then_else (match_operator 0 "comparison_operator"
> >                       [(match_operand:BR 1 "register_operand")
> > --
> > 2.31.1
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-05-05 19:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-26 12:45 [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Christoph Muellner
2021-04-26 12:45 ` [PATCH 01/10] RISC-V: Simplify memory model code [PR 100265] Christoph Muellner
2021-04-26 12:45 ` [PATCH 02/10] RISC-V: Emit proper memory ordering suffixes for AMOs " Christoph Muellner
2021-04-26 12:45 ` [PATCH 03/10] RISC-V: Eliminate %F specifier from riscv_print_operand() " Christoph Muellner
2021-04-26 12:45 ` [PATCH 04/10] RISC-V: Don't use amoswap for atomic stores " Christoph Muellner
2021-04-26 12:45 ` [PATCH 05/10] RISC-V: Emit fences according to chosen memory model " Christoph Muellner
2021-04-26 12:45 ` [PATCH 06/10] RISC-V: Implement atomic_{load,store} " Christoph Muellner
2021-04-26 12:45 ` [PATCH 07/10] RISC-V: Model INSNs for LR and SC [PR 100266] Christoph Muellner
2021-04-26 12:45 ` [PATCH 08/10] RISC-V: Add s.ext-consuming " Christoph Muellner
2021-04-26 12:45 ` [PATCH 09/10] RISC-V: Generate helpers for cbranch4 " Christoph Muellner
2021-04-26 14:39   ` Kito Cheng
2021-05-05 19:26     ` Christoph Müllner
2021-04-26 12:45 ` [PATCH 10/10] RISC-V: Provide programmatic implementation of CAS " Christoph Muellner
2021-04-27 15:17   ` Jim Wilson
2021-04-28 22:40 ` [PATCH 00/10] [RISC-V] Atomics improvements [PR100265/PR100266] Jim Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).