public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/3] RISC-V: Add memset-zero expansion with Zicboz+Zic64b
@ 2024-05-08  5:38 Christoph Müllner
  2024-05-08  5:38 ` [PATCH 1/3] expr: Export clear_by_pieces() Christoph Müllner
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Christoph Müllner @ 2024-05-08  5:38 UTC (permalink / raw)
  To: gcc-patches, Kito Cheng, Jim Wilson, Palmer Dabbelt,
	Andrew Waterman, Philipp Tomsich, Jeff Law, Vineet Gupta
  Cc: Christoph Müllner

I've mentioned this patchset a few weeks ago in the RISC-V call.
Sending it now, as the release is out.

Christoph Müllner (3):
  expr: Export clear_by_pieces()
  RISC-V: testsuite: Make cmo tests LTO safe
  RISC-V: Add memset-zero expansion to cbo.zero

 gcc/config/riscv/riscv-protos.h               |  1 +
 gcc/config/riscv/riscv-string.cc              | 59 +++++++++++++++++++
 gcc/config/riscv/riscv.md                     | 24 ++++++++
 gcc/expr.cc                                   |  6 +-
 gcc/expr.h                                    |  5 ++
 gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c |  6 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c |  6 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c |  6 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c |  6 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c |  2 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c |  2 +-
 .../gcc.target/riscv/cmo-zicboz-zic64-1.c     | 43 ++++++++++++++
 12 files changed, 147 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicboz-zic64-1.c

-- 
2.44.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] expr: Export clear_by_pieces()
  2024-05-08  5:38 [PATCH 0/3] RISC-V: Add memset-zero expansion with Zicboz+Zic64b Christoph Müllner
@ 2024-05-08  5:38 ` Christoph Müllner
  2024-05-09 18:23   ` Jeff Law
  2024-05-14 15:23   ` Jeff Law
  2024-05-08  5:38 ` [PATCH 2/3] RISC-V: testsuite: Make cmo tests LTO safe Christoph Müllner
  2024-05-08  5:38 ` [PATCH 3/3] RISC-V: Add memset-zero expansion to cbo.zero Christoph Müllner
  2 siblings, 2 replies; 8+ messages in thread
From: Christoph Müllner @ 2024-05-08  5:38 UTC (permalink / raw)
  To: gcc-patches, Kito Cheng, Jim Wilson, Palmer Dabbelt,
	Andrew Waterman, Philipp Tomsich, Jeff Law, Vineet Gupta
  Cc: Christoph Müllner

Make clear_by_pieces() available to other parts of the compiler,
similar to store_by_pieces().

gcc/ChangeLog:

	* expr.cc (clear_by_pieces): Remove static from clear_by_pieces.
	* expr.h (clear_by_pieces): Add prototype for clear_by_pieces.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 gcc/expr.cc | 6 +-----
 gcc/expr.h  | 5 +++++
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index d4414e242cb..eaf86d3d842 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -85,7 +85,6 @@ static void emit_block_move_via_sized_loop (rtx, rtx, rtx, unsigned, unsigned);
 static void emit_block_move_via_oriented_loop (rtx, rtx, rtx, unsigned, unsigned);
 static rtx emit_block_cmp_via_loop (rtx, rtx, rtx, tree, rtx, bool,
 				    unsigned, unsigned);
-static void clear_by_pieces (rtx, unsigned HOST_WIDE_INT, unsigned int);
 static rtx_insn *compress_float_constant (rtx, rtx);
 static rtx get_subtarget (rtx);
 static rtx store_field (rtx, poly_int64, poly_int64, poly_uint64, poly_uint64,
@@ -1832,10 +1831,7 @@ store_by_pieces (rtx to, unsigned HOST_WIDE_INT len,
     return to;
 }
 
-/* Generate several move instructions to clear LEN bytes of block TO.  (A MEM
-   rtx with BLKmode).  ALIGN is maximum alignment we can assume.  */
-
-static void
+void
 clear_by_pieces (rtx to, unsigned HOST_WIDE_INT len, unsigned int align)
 {
   if (len == 0)
diff --git a/gcc/expr.h b/gcc/expr.h
index 64956f63029..75181584108 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -245,6 +245,11 @@ extern bool can_store_by_pieces (unsigned HOST_WIDE_INT,
 extern rtx store_by_pieces (rtx, unsigned HOST_WIDE_INT, by_pieces_constfn,
 			    void *, unsigned int, bool, memop_ret);
 
+/* Generate several move instructions to clear LEN bytes of block TO.  (A MEM
+   rtx with BLKmode).  ALIGN is maximum alignment we can assume.  */
+
+extern void clear_by_pieces (rtx, unsigned HOST_WIDE_INT, unsigned int);
+
 /* If can_store_by_pieces passes for worst-case values near MAX_LEN, call
    store_by_pieces within conditionals so as to handle variable LEN efficiently,
    storing VAL, if non-NULL_RTX, or valc instead.  */
-- 
2.44.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/3] RISC-V: testsuite: Make cmo tests LTO safe
  2024-05-08  5:38 [PATCH 0/3] RISC-V: Add memset-zero expansion with Zicboz+Zic64b Christoph Müllner
  2024-05-08  5:38 ` [PATCH 1/3] expr: Export clear_by_pieces() Christoph Müllner
@ 2024-05-08  5:38 ` Christoph Müllner
  2024-05-09 18:23   ` Jeff Law
  2024-05-08  5:38 ` [PATCH 3/3] RISC-V: Add memset-zero expansion to cbo.zero Christoph Müllner
  2 siblings, 1 reply; 8+ messages in thread
From: Christoph Müllner @ 2024-05-08  5:38 UTC (permalink / raw)
  To: gcc-patches, Kito Cheng, Jim Wilson, Palmer Dabbelt,
	Andrew Waterman, Philipp Tomsich, Jeff Law, Vineet Gupta
  Cc: Christoph Müllner

Let's add '\t' to the instruction match pattern to avoid false positive
matches when compiling with -flto.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/cmo-zicbom-1.c: Add \t to test pattern.
	* gcc.target/riscv/cmo-zicbom-2.c: Likewise.
	* gcc.target/riscv/cmo-zicbop-1.c: Likewise.
	* gcc.target/riscv/cmo-zicbop-2.c: Likewise.
	* gcc.target/riscv/cmo-zicboz-1.c: Likewise.
	* gcc.target/riscv/cmo-zicboz-2.c: Likewise.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c | 6 +++---
 gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c | 6 +++---
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c | 6 +++---
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c | 6 +++---
 gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c | 2 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c | 2 +-
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c
index 6341f7874d3..02c38e201fa 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c
@@ -24,6 +24,6 @@ void foo3()
     __builtin_riscv_zicbom_cbo_inval((void*)0x111);
 }
 
-/* { dg-final { scan-assembler-times "cbo.clean" 3 } } */
-/* { dg-final { scan-assembler-times "cbo.flush" 3 } } */
-/* { dg-final { scan-assembler-times "cbo.inval" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.clean\t" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.flush\t" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.inval\t" 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c
index a04f106c8b0..040b96952bc 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c
@@ -24,6 +24,6 @@ void foo3()
     __builtin_riscv_zicbom_cbo_inval((void*)0x111);
 }
 
-/* { dg-final { scan-assembler-times "cbo.clean" 3 } } */
-/* { dg-final { scan-assembler-times "cbo.flush" 3 } } */
-/* { dg-final { scan-assembler-times "cbo.inval" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.clean\t" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.flush\t" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.inval\t" 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c
index c5d78c1763d..97181154d85 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c
@@ -18,6 +18,6 @@ int foo1()
   return __builtin_riscv_zicbop_cbo_prefetchi(1);
 }
 
-/* { dg-final { scan-assembler-times "prefetch.i" 1 } } */
-/* { dg-final { scan-assembler-times "prefetch.r" 4 } } */
-/* { dg-final { scan-assembler-times "prefetch.w" 4 } } */
+/* { dg-final { scan-assembler-times "prefetch.i\t" 1 } } */
+/* { dg-final { scan-assembler-times "prefetch.r\t" 4 } } */
+/* { dg-final { scan-assembler-times "prefetch.w\t" 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c
index 6576365b39c..4871a97b21a 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c
@@ -18,6 +18,6 @@ int foo1()
   return __builtin_riscv_zicbop_cbo_prefetchi(1);
 }
 
-/* { dg-final { scan-assembler-times "prefetch.i" 1 } } */
-/* { dg-final { scan-assembler-times "prefetch.r" 4 } } */
-/* { dg-final { scan-assembler-times "prefetch.w" 4 } } */ 
+/* { dg-final { scan-assembler-times "prefetch.i\t" 1 } } */
+/* { dg-final { scan-assembler-times "prefetch.r\t" 4 } } */
+/* { dg-final { scan-assembler-times "prefetch.w\t" 4 } } */ 
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c
index 5eb78ab94b5..63b8782bf89 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c
@@ -10,4 +10,4 @@ void foo1()
     __builtin_riscv_zicboz_cbo_zero((void*)0x121);
 }
 
-/* { dg-final { scan-assembler-times "cbo.zero" 3 } } */ 
+/* { dg-final { scan-assembler-times "cbo.zero\t" 3 } } */ 
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c
index fdc9c719669..cc3bd505ec0 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c
@@ -10,4 +10,4 @@ void foo1()
     __builtin_riscv_zicboz_cbo_zero((void*)0x121);
 }
 
-/* { dg-final { scan-assembler-times "cbo.zero" 3 } } */ 
+/* { dg-final { scan-assembler-times "cbo.zero\t" 3 } } */ 
-- 
2.44.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 3/3] RISC-V: Add memset-zero expansion to cbo.zero
  2024-05-08  5:38 [PATCH 0/3] RISC-V: Add memset-zero expansion with Zicboz+Zic64b Christoph Müllner
  2024-05-08  5:38 ` [PATCH 1/3] expr: Export clear_by_pieces() Christoph Müllner
  2024-05-08  5:38 ` [PATCH 2/3] RISC-V: testsuite: Make cmo tests LTO safe Christoph Müllner
@ 2024-05-08  5:38 ` Christoph Müllner
  2024-05-09 18:40   ` Jeff Law
  2 siblings, 1 reply; 8+ messages in thread
From: Christoph Müllner @ 2024-05-08  5:38 UTC (permalink / raw)
  To: gcc-patches, Kito Cheng, Jim Wilson, Palmer Dabbelt,
	Andrew Waterman, Philipp Tomsich, Jeff Law, Vineet Gupta
  Cc: Christoph Müllner

The Zicboz extension offers the cbo.zero instruction, which can be used
to clean a memory region corresponding to a cache block.
The Zic64b extension defines the cache block size to 64 byte.
If both extensions are available, it is possible to use cbo.zero
to clear memory, if the alignment and size constraints are met.
This patch implements this.

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (riscv_expand_block_clear): New prototype.
	* config/riscv/riscv-string.cc (riscv_expand_block_clear_zicboz_zic64b):
	New function to expand a block-clear with cbo.zero.
	(riscv_expand_block_clear): New RISC-V block-clear expansion function.
	* config/riscv/riscv.md (setmem<mode>): New setmem expansion.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/cmo-zicboz-zic64-1.c: New test.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 gcc/config/riscv/riscv-protos.h               |  1 +
 gcc/config/riscv/riscv-string.cc              | 59 +++++++++++++++++++
 gcc/config/riscv/riscv.md                     | 24 ++++++++
 .../gcc.target/riscv/cmo-zicboz-zic64-1.c     | 43 ++++++++++++++
 4 files changed, 127 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicboz-zic64-1.c

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index e5aebf3fc3d..255fd6a0de9 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -189,6 +189,7 @@ rtl_opt_pass * make_pass_vsetvl (gcc::context *ctxt);
 
 /* Routines implemented in riscv-string.c.  */
 extern bool riscv_expand_block_move (rtx, rtx, rtx);
+extern bool riscv_expand_block_clear (rtx, rtx);
 
 /* Information about one CPU we know about.  */
 struct riscv_cpu_info {
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index b09b51d7526..cf92256bc4e 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -787,6 +787,65 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length)
   return false;
 }
 
+/* Expand a block-clear instruction via cbo.zero instructions.  */
+
+static bool
+riscv_expand_block_clear_zicboz_zic64b (rtx dest, rtx length)
+{
+  unsigned HOST_WIDE_INT hwi_length;
+  unsigned HOST_WIDE_INT align;
+  const unsigned HOST_WIDE_INT cbo_bytes = 64;
+
+  gcc_assert (TARGET_ZICBOZ && TARGET_ZIC64B);
+
+  if (!CONST_INT_P (length))
+    return false;
+
+  hwi_length = UINTVAL (length);
+  if (hwi_length < cbo_bytes)
+    return false;
+
+  align = MEM_ALIGN (dest) / BITS_PER_UNIT;
+  if (align < cbo_bytes)
+    return false;
+
+  /* We don't emit loops.  Instead apply move-bytes limitation.  */
+  unsigned HOST_WIDE_INT max_bytes = RISCV_MAX_MOVE_BYTES_STRAIGHT /
+	  UNITS_PER_WORD * cbo_bytes;
+  if (hwi_length > max_bytes)
+    return false;
+
+  unsigned HOST_WIDE_INT offset = 0;
+  while (offset + cbo_bytes <= hwi_length)
+    {
+      rtx mem = adjust_address (dest, BLKmode, offset);
+      rtx addr = force_reg (Pmode, XEXP (mem, 0));
+      emit_insn (gen_riscv_zero_di (addr));
+      offset += cbo_bytes;
+    }
+
+  if (offset < hwi_length)
+    {
+      rtx mem = adjust_address (dest, BLKmode, offset);
+      clear_by_pieces (mem, hwi_length - offset, align);
+    }
+
+  return true;
+}
+
+bool
+riscv_expand_block_clear (rtx dest, rtx length)
+{
+  /* Only use setmem-zero expansion for Zicboz + Zic64b.  */
+  if (!TARGET_ZICBOZ || !TARGET_ZIC64B)
+    return false;
+
+  if (optimize_function_for_size_p (cfun))
+    return false;
+
+  return riscv_expand_block_clear_zicboz_zic64b (dest, length);
+}
+
 /* --- Vector expanders --- */
 
 namespace riscv_vector {
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index d4676507b45..729c102812c 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2598,6 +2598,30 @@ (define_expand "cpymem<mode>"
     FAIL;
 })
 
+;; Fill memory with constant byte.
+;; Argument 0 is the destination
+;; Argument 1 is the constant byte
+;; Argument 2 is the length
+;; Argument 3 is the alignment
+
+(define_expand "setmem<mode>"
+  [(parallel [(set (match_operand:BLK 0 "memory_operand")
+		   (match_operand:QI 2 "const_int_operand"))
+	      (use (match_operand:P 1 ""))
+	      (use (match_operand:SI 3 "const_int_operand"))])]
+ ""
+ {
+  /* If value to set is not zero, use the library routine.  */
+  if (operands[2] != const0_rtx)
+    FAIL;
+
+  if (riscv_expand_block_clear (operands[0], operands[1]))
+    DONE;
+  else
+    FAIL;
+})
+
+
 ;; Expand in-line code to clear the instruction cache between operand[0] and
 ;; operand[1].
 (define_expand "clear_cache"
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicboz-zic64-1.c b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-zic64-1.c
new file mode 100644
index 00000000000..c2d79eb7ae6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-zic64-1.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zic64b_zicboz" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_zic64b_zicboz" { target { rv32 } } } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+/* { dg-allow-blank-lines-in-output 1 } */
+
+/*
+**clear_buf_123:
+**    ...
+**    cbo\.zero\t0\(a[0-9]+\)
+**    sd\tzero,64\(a[0-9]+\)
+**    sd\tzero,72\(a[0-9]+\)
+**    sd\tzero,80\(a[0-9]+\)
+**    sd\tzero,88\(a[0-9]+\)
+**    sd\tzero,96\(a[0-9]+\)
+**    sd\tzero,104\(a[0-9]+\)
+**    sd\tzero,112\(a[0-9]+\)
+**    sh\tzero,120\(a[0-9]+\)
+**    sb\tzero,122\(a[0-9]+\)
+**    ...
+*/
+int
+clear_buf_123 (void *p)
+{
+  p = __builtin_assume_aligned(p, 64);
+  __builtin_memset (p, 0, 123);
+}
+
+/*
+**clear_buf_128:
+**    ...
+**    cbo\.zero\t0\(a[0-9]+\)
+**    addi\ta[0-9]+,a[0-9]+,64
+**    cbo\.zero\t0\(a[0-9]+\)
+**    ...
+*/
+int
+clear_buf_128 (void *p)
+{
+  p = __builtin_assume_aligned(p, 64);
+  __builtin_memset (p, 0, 128);
+}
-- 
2.44.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] expr: Export clear_by_pieces()
  2024-05-08  5:38 ` [PATCH 1/3] expr: Export clear_by_pieces() Christoph Müllner
@ 2024-05-09 18:23   ` Jeff Law
  2024-05-14 15:23   ` Jeff Law
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff Law @ 2024-05-09 18:23 UTC (permalink / raw)
  To: Christoph Müllner, gcc-patches, Kito Cheng, Jim Wilson,
	Palmer Dabbelt, Andrew Waterman, Philipp Tomsich, Vineet Gupta



On 5/7/24 11:38 PM, Christoph Müllner wrote:
> Make clear_by_pieces() available to other parts of the compiler,
> similar to store_by_pieces().
> 
> gcc/ChangeLog:
> 
> 	* expr.cc (clear_by_pieces): Remove static from clear_by_pieces.
> 	* expr.h (clear_by_pieces): Add prototype for clear_by_pieces.
OK
jeff


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] RISC-V: testsuite: Make cmo tests LTO safe
  2024-05-08  5:38 ` [PATCH 2/3] RISC-V: testsuite: Make cmo tests LTO safe Christoph Müllner
@ 2024-05-09 18:23   ` Jeff Law
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff Law @ 2024-05-09 18:23 UTC (permalink / raw)
  To: Christoph Müllner, gcc-patches, Kito Cheng, Jim Wilson,
	Palmer Dabbelt, Andrew Waterman, Philipp Tomsich, Vineet Gupta



On 5/7/24 11:38 PM, Christoph Müllner wrote:
> Let's add '\t' to the instruction match pattern to avoid false positive
> matches when compiling with -flto.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/cmo-zicbom-1.c: Add \t to test pattern.
> 	* gcc.target/riscv/cmo-zicbom-2.c: Likewise.
> 	* gcc.target/riscv/cmo-zicbop-1.c: Likewise.
> 	* gcc.target/riscv/cmo-zicbop-2.c: Likewise.
> 	* gcc.target/riscv/cmo-zicboz-1.c: Likewise.
> 	* gcc.target/riscv/cmo-zicboz-2.c: Likewise.
OK
jeff


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] RISC-V: Add memset-zero expansion to cbo.zero
  2024-05-08  5:38 ` [PATCH 3/3] RISC-V: Add memset-zero expansion to cbo.zero Christoph Müllner
@ 2024-05-09 18:40   ` Jeff Law
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff Law @ 2024-05-09 18:40 UTC (permalink / raw)
  To: Christoph Müllner, gcc-patches, Kito Cheng, Jim Wilson,
	Palmer Dabbelt, Andrew Waterman, Philipp Tomsich, Vineet Gupta



On 5/7/24 11:38 PM, Christoph Müllner wrote:
> The Zicboz extension offers the cbo.zero instruction, which can be used
> to clean a memory region corresponding to a cache block.
> The Zic64b extension defines the cache block size to 64 byte.
> If both extensions are available, it is possible to use cbo.zero
> to clear memory, if the alignment and size constraints are met.
> This patch implements this.
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/riscv-protos.h (riscv_expand_block_clear): New prototype.
> 	* config/riscv/riscv-string.cc (riscv_expand_block_clear_zicboz_zic64b):
> 	New function to expand a block-clear with cbo.zero.
> 	(riscv_expand_block_clear): New RISC-V block-clear expansion function.
> 	* config/riscv/riscv.md (setmem<mode>): New setmem expansion.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/cmo-zicboz-zic64-1.c: New test.
Depending on the underlying uarch details cbo.zero may not be nearly as 
useful as it might first appear.  There can be multiple uarch details 
that come into play.  We've done a fair amount of measurement internally 
in this space and while cbo.zero is a win, it's not a huge win.  Point 
being we may nee to come back and make this part of the tuning structure 
so uarchs can adjust.

--


I know in the cbo memset implementation VRULL provided to Ventana you 
used the trick of allowing overlapping stores to avoid the alignment 
requirements.  ie we issue a series of "sd" instrutions to ensure we 
cross the alignment barrier, then a series of cbo.zero instructions for 
the cache lines (possibly overlapping the locations stored by those "sd" 
instructions, then handled residuals which may overlap the last cbo.zero 
instructions.

I don't think you necessarily have to do that for this patch, but I 
suspect that a similar approach would make this apply much more often in 
practice.

So, OK for the trunk and consider the unaligned cases as potential 
follow-up enhancements.

THanks
Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] expr: Export clear_by_pieces()
  2024-05-08  5:38 ` [PATCH 1/3] expr: Export clear_by_pieces() Christoph Müllner
  2024-05-09 18:23   ` Jeff Law
@ 2024-05-14 15:23   ` Jeff Law
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff Law @ 2024-05-14 15:23 UTC (permalink / raw)
  To: Christoph Müllner, gcc-patches, Kito Cheng, Jim Wilson,
	Palmer Dabbelt, Andrew Waterman, Philipp Tomsich, Vineet Gupta



On 5/7/24 11:38 PM, Christoph Müllner wrote:
> Make clear_by_pieces() available to other parts of the compiler,
> similar to store_by_pieces().
> 
> gcc/ChangeLog:
> 
> 	* expr.cc (clear_by_pieces): Remove static from clear_by_pieces.
> 	* expr.h (clear_by_pieces): Add prototype for clear_by_pieces.
I'm going to push this series.  It's fully ack'd, tested and is going to 
interact with Sergei's work on vector variants of relevant patterns.

Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-05-14 15:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-08  5:38 [PATCH 0/3] RISC-V: Add memset-zero expansion with Zicboz+Zic64b Christoph Müllner
2024-05-08  5:38 ` [PATCH 1/3] expr: Export clear_by_pieces() Christoph Müllner
2024-05-09 18:23   ` Jeff Law
2024-05-14 15:23   ` Jeff Law
2024-05-08  5:38 ` [PATCH 2/3] RISC-V: testsuite: Make cmo tests LTO safe Christoph Müllner
2024-05-09 18:23   ` Jeff Law
2024-05-08  5:38 ` [PATCH 3/3] RISC-V: Add memset-zero expansion to cbo.zero Christoph Müllner
2024-05-09 18:40   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).