public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Robin Dapp <rdapp.gcc@gmail.com>
To: gcc-patches <gcc-patches@gcc.gnu.org>,
	palmer <palmer@dabbelt.com>, Kito Cheng <kito.cheng@gmail.com>,
	jeffreyalaw <jeffreyalaw@gmail.com>,
	"juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
Cc: rdapp.gcc@gmail.com
Subject: [PATCH] RISC-V: Vectorized str(n)cmp and strlen.
Date: Thu, 30 Nov 2023 23:22:35 +0100	[thread overview]
Message-ID: <2cf2fa3f-541b-4c39-8689-161c7a047f7a@gmail.com> (raw)

Hi,

this adds vectorized implementations of strcmp and strncmp as well as
strlen.  strlen falls back to the previously implemented rawmemchr.
Also, it fixes a rawmemchr bug causing a SPEC2017 execution failure:
We would only ever increment the source address by 1 regardless of
the input type.

The patch also changes the stringop-strategy handling slightly:
auto is now an aggregate (including vector and scalar,
possibly more in the future) and expansion functions try all
matching strategies in their preferred order.

As before, str* expansion is guarded by -minline-str* and not active
by default.  This might change in the future as I would rather have
those on by default.  As of now, though, there is still a latent bug:

With -minline-strlen and -minline-strcmp we have several execution
failures in gcc.c-torture/execute/builtins/.  From my initial analysis
it looks like we don't insert a vsetvl at the right spot (which would
be right after a setjmp in those cases).  This leaves the initial
vle8ff without a proper vtype or vl causing a SIGILL.
Still, I figured I'd rather post the patch as-is so the bug can be
reproduced upstream.

Regards
 Robin

gcc/ChangeLog:

	PR target/112109

	* config/riscv/riscv-opts.h (enum riscv_stringop_strategy_enum):
	Rename.
	(enum stringop_strategy_enum): To this.
	* config/riscv/riscv-protos.h (expand_rawmemchr): Add strlen
	param.
	(expand_strcmp): Define.
	* config/riscv/riscv-string.cc (riscv_expand_strcmp):  Add
	vector version.
	(riscv_expand_strlen): Ditto.
	(riscv_expand_block_move_scalar): Handle existing scalar expansion.
	(riscv_expand_block_move): Expand to either vector or scalar
	version.
	(expand_block_move): Add stringop strategy.
	(expand_rawmemchr): Handle strlen and fix increment bug.
	(expand_strcmp): New expander.
	* config/riscv/riscv.md: Add vector.
	* config/riscv/riscv.opt: Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c: New test.
	* gcc.target/riscv/rvv/autovec/builtin/strcmp.c: New test.
	* gcc.target/riscv/rvv/autovec/builtin/strlen-run.c: New test.
	* gcc.target/riscv/rvv/autovec/builtin/strlen.c: New test.
---
 gcc/config/riscv/riscv-opts.h                 |  20 +-
 gcc/config/riscv/riscv-protos.h               |   4 +-
 gcc/config/riscv/riscv-string.cc              | 287 +++++++++++++++---
 gcc/config/riscv/riscv.md                     |  18 +-
 gcc/config/riscv/riscv.opt                    |  18 +-
 .../riscv/rvv/autovec/builtin/strcmp-run.c    |  32 ++
 .../riscv/rvv/autovec/builtin/strcmp.c        |  13 +
 .../riscv/rvv/autovec/builtin/strlen-run.c    |  37 +++
 .../riscv/rvv/autovec/builtin/strlen.c        |  12 +
 9 files changed, 363 insertions(+), 78 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen.c

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index e6e55ad7071..315f6ddb239 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -103,16 +103,16 @@ enum riscv_entity
   MAX_RISCV_ENTITIES
 };
 
-/* RISC-V stringop strategy. */
-enum riscv_stringop_strategy_enum {
-  /* Use scalar or vector instructions. */
-  USE_AUTO,
-  /* Always use a library call. */
-  USE_LIBCALL,
-  /* Only use scalar instructions. */
-  USE_SCALAR,
-  /* Only use vector instructions. */
-  USE_VECTOR
+/* RISC-V builtin strategy. */
+enum stringop_strategy_enum {
+  /* No expansion. */
+  STRINGOP_STRATEGY_LIBCALL = 1,
+  /* Use scalar expansion if possible. */
+  STRINGOP_STRATEGY_SCALAR = 2,
+  /* Only vector expansion if possible. */
+  STRINGOP_STRATEGY_VECTOR = 4,
+  /* Use any. */
+  STRINGOP_STRATEGY_AUTO = STRINGOP_STRATEGY_SCALAR | STRINGOP_STRATEGY_VECTOR
 };
 
 #define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && TARGET_64BIT))
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 695ee24ad6f..51359154846 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -557,7 +557,9 @@ void expand_cond_unop (unsigned, rtx *);
 void expand_cond_binop (unsigned, rtx *);
 void expand_cond_ternop (unsigned, rtx *);
 void expand_popcount (rtx *);
-void expand_rawmemchr (machine_mode, rtx, rtx, rtx);
+void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = false);
+bool expand_strcmp (rtx, rtx, rtx, rtx,
+		    unsigned HOST_WIDE_INT, bool);
 void emit_vec_extract (rtx, rtx, poly_int64);
 
 /* Rounding mode bitfield for fixed point VXRM.  */
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 80e3b5981af..ce259831a5c 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -511,7 +511,16 @@ riscv_expand_strcmp (rtx result, rtx src1, rtx src2,
     return false;
   alignment = UINTVAL (align_rtx);
 
-  if (TARGET_ZBB || TARGET_XTHEADBB)
+  if (TARGET_VECTOR && stringop_strategy & STRINGOP_STRATEGY_VECTOR)
+    {
+      bool ok = riscv_vector::expand_strcmp (result, src1, src2, bytes_rtx,
+					     alignment, ncompare);
+      if (ok)
+	return true;
+    }
+
+  if ((TARGET_ZBB || TARGET_XTHEADBB)
+      && stringop_strategy & STRINGOP_STRATEGY_SCALAR)
     {
       return riscv_expand_strcmp_scalar (result, src1, src2, nbytes, alignment,
 					 ncompare);
@@ -588,9 +597,17 @@ riscv_expand_strlen_scalar (rtx result, rtx src, rtx align)
 bool
 riscv_expand_strlen (rtx result, rtx src, rtx search_char, rtx align)
 {
+  if (TARGET_VECTOR && (stringop_strategy & STRINGOP_STRATEGY_VECTOR))
+    {
+      riscv_vector::expand_rawmemchr (E_QImode, result, src, search_char,
+				      /* strlen */ true);
+      return true;
+    }
+
   gcc_assert (search_char == const0_rtx);
 
-  if (TARGET_ZBB || TARGET_XTHEADBB)
+  if ((TARGET_ZBB || TARGET_XTHEADBB)
+      && stringop_strategy & STRINGOP_STRATEGY_SCALAR)
     return riscv_expand_strlen_scalar (result, src, align);
 
   return false;
@@ -707,51 +724,68 @@ riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length,
 /* Expand a cpymemsi instruction, which copies LENGTH bytes from
    memory reference SRC to memory reference DEST.  */
 
-bool
-riscv_expand_block_move (rtx dest, rtx src, rtx length)
+static bool
+riscv_expand_block_move_scalar (rtx dest, rtx src, rtx length)
 {
-  if (riscv_memcpy_strategy == USE_LIBCALL
-      || riscv_memcpy_strategy == USE_VECTOR)
+  if (!CONST_INT_P (length))
     return false;
 
-  if (CONST_INT_P (length))
-    {
-      unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
-      unsigned HOST_WIDE_INT factor, align;
+  unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
+  unsigned HOST_WIDE_INT factor, align;
 
-      align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
-      factor = BITS_PER_WORD / align;
+  align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
+  factor = BITS_PER_WORD / align;
 
-      if (optimize_function_for_size_p (cfun)
-	  && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false))
-	return false;
+  if (optimize_function_for_size_p (cfun)
+      && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false))
+    return false;
 
-      if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor))
+  if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor))
+    {
+      riscv_block_move_straight (dest, src, INTVAL (length));
+      return true;
+    }
+  else if (optimize && align >= BITS_PER_WORD)
+    {
+      unsigned min_iter_words
+	= RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD;
+      unsigned iter_words = min_iter_words;
+      unsigned HOST_WIDE_INT bytes = hwi_length;
+      unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD;
+
+      /* Lengthen the loop body if it shortens the tail.  */
+      for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++)
 	{
-	  riscv_block_move_straight (dest, src, INTVAL (length));
-	  return true;
+	  unsigned cur_cost = iter_words + words % iter_words;
+	  unsigned new_cost = i + words % i;
+	  if (new_cost <= cur_cost)
+	    iter_words = i;
 	}
-      else if (optimize && align >= BITS_PER_WORD)
-	{
-	  unsigned min_iter_words
-	    = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD;
-	  unsigned iter_words = min_iter_words;
-	  unsigned HOST_WIDE_INT bytes = hwi_length;
-	  unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD;
-
-	  /* Lengthen the loop body if it shortens the tail.  */
-	  for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++)
-	    {
-	      unsigned cur_cost = iter_words + words % iter_words;
-	      unsigned new_cost = i + words % i;
-	      if (new_cost <= cur_cost)
-		iter_words = i;
-	    }
 
-	  riscv_block_move_loop (dest, src, bytes, iter_words * UNITS_PER_WORD);
-	  return true;
-	}
+      riscv_block_move_loop (dest, src, bytes, iter_words * UNITS_PER_WORD);
+      return true;
+    }
+
+  return false;
+}
+
+/* This function delegates block-move expansion to either the vector
+   implementation or the scalar one.  Return TRUE if successful or FALSE
+   otherwise.  */
+
+bool
+riscv_expand_block_move (rtx dest, rtx src, rtx length)
+{
+  if (TARGET_VECTOR && stringop_strategy & STRINGOP_STRATEGY_VECTOR)
+    {
+      bool ok = riscv_vector::expand_block_move (dest, src, length);
+      if (ok)
+	return true;
     }
+
+  if (stringop_strategy & STRINGOP_STRATEGY_SCALAR)
+    return riscv_expand_block_move_scalar (dest, src, length);
+
   return false;
 }
 
@@ -777,9 +811,6 @@ expand_block_move (rtx dst_in, rtx src_in, rtx length_in)
 	bnez a2, loop                   # Any more?
 	ret                             # Return
   */
-  if (!TARGET_VECTOR || riscv_memcpy_strategy == USE_LIBCALL
-      || riscv_memcpy_strategy == USE_SCALAR)
-    return false;
   HOST_WIDE_INT potential_ew
     = (MIN (MIN (MEM_ALIGN (src_in), MEM_ALIGN (dst_in)), BITS_PER_WORD)
        / BITS_PER_UNIT);
@@ -968,7 +999,8 @@ expand_block_move (rtx dst_in, rtx src_in, rtx length_in)
    behavior is undefined.  */
 
 void
-expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat)
+expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat,
+		  bool strlen)
 {
   /*
     rawmemchr:
@@ -1001,6 +1033,8 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat)
   machine_mode mask_mode = riscv_vector::get_mask_mode (vmode);
 
   rtx cnt = gen_reg_rtx (Pmode);
+  emit_move_insn (cnt, CONST0_RTX (Pmode));
+
   rtx end = gen_reg_rtx (Pmode);
   rtx vec = gen_reg_rtx (vmode);
   rtx mask = gen_reg_rtx (mask_mode);
@@ -1011,12 +1045,18 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat)
   unsigned int shift = exact_log2 (GET_MODE_SIZE (mode).to_constant ());
 
   rtx src_addr = copy_addr_to_reg (XEXP (src, 0));
+  rtx start_addr = copy_addr_to_reg (XEXP (src, 0));
 
   rtx loop = gen_label_rtx ();
   emit_label (loop);
 
   rtx vsrc = change_address (src, vmode, src_addr);
 
+  /* Bump the pointer.  */
+  rtx step = gen_reg_rtx (Pmode);
+  emit_insn (gen_rtx_SET (step, gen_rtx_ASHIFT (Pmode, cnt, GEN_INT (shift))));
+  emit_insn (gen_rtx_SET (src_addr, gen_rtx_PLUS (Pmode, src_addr, step)));
+
   /* Emit a first-fault load.  */
   rtx vlops[] = {vec, vsrc};
   emit_vlmax_insn (code_for_pred_fault_load (vmode),
@@ -1039,19 +1079,166 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat)
   emit_nonvlmax_insn (code_for_pred_ffs (mask_mode, Pmode),
 		      riscv_vector::CPOP_OP, vfops, cnt);
 
-  /* Bump the pointer.  */
-  emit_insn (gen_rtx_SET (src_addr, gen_rtx_PLUS (Pmode, src_addr, cnt)));
-
   /* Emit the loop condition.  */
   rtx test = gen_rtx_LT (VOIDmode, end, const0_rtx);
   emit_jump_insn (gen_cbranch4 (Pmode, test, end, const0_rtx, loop));
 
-  /*  We overran by CNT, subtract it.  */
-  emit_insn (gen_rtx_SET (src_addr, gen_rtx_MINUS (Pmode, src_addr, cnt)));
-
-  /*  We found something at SRC + END * [1,2,4,8].  */
-  emit_insn (gen_rtx_SET (end, gen_rtx_ASHIFT (Pmode, end, GEN_INT (shift))));
-  emit_insn (gen_rtx_SET (dst, gen_rtx_PLUS (Pmode, src_addr, end)));
+  if (strlen)
+    {
+      /* For strlen, return the length.  */
+      emit_insn (gen_rtx_SET (dst, gen_rtx_PLUS (Pmode, src_addr, end)));
+      emit_insn (gen_rtx_SET (dst, gen_rtx_MINUS (Pmode, dst, start_addr)));
+    }
+  else
+    {
+      /*  For rawmemchr, return the position at SRC + END * [1,2,4,8].  */
+      emit_insn (gen_rtx_SET (end, gen_rtx_ASHIFT (Pmode, end, GEN_INT (shift))));
+      emit_insn (gen_rtx_SET (dst, gen_rtx_PLUS (Pmode, src_addr, end)));
+    }
 }
 
+/* Implement cmpstr<mode> using vector instructions.  */
+
+bool
+expand_strcmp (rtx result, rtx src1, rtx src2, rtx nbytes,
+	       unsigned HOST_WIDE_INT, bool)
+{
+  gcc_assert (TARGET_VECTOR);
+
+  /* We don't support big endian.  */
+  if (BYTES_BIG_ENDIAN)
+    return false;
+
+  bool with_length = nbytes != NULL_RTX;
+
+  if (with_length
+      && (!REG_P (nbytes) && !SUBREG_P (nbytes) && !CONST_INT_P (nbytes)))
+    return false;
+
+  if (with_length && CONST_INT_P (nbytes))
+    nbytes = force_reg (Pmode, nbytes);
+
+  machine_mode mode = E_QImode;
+  unsigned int isize = GET_MODE_SIZE (mode).to_constant ();
+  int lmul = TARGET_MAX_LMUL;
+  poly_int64 nunits = exact_div (BYTES_PER_RISCV_VECTOR * lmul, isize);
+
+  machine_mode vmode;
+  if (!riscv_vector::get_vector_mode (GET_MODE_INNER (mode),
+				      nunits).exists (&vmode))
+    gcc_unreachable ();
+
+  machine_mode mask_mode = riscv_vector::get_mask_mode (vmode);
+
+  /* Prepare addresses.  */
+  rtx src_addr1 = copy_addr_to_reg (XEXP (src1, 0));
+  rtx vsrc1 = change_address (src1, vmode, src_addr1);
+
+  rtx src_addr2 = copy_addr_to_reg (XEXP (src2, 0));
+  rtx vsrc2 = change_address (src2, vmode, src_addr2);
+
+  /* Set initial pointer bump to 0.  */
+  rtx cnt = gen_reg_rtx (Pmode);
+  emit_move_insn (cnt, CONST0_RTX (Pmode));
+
+  rtx sub = gen_reg_rtx (Pmode);
+  emit_move_insn (sub, CONST0_RTX (Pmode));
+
+  /* Create source vectors.  */
+  rtx vec1 = gen_reg_rtx (vmode);
+  rtx vec2 = gen_reg_rtx (vmode);
+
+  rtx done = gen_label_rtx ();
+  rtx loop = gen_label_rtx ();
+  emit_label (loop);
+
+  /* Bump the pointers.  */
+  emit_insn (gen_rtx_SET (src_addr1, gen_rtx_PLUS (Pmode, src_addr1, cnt)));
+  emit_insn (gen_rtx_SET (src_addr2, gen_rtx_PLUS (Pmode, src_addr2, cnt)));
+
+  rtx vlops1[] = {vec1, vsrc1};
+  rtx vlops2[] = {vec2, vsrc2};
+
+  if (!with_length)
+    {
+      emit_vlmax_insn (code_for_pred_fault_load (vmode),
+		       riscv_vector::UNARY_OP, vlops1);
+
+      emit_vlmax_insn (code_for_pred_fault_load (vmode),
+		       riscv_vector::UNARY_OP, vlops2);
+    }
+  else
+    {
+      nbytes = gen_lowpart (Pmode, nbytes);
+      emit_nonvlmax_insn (code_for_pred_fault_load (vmode),
+			  riscv_vector::UNARY_OP, vlops1, nbytes);
+
+      emit_nonvlmax_insn (code_for_pred_fault_load (vmode),
+			  riscv_vector::UNARY_OP, vlops2, nbytes);
+    }
+
+  /* Read the vl for the next pointer bump.  */
+  if (Pmode == SImode)
+    emit_insn (gen_read_vlsi (cnt));
+  else
+    emit_insn (gen_read_vldi_zero_extend (cnt));
+
+  if (with_length)
+    {
+      rtx test_done = gen_rtx_EQ (VOIDmode, cnt, const0_rtx);
+      emit_jump_insn (gen_cbranch4 (Pmode, test_done, cnt, const0_rtx, done));
+      emit_insn (gen_rtx_SET (nbytes, gen_rtx_MINUS (Pmode, nbytes, cnt)));
+    }
+
+  /* Look for a \0 in the first string.  */
+  rtx mask0 = gen_reg_rtx (mask_mode);
+  rtx eq0 = gen_rtx_EQ (mask_mode,
+			gen_const_vec_duplicate (vmode, CONST0_RTX (mode)),
+			vec1);
+  rtx vmsops1[] = {mask0, eq0, vec1, CONST0_RTX (mode)};
+  emit_nonvlmax_insn (code_for_pred_eqne_scalar (vmode),
+		      riscv_vector::COMPARE_OP, vmsops1, cnt);
+
+  /* Look for vec1 != vec2 (includes vec2[i] == 0).  */
+  rtx maskne = gen_reg_rtx (mask_mode);
+  rtx ne = gen_rtx_NE (mask_mode, vec1, vec2);
+  rtx vmsops[] = {maskne, ne, vec1, vec2};
+  emit_nonvlmax_insn (code_for_pred_cmp (vmode),
+		      riscv_vector::COMPARE_OP, vmsops, cnt);
+
+  /* Combine both masks into one.  */
+  rtx mask = gen_reg_rtx (mask_mode);
+  rtx vmorops[] = {mask, mask0, maskne};
+  emit_nonvlmax_insn (code_for_pred (IOR, mask_mode),
+		      riscv_vector::BINARY_MASK_OP, vmorops, cnt);
+
+  /* Find the first bit in the mask (the first unequal element).  */
+  rtx found_at = gen_reg_rtx (Pmode);
+  rtx vfops[] = {found_at, mask};
+  emit_nonvlmax_insn (code_for_pred_ffs (mask_mode, Pmode),
+		      riscv_vector::CPOP_OP, vfops, cnt);
+
+  /* Emit the loop condition.  */
+  rtx test = gen_rtx_LT (VOIDmode, found_at, const0_rtx);
+  emit_jump_insn (gen_cbranch4 (Pmode, test, found_at, const0_rtx, loop));
+
+  /* Walk up to the difference point.  */
+  emit_insn (gen_rtx_SET (src_addr1, gen_rtx_PLUS (Pmode, src_addr1, found_at)));
+  emit_insn (gen_rtx_SET (src_addr2, gen_rtx_PLUS (Pmode, src_addr2, found_at)));
+
+  /* Load the respective byte and compute the difference.  */
+  rtx c1 = gen_reg_rtx (Pmode);
+  rtx c2 = gen_reg_rtx (Pmode);
+
+  do_load_from_addr (mode, c1, src_addr1, src1);
+  do_load_from_addr (mode, c2, src_addr2, src2);
+
+  do_sub3 (sub, c1, c2);
+
+  if (with_length)
+    emit_label (done);
+
+  emit_insn (gen_movsi (result, gen_lowpart (SImode, sub)));
+  return true;
+}
 }
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 6bf2dfdf9b4..ce092e92465 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2336,9 +2336,7 @@ (define_expand "cpymem<mode>"
 	      (use (match_operand:SI 3 "const_int_operand"))])]
   ""
 {
-  if (riscv_vector::expand_block_move (operands[0], operands[1], operands[2]))
-    DONE;
-  else if (riscv_expand_block_move (operands[0], operands[1], operands[2]))
+  if (riscv_expand_block_move (operands[0], operands[1], operands[2]))
     DONE;
   else
     FAIL;
@@ -3705,7 +3703,8 @@ (define_expand "cmpstrnsi"
 			  (match_operand:BLK 2)))
 	      (use (match_operand:SI 3))
 	      (use (match_operand:SI 4))])]
-  "riscv_inline_strncmp && !optimize_size && (TARGET_ZBB || TARGET_XTHEADBB)"
+  "riscv_inline_strncmp && !optimize_size
+      && (TARGET_ZBB || TARGET_XTHEADBB || TARGET_VECTOR)"
 {
   if (riscv_expand_strcmp (operands[0], operands[1], operands[2],
                            operands[3], operands[4]))
@@ -3725,7 +3724,8 @@ (define_expand "cmpstrsi"
 	      (compare:SI (match_operand:BLK 1)
 			  (match_operand:BLK 2)))
 	      (use (match_operand:SI 3))])]
-  "riscv_inline_strcmp && !optimize_size && (TARGET_ZBB || TARGET_XTHEADBB)"
+  "riscv_inline_strcmp && !optimize_size
+      && (TARGET_ZBB || TARGET_XTHEADBB || TARGET_VECTOR)"
 {
   if (riscv_expand_strcmp (operands[0], operands[1], operands[2],
                            NULL_RTX, operands[3]))
@@ -3746,14 +3746,16 @@ (define_expand "strlen<mode>"
 		     (match_operand:SI 2 "const_int_operand")
 		     (match_operand:SI 3 "const_int_operand")]
 		  UNSPEC_STRLEN))]
-  "riscv_inline_strlen && !optimize_size && (TARGET_ZBB || TARGET_XTHEADBB)"
+  "riscv_inline_strlen && !optimize_size
+     && (TARGET_ZBB || TARGET_XTHEADBB || TARGET_VECTOR)"
 {
   rtx search_char = operands[2];
 
-  if (search_char != const0_rtx)
+  if (search_char != const0_rtx && !TARGET_VECTOR)
     FAIL;
 
-  if (riscv_expand_strlen (operands[0], operands[1], operands[2], operands[3]))
+  else if (riscv_expand_strlen (operands[0], operands[1], operands[2],
+				operands[3]))
     DONE;
   else
     FAIL;
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 0c6517bdc8b..00b52f5dc77 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -536,21 +536,21 @@ Enable the use of vector registers for function arguments and return value.
 This is an experimental switch and may be subject to change in the future.
 
 Enum
-Name(riscv_stringop_strategy) Type(enum riscv_stringop_strategy_enum)
-Valid arguments to -mmemcpy-strategy=:
+Name(stringop_strategy) Type(enum stringop_strategy_enum)
+Valid arguments to -mbuilin-strategy=:
 
 EnumValue
-Enum(riscv_stringop_strategy) String(auto) Value(USE_AUTO)
+Enum(stringop_strategy) String(auto) Value(STRINGOP_STRATEGY_AUTO)
 
 EnumValue
-Enum(riscv_stringop_strategy) String(libcall) Value(USE_LIBCALL)
+Enum(stringop_strategy) String(libcall) Value(STRINGOP_STRATEGY_LIBCALL)
 
 EnumValue
-Enum(riscv_stringop_strategy) String(scalar) Value(USE_SCALAR)
+Enum(stringop_strategy) String(scalar) Value(STRINGOP_STRATEGY_SCALAR)
 
 EnumValue
-Enum(riscv_stringop_strategy) String(vector) Value(USE_VECTOR)
+Enum(stringop_strategy) String(vector) Value(STRINGOP_STRATEGY_VECTOR)
 
-mmemcpy-strategy=
-Target RejectNegative Joined Enum(riscv_stringop_strategy) Var(riscv_memcpy_strategy) Init(USE_AUTO)
-Specify memcpy expansion strategy.
+mbuiltin-strategy=
+Target RejectNegative Joined Enum(stringop_strategy) Var(stringop_strategy) Init(STRINGOP_STRATEGY_AUTO)
+Specify builtin expansion strategy.
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c
new file mode 100644
index 00000000000..6dec7da91c1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O3 -minline-strcmp" } */
+
+#include <string.h>
+
+int
+__attribute__ ((noipa))
+foo (const char *s, const char *t)
+{
+  return __builtin_strcmp (s, t);
+}
+
+int
+__attribute__ ((noipa, optimize ("0")))
+foo2 (const char *s, const char *t)
+{
+  return strcmp (s, t);
+}
+
+#define SZ 10
+
+int main ()
+{
+  const char *s[SZ]
+    = {"",  "asdf", "0", "\0", "!@#$%***m1123fdnmoi43",
+       "a", "z",    "1", "9",  "12345678901234567889012345678901234567890"};
+
+  for (int i = 0; i < SZ; i++)
+    for (int j = 0; j < SZ; j++)
+      if (foo (s[i], s[j]) != foo2 (s[i], s[j]))
+        __builtin_abort ();
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp.c
new file mode 100644
index 00000000000..f9d33a74fc5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { riscv_v } } } */
+/* { dg-additional-options "-O3 -minline-strcmp" } */
+
+int
+__attribute__ ((noipa))
+foo (const char *s, const char *t)
+{
+  return __builtin_strcmp (s, t);
+}
+
+/* { dg-final { scan-assembler-times "vle8ff" 2 } } */
+/* { dg-final { scan-assembler-times "vfirst.m" 1 } } */
+/* { dg-final { scan-assembler-times "vmor.m" 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen-run.c
new file mode 100644
index 00000000000..d29297a5f86
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen-run.c
@@ -0,0 +1,37 @@
+/* { dg-do run } */
+/* { dg-additional-options "-O3 -minline-strlen" } */
+
+int
+__attribute__ ((noipa))
+foo (const char *s)
+{
+  return __builtin_strlen (s);
+}
+
+int
+__attribute__ ((noipa))
+foo2 (const char *s)
+{
+  int n = 0;
+  while (*s++ != '\0')
+    {
+      asm volatile ("");
+      n++;
+    }
+  return n;
+}
+
+#define SZ 10
+
+int main ()
+{
+  const char *s[SZ]
+    = {"",  "asdf", "0", "\0", "!@#$%***m1123fdnmoi43",
+       "a", "z",    "1", "9",  "12345678901234567889012345678901234567890"};
+
+  for (int i = 0; i < SZ; i++)
+    {
+      if (foo (s[i]) != foo2 (s[i]))
+        __builtin_abort ();
+    }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen.c
new file mode 100644
index 00000000000..0c6cca63ebf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { riscv_v } } } */
+/* { dg-additional-options "-O3 -minline-strlen" } */
+
+int
+__attribute__ ((noipa))
+foo (const char *s)
+{
+  return __builtin_strlen (s);
+}
+
+/* { dg-final { scan-assembler-times "vle8ff" 1 } } */
+/* { dg-final { scan-assembler-times "vfirst.m" 1 } } */
-- 
2.43.0

             reply	other threads:[~2023-11-30 22:22 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-30 22:22 Robin Dapp [this message]
2023-12-01  0:49 ` Jeff Law
2023-12-01  0:58 ` juzhe.zhong
2023-12-01  1:04 ` juzhe.zhong
2023-12-01 15:27   ` Robin Dapp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2cf2fa3f-541b-4c39-8689-161c7a047f7a@gmail.com \
    --to=rdapp.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=juzhe.zhong@rivai.ai \
    --cc=kito.cheng@gmail.com \
    --cc=palmer@dabbelt.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).