public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] RISC-V: Add conditional sqrt autovec pattern
@ 2023-09-04  4:49 Lehua Ding
  2023-09-06  0:31 ` Jeff Law
  0 siblings, 1 reply; 5+ messages in thread
From: Lehua Ding @ 2023-09-04  4:49 UTC (permalink / raw)
  To: gcc-patches; +Cc: juzhe.zhong, kito.cheng, rdapp.gcc, palmer, jeffreyalaw

This patch adds a combined pattern for combining vfsqrt.v and vcond_mask.

gcc/ChangeLog:

	* config/riscv/autovec-opt.md (*cond_<optab><mode>):
	Add sqrt + vcond_mask combine pattern.
	* config/riscv/autovec.md (<optab><mode>2):
	Change define_expand to define_insn_and_split.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c: New test.

---
 gcc/config/riscv/autovec-opt.md               | 20 +++++++++++++
 gcc/config/riscv/autovec.md                   |  7 +++--
 .../riscv/rvv/autovec/cond/cond_sqrt-1.c      | 24 +++++++++++++++
 .../riscv/rvv/autovec/cond/cond_sqrt-2.c      | 24 +++++++++++++++
 .../riscv/rvv/autovec/cond/cond_sqrt_run-1.c  | 29 +++++++++++++++++++
 .../riscv/rvv/autovec/cond/cond_sqrt_run-2.c  | 29 +++++++++++++++++++
 6 files changed, 131 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 1ca5ce97193..d9863c76654 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -730,6 +730,26 @@
   DONE;
 })
 
+;; Combine vfsqrt.v and cond_mask
+(define_insn_and_split "*cond_<optab><mode>"
+  [(set (match_operand:VF 0 "register_operand")
+     (if_then_else:VF
+       (match_operand:<VM> 1 "register_operand")
+       (any_float_unop:VF
+         (match_operand:VF 2 "register_operand"))
+       (match_operand:VF 3 "register_operand")))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  insn_code icode = code_for_pred (<CODE>, <MODE>mode);
+  rtx ops[] = {operands[0], operands[1], operands[2], operands[3],
+               gen_int_mode (GET_MODE_NUNITS (<MODE>mode), Pmode)};
+  riscv_vector::expand_cond_len_unop (icode, ops);
+  DONE;
+})
+
 ;; Combine vlmax neg and UNSPEC_VCOPYSIGN
 (define_insn_and_split "*copysign<mode>_neg"
   [(set (match_operand:VF 0 "register_operand")
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 0f9d1fe2c8e..c220fda312e 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -994,11 +994,14 @@
 ;; Includes:
 ;; - vfsqrt.v
 ;; -------------------------------------------------------------------------------
-(define_expand "<optab><mode>2"
+(define_insn_and_split "<optab><mode>2"
   [(set (match_operand:VF 0 "register_operand")
     (any_float_unop:VF
      (match_operand:VF 1 "register_operand")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
 {
   insn_code icode = code_for_pred (<CODE>, <MODE>mode);
   riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_DYN, operands);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
new file mode 100644
index 00000000000..21219b43d9d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, OP)                                                     \
+  void __attribute__ ((noipa))                                                 \
+  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,                  \
+		      TYPE *__restrict pred, int n)                            \
+  {                                                                            \
+    for (int i = 0; i < n; ++i)                                                \
+      r[i] = pred[i] ? OP (a[i]) : a[i];                                       \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (_Float16, __builtin_sqrtf16)                                              \
+  T (float, __builtin_sqrtf)                                                   \
+  T (double, __builtin_sqrt)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
+
+/* { dg-final { scan-assembler {\tvsetvli\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
new file mode 100644
index 00000000000..2fcdc339e70
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv64gcv_zvfh -mabi=lp64d --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, OP)                                                     \
+  void __attribute__ ((noipa))                                                 \
+  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,                  \
+		      TYPE *__restrict b, TYPE *__restrict pred, int n)        \
+  {                                                                            \
+    for (int i = 0; i < n; ++i)                                                \
+      r[i] = pred[i] ? OP (a[i]) : b[i];                                       \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (_Float16, __builtin_sqrtf16)                                              \
+  T (float, __builtin_sqrtf)                                                   \
+  T (double, __builtin_sqrt)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v\tv[0-9]+,v[0-9]+,v0\.t} 3 } } */
+
+/* { dg-final { scan-assembler {\tvsetvli\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c
new file mode 100644
index 00000000000..c6f9ba85790
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c
@@ -0,0 +1,29 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math " } */
+
+#include "cond_sqrt-1.c"
+#include <stdio.h>
+
+#define N 99
+
+#define TEST_LOOP(TYPE, OP)                                                    \
+  {                                                                            \
+    TYPE r[N], a[N], pred[N];                                                  \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+	a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : 2);                     \
+	pred[i] = (i % 7 < 4);                                                 \
+	asm volatile("" ::: "memory");                                         \
+      }                                                                        \
+    test_##TYPE##_##OP (r, a, pred, N);                                        \
+    for (int i = 0; i < N; ++i)                                                \
+      if (r[i] != (pred[i] ? OP (a[i]) : a[i]))                                \
+	__builtin_abort ();                                                    \
+  }
+
+int
+main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c
new file mode 100644
index 00000000000..5cfcfed568a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c
@@ -0,0 +1,29 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math" } */
+
+#include "cond_sqrt-2.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, OP)                                                    \
+  {                                                                            \
+    TYPE r[N], a[N], b[N], pred[N];                                            \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+	a[i] = (i & 1 ? i : 3 * i) * (i % 3 == 0 ? 1 : 2);                     \
+	b[i] = (i % 9) * (i % 7 + 1);                                          \
+	pred[i] = (i % 7 < 4);                                                 \
+	asm volatile("" ::: "memory");                                         \
+      }                                                                        \
+    test_##TYPE##_##OP (r, a, b, pred, N);                                     \
+    for (int i = 0; i < N; ++i)                                                \
+      if (r[i] != (pred[i] ? OP (a[i]) : b[i]))                                \
+	__builtin_abort ();                                                    \
+  }
+
+int
+main ()
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
-- 
2.36.3


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RISC-V: Add conditional sqrt autovec pattern
  2023-09-04  4:49 [PATCH] RISC-V: Add conditional sqrt autovec pattern Lehua Ding
@ 2023-09-06  0:31 ` Jeff Law
  2023-09-06  4:13   ` Lehua Ding
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff Law @ 2023-09-06  0:31 UTC (permalink / raw)
  To: Lehua Ding, gcc-patches; +Cc: juzhe.zhong, kito.cheng, rdapp.gcc, palmer



On 9/3/23 22:49, Lehua Ding wrote:
> This patch adds a combined pattern for combining vfsqrt.v and vcond_mask.
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/autovec-opt.md (*cond_<optab><mode>):
> 	Add sqrt + vcond_mask combine pattern.
> 	* config/riscv/autovec.md (<optab><mode>2):
> 	Change define_expand to define_insn_and_split.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c: New test.
> 	* gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c: New test.
> 	* gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c: New test.
> 	* gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c: New test.
OK.  Thanks.

FWIW, I thought we only had the reciprocal sqrt estimator, but in fact 
rvv does define a real vector sqrt.   So the concerns we kicked around 
in the meeting this morning turned out not be warranted.

This raises one of the very interesting questions in this space, 
specifically whether or not we should be using the rsqrt estimator with 
correction steps.   Unless the vfsqrt latency is really bad, it's going 
to be hard to make a vfrsqrt7 based sequence faster -- but the vfrsqrt7 
sequences will be pipelinable while vfsqrt almost certainly isn't.

Sadly we don't have a scalar FP rsqrt estimator.  Though I certainly 
ponder using the vector one -- there's a neat trick you can do with the 
nab benchmark from spec and produce sqrt and rsqrt at the same time with 
a Goldschmidt sequence.  It requires a bit of hackery to make new tree 
nodes, but it was definitely worth it on other targets I've worked on.


Jeff


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RISC-V: Add conditional sqrt autovec pattern
  2023-09-06  0:31 ` Jeff Law
@ 2023-09-06  4:13   ` Lehua Ding
  2023-09-06  8:17     ` Kito Cheng
  0 siblings, 1 reply; 5+ messages in thread
From: Lehua Ding @ 2023-09-06  4:13 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: juzhe.zhong, kito.cheng, rdapp.gcc, palmer



On 2023/9/6 8:31, Jeff Law wrote:
> 
> 
> On 9/3/23 22:49, Lehua Ding wrote:
>> This patch adds a combined pattern for combining vfsqrt.v and vcond_mask.
>>
>> gcc/ChangeLog:
>>
>>     * config/riscv/autovec-opt.md (*cond_<optab><mode>):
>>     Add sqrt + vcond_mask combine pattern.
>>     * config/riscv/autovec.md (<optab><mode>2):
>>     Change define_expand to define_insn_and_split.
>>
>> gcc/testsuite/ChangeLog:
>>
>>     * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c: New test.
>>     * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c: New test.
>>     * gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c: New test.
>>     * gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c: New test.
> OK.  Thanks.
> 
> FWIW, I thought we only had the reciprocal sqrt estimator, but in fact 
> rvv does define a real vector sqrt.   So the concerns we kicked around 
> in the meeting this morning turned out not be warranted.
> 
> This raises one of the very interesting questions in this space, 
> specifically whether or not we should be using the rsqrt estimator with 
> correction steps.   Unless the vfsqrt latency is really bad, it's going 
> to be hard to make a vfrsqrt7 based sequence faster -- but the vfrsqrt7 
> sequences will be pipelinable while vfsqrt almost certainly isn't.
> 
> Sadly we don't have a scalar FP rsqrt estimator.  Though I certainly 
> ponder using the vector one -- there's a neat trick you can do with the 
> nab benchmark from spec and produce sqrt and rsqrt at the same time with 
> a Goldschmidt sequence.  It requires a bit of hackery to make new tree 
> nodes, but it was definitely worth it on other targets I've worked on.

Committed, thank Jeff.

-- 
Best,
Lehua


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RISC-V: Add conditional sqrt autovec pattern
  2023-09-06  4:13   ` Lehua Ding
@ 2023-09-06  8:17     ` Kito Cheng
  2023-09-06  8:22       ` Lehua Ding
  0 siblings, 1 reply; 5+ messages in thread
From: Kito Cheng @ 2023-09-06  8:17 UTC (permalink / raw)
  To: Lehua Ding; +Cc: Jeff Law, gcc-patches, juzhe.zhong

Got failed on the trunk, could you take a look?

                === gcc: Unexpected fails for rv32imafdc ilp32d medlow ===
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
               === gcc: Unexpected fails for rv64imac lp64 medlow ===
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
               === gcc: Unexpected fails for rv64imafdc lp64d medlow ===
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
\\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3

              ========= Summary of gcc testsuite =========
                           | # of unexpected case / # of unique unexpected case
                           |          gcc |          g++ |     gfortran |
  rv32imac/  ilp32/ medlow |    0 /     0 |    0 /     0 |    0 /     0 |
rv32imafdc/ ilp32d/ medlow |   32 /     2 |    0 /     0 |    0 /     0 |
  rv64imac/   lp64/ medlow |   32 /     2 |    0 /     0 |    0 /     0 |
rv64imafdc/  lp64d/ medlow |   32 /     2 |    0 /     0 |    0 /     0 |

On Wed, Sep 6, 2023 at 12:14 PM Lehua Ding <lehua.ding@rivai.ai> wrote:
>
>
>
> On 2023/9/6 8:31, Jeff Law wrote:
> >
> >
> > On 9/3/23 22:49, Lehua Ding wrote:
> >> This patch adds a combined pattern for combining vfsqrt.v and vcond_mask.
> >>
> >> gcc/ChangeLog:
> >>
> >>     * config/riscv/autovec-opt.md (*cond_<optab><mode>):
> >>     Add sqrt + vcond_mask combine pattern.
> >>     * config/riscv/autovec.md (<optab><mode>2):
> >>     Change define_expand to define_insn_and_split.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>     * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c: New test.
> >>     * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c: New test.
> >>     * gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c: New test.
> >>     * gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c: New test.
> > OK.  Thanks.
> >
> > FWIW, I thought we only had the reciprocal sqrt estimator, but in fact
> > rvv does define a real vector sqrt.   So the concerns we kicked around
> > in the meeting this morning turned out not be warranted.
> >
> > This raises one of the very interesting questions in this space,
> > specifically whether or not we should be using the rsqrt estimator with
> > correction steps.   Unless the vfsqrt latency is really bad, it's going
> > to be hard to make a vfrsqrt7 based sequence faster -- but the vfrsqrt7
> > sequences will be pipelinable while vfsqrt almost certainly isn't.
> >
> > Sadly we don't have a scalar FP rsqrt estimator.  Though I certainly
> > ponder using the vector one -- there's a neat trick you can do with the
> > nab benchmark from spec and produce sqrt and rsqrt at the same time with
> > a Goldschmidt sequence.  It requires a bit of hackery to make new tree
> > nodes, but it was definitely worth it on other targets I've worked on.
>
> Committed, thank Jeff.
>
> --
> Best,
> Lehua
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RISC-V: Add conditional sqrt autovec pattern
  2023-09-06  8:17     ` Kito Cheng
@ 2023-09-06  8:22       ` Lehua Ding
  0 siblings, 0 replies; 5+ messages in thread
From: Lehua Ding @ 2023-09-06  8:22 UTC (permalink / raw)
  To: Kito Cheng; +Cc: gcc-patches, juzhe.zhong

Okay, I'll take a look at it right away. Thanks reporting.

On 2023/9/6 16:17, Kito Cheng via Gcc-patches wrote:
> Got failed on the trunk, could you take a look?
> 
>                  === gcc: Unexpected fails for rv32imafdc ilp32d medlow ===
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
>                 === gcc: Unexpected fails for rv64imac lp64 medlow ===
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
>                 === gcc: Unexpected fails for rv64imafdc lp64d medlow ===
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c scan-assembler
> \\tvsetvli\\t[a-z0-9]+,[a-z0-9]+,e[0-9]+,m[f0-9]+,t[au],mu
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> FAIL: gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c
> scan-assembler-times \\tvfsqrt\\.v\\tv[0-9]+,v[0-9]+,v0\\.t 3
> 
>                ========= Summary of gcc testsuite =========
>                             | # of unexpected case / # of unique unexpected case
>                             |          gcc |          g++ |     gfortran |
>    rv32imac/  ilp32/ medlow |    0 /     0 |    0 /     0 |    0 /     0 |
> rv32imafdc/ ilp32d/ medlow |   32 /     2 |    0 /     0 |    0 /     0 |
>    rv64imac/   lp64/ medlow |   32 /     2 |    0 /     0 |    0 /     0 |
> rv64imafdc/  lp64d/ medlow |   32 /     2 |    0 /     0 |    0 /     0 |
> 
> On Wed, Sep 6, 2023 at 12:14 PM Lehua Ding <lehua.ding@rivai.ai> wrote:
>>
>>
>>
>> On 2023/9/6 8:31, Jeff Law wrote:
>>>
>>>
>>> On 9/3/23 22:49, Lehua Ding wrote:
>>>> This patch adds a combined pattern for combining vfsqrt.v and vcond_mask.
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>      * config/riscv/autovec-opt.md (*cond_<optab><mode>):
>>>>      Add sqrt + vcond_mask combine pattern.
>>>>      * config/riscv/autovec.md (<optab><mode>2):
>>>>      Change define_expand to define_insn_and_split.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>>      * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c: New test.
>>>>      * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c: New test.
>>>>      * gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-1.c: New test.
>>>>      * gcc.target/riscv/rvv/autovec/cond/cond_sqrt_run-2.c: New test.
>>> OK.  Thanks.
>>>
>>> FWIW, I thought we only had the reciprocal sqrt estimator, but in fact
>>> rvv does define a real vector sqrt.   So the concerns we kicked around
>>> in the meeting this morning turned out not be warranted.
>>>
>>> This raises one of the very interesting questions in this space,
>>> specifically whether or not we should be using the rsqrt estimator with
>>> correction steps.   Unless the vfsqrt latency is really bad, it's going
>>> to be hard to make a vfrsqrt7 based sequence faster -- but the vfrsqrt7
>>> sequences will be pipelinable while vfsqrt almost certainly isn't.
>>>
>>> Sadly we don't have a scalar FP rsqrt estimator.  Though I certainly
>>> ponder using the vector one -- there's a neat trick you can do with the
>>> nab benchmark from spec and produce sqrt and rsqrt at the same time with
>>> a Goldschmidt sequence.  It requires a bit of hackery to make new tree
>>> nodes, but it was definitely worth it on other targets I've worked on.
>>
>> Committed, thank Jeff.
>>
>> --
>> Best,
>> Lehua
>>

-- 
Best,
Lehua

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-09-06  8:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-04  4:49 [PATCH] RISC-V: Add conditional sqrt autovec pattern Lehua Ding
2023-09-06  0:31 ` Jeff Law
2023-09-06  4:13   ` Lehua Ding
2023-09-06  8:17     ` Kito Cheng
2023-09-06  8:22       ` Lehua Ding

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).