public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
@ 2023-06-16  7:28 pan2.li
  2023-06-16  7:47 ` juzhe.zhong
  2023-06-16  8:09 ` [PATCH v2] " pan2.li
  0 siblings, 2 replies; 8+ messages in thread
From: pan2.li @ 2023-06-16  7:28 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, rdapp.gcc, jeffreyalaw, pan2.li, yanzhang.wang, kito.cheng

From: Pan Li <pan2.li@intel.com>

The rvv integer reduction has 3 different patterns for zve128+, zve64
and zve32. They take the same iterator with different attributions.
However, we need the generated function code_for_reduc (code, mode1, mode2).
The implementation of code_for_reduc may look like below.

code_for_reduc (code, mode1, mode2)
{
  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx16qi; // ZVE128+

  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx8qi;  // ZVE64

  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx4qi;  // ZVE32
}

Thus there will be a problem here. For example zve32, we will have
code_for_reduc (max, VNx1QI, VNx1QI) which will return the code of
the ZVE128+ instead of the ZVE32 logically.

This patch will merge the 3 patterns into one pattern, and pass both the
input_vector and the ret_vector of code_for_reduc. For example, ZVE32 will be
code_for_reduc (max, VNx1Q1, VNx4QI), then the correct code of ZVE32
will be returned as expectation.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored by: Juzhe-Zhong <juzhe.zhong@rivai.ai>

	PR 110265

gcc/ChangeLog:
	PR target/110265
	* config/riscv/riscv-vector-builtins-bases.cc: Add ret_mode for
	integer reduction expand.
	* config/riscv/vector-iterators.md: Add VQI, VHI, VSI and VDI,
	and the LMUL1 attr respectively.
	* config/riscv/vector.md.
	(@pred_reduc_<reduc><mode><vlmul1>): Removed.
	(@pred_reduc_<reduc><mode><vlmul1_zve64>): Likewise.
	(@pred_reduc_<reduc><mode><vlmul1_zve32>): Likewise.
	(@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>): New pattern.
	(@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>): Likewise.
	(@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>): Likewise.
	(@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>): Likewise.
	* machmode.h (VECTOR_FLOAT_MODE_P): New macro.

gcc/testsuite/ChangeLog:
	PR target/110265
	* gcc.target/riscv/rvv/base/pr110265-1.c: New test.
	* gcc.target/riscv/rvv/base/pr110265-1.h: New test.
	* gcc.target/riscv/rvv/base/pr110265-2.c: New test.
	* gcc.target/riscv/rvv/base/pr110265-2.h: New test.
	* gcc.target/riscv/rvv/base/pr110265-3.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc      |  13 +-
 gcc/config/riscv/vector-iterators.md          |  61 +++++
 gcc/config/riscv/vector.md                    | 208 +++++++++++++-----
 gcc/machmode.h                                |   4 +
 .../gcc.target/riscv/rvv/base/pr110265-1.c    |  13 ++
 .../gcc.target/riscv/rvv/base/pr110265-1.h    |  65 ++++++
 .../gcc.target/riscv/rvv/base/pr110265-2.c    |  14 ++
 .../gcc.target/riscv/rvv/base/pr110265-2.h    |  57 +++++
 .../gcc.target/riscv/rvv/base/pr110265-3.c    |  14 ++
 9 files changed, 389 insertions(+), 60 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 87a684dd127..a77933d60d5 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1396,8 +1396,17 @@ public:
 
   rtx expand (function_expander &e) const override
   {
-    return e.use_exact_insn (
-      code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    machine_mode mode = e.vector_mode ();
+    machine_mode ret_mode = e.ret_mode ();
+
+    /* TODO: we will use ret_mode after all types of PR110265 are addressed.  */
+    if (VECTOR_FLOAT_MODE_P (mode)
+       || GET_MODE_INNER (mode) != GET_MODE_INNER (ret_mode))
+      return e.use_exact_insn (
+	code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    else
+      return e.use_exact_insn (
+	code_for_pred_reduc (CODE, e.vector_mode (), e.ret_mode ()));
   }
 };
 
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 8c71c9e22cc..e2c8ade98eb 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -929,6 +929,67 @@ (define_mode_iterator V64T [
   (VNx2x64QI "TARGET_MIN_VLEN >= 128")
 ])
 
+(define_mode_iterator VQI [
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  VNx2QI
+  VNx4QI
+  VNx8QI
+  VNx16QI
+  VNx32QI
+  (VNx64QI "TARGET_MIN_VLEN > 32")
+  (VNx128QI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VHI [
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  VNx2HI
+  VNx4HI
+  VNx8HI
+  VNx16HI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx64HI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VSI [
+  (VNx1SI "TARGET_MIN_VLEN < 128")
+  VNx2SI
+  VNx4SI
+  VNx8SI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VDI [
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx2DI "TARGET_VECTOR_ELEN_64")
+  (VNx4DI "TARGET_VECTOR_ELEN_64")
+  (VNx8DI "TARGET_VECTOR_ELEN_64")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VQI_LMUL1 [
+  (VNx16QI "TARGET_MIN_VLEN >= 128")
+  (VNx8QI "TARGET_MIN_VLEN == 64")
+  (VNx4QI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VHI_LMUL1 [
+  (VNx8HI "TARGET_MIN_VLEN >= 128")
+  (VNx4HI "TARGET_MIN_VLEN == 64")
+  (VNx2HI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VSI_LMUL1 [
+  (VNx4SI "TARGET_MIN_VLEN >= 128")
+  (VNx2SI "TARGET_MIN_VLEN == 64")
+  (VNx1SI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VDI_LMUL1 [
+  (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN == 64")
+])
+
 (define_mode_attr VLMULX2 [
   (VNx1QI "VNx2QI") (VNx2QI "VNx4QI") (VNx4QI "VNx8QI") (VNx8QI "VNx16QI") (VNx16QI "VNx32QI") (VNx32QI "VNx64QI") (VNx64QI "VNx128QI")
   (VNx1HI "VNx2HI") (VNx2HI "VNx4HI") (VNx4HI "VNx8HI") (VNx8HI "VNx16HI") (VNx16HI "VNx32HI") (VNx32HI "VNx64HI")
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 1d1847bd85a..d396e278503 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -7244,76 +7244,168 @@ (define_insn "@pred_rod_trunc<mode>"
 ;; -------------------------------------------------------------------------------
 
 ;; For reduction operations, we should have seperate patterns for
-;; TARGET_MIN_VLEN == 32 and TARGET_MIN_VLEN > 32.
+;; different types. For each type, we will cover MIN_VLEN == 32, MIN_VLEN == 64
+;; and the MIN_VLEN >= 128 from the well defined iterators.
 ;; Since reduction need LMUL = 1 scalar operand as the input operand
 ;; and they are different.
 ;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode
 ;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64*
-(define_insn "@pred_reduc_<reduc><mode><vlmul1>"
-  [(set (match_operand:<VLMUL1> 0 "register_operand"            "=vr,   vr")
-	(unspec:<VLMUL1>
-	  [(unspec:<VM>
-	     [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-	      (match_operand 5 "vector_length_operand"        "   rK,   rK")
-	      (match_operand 6 "const_int_operand"            "    i,    i")
-	      (match_operand 7 "const_int_operand"            "    i,    i")
+
+;; Integer Reduction for QI
+(define_insn "@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VQI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VQI_LMUL1
+	[
+	  (unspec:<VQI:VM>
+	    [
+	      (match_operand:<VQI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+	      (match_operand             5 "vector_length_operand" "   rK,   rK")
+	      (match_operand             6 "const_int_operand"     "    i,    i")
+	      (match_operand             7 "const_int_operand"     "    i,    i")
 	      (reg:SI VL_REGNUM)
-	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-	   (any_reduc:VI
-	     (vec_duplicate:VI
-	       (vec_select:<VEL>
-	         (match_operand:<VLMUL1> 4 "register_operand" "   vr,   vr")
-	         (parallel [(const_int 0)])))
-	     (match_operand:VI 3 "register_operand"           "   vr,   vr"))
-	   (match_operand:<VLMUL1> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN >= 128"
+	      (reg:SI VTYPE_REGNUM)
+	    ] UNSPEC_VPREDICATE
+	  )
+	  (any_reduc:VQI
+	    (vec_duplicate:VQI
+	      (vec_select:<VEL>
+		(match_operand:VQI_LMUL1 4 "register_operand"      "   vr,   vr")
+		(parallel [(const_int 0)])
+	      )
+	    )
+	    (match_operand:VQI           3 "register_operand"      "   vr,   vr")
+	  )
+	  (match_operand:VQI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+	] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VQI:MODE>")
+  ]
+)
 
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve64>"
-  [(set (match_operand:<VLMUL1_ZVE64> 0 "register_operand"            "=vr,   vr")
-	(unspec:<VLMUL1_ZVE64>
-	  [(unspec:<VM>
-	     [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-	      (match_operand 5 "vector_length_operand"        "   rK,   rK")
-	      (match_operand 6 "const_int_operand"            "    i,    i")
-	      (match_operand 7 "const_int_operand"            "    i,    i")
+;; Integer Reduction for HI
+(define_insn "@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VHI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VHI_LMUL1
+	[
+	  (unspec:<VHI:VM>
+	    [
+	      (match_operand:<VHI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+	      (match_operand             5 "vector_length_operand" "   rK,   rK")
+	      (match_operand             6 "const_int_operand"     "    i,    i")
+	      (match_operand             7 "const_int_operand"     "    i,    i")
 	      (reg:SI VL_REGNUM)
-	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-	   (any_reduc:VI_ZVE64
-	     (vec_duplicate:VI_ZVE64
-	       (vec_select:<VEL>
-	         (match_operand:<VLMUL1_ZVE64> 4 "register_operand" "   vr,   vr")
-	         (parallel [(const_int 0)])))
-	     (match_operand:VI_ZVE64 3 "register_operand"           "   vr,   vr"))
-	   (match_operand:<VLMUL1_ZVE64> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 64"
+	      (reg:SI VTYPE_REGNUM)
+	    ] UNSPEC_VPREDICATE
+	  )
+	  (any_reduc:VHI
+	    (vec_duplicate:VHI
+	      (vec_select:<VEL>
+		(match_operand:VHI_LMUL1 4 "register_operand"      "   vr,   vr")
+		(parallel [(const_int 0)])
+	      )
+	    )
+	    (match_operand:VHI           3 "register_operand"      "   vr,   vr")
+	  )
+	  (match_operand:VHI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+	] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VHI:MODE>")
+  ]
+)
 
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve32>"
-  [(set (match_operand:<VLMUL1_ZVE32> 0 "register_operand"          "=vd, vd, vr, vr")
-	(unspec:<VLMUL1_ZVE32>
-	  [(unspec:<VM>
-	     [(match_operand:<VM> 1 "vector_mask_operand"           " vm, vm,Wc1,Wc1")
-	      (match_operand 5 "vector_length_operand"              " rK, rK, rK, rK")
-	      (match_operand 6 "const_int_operand"                  "  i,  i,  i,  i")
-	      (match_operand 7 "const_int_operand"                  "  i,  i,  i,  i")
+;; Integer Reduction for SI
+(define_insn "@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VSI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VSI_LMUL1
+	[
+	  (unspec:<VSI:VM>
+	    [
+	      (match_operand:<VSI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+	      (match_operand             5 "vector_length_operand" "   rK,   rK")
+	      (match_operand             6 "const_int_operand"     "    i,    i")
+	      (match_operand             7 "const_int_operand"     "    i,    i")
 	      (reg:SI VL_REGNUM)
-	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-	   (any_reduc:VI_ZVE32
-	     (vec_duplicate:VI_ZVE32
-	       (vec_select:<VEL>
-	         (match_operand:<VLMUL1_ZVE32> 4 "register_operand" " vr, vr, vr, vr")
-	         (parallel [(const_int 0)])))
-	     (match_operand:VI_ZVE32 3 "register_operand"           " vr, vr, vr, vr"))
-	   (match_operand:<VLMUL1_ZVE32> 2 "vector_merge_operand"   " vu,  0, vu,  0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 32"
+	      (reg:SI VTYPE_REGNUM)
+	    ] UNSPEC_VPREDICATE
+	  )
+	  (any_reduc:VSI
+	    (vec_duplicate:VSI
+	      (vec_select:<VEL>
+		(match_operand:VSI_LMUL1 4 "register_operand"      "   vr,   vr")
+		(parallel [(const_int 0)])
+	      )
+	    )
+	    (match_operand:VSI           3 "register_operand"      "   vr,   vr")
+	  )
+	  (match_operand:VSI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+	] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VSI:MODE>")
+  ]
+)
+
+;; Integer Reduction for DI
+(define_insn "@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VDI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VDI_LMUL1
+	[
+	  (unspec:<VDI:VM>
+	    [
+	      (match_operand:<VDI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+	      (match_operand             5 "vector_length_operand" "   rK,   rK")
+	      (match_operand             6 "const_int_operand"     "    i,    i")
+	      (match_operand             7 "const_int_operand"     "    i,    i")
+	      (reg:SI VL_REGNUM)
+	      (reg:SI VTYPE_REGNUM)
+	    ] UNSPEC_VPREDICATE
+	  )
+	  (any_reduc:VDI
+	    (vec_duplicate:VDI
+	      (vec_select:<VEL>
+		(match_operand:VDI_LMUL1 4 "register_operand"      "   vr,   vr")
+		(parallel [(const_int 0)])
+	      )
+	    )
+	    (match_operand:VDI           3 "register_operand"      "   vr,   vr")
+	  )
+	  (match_operand:VDI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+	] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
+  "vred<reduc>.vs\t%0,%3,%4%p1"
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VDI:MODE>")
+  ]
+)
 
 (define_insn "@pred_widen_reduc_plus<v_su><mode><vwlmul1>"
   [(set (match_operand:<VWLMUL1> 0 "register_operand"           "=&vr,  &vr")
diff --git a/gcc/machmode.h b/gcc/machmode.h
index a22df60dc20..8ecfc2a656e 100644
--- a/gcc/machmode.h
+++ b/gcc/machmode.h
@@ -134,6 +134,10 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES];
    || GET_MODE_CLASS (MODE) == MODE_VECTOR_ACCUM	\
    || GET_MODE_CLASS (MODE) == MODE_VECTOR_UACCUM)
 
+/* Nonzero if MODE is a vector float mode.  */
+#define VECTOR_FLOAT_MODE_P(MODE)			\
+  (GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT)		\
+
 /* Nonzero if MODE is a scalar integral mode.  */
 #define SCALAR_INT_MODE_P(MODE)			\
   (GET_MODE_CLASS (MODE) == MODE_INT		\
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
new file mode 100644
index 00000000000..2e4aeb5b90b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
new file mode 100644
index 00000000000..ade44cc27ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
@@ -0,0 +1,65 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredand_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmax_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredmaxu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmin_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredminu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredsum_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredxor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u32m8_u32m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
new file mode 100644
index 00000000000..7454c1cc918
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
new file mode 100644
index 00000000000..6a7e14e51f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
@@ -0,0 +1,57 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredand_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredmaxu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredminu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredsum_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredxor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u64m8_u64m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
new file mode 100644
index 00000000000..0ed1fbae35a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
  2023-06-16  7:28 [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64 pan2.li
@ 2023-06-16  7:47 ` juzhe.zhong
  2023-06-16  7:56   ` Li, Pan2
  2023-06-16  8:09 ` [PATCH v2] " pan2.li
  1 sibling, 1 reply; 8+ messages in thread
From: juzhe.zhong @ 2023-06-16  7:47 UTC (permalink / raw)
  To: pan2.li, gcc-patches
  Cc: Robin Dapp, jeffreyalaw, pan2.li, yanzhang.wang, kito.cheng

[-- Attachment #1: Type: text/plain, Size: 25068 bytes --]

+/* Nonzero if MODE is a vector float mode.  */
+#define VECTOR_FLOAT_MODE_P(MODE)			\
+  (GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT)	
Why you add this?

Remove it. Otherwise, LGTM.


juzhe.zhong@rivai.ai
 
From: pan2.li
Date: 2023-06-16 15:28
To: gcc-patches
CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
From: Pan Li <pan2.li@intel.com>
 
The rvv integer reduction has 3 different patterns for zve128+, zve64
and zve32. They take the same iterator with different attributions.
However, we need the generated function code_for_reduc (code, mode1, mode2).
The implementation of code_for_reduc may look like below.
 
code_for_reduc (code, mode1, mode2)
{
  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx16qi; // ZVE128+
 
  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx8qi;  // ZVE64
 
  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx4qi;  // ZVE32
}
 
Thus there will be a problem here. For example zve32, we will have
code_for_reduc (max, VNx1QI, VNx1QI) which will return the code of
the ZVE128+ instead of the ZVE32 logically.
 
This patch will merge the 3 patterns into one pattern, and pass both the
input_vector and the ret_vector of code_for_reduc. For example, ZVE32 will be
code_for_reduc (max, VNx1Q1, VNx4QI), then the correct code of ZVE32
will be returned as expectation.
 
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
 
PR 110265
 
gcc/ChangeLog:
PR target/110265
* config/riscv/riscv-vector-builtins-bases.cc: Add ret_mode for
integer reduction expand.
* config/riscv/vector-iterators.md: Add VQI, VHI, VSI and VDI,
and the LMUL1 attr respectively.
* config/riscv/vector.md.
(@pred_reduc_<reduc><mode><vlmul1>): Removed.
(@pred_reduc_<reduc><mode><vlmul1_zve64>): Likewise.
(@pred_reduc_<reduc><mode><vlmul1_zve32>): Likewise.
(@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>): New pattern.
(@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>): Likewise.
(@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>): Likewise.
(@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>): Likewise.
* machmode.h (VECTOR_FLOAT_MODE_P): New macro.
 
gcc/testsuite/ChangeLog:
PR target/110265
* gcc.target/riscv/rvv/base/pr110265-1.c: New test.
* gcc.target/riscv/rvv/base/pr110265-1.h: New test.
* gcc.target/riscv/rvv/base/pr110265-2.c: New test.
* gcc.target/riscv/rvv/base/pr110265-2.h: New test.
* gcc.target/riscv/rvv/base/pr110265-3.c: New test.
---
.../riscv/riscv-vector-builtins-bases.cc      |  13 +-
gcc/config/riscv/vector-iterators.md          |  61 +++++
gcc/config/riscv/vector.md                    | 208 +++++++++++++-----
gcc/machmode.h                                |   4 +
.../gcc.target/riscv/rvv/base/pr110265-1.c    |  13 ++
.../gcc.target/riscv/rvv/base/pr110265-1.h    |  65 ++++++
.../gcc.target/riscv/rvv/base/pr110265-2.c    |  14 ++
.../gcc.target/riscv/rvv/base/pr110265-2.h    |  57 +++++
.../gcc.target/riscv/rvv/base/pr110265-3.c    |  14 ++
9 files changed, 389 insertions(+), 60 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
 
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 87a684dd127..a77933d60d5 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1396,8 +1396,17 @@ public:
   rtx expand (function_expander &e) const override
   {
-    return e.use_exact_insn (
-      code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    machine_mode mode = e.vector_mode ();
+    machine_mode ret_mode = e.ret_mode ();
+
+    /* TODO: we will use ret_mode after all types of PR110265 are addressed.  */
+    if (VECTOR_FLOAT_MODE_P (mode)
+       || GET_MODE_INNER (mode) != GET_MODE_INNER (ret_mode))
+      return e.use_exact_insn (
+ code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    else
+      return e.use_exact_insn (
+ code_for_pred_reduc (CODE, e.vector_mode (), e.ret_mode ()));
   }
};
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 8c71c9e22cc..e2c8ade98eb 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -929,6 +929,67 @@ (define_mode_iterator V64T [
   (VNx2x64QI "TARGET_MIN_VLEN >= 128")
])
+(define_mode_iterator VQI [
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  VNx2QI
+  VNx4QI
+  VNx8QI
+  VNx16QI
+  VNx32QI
+  (VNx64QI "TARGET_MIN_VLEN > 32")
+  (VNx128QI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VHI [
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  VNx2HI
+  VNx4HI
+  VNx8HI
+  VNx16HI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx64HI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VSI [
+  (VNx1SI "TARGET_MIN_VLEN < 128")
+  VNx2SI
+  VNx4SI
+  VNx8SI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VDI [
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx2DI "TARGET_VECTOR_ELEN_64")
+  (VNx4DI "TARGET_VECTOR_ELEN_64")
+  (VNx8DI "TARGET_VECTOR_ELEN_64")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VQI_LMUL1 [
+  (VNx16QI "TARGET_MIN_VLEN >= 128")
+  (VNx8QI "TARGET_MIN_VLEN == 64")
+  (VNx4QI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VHI_LMUL1 [
+  (VNx8HI "TARGET_MIN_VLEN >= 128")
+  (VNx4HI "TARGET_MIN_VLEN == 64")
+  (VNx2HI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VSI_LMUL1 [
+  (VNx4SI "TARGET_MIN_VLEN >= 128")
+  (VNx2SI "TARGET_MIN_VLEN == 64")
+  (VNx1SI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VDI_LMUL1 [
+  (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN == 64")
+])
+
(define_mode_attr VLMULX2 [
   (VNx1QI "VNx2QI") (VNx2QI "VNx4QI") (VNx4QI "VNx8QI") (VNx8QI "VNx16QI") (VNx16QI "VNx32QI") (VNx32QI "VNx64QI") (VNx64QI "VNx128QI")
   (VNx1HI "VNx2HI") (VNx2HI "VNx4HI") (VNx4HI "VNx8HI") (VNx8HI "VNx16HI") (VNx16HI "VNx32HI") (VNx32HI "VNx64HI")
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 1d1847bd85a..d396e278503 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -7244,76 +7244,168 @@ (define_insn "@pred_rod_trunc<mode>"
;; -------------------------------------------------------------------------------
;; For reduction operations, we should have seperate patterns for
-;; TARGET_MIN_VLEN == 32 and TARGET_MIN_VLEN > 32.
+;; different types. For each type, we will cover MIN_VLEN == 32, MIN_VLEN == 64
+;; and the MIN_VLEN >= 128 from the well defined iterators.
;; Since reduction need LMUL = 1 scalar operand as the input operand
;; and they are different.
;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode
;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64*
-(define_insn "@pred_reduc_<reduc><mode><vlmul1>"
-  [(set (match_operand:<VLMUL1> 0 "register_operand"            "=vr,   vr")
- (unspec:<VLMUL1>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-       (match_operand 5 "vector_length_operand"        "   rK,   rK")
-       (match_operand 6 "const_int_operand"            "    i,    i")
-       (match_operand 7 "const_int_operand"            "    i,    i")
+
+;; Integer Reduction for QI
+(define_insn "@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VQI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VQI_LMUL1
+ [
+   (unspec:<VQI:VM>
+     [
+       (match_operand:<VQI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI
-      (vec_duplicate:VI
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1> 4 "register_operand" "   vr,   vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI 3 "register_operand"           "   vr,   vr"))
-    (match_operand:<VLMUL1> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN >= 128"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VQI
+     (vec_duplicate:VQI
+       (vec_select:<VEL>
+ (match_operand:VQI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VQI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VQI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VQI:MODE>")
+  ]
+)
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve64>"
-  [(set (match_operand:<VLMUL1_ZVE64> 0 "register_operand"            "=vr,   vr")
- (unspec:<VLMUL1_ZVE64>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-       (match_operand 5 "vector_length_operand"        "   rK,   rK")
-       (match_operand 6 "const_int_operand"            "    i,    i")
-       (match_operand 7 "const_int_operand"            "    i,    i")
+;; Integer Reduction for HI
+(define_insn "@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VHI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VHI_LMUL1
+ [
+   (unspec:<VHI:VM>
+     [
+       (match_operand:<VHI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI_ZVE64
-      (vec_duplicate:VI_ZVE64
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1_ZVE64> 4 "register_operand" "   vr,   vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI_ZVE64 3 "register_operand"           "   vr,   vr"))
-    (match_operand:<VLMUL1_ZVE64> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 64"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VHI
+     (vec_duplicate:VHI
+       (vec_select:<VEL>
+ (match_operand:VHI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VHI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VHI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VHI:MODE>")
+  ]
+)
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve32>"
-  [(set (match_operand:<VLMUL1_ZVE32> 0 "register_operand"          "=vd, vd, vr, vr")
- (unspec:<VLMUL1_ZVE32>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"           " vm, vm,Wc1,Wc1")
-       (match_operand 5 "vector_length_operand"              " rK, rK, rK, rK")
-       (match_operand 6 "const_int_operand"                  "  i,  i,  i,  i")
-       (match_operand 7 "const_int_operand"                  "  i,  i,  i,  i")
+;; Integer Reduction for SI
+(define_insn "@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VSI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VSI_LMUL1
+ [
+   (unspec:<VSI:VM>
+     [
+       (match_operand:<VSI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI_ZVE32
-      (vec_duplicate:VI_ZVE32
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1_ZVE32> 4 "register_operand" " vr, vr, vr, vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI_ZVE32 3 "register_operand"           " vr, vr, vr, vr"))
-    (match_operand:<VLMUL1_ZVE32> 2 "vector_merge_operand"   " vu,  0, vu,  0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 32"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VSI
+     (vec_duplicate:VSI
+       (vec_select:<VEL>
+ (match_operand:VSI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VSI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VSI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VSI:MODE>")
+  ]
+)
+
+;; Integer Reduction for DI
+(define_insn "@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VDI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VDI_LMUL1
+ [
+   (unspec:<VDI:VM>
+     [
+       (match_operand:<VDI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
+       (reg:SI VL_REGNUM)
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VDI
+     (vec_duplicate:VDI
+       (vec_select:<VEL>
+ (match_operand:VDI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VDI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VDI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
+  "vred<reduc>.vs\t%0,%3,%4%p1"
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VDI:MODE>")
+  ]
+)
(define_insn "@pred_widen_reduc_plus<v_su><mode><vwlmul1>"
   [(set (match_operand:<VWLMUL1> 0 "register_operand"           "=&vr,  &vr")
diff --git a/gcc/machmode.h b/gcc/machmode.h
index a22df60dc20..8ecfc2a656e 100644
--- a/gcc/machmode.h
+++ b/gcc/machmode.h
@@ -134,6 +134,10 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES];
    || GET_MODE_CLASS (MODE) == MODE_VECTOR_ACCUM \
    || GET_MODE_CLASS (MODE) == MODE_VECTOR_UACCUM)
+/* Nonzero if MODE is a vector float mode.  */
+#define VECTOR_FLOAT_MODE_P(MODE) \
+  (GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT) \
+
/* Nonzero if MODE is a scalar integral mode.  */
#define SCALAR_INT_MODE_P(MODE) \
   (GET_MODE_CLASS (MODE) == MODE_INT \
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
new file mode 100644
index 00000000000..2e4aeb5b90b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
new file mode 100644
index 00000000000..ade44cc27ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
@@ -0,0 +1,65 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredand_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmax_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredmaxu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmin_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredminu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredsum_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredxor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u32m8_u32m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
new file mode 100644
index 00000000000..7454c1cc918
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
new file mode 100644
index 00000000000..6a7e14e51f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
@@ -0,0 +1,57 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredand_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredmaxu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredminu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredsum_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredxor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u64m8_u64m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
new file mode 100644
index 00000000000..0ed1fbae35a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
-- 
2.34.1
 
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
  2023-06-16  7:47 ` juzhe.zhong
@ 2023-06-16  7:56   ` Li, Pan2
  0 siblings, 0 replies; 8+ messages in thread
From: Li, Pan2 @ 2023-06-16  7:56 UTC (permalink / raw)
  To: juzhe.zhong, gcc-patches
  Cc: Robin Dapp, jeffreyalaw, Wang, Yanzhang, kito.cheng

[-- Attachment #1: Type: text/plain, Size: 35069 bytes --]

VECTOR_FLOAT_MODE_P referenced from expand, will remove it as it will be removed shortly.

Pan

From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: Friday, June 16, 2023 3:48 PM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Robin Dapp <rdapp.gcc@gmail.com>; jeffreyalaw <jeffreyalaw@gmail.com>; Li, Pan2 <pan2.li@intel.com>; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng <kito.cheng@gmail.com>
Subject: Re: [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.


+/* Nonzero if MODE is a vector float mode.  */

+#define VECTOR_FLOAT_MODE_P(MODE)                   \

+  (GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT)
Why you add this?

Remove it. Otherwise, LGTM.
________________________________
juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>

From: pan2.li<mailto:pan2.li@intel.com>
Date: 2023-06-16 15:28
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zhong@rivai.ai>; rdapp.gcc<mailto:rdapp.gcc@gmail.com>; jeffreyalaw<mailto:jeffreyalaw@gmail.com>; pan2.li<mailto:pan2.li@intel.com>; yanzhang.wang<mailto:yanzhang.wang@intel.com>; kito.cheng<mailto:kito.cheng@gmail.com>
Subject: [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
From: Pan Li <pan2.li@intel.com<mailto:pan2.li@intel.com>>

The rvv integer reduction has 3 different patterns for zve128+, zve64
and zve32. They take the same iterator with different attributions.
However, we need the generated function code_for_reduc (code, mode1, mode2).
The implementation of code_for_reduc may look like below.

code_for_reduc (code, mode1, mode2)
{
  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx16qi; // ZVE128+

  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx8qi;  // ZVE64

  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx4qi;  // ZVE32
}

Thus there will be a problem here. For example zve32, we will have
code_for_reduc (max, VNx1QI, VNx1QI) which will return the code of
the ZVE128+ instead of the ZVE32 logically.

This patch will merge the 3 patterns into one pattern, and pass both the
input_vector and the ret_vector of code_for_reduc. For example, ZVE32 will be
code_for_reduc (max, VNx1Q1, VNx4QI), then the correct code of ZVE32
will be returned as expectation.

Signed-off-by: Pan Li <pan2.li@intel.com<mailto:pan2.li@intel.com>>
Co-Authored by: Juzhe-Zhong <juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>>

PR 110265

gcc/ChangeLog:
PR target/110265
* config/riscv/riscv-vector-builtins-bases.cc: Add ret_mode for
integer reduction expand.
* config/riscv/vector-iterators.md: Add VQI, VHI, VSI and VDI,
and the LMUL1 attr respectively.
* config/riscv/vector.md.
(@pred_reduc_<reduc><mode><vlmul1>): Removed.
(@pred_reduc_<reduc><mode><vlmul1_zve64>): Likewise.
(@pred_reduc_<reduc><mode><vlmul1_zve32>): Likewise.
(@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>): New pattern.
(@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>): Likewise.
(@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>): Likewise.
(@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>): Likewise.
* machmode.h (VECTOR_FLOAT_MODE_P): New macro.

gcc/testsuite/ChangeLog:
PR target/110265
* gcc.target/riscv/rvv/base/pr110265-1.c: New test.
* gcc.target/riscv/rvv/base/pr110265-1.h: New test.
* gcc.target/riscv/rvv/base/pr110265-2.c: New test.
* gcc.target/riscv/rvv/base/pr110265-2.h: New test.
* gcc.target/riscv/rvv/base/pr110265-3.c: New test.
---
.../riscv/riscv-vector-builtins-bases.cc      |  13 +-
gcc/config/riscv/vector-iterators.md          |  61 +++++
gcc/config/riscv/vector.md                    | 208 +++++++++++++-----
gcc/machmode.h                                |   4 +
.../gcc.target/riscv/rvv/base/pr110265-1.c    |  13 ++
.../gcc.target/riscv/rvv/base/pr110265-1.h    |  65 ++++++
.../gcc.target/riscv/rvv/base/pr110265-2.c    |  14 ++
.../gcc.target/riscv/rvv/base/pr110265-2.h    |  57 +++++
.../gcc.target/riscv/rvv/base/pr110265-3.c    |  14 ++
9 files changed, 389 insertions(+), 60 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 87a684dd127..a77933d60d5 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1396,8 +1396,17 @@ public:
   rtx expand (function_expander &e) const override
   {
-    return e.use_exact_insn (
-      code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    machine_mode mode = e.vector_mode ();
+    machine_mode ret_mode = e.ret_mode ();
+
+    /* TODO: we will use ret_mode after all types of PR110265 are addressed.  */
+    if (VECTOR_FLOAT_MODE_P (mode)
+       || GET_MODE_INNER (mode) != GET_MODE_INNER (ret_mode))
+      return e.use_exact_insn (
+ code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    else
+      return e.use_exact_insn (
+ code_for_pred_reduc (CODE, e.vector_mode (), e.ret_mode ()));
   }
};
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 8c71c9e22cc..e2c8ade98eb 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -929,6 +929,67 @@ (define_mode_iterator V64T [
   (VNx2x64QI "TARGET_MIN_VLEN >= 128")
])
+(define_mode_iterator VQI [
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  VNx2QI
+  VNx4QI
+  VNx8QI
+  VNx16QI
+  VNx32QI
+  (VNx64QI "TARGET_MIN_VLEN > 32")
+  (VNx128QI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VHI [
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  VNx2HI
+  VNx4HI
+  VNx8HI
+  VNx16HI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx64HI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VSI [
+  (VNx1SI "TARGET_MIN_VLEN < 128")
+  VNx2SI
+  VNx4SI
+  VNx8SI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VDI [
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx2DI "TARGET_VECTOR_ELEN_64")
+  (VNx4DI "TARGET_VECTOR_ELEN_64")
+  (VNx8DI "TARGET_VECTOR_ELEN_64")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VQI_LMUL1 [
+  (VNx16QI "TARGET_MIN_VLEN >= 128")
+  (VNx8QI "TARGET_MIN_VLEN == 64")
+  (VNx4QI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VHI_LMUL1 [
+  (VNx8HI "TARGET_MIN_VLEN >= 128")
+  (VNx4HI "TARGET_MIN_VLEN == 64")
+  (VNx2HI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VSI_LMUL1 [
+  (VNx4SI "TARGET_MIN_VLEN >= 128")
+  (VNx2SI "TARGET_MIN_VLEN == 64")
+  (VNx1SI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VDI_LMUL1 [
+  (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN == 64")
+])
+
(define_mode_attr VLMULX2 [
   (VNx1QI "VNx2QI") (VNx2QI "VNx4QI") (VNx4QI "VNx8QI") (VNx8QI "VNx16QI") (VNx16QI "VNx32QI") (VNx32QI "VNx64QI") (VNx64QI "VNx128QI")
   (VNx1HI "VNx2HI") (VNx2HI "VNx4HI") (VNx4HI "VNx8HI") (VNx8HI "VNx16HI") (VNx16HI "VNx32HI") (VNx32HI "VNx64HI")
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 1d1847bd85a..d396e278503 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -7244,76 +7244,168 @@ (define_insn "@pred_rod_trunc<mode>"<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; -------------------------------------------------------------------------------<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; For reduction operations, we should have seperate patterns for<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
-;; TARGET_MIN_VLEN == 32 and TARGET_MIN_VLEN > 32.<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
+;; different types. For each type, we will cover MIN_VLEN == 32, MIN_VLEN == 64<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
+;; and the MIN_VLEN >= 128 from the well defined iterators.<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; Since reduction need LMUL = 1 scalar operand as the input operand<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; and they are different.<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64*<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
-(define_insn "@pred_reduc_<reduc><mode><vlmul1<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>>"
-  [(set (match_operand:<VLMUL1> 0 "register_operand"            "=vr,   vr")
- (unspec:<VLMUL1>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-       (match_operand 5 "vector_length_operand"        "   rK,   rK")
-       (match_operand 6 "const_int_operand"            "    i,    i")
-       (match_operand 7 "const_int_operand"            "    i,    i")
+
+;; Integer Reduction for QI
+(define_insn "@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VQI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VQI_LMUL1
+ [
+   (unspec:<VQI:VM>
+     [
+       (match_operand:<VQI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI
-      (vec_duplicate:VI
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1> 4 "register_operand" "   vr,   vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI 3 "register_operand"           "   vr,   vr"))
-    (match_operand:<VLMUL1> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN >= 128"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VQI
+     (vec_duplicate:VQI
+       (vec_select:<VEL>
+ (match_operand:VQI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VQI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VQI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VQI:MODE>")
+  ]
+)
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve64>"
-  [(set (match_operand:<VLMUL1_ZVE64> 0 "register_operand"            "=vr,   vr")
- (unspec:<VLMUL1_ZVE64>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-       (match_operand 5 "vector_length_operand"        "   rK,   rK")
-       (match_operand 6 "const_int_operand"            "    i,    i")
-       (match_operand 7 "const_int_operand"            "    i,    i")
+;; Integer Reduction for HI
+(define_insn "@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VHI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VHI_LMUL1
+ [
+   (unspec:<VHI:VM>
+     [
+       (match_operand:<VHI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI_ZVE64
-      (vec_duplicate:VI_ZVE64
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1_ZVE64> 4 "register_operand" "   vr,   vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI_ZVE64 3 "register_operand"           "   vr,   vr"))
-    (match_operand:<VLMUL1_ZVE64> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 64"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VHI
+     (vec_duplicate:VHI
+       (vec_select:<VEL>
+ (match_operand:VHI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VHI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VHI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VHI:MODE>")
+  ]
+)
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve32>"
-  [(set (match_operand:<VLMUL1_ZVE32> 0 "register_operand"          "=vd, vd, vr, vr")
- (unspec:<VLMUL1_ZVE32>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"           " vm, vm,Wc1,Wc1")
-       (match_operand 5 "vector_length_operand"              " rK, rK, rK, rK")
-       (match_operand 6 "const_int_operand"                  "  i,  i,  i,  i")
-       (match_operand 7 "const_int_operand"                  "  i,  i,  i,  i")
+;; Integer Reduction for SI
+(define_insn "@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VSI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VSI_LMUL1
+ [
+   (unspec:<VSI:VM>
+     [
+       (match_operand:<VSI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI_ZVE32
-      (vec_duplicate:VI_ZVE32
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1_ZVE32> 4 "register_operand" " vr, vr, vr, vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI_ZVE32 3 "register_operand"           " vr, vr, vr, vr"))
-    (match_operand:<VLMUL1_ZVE32> 2 "vector_merge_operand"   " vu,  0, vu,  0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 32"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VSI
+     (vec_duplicate:VSI
+       (vec_select:<VEL>
+ (match_operand:VSI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VSI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VSI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VSI:MODE>")
+  ]
+)
+
+;; Integer Reduction for DI
+(define_insn "@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VDI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VDI_LMUL1
+ [
+   (unspec:<VDI:VM>
+     [
+       (match_operand:<VDI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
+       (reg:SI VL_REGNUM)
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VDI
+     (vec_duplicate:VDI
+       (vec_select:<VEL>
+ (match_operand:VDI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VDI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VDI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
+  "vred<reduc>.vs\t%0,%3,%4%p1"
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VDI:MODE>")
+  ]
+)
(define_insn "@pred_widen_reduc_plus<v_su><mode><vwlmul1>"
   [(set (match_operand:<VWLMUL1> 0 "register_operand"           "=&vr,  &vr")
diff --git a/gcc/machmode.h b/gcc/machmode.h
index a22df60dc20..8ecfc2a656e 100644
--- a/gcc/machmode.h
+++ b/gcc/machmode.h
@@ -134,6 +134,10 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES];
    || GET_MODE_CLASS (MODE) == MODE_VECTOR_ACCUM \
    || GET_MODE_CLASS (MODE) == MODE_VECTOR_UACCUM)
+/* Nonzero if MODE is a vector float mode.  */
+#define VECTOR_FLOAT_MODE_P(MODE) \
+  (GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT) \
+
/* Nonzero if MODE is a scalar integral mode.  */
#define SCALAR_INT_MODE_P(MODE) \
   (GET_MODE_CLASS (MODE) == MODE_INT \
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
new file mode 100644
index 00000000000..2e4aeb5b90b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
new file mode 100644
index 00000000000..ade44cc27ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
@@ -0,0 +1,65 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredand_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmax_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredmaxu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmin_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredminu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredsum_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredxor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u32m8_u32m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
new file mode 100644
index 00000000000..7454c1cc918
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
new file mode 100644
index 00000000000..6a7e14e51f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
@@ -0,0 +1,57 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredand_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredmaxu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredminu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredsum_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredxor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u64m8_u64m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
new file mode 100644
index 00000000000..0ed1fbae35a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
--
2.34.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
  2023-06-16  7:28 [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64 pan2.li
  2023-06-16  7:47 ` juzhe.zhong
@ 2023-06-16  8:09 ` pan2.li
  2023-06-16  8:10   ` juzhe.zhong
  1 sibling, 1 reply; 8+ messages in thread
From: pan2.li @ 2023-06-16  8:09 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, rdapp.gcc, jeffreyalaw, pan2.li, yanzhang.wang, kito.cheng

From: Pan Li <pan2.li@intel.com>

The rvv integer reduction has 3 different patterns for zve128+, zve64
and zve32. They take the same iterator with different attributions.
However, we need the generated function code_for_reduc (code, mode1, mode2).
The implementation of code_for_reduc may look like below.

code_for_reduc (code, mode1, mode2)
{
  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx16qi; // ZVE128+

  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx8qi;  // ZVE64

  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx4qi;  // ZVE32
}

Thus there will be a problem here. For example zve32, we will have
code_for_reduc (max, VNx1QI, VNx1QI) which will return the code of
the ZVE128+ instead of the ZVE32 logically.

This patch will merge the 3 patterns into pattern, and pass both the
input_vector and the ret_vector of code_for_reduc. For example, ZVE32 will be
code_for_reduc (max, VNx1Q1, VNx8QI), then the correct code of ZVE32
will be returned as expectation.

Please note both GCC 13 and 14 are impacted by this issue.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored by: Juzhe-Zhong <juzhe.zhong@rivai.ai>

	PR 110265

gcc/ChangeLog:
	PR target/110265
	* config/riscv/riscv-vector-builtins-bases.cc: Add ret_mode for
	integer reduction expand.
	* config/riscv/vector-iterators.md: Add VQI, VHI, VSI and VDI,
	and the LMUL1 attr respectively.
	* config/riscv/vector.md.
	(@pred_reduc_<reduc><mode><vlmul1>): Removed.
	(@pred_reduc_<reduc><mode><vlmul1_zve64>): Likewise.
	(@pred_reduc_<reduc><mode><vlmul1_zve32>): Likewise.
	(@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>): New pattern.
	(@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>): Likewise.
	(@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>): Likewise.
	(@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>): Likewise.

gcc/testsuite/ChangeLog:
	PR target/110265
	* gcc.target/riscv/rvv/base/pr110265-1.c: New test.
	* gcc.target/riscv/rvv/base/pr110265-1.h: New test.
	* gcc.target/riscv/rvv/base/pr110265-2.c: New test.
	* gcc.target/riscv/rvv/base/pr110265-2.h: New test.
	* gcc.target/riscv/rvv/base/pr110265-3.c: New test.
---
 .../riscv/riscv-vector-builtins-bases.cc      |  13 +-
 gcc/config/riscv/vector-iterators.md          |  61 +++++
 gcc/config/riscv/vector.md                    | 208 +++++++++++++-----
 .../gcc.target/riscv/rvv/base/pr110265-1.c    |  13 ++
 .../gcc.target/riscv/rvv/base/pr110265-1.h    |  65 ++++++
 .../gcc.target/riscv/rvv/base/pr110265-2.c    |  14 ++
 .../gcc.target/riscv/rvv/base/pr110265-2.h    |  57 +++++
 .../gcc.target/riscv/rvv/base/pr110265-3.c    |  14 ++
 8 files changed, 385 insertions(+), 60 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 87a684dd127..53bd0ed2534 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1396,8 +1396,17 @@ public:
 
   rtx expand (function_expander &e) const override
   {
-    return e.use_exact_insn (
-      code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    machine_mode mode = e.vector_mode ();
+    machine_mode ret_mode = e.ret_mode ();
+
+    /* TODO: we will use ret_mode after all types of PR110265 are addressed.  */
+    if ((GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT)
+       || GET_MODE_INNER (mode) != GET_MODE_INNER (ret_mode))
+      return e.use_exact_insn (
+	code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    else
+      return e.use_exact_insn (
+	code_for_pred_reduc (CODE, e.vector_mode (), e.ret_mode ()));
   }
 };
 
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 8c71c9e22cc..e2c8ade98eb 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -929,6 +929,67 @@ (define_mode_iterator V64T [
   (VNx2x64QI "TARGET_MIN_VLEN >= 128")
 ])
 
+(define_mode_iterator VQI [
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  VNx2QI
+  VNx4QI
+  VNx8QI
+  VNx16QI
+  VNx32QI
+  (VNx64QI "TARGET_MIN_VLEN > 32")
+  (VNx128QI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VHI [
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  VNx2HI
+  VNx4HI
+  VNx8HI
+  VNx16HI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx64HI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VSI [
+  (VNx1SI "TARGET_MIN_VLEN < 128")
+  VNx2SI
+  VNx4SI
+  VNx8SI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VDI [
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx2DI "TARGET_VECTOR_ELEN_64")
+  (VNx4DI "TARGET_VECTOR_ELEN_64")
+  (VNx8DI "TARGET_VECTOR_ELEN_64")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VQI_LMUL1 [
+  (VNx16QI "TARGET_MIN_VLEN >= 128")
+  (VNx8QI "TARGET_MIN_VLEN == 64")
+  (VNx4QI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VHI_LMUL1 [
+  (VNx8HI "TARGET_MIN_VLEN >= 128")
+  (VNx4HI "TARGET_MIN_VLEN == 64")
+  (VNx2HI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VSI_LMUL1 [
+  (VNx4SI "TARGET_MIN_VLEN >= 128")
+  (VNx2SI "TARGET_MIN_VLEN == 64")
+  (VNx1SI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VDI_LMUL1 [
+  (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN == 64")
+])
+
 (define_mode_attr VLMULX2 [
   (VNx1QI "VNx2QI") (VNx2QI "VNx4QI") (VNx4QI "VNx8QI") (VNx8QI "VNx16QI") (VNx16QI "VNx32QI") (VNx32QI "VNx64QI") (VNx64QI "VNx128QI")
   (VNx1HI "VNx2HI") (VNx2HI "VNx4HI") (VNx4HI "VNx8HI") (VNx8HI "VNx16HI") (VNx16HI "VNx32HI") (VNx32HI "VNx64HI")
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 1d1847bd85a..d396e278503 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -7244,76 +7244,168 @@ (define_insn "@pred_rod_trunc<mode>"
 ;; -------------------------------------------------------------------------------
 
 ;; For reduction operations, we should have seperate patterns for
-;; TARGET_MIN_VLEN == 32 and TARGET_MIN_VLEN > 32.
+;; different types. For each type, we will cover MIN_VLEN == 32, MIN_VLEN == 64
+;; and the MIN_VLEN >= 128 from the well defined iterators.
 ;; Since reduction need LMUL = 1 scalar operand as the input operand
 ;; and they are different.
 ;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode
 ;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64*
-(define_insn "@pred_reduc_<reduc><mode><vlmul1>"
-  [(set (match_operand:<VLMUL1> 0 "register_operand"            "=vr,   vr")
-	(unspec:<VLMUL1>
-	  [(unspec:<VM>
-	     [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-	      (match_operand 5 "vector_length_operand"        "   rK,   rK")
-	      (match_operand 6 "const_int_operand"            "    i,    i")
-	      (match_operand 7 "const_int_operand"            "    i,    i")
+
+;; Integer Reduction for QI
+(define_insn "@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VQI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VQI_LMUL1
+	[
+	  (unspec:<VQI:VM>
+	    [
+	      (match_operand:<VQI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+	      (match_operand             5 "vector_length_operand" "   rK,   rK")
+	      (match_operand             6 "const_int_operand"     "    i,    i")
+	      (match_operand             7 "const_int_operand"     "    i,    i")
 	      (reg:SI VL_REGNUM)
-	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-	   (any_reduc:VI
-	     (vec_duplicate:VI
-	       (vec_select:<VEL>
-	         (match_operand:<VLMUL1> 4 "register_operand" "   vr,   vr")
-	         (parallel [(const_int 0)])))
-	     (match_operand:VI 3 "register_operand"           "   vr,   vr"))
-	   (match_operand:<VLMUL1> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN >= 128"
+	      (reg:SI VTYPE_REGNUM)
+	    ] UNSPEC_VPREDICATE
+	  )
+	  (any_reduc:VQI
+	    (vec_duplicate:VQI
+	      (vec_select:<VEL>
+		(match_operand:VQI_LMUL1 4 "register_operand"      "   vr,   vr")
+		(parallel [(const_int 0)])
+	      )
+	    )
+	    (match_operand:VQI           3 "register_operand"      "   vr,   vr")
+	  )
+	  (match_operand:VQI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+	] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VQI:MODE>")
+  ]
+)
 
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve64>"
-  [(set (match_operand:<VLMUL1_ZVE64> 0 "register_operand"            "=vr,   vr")
-	(unspec:<VLMUL1_ZVE64>
-	  [(unspec:<VM>
-	     [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-	      (match_operand 5 "vector_length_operand"        "   rK,   rK")
-	      (match_operand 6 "const_int_operand"            "    i,    i")
-	      (match_operand 7 "const_int_operand"            "    i,    i")
+;; Integer Reduction for HI
+(define_insn "@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VHI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VHI_LMUL1
+	[
+	  (unspec:<VHI:VM>
+	    [
+	      (match_operand:<VHI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+	      (match_operand             5 "vector_length_operand" "   rK,   rK")
+	      (match_operand             6 "const_int_operand"     "    i,    i")
+	      (match_operand             7 "const_int_operand"     "    i,    i")
 	      (reg:SI VL_REGNUM)
-	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-	   (any_reduc:VI_ZVE64
-	     (vec_duplicate:VI_ZVE64
-	       (vec_select:<VEL>
-	         (match_operand:<VLMUL1_ZVE64> 4 "register_operand" "   vr,   vr")
-	         (parallel [(const_int 0)])))
-	     (match_operand:VI_ZVE64 3 "register_operand"           "   vr,   vr"))
-	   (match_operand:<VLMUL1_ZVE64> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 64"
+	      (reg:SI VTYPE_REGNUM)
+	    ] UNSPEC_VPREDICATE
+	  )
+	  (any_reduc:VHI
+	    (vec_duplicate:VHI
+	      (vec_select:<VEL>
+		(match_operand:VHI_LMUL1 4 "register_operand"      "   vr,   vr")
+		(parallel [(const_int 0)])
+	      )
+	    )
+	    (match_operand:VHI           3 "register_operand"      "   vr,   vr")
+	  )
+	  (match_operand:VHI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+	] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VHI:MODE>")
+  ]
+)
 
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve32>"
-  [(set (match_operand:<VLMUL1_ZVE32> 0 "register_operand"          "=vd, vd, vr, vr")
-	(unspec:<VLMUL1_ZVE32>
-	  [(unspec:<VM>
-	     [(match_operand:<VM> 1 "vector_mask_operand"           " vm, vm,Wc1,Wc1")
-	      (match_operand 5 "vector_length_operand"              " rK, rK, rK, rK")
-	      (match_operand 6 "const_int_operand"                  "  i,  i,  i,  i")
-	      (match_operand 7 "const_int_operand"                  "  i,  i,  i,  i")
+;; Integer Reduction for SI
+(define_insn "@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VSI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VSI_LMUL1
+	[
+	  (unspec:<VSI:VM>
+	    [
+	      (match_operand:<VSI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+	      (match_operand             5 "vector_length_operand" "   rK,   rK")
+	      (match_operand             6 "const_int_operand"     "    i,    i")
+	      (match_operand             7 "const_int_operand"     "    i,    i")
 	      (reg:SI VL_REGNUM)
-	      (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-	   (any_reduc:VI_ZVE32
-	     (vec_duplicate:VI_ZVE32
-	       (vec_select:<VEL>
-	         (match_operand:<VLMUL1_ZVE32> 4 "register_operand" " vr, vr, vr, vr")
-	         (parallel [(const_int 0)])))
-	     (match_operand:VI_ZVE32 3 "register_operand"           " vr, vr, vr, vr"))
-	   (match_operand:<VLMUL1_ZVE32> 2 "vector_merge_operand"   " vu,  0, vu,  0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 32"
+	      (reg:SI VTYPE_REGNUM)
+	    ] UNSPEC_VPREDICATE
+	  )
+	  (any_reduc:VSI
+	    (vec_duplicate:VSI
+	      (vec_select:<VEL>
+		(match_operand:VSI_LMUL1 4 "register_operand"      "   vr,   vr")
+		(parallel [(const_int 0)])
+	      )
+	    )
+	    (match_operand:VSI           3 "register_operand"      "   vr,   vr")
+	  )
+	  (match_operand:VSI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+	] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VSI:MODE>")
+  ]
+)
+
+;; Integer Reduction for DI
+(define_insn "@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VDI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VDI_LMUL1
+	[
+	  (unspec:<VDI:VM>
+	    [
+	      (match_operand:<VDI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+	      (match_operand             5 "vector_length_operand" "   rK,   rK")
+	      (match_operand             6 "const_int_operand"     "    i,    i")
+	      (match_operand             7 "const_int_operand"     "    i,    i")
+	      (reg:SI VL_REGNUM)
+	      (reg:SI VTYPE_REGNUM)
+	    ] UNSPEC_VPREDICATE
+	  )
+	  (any_reduc:VDI
+	    (vec_duplicate:VDI
+	      (vec_select:<VEL>
+		(match_operand:VDI_LMUL1 4 "register_operand"      "   vr,   vr")
+		(parallel [(const_int 0)])
+	      )
+	    )
+	    (match_operand:VDI           3 "register_operand"      "   vr,   vr")
+	  )
+	  (match_operand:VDI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+	] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
+  "vred<reduc>.vs\t%0,%3,%4%p1"
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VDI:MODE>")
+  ]
+)
 
 (define_insn "@pred_widen_reduc_plus<v_su><mode><vwlmul1>"
   [(set (match_operand:<VWLMUL1> 0 "register_operand"           "=&vr,  &vr")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
new file mode 100644
index 00000000000..2e4aeb5b90b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
new file mode 100644
index 00000000000..ade44cc27ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
@@ -0,0 +1,65 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredand_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmax_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredmaxu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmin_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredminu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredsum_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredxor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u32m8_u32m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
new file mode 100644
index 00000000000..7454c1cc918
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
new file mode 100644
index 00000000000..6a7e14e51f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
@@ -0,0 +1,57 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredand_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredmaxu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredminu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredsum_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredxor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u64m8_u64m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
new file mode 100644
index 00000000000..0ed1fbae35a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
  2023-06-16  8:09 ` [PATCH v2] " pan2.li
@ 2023-06-16  8:10   ` juzhe.zhong
  2023-06-16  8:16     ` Li, Pan2
  2023-06-16 15:55     ` Jeff Law
  0 siblings, 2 replies; 8+ messages in thread
From: juzhe.zhong @ 2023-06-16  8:10 UTC (permalink / raw)
  To: pan2.li, gcc-patches
  Cc: Robin Dapp, jeffreyalaw, pan2.li, yanzhang.wang, kito.cheng

[-- Attachment #1: Type: text/plain, Size: 24353 bytes --]

LGTM. Thanks for fix this bug.
Let's wait for Jeff's final approve.

Thanks.


juzhe.zhong@rivai.ai
 
From: pan2.li
Date: 2023-06-16 16:09
To: gcc-patches
CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
From: Pan Li <pan2.li@intel.com>
 
The rvv integer reduction has 3 different patterns for zve128+, zve64
and zve32. They take the same iterator with different attributions.
However, we need the generated function code_for_reduc (code, mode1, mode2).
The implementation of code_for_reduc may look like below.
 
code_for_reduc (code, mode1, mode2)
{
  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx16qi; // ZVE128+
 
  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx8qi;  // ZVE64
 
  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx4qi;  // ZVE32
}
 
Thus there will be a problem here. For example zve32, we will have
code_for_reduc (max, VNx1QI, VNx1QI) which will return the code of
the ZVE128+ instead of the ZVE32 logically.
 
This patch will merge the 3 patterns into pattern, and pass both the
input_vector and the ret_vector of code_for_reduc. For example, ZVE32 will be
code_for_reduc (max, VNx1Q1, VNx8QI), then the correct code of ZVE32
will be returned as expectation.
 
Please note both GCC 13 and 14 are impacted by this issue.
 
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
 
PR 110265
 
gcc/ChangeLog:
PR target/110265
* config/riscv/riscv-vector-builtins-bases.cc: Add ret_mode for
integer reduction expand.
* config/riscv/vector-iterators.md: Add VQI, VHI, VSI and VDI,
and the LMUL1 attr respectively.
* config/riscv/vector.md.
(@pred_reduc_<reduc><mode><vlmul1>): Removed.
(@pred_reduc_<reduc><mode><vlmul1_zve64>): Likewise.
(@pred_reduc_<reduc><mode><vlmul1_zve32>): Likewise.
(@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>): New pattern.
(@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>): Likewise.
(@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>): Likewise.
(@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>): Likewise.
 
gcc/testsuite/ChangeLog:
PR target/110265
* gcc.target/riscv/rvv/base/pr110265-1.c: New test.
* gcc.target/riscv/rvv/base/pr110265-1.h: New test.
* gcc.target/riscv/rvv/base/pr110265-2.c: New test.
* gcc.target/riscv/rvv/base/pr110265-2.h: New test.
* gcc.target/riscv/rvv/base/pr110265-3.c: New test.
---
.../riscv/riscv-vector-builtins-bases.cc      |  13 +-
gcc/config/riscv/vector-iterators.md          |  61 +++++
gcc/config/riscv/vector.md                    | 208 +++++++++++++-----
.../gcc.target/riscv/rvv/base/pr110265-1.c    |  13 ++
.../gcc.target/riscv/rvv/base/pr110265-1.h    |  65 ++++++
.../gcc.target/riscv/rvv/base/pr110265-2.c    |  14 ++
.../gcc.target/riscv/rvv/base/pr110265-2.h    |  57 +++++
.../gcc.target/riscv/rvv/base/pr110265-3.c    |  14 ++
8 files changed, 385 insertions(+), 60 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
 
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 87a684dd127..53bd0ed2534 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1396,8 +1396,17 @@ public:
   rtx expand (function_expander &e) const override
   {
-    return e.use_exact_insn (
-      code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    machine_mode mode = e.vector_mode ();
+    machine_mode ret_mode = e.ret_mode ();
+
+    /* TODO: we will use ret_mode after all types of PR110265 are addressed.  */
+    if ((GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT)
+       || GET_MODE_INNER (mode) != GET_MODE_INNER (ret_mode))
+      return e.use_exact_insn (
+ code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    else
+      return e.use_exact_insn (
+ code_for_pred_reduc (CODE, e.vector_mode (), e.ret_mode ()));
   }
};
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 8c71c9e22cc..e2c8ade98eb 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -929,6 +929,67 @@ (define_mode_iterator V64T [
   (VNx2x64QI "TARGET_MIN_VLEN >= 128")
])
+(define_mode_iterator VQI [
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  VNx2QI
+  VNx4QI
+  VNx8QI
+  VNx16QI
+  VNx32QI
+  (VNx64QI "TARGET_MIN_VLEN > 32")
+  (VNx128QI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VHI [
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  VNx2HI
+  VNx4HI
+  VNx8HI
+  VNx16HI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx64HI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VSI [
+  (VNx1SI "TARGET_MIN_VLEN < 128")
+  VNx2SI
+  VNx4SI
+  VNx8SI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VDI [
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx2DI "TARGET_VECTOR_ELEN_64")
+  (VNx4DI "TARGET_VECTOR_ELEN_64")
+  (VNx8DI "TARGET_VECTOR_ELEN_64")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VQI_LMUL1 [
+  (VNx16QI "TARGET_MIN_VLEN >= 128")
+  (VNx8QI "TARGET_MIN_VLEN == 64")
+  (VNx4QI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VHI_LMUL1 [
+  (VNx8HI "TARGET_MIN_VLEN >= 128")
+  (VNx4HI "TARGET_MIN_VLEN == 64")
+  (VNx2HI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VSI_LMUL1 [
+  (VNx4SI "TARGET_MIN_VLEN >= 128")
+  (VNx2SI "TARGET_MIN_VLEN == 64")
+  (VNx1SI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VDI_LMUL1 [
+  (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN == 64")
+])
+
(define_mode_attr VLMULX2 [
   (VNx1QI "VNx2QI") (VNx2QI "VNx4QI") (VNx4QI "VNx8QI") (VNx8QI "VNx16QI") (VNx16QI "VNx32QI") (VNx32QI "VNx64QI") (VNx64QI "VNx128QI")
   (VNx1HI "VNx2HI") (VNx2HI "VNx4HI") (VNx4HI "VNx8HI") (VNx8HI "VNx16HI") (VNx16HI "VNx32HI") (VNx32HI "VNx64HI")
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 1d1847bd85a..d396e278503 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -7244,76 +7244,168 @@ (define_insn "@pred_rod_trunc<mode>"
;; -------------------------------------------------------------------------------
;; For reduction operations, we should have seperate patterns for
-;; TARGET_MIN_VLEN == 32 and TARGET_MIN_VLEN > 32.
+;; different types. For each type, we will cover MIN_VLEN == 32, MIN_VLEN == 64
+;; and the MIN_VLEN >= 128 from the well defined iterators.
;; Since reduction need LMUL = 1 scalar operand as the input operand
;; and they are different.
;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode
;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64*
-(define_insn "@pred_reduc_<reduc><mode><vlmul1>"
-  [(set (match_operand:<VLMUL1> 0 "register_operand"            "=vr,   vr")
- (unspec:<VLMUL1>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-       (match_operand 5 "vector_length_operand"        "   rK,   rK")
-       (match_operand 6 "const_int_operand"            "    i,    i")
-       (match_operand 7 "const_int_operand"            "    i,    i")
+
+;; Integer Reduction for QI
+(define_insn "@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VQI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VQI_LMUL1
+ [
+   (unspec:<VQI:VM>
+     [
+       (match_operand:<VQI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI
-      (vec_duplicate:VI
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1> 4 "register_operand" "   vr,   vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI 3 "register_operand"           "   vr,   vr"))
-    (match_operand:<VLMUL1> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN >= 128"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VQI
+     (vec_duplicate:VQI
+       (vec_select:<VEL>
+ (match_operand:VQI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VQI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VQI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VQI:MODE>")
+  ]
+)
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve64>"
-  [(set (match_operand:<VLMUL1_ZVE64> 0 "register_operand"            "=vr,   vr")
- (unspec:<VLMUL1_ZVE64>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-       (match_operand 5 "vector_length_operand"        "   rK,   rK")
-       (match_operand 6 "const_int_operand"            "    i,    i")
-       (match_operand 7 "const_int_operand"            "    i,    i")
+;; Integer Reduction for HI
+(define_insn "@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VHI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VHI_LMUL1
+ [
+   (unspec:<VHI:VM>
+     [
+       (match_operand:<VHI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI_ZVE64
-      (vec_duplicate:VI_ZVE64
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1_ZVE64> 4 "register_operand" "   vr,   vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI_ZVE64 3 "register_operand"           "   vr,   vr"))
-    (match_operand:<VLMUL1_ZVE64> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 64"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VHI
+     (vec_duplicate:VHI
+       (vec_select:<VEL>
+ (match_operand:VHI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VHI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VHI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VHI:MODE>")
+  ]
+)
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve32>"
-  [(set (match_operand:<VLMUL1_ZVE32> 0 "register_operand"          "=vd, vd, vr, vr")
- (unspec:<VLMUL1_ZVE32>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"           " vm, vm,Wc1,Wc1")
-       (match_operand 5 "vector_length_operand"              " rK, rK, rK, rK")
-       (match_operand 6 "const_int_operand"                  "  i,  i,  i,  i")
-       (match_operand 7 "const_int_operand"                  "  i,  i,  i,  i")
+;; Integer Reduction for SI
+(define_insn "@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VSI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VSI_LMUL1
+ [
+   (unspec:<VSI:VM>
+     [
+       (match_operand:<VSI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI_ZVE32
-      (vec_duplicate:VI_ZVE32
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1_ZVE32> 4 "register_operand" " vr, vr, vr, vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI_ZVE32 3 "register_operand"           " vr, vr, vr, vr"))
-    (match_operand:<VLMUL1_ZVE32> 2 "vector_merge_operand"   " vu,  0, vu,  0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 32"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VSI
+     (vec_duplicate:VSI
+       (vec_select:<VEL>
+ (match_operand:VSI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VSI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VSI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VSI:MODE>")
+  ]
+)
+
+;; Integer Reduction for DI
+(define_insn "@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VDI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VDI_LMUL1
+ [
+   (unspec:<VDI:VM>
+     [
+       (match_operand:<VDI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
+       (reg:SI VL_REGNUM)
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VDI
+     (vec_duplicate:VDI
+       (vec_select:<VEL>
+ (match_operand:VDI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VDI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VDI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
+  "vred<reduc>.vs\t%0,%3,%4%p1"
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VDI:MODE>")
+  ]
+)
(define_insn "@pred_widen_reduc_plus<v_su><mode><vwlmul1>"
   [(set (match_operand:<VWLMUL1> 0 "register_operand"           "=&vr,  &vr")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
new file mode 100644
index 00000000000..2e4aeb5b90b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
new file mode 100644
index 00000000000..ade44cc27ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
@@ -0,0 +1,65 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredand_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmax_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredmaxu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmin_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredminu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredsum_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredxor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u32m8_u32m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
new file mode 100644
index 00000000000..7454c1cc918
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
new file mode 100644
index 00000000000..6a7e14e51f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
@@ -0,0 +1,57 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredand_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredmaxu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredminu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredsum_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredxor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u64m8_u64m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
new file mode 100644
index 00000000000..0ed1fbae35a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
-- 
2.34.1
 
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
  2023-06-16  8:10   ` juzhe.zhong
@ 2023-06-16  8:16     ` Li, Pan2
  2023-06-16 15:55     ` Jeff Law
  1 sibling, 0 replies; 8+ messages in thread
From: Li, Pan2 @ 2023-06-16  8:16 UTC (permalink / raw)
  To: juzhe.zhong, gcc-patches
  Cc: Robin Dapp, jeffreyalaw, Wang, Yanzhang, kito.cheng

[-- Attachment #1: Type: text/plain, Size: 34314 bytes --]

Thanks Juzhe for reviewing, will take care of the FP and widen part soon.

Pan

From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: Friday, June 16, 2023 4:11 PM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Robin Dapp <rdapp.gcc@gmail.com>; jeffreyalaw <jeffreyalaw@gmail.com>; Li, Pan2 <pan2.li@intel.com>; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng <kito.cheng@gmail.com>
Subject: Re: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.

LGTM. Thanks for fix this bug.
Let's wait for Jeff's final approve.

Thanks.
________________________________
juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>

From: pan2.li<mailto:pan2.li@intel.com>
Date: 2023-06-16 16:09
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zhong@rivai.ai>; rdapp.gcc<mailto:rdapp.gcc@gmail.com>; jeffreyalaw<mailto:jeffreyalaw@gmail.com>; pan2.li<mailto:pan2.li@intel.com>; yanzhang.wang<mailto:yanzhang.wang@intel.com>; kito.cheng<mailto:kito.cheng@gmail.com>
Subject: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
From: Pan Li <pan2.li@intel.com<mailto:pan2.li@intel.com>>

The rvv integer reduction has 3 different patterns for zve128+, zve64
and zve32. They take the same iterator with different attributions.
However, we need the generated function code_for_reduc (code, mode1, mode2).
The implementation of code_for_reduc may look like below.

code_for_reduc (code, mode1, mode2)
{
  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx16qi; // ZVE128+

  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx8qi;  // ZVE64

  if (code == max && mode1 == VNx1QI && mode2 == VNx1QI)
    return CODE_FOR_pred_reduc_maxvnx1qivnx4qi;  // ZVE32
}

Thus there will be a problem here. For example zve32, we will have
code_for_reduc (max, VNx1QI, VNx1QI) which will return the code of
the ZVE128+ instead of the ZVE32 logically.

This patch will merge the 3 patterns into pattern, and pass both the
input_vector and the ret_vector of code_for_reduc. For example, ZVE32 will be
code_for_reduc (max, VNx1Q1, VNx8QI), then the correct code of ZVE32
will be returned as expectation.

Please note both GCC 13 and 14 are impacted by this issue.

Signed-off-by: Pan Li <pan2.li@intel.com<mailto:pan2.li@intel.com>>
Co-Authored by: Juzhe-Zhong <juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>>

PR 110265

gcc/ChangeLog:
PR target/110265
* config/riscv/riscv-vector-builtins-bases.cc: Add ret_mode for
integer reduction expand.
* config/riscv/vector-iterators.md: Add VQI, VHI, VSI and VDI,
and the LMUL1 attr respectively.
* config/riscv/vector.md.
(@pred_reduc_<reduc><mode><vlmul1>): Removed.
(@pred_reduc_<reduc><mode><vlmul1_zve64>): Likewise.
(@pred_reduc_<reduc><mode><vlmul1_zve32>): Likewise.
(@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>): New pattern.
(@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>): Likewise.
(@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>): Likewise.
(@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>): Likewise.

gcc/testsuite/ChangeLog:
PR target/110265
* gcc.target/riscv/rvv/base/pr110265-1.c: New test.
* gcc.target/riscv/rvv/base/pr110265-1.h: New test.
* gcc.target/riscv/rvv/base/pr110265-2.c: New test.
* gcc.target/riscv/rvv/base/pr110265-2.h: New test.
* gcc.target/riscv/rvv/base/pr110265-3.c: New test.
---
.../riscv/riscv-vector-builtins-bases.cc      |  13 +-
gcc/config/riscv/vector-iterators.md          |  61 +++++
gcc/config/riscv/vector.md                    | 208 +++++++++++++-----
.../gcc.target/riscv/rvv/base/pr110265-1.c    |  13 ++
.../gcc.target/riscv/rvv/base/pr110265-1.h    |  65 ++++++
.../gcc.target/riscv/rvv/base/pr110265-2.c    |  14 ++
.../gcc.target/riscv/rvv/base/pr110265-2.h    |  57 +++++
.../gcc.target/riscv/rvv/base/pr110265-3.c    |  14 ++
8 files changed, 385 insertions(+), 60 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 87a684dd127..53bd0ed2534 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1396,8 +1396,17 @@ public:
   rtx expand (function_expander &e) const override
   {
-    return e.use_exact_insn (
-      code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    machine_mode mode = e.vector_mode ();
+    machine_mode ret_mode = e.ret_mode ();
+
+    /* TODO: we will use ret_mode after all types of PR110265 are addressed.  */
+    if ((GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT)
+       || GET_MODE_INNER (mode) != GET_MODE_INNER (ret_mode))
+      return e.use_exact_insn (
+ code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+    else
+      return e.use_exact_insn (
+ code_for_pred_reduc (CODE, e.vector_mode (), e.ret_mode ()));
   }
};
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 8c71c9e22cc..e2c8ade98eb 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -929,6 +929,67 @@ (define_mode_iterator V64T [
   (VNx2x64QI "TARGET_MIN_VLEN >= 128")
])
+(define_mode_iterator VQI [
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  VNx2QI
+  VNx4QI
+  VNx8QI
+  VNx16QI
+  VNx32QI
+  (VNx64QI "TARGET_MIN_VLEN > 32")
+  (VNx128QI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VHI [
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  VNx2HI
+  VNx4HI
+  VNx8HI
+  VNx16HI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx64HI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VSI [
+  (VNx1SI "TARGET_MIN_VLEN < 128")
+  VNx2SI
+  VNx4SI
+  VNx8SI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VDI [
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx2DI "TARGET_VECTOR_ELEN_64")
+  (VNx4DI "TARGET_VECTOR_ELEN_64")
+  (VNx8DI "TARGET_VECTOR_ELEN_64")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+])
+
+(define_mode_iterator VQI_LMUL1 [
+  (VNx16QI "TARGET_MIN_VLEN >= 128")
+  (VNx8QI "TARGET_MIN_VLEN == 64")
+  (VNx4QI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VHI_LMUL1 [
+  (VNx8HI "TARGET_MIN_VLEN >= 128")
+  (VNx4HI "TARGET_MIN_VLEN == 64")
+  (VNx2HI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VSI_LMUL1 [
+  (VNx4SI "TARGET_MIN_VLEN >= 128")
+  (VNx2SI "TARGET_MIN_VLEN == 64")
+  (VNx1SI "TARGET_MIN_VLEN == 32")
+])
+
+(define_mode_iterator VDI_LMUL1 [
+  (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN == 64")
+])
+
(define_mode_attr VLMULX2 [
   (VNx1QI "VNx2QI") (VNx2QI "VNx4QI") (VNx4QI "VNx8QI") (VNx8QI "VNx16QI") (VNx16QI "VNx32QI") (VNx32QI "VNx64QI") (VNx64QI "VNx128QI")
   (VNx1HI "VNx2HI") (VNx2HI "VNx4HI") (VNx4HI "VNx8HI") (VNx8HI "VNx16HI") (VNx16HI "VNx32HI") (VNx32HI "VNx64HI")
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 1d1847bd85a..d396e278503 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -7244,76 +7244,168 @@ (define_insn "@pred_rod_trunc<mode>"<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; -------------------------------------------------------------------------------<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; For reduction operations, we should have seperate patterns for<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
-;; TARGET_MIN_VLEN == 32 and TARGET_MIN_VLEN > 32.<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
+;; different types. For each type, we will cover MIN_VLEN == 32, MIN_VLEN == 64<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
+;; and the MIN_VLEN >= 128 from the well defined iterators.<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; Since reduction need LMUL = 1 scalar operand as the input operand<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; and they are different.<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64*<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>
-(define_insn "@pred_reduc_<reduc><mode><vlmul1<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>>"
-  [(set (match_operand:<VLMUL1> 0 "register_operand"            "=vr,   vr")
- (unspec:<VLMUL1>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-       (match_operand 5 "vector_length_operand"        "   rK,   rK")
-       (match_operand 6 "const_int_operand"            "    i,    i")
-       (match_operand 7 "const_int_operand"            "    i,    i")
+
+;; Integer Reduction for QI
+(define_insn "@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VQI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VQI_LMUL1
+ [
+   (unspec:<VQI:VM>
+     [
+       (match_operand:<VQI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI
-      (vec_duplicate:VI
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1> 4 "register_operand" "   vr,   vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI 3 "register_operand"           "   vr,   vr"))
-    (match_operand:<VLMUL1> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN >= 128"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VQI
+     (vec_duplicate:VQI
+       (vec_select:<VEL>
+ (match_operand:VQI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VQI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VQI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VQI:MODE>")
+  ]
+)
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve64>"
-  [(set (match_operand:<VLMUL1_ZVE64> 0 "register_operand"            "=vr,   vr")
- (unspec:<VLMUL1_ZVE64>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"     "vmWc1,vmWc1")
-       (match_operand 5 "vector_length_operand"        "   rK,   rK")
-       (match_operand 6 "const_int_operand"            "    i,    i")
-       (match_operand 7 "const_int_operand"            "    i,    i")
+;; Integer Reduction for HI
+(define_insn "@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VHI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VHI_LMUL1
+ [
+   (unspec:<VHI:VM>
+     [
+       (match_operand:<VHI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI_ZVE64
-      (vec_duplicate:VI_ZVE64
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1_ZVE64> 4 "register_operand" "   vr,   vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI_ZVE64 3 "register_operand"           "   vr,   vr"))
-    (match_operand:<VLMUL1_ZVE64> 2 "vector_merge_operand"   "   vu,    0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 64"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VHI
+     (vec_duplicate:VHI
+       (vec_select:<VEL>
+ (match_operand:VHI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VHI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VHI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VHI:MODE>")
+  ]
+)
-(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve32>"
-  [(set (match_operand:<VLMUL1_ZVE32> 0 "register_operand"          "=vd, vd, vr, vr")
- (unspec:<VLMUL1_ZVE32>
-   [(unspec:<VM>
-      [(match_operand:<VM> 1 "vector_mask_operand"           " vm, vm,Wc1,Wc1")
-       (match_operand 5 "vector_length_operand"              " rK, rK, rK, rK")
-       (match_operand 6 "const_int_operand"                  "  i,  i,  i,  i")
-       (match_operand 7 "const_int_operand"                  "  i,  i,  i,  i")
+;; Integer Reduction for SI
+(define_insn "@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VSI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VSI_LMUL1
+ [
+   (unspec:<VSI:VM>
+     [
+       (match_operand:<VSI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
      (reg:SI VL_REGNUM)
-       (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-    (any_reduc:VI_ZVE32
-      (vec_duplicate:VI_ZVE32
-        (vec_select:<VEL>
-          (match_operand:<VLMUL1_ZVE32> 4 "register_operand" " vr, vr, vr, vr")
-          (parallel [(const_int 0)])))
-      (match_operand:VI_ZVE32 3 "register_operand"           " vr, vr, vr, vr"))
-    (match_operand:<VLMUL1_ZVE32> 2 "vector_merge_operand"   " vu,  0, vu,  0")] UNSPEC_REDUC))]
-  "TARGET_VECTOR && TARGET_MIN_VLEN == 32"
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VSI
+     (vec_duplicate:VSI
+       (vec_select:<VEL>
+ (match_operand:VSI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VSI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VSI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
   "vred<reduc>.vs\t%0,%3,%4%p1"
-  [(set_attr "type" "vired")
-   (set_attr "mode" "<MODE>")])
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VSI:MODE>")
+  ]
+)
+
+;; Integer Reduction for DI
+(define_insn "@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>"
+  [
+    (set
+      (match_operand:VDI_LMUL1           0 "register_operand"      "=vr,     vr")
+      (unspec:VDI_LMUL1
+ [
+   (unspec:<VDI:VM>
+     [
+       (match_operand:<VDI:VM>    1 "vector_mask_operand"   "vmWc1,vmWc1")
+       (match_operand             5 "vector_length_operand" "   rK,   rK")
+       (match_operand             6 "const_int_operand"     "    i,    i")
+       (match_operand             7 "const_int_operand"     "    i,    i")
+       (reg:SI VL_REGNUM)
+       (reg:SI VTYPE_REGNUM)
+     ] UNSPEC_VPREDICATE
+   )
+   (any_reduc:VDI
+     (vec_duplicate:VDI
+       (vec_select:<VEL>
+ (match_operand:VDI_LMUL1 4 "register_operand"      "   vr,   vr")
+ (parallel [(const_int 0)])
+       )
+     )
+     (match_operand:VDI           3 "register_operand"      "   vr,   vr")
+   )
+   (match_operand:VDI_LMUL1       2 "vector_merge_operand"  "   vu,    0")
+ ] UNSPEC_REDUC
+      )
+    )
+  ]
+  "TARGET_VECTOR"
+  "vred<reduc>.vs\t%0,%3,%4%p1"
+  [
+    (set_attr "type" "vired")
+    (set_attr "mode" "<VDI:MODE>")
+  ]
+)
(define_insn "@pred_widen_reduc_plus<v_su><mode><vwlmul1>"
   [(set (match_operand:<VWLMUL1> 0 "register_operand"           "=&vr,  &vr")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
new file mode 100644
index 00000000000..2e4aeb5b90b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
new file mode 100644
index 00000000000..ade44cc27ea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h
@@ -0,0 +1,65 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredand_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmax_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredmaxu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vint32m1_t test_vredmin_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i32m8_i32m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf4_u8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredminu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredsum_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u32m8_u32m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf4_i8m1(vector, scalar, vl);
+}
+
+vuint32m1_t test_vredxor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u32m8_u32m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
new file mode 100644
index 00000000000..7454c1cc918
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
new file mode 100644
index 00000000000..6a7e14e51f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h
@@ -0,0 +1,57 @@
+#include "riscv_vector.h"
+
+vint8m1_t test_vredand_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmax_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmax_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredmaxu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredmin_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredmin_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint8m1_t test_vredminu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u8mf8_u8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredsum_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vint8m1_t test_vredxor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_i8mf8_i8m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredand_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredand_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredmaxu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredmaxu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredminu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredminu_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredor_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredsum_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredsum_vs_u64m8_u64m1(vector, scalar, vl);
+}
+
+vuint64m1_t test_vredxor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) {
+  return __riscv_vredxor_vs_u64m8_u64m1(vector, scalar, vl);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
new file mode 100644
index 00000000000..0ed1fbae35a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32f -O3 -Wno-psabi" } */
+
+#include "pr110265-1.h"
+#include "pr110265-2.h"
+
+/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
+/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */
--
2.34.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
  2023-06-16  8:10   ` juzhe.zhong
  2023-06-16  8:16     ` Li, Pan2
@ 2023-06-16 15:55     ` Jeff Law
  2023-06-16 23:38       ` Li, Pan2
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Law @ 2023-06-16 15:55 UTC (permalink / raw)
  To: juzhe.zhong, pan2.li, gcc-patches; +Cc: Robin Dapp, yanzhang.wang, kito.cheng



On 6/16/23 02:10, juzhe.zhong@rivai.ai wrote:
> LGTM. Thanks for fix this bug.
> Let's wait for Jeff's final approve.
OK.

jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.
  2023-06-16 15:55     ` Jeff Law
@ 2023-06-16 23:38       ` Li, Pan2
  0 siblings, 0 replies; 8+ messages in thread
From: Li, Pan2 @ 2023-06-16 23:38 UTC (permalink / raw)
  To: Jeff Law, juzhe.zhong, gcc-patches; +Cc: Robin Dapp, Wang, Yanzhang, kito.cheng

Committed, thanks Jeff.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Friday, June 16, 2023 11:56 PM
To: juzhe.zhong@rivai.ai; Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Robin Dapp <rdapp.gcc@gmail.com>; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng <kito.cheng@gmail.com>
Subject: Re: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64.



On 6/16/23 02:10, juzhe.zhong@rivai.ai wrote:
> LGTM. Thanks for fix this bug.
> Let's wait for Jeff's final approve.
OK.

jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-06-16 23:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-16  7:28 [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64 pan2.li
2023-06-16  7:47 ` juzhe.zhong
2023-06-16  7:56   ` Li, Pan2
2023-06-16  8:09 ` [PATCH v2] " pan2.li
2023-06-16  8:10   ` juzhe.zhong
2023-06-16  8:16     ` Li, Pan2
2023-06-16 15:55     ` Jeff Law
2023-06-16 23:38       ` Li, Pan2

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).