public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v4 00/10] RISC-V: Add autovec support
@ 2023-04-17 18:36 Michael Collison
  2023-04-17 18:36 ` [PATCH v4 01/10] RISC-V: Add new predicates and function prototypes Michael Collison
                   ` (11 more replies)
  0 siblings, 12 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:36 UTC (permalink / raw)
  To: gcc-patches

This series of patches adds foundational support for RISC-V auto-vectorization support. These patches are based on the current upstream rvv vector intrinsic support and is not a new implementation. Most of the implementation consists of adding the new vector cost model, the autovectorization patterns themselves and target hooks. This implementation only provides support for integer addition and subtraction as a proof of concept. This patch set should not be construed to be feature complete. Based on conversations with the community these patches are intended to lay the groundwork for feature completion and collaboration within the RISC-V community.

These patches are largely based off the work of Juzhe Zhong (juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>) of RiVAI. More specifically the rvv-next branch at: https://github.com/riscv-collab/riscv-gcc.git <https://github.com/riscv-collab/riscv-gcc.git>is the foundation of this patch set. 

As discussed on this list, if these patches are approved they will be merged into a "auto-vectorization" branch once gcc-13 branches for release. There are two known issues related to crashes (assert failures) associated with tree vectorization; one of which I have sent a patch for and have received feedback. 

Changes in v4:

- Added support for binary integer operations and test cases
- Fixed bug to support 8-bit integer vectorization
- Fixed several assert errors related to non-multiple of two vector modes

Changes in v3:

- Removed the cost model and cost hooks based on feedback from Richard Biener
- Used RVV_VUNDEF macro to fix failing patterns

Changes in v2 

- Updated ChangeLog entry to include RiVAI contributions 
- Fixed ChangeLog email formatting 
- Fixed gnu formatting issues in the code 

Kevin Lee (2):
  This patch adds a guard for VNx1 vectors that are present in ports
    like riscv.
  This patch supports 8 bit auto-vectorization in riscv.

Michael Collison (8):
  RISC-V: Add new predicates and function prototypes
  RISC-V: autovec: Export policy functions to global scope
  RISC-V:autovec: Add auto-vectorization support functions
  RISC-V:autovec: Add target vectorization hooks
  RISC-V:autovec: Add autovectorization patterns for binary integer
    operations
  RISC-V:autovec: Add autovectorization tests for add & sub
  vect: Verify that GET_MODE_NUNITS is a multiple of 2.
  RISC-V:autovec: Add autovectorization tests for binary integer

 gcc/config/riscv/predicates.md                |  13 ++
 gcc/config/riscv/riscv-opts.h                 |  40 ++++
 gcc/config/riscv/riscv-protos.h               |  14 ++
 gcc/config/riscv/riscv-v.cc                   | 176 ++++++++++++++++++
 gcc/config/riscv/riscv-vector-builtins.cc     |   4 +-
 gcc/config/riscv/riscv-vector-builtins.h      |   3 +
 gcc/config/riscv/riscv.cc                     | 157 ++++++++++++++++
 gcc/config/riscv/riscv.md                     |   1 +
 gcc/config/riscv/riscv.opt                    |  20 ++
 gcc/config/riscv/vector-auto.md               |  79 ++++++++
 gcc/config/riscv/vector-iterators.md          |   2 +
 gcc/config/riscv/vector.md                    |   4 +-
 .../riscv/rvv/autovec/loop-add-rv32.c         |  25 +++
 .../gcc.target/riscv/rvv/autovec/loop-add.c   |  25 +++
 .../riscv/rvv/autovec/loop-and-rv32.c         |  25 +++
 .../gcc.target/riscv/rvv/autovec/loop-and.c   |  25 +++
 .../riscv/rvv/autovec/loop-div-rv32.c         |  27 +++
 .../gcc.target/riscv/rvv/autovec/loop-div.c   |  27 +++
 .../riscv/rvv/autovec/loop-max-rv32.c         |  26 +++
 .../gcc.target/riscv/rvv/autovec/loop-max.c   |  26 +++
 .../riscv/rvv/autovec/loop-min-rv32.c         |  26 +++
 .../gcc.target/riscv/rvv/autovec/loop-min.c   |  26 +++
 .../riscv/rvv/autovec/loop-mod-rv32.c         |  27 +++
 .../gcc.target/riscv/rvv/autovec/loop-mod.c   |  27 +++
 .../riscv/rvv/autovec/loop-mul-rv32.c         |  25 +++
 .../gcc.target/riscv/rvv/autovec/loop-mul.c   |  25 +++
 .../riscv/rvv/autovec/loop-or-rv32.c          |  25 +++
 .../gcc.target/riscv/rvv/autovec/loop-or.c    |  25 +++
 .../riscv/rvv/autovec/loop-sub-rv32.c         |  25 +++
 .../gcc.target/riscv/rvv/autovec/loop-sub.c   |  25 +++
 .../riscv/rvv/autovec/loop-xor-rv32.c         |  25 +++
 .../gcc.target/riscv/rvv/autovec/loop-xor.c   |  25 +++
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp    |   3 +
 gcc/tree-vect-data-refs.cc                    |   2 +
 gcc/tree-vect-slp.cc                          |   7 +-
 35 files changed, 1031 insertions(+), 6 deletions(-)
 create mode 100644 gcc/config/riscv/vector-auto.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 01/10] RISC-V: Add new predicates and function prototypes
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
@ 2023-04-17 18:36 ` Michael Collison
  2023-04-19  0:54   ` Kito Cheng
  2023-04-17 18:36 ` [PATCH v4 02/10] RISC-V: autovec: Export policy functions to global scope Michael Collison
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:36 UTC (permalink / raw)
  To: gcc-patches

2023-03-02  Michael Collison  <collison@rivosinc.com>
	    Juzhe Zhong  <juzhe.zhong@rivai.ai>

	* config/riscv/riscv-protos.h (riscv_classify_vlmul_field):
	New external declaration.
	(riscv_vector_preferred_simd_mode): Ditto.
	(riscv_tuple_mode_p): Ditto.
	(riscv_vector_mask_mode_p): Ditto.
	(riscv_classify_nf): Ditto.
	(riscv_vlmul_regsize): Ditto.
	(riscv_vector_preferred_simd_mode): Ditto.
	(riscv_vector_get_mask_mode): Ditto.
	(emit_vlmax_vsetvl): Ditto.
	(get_mask_policy_no_pred): Ditto.
	(get_tail_policy_no_pred): Ditto.
	* config/riscv/riscv-opts.h (riscv_vector_bits_enum): New enum.
	(riscv_vector_lmul_enum): Ditto.
	(vlmul_field_enum): Ditto.
	* config/riscv/riscv-v.cc (emit_vlmax_vsetvl):
	Remove static scope.
	* config/riscv/riscv.opt (riscv_vector_lmul):
	New option -mriscv_vector_lmul.
	* config/riscv/predicates.md (p_reg_or_const_csr_operand):
	New predicate.
	(vector_reg_or_const_dup_operand): Ditto.
---
 gcc/config/riscv/predicates.md  | 13 +++++++++++
 gcc/config/riscv/riscv-opts.h   | 40 +++++++++++++++++++++++++++++++++
 gcc/config/riscv/riscv-protos.h | 14 ++++++++++++
 gcc/config/riscv/riscv.opt      | 20 +++++++++++++++++
 4 files changed, 87 insertions(+)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 8654dbc5943..b3f2d622c7b 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -264,6 +264,14 @@
 })
 
 ;; Predicates for the V extension.
+(define_special_predicate "p_reg_or_const_csr_operand"
+  (match_code "reg, subreg, const_int")
+{
+  if (CONST_INT_P (op))
+    return satisfies_constraint_K (op);
+  return GET_MODE (op) == Pmode;
+})
+
 (define_special_predicate "vector_length_operand"
   (ior (match_operand 0 "pmode_register_operand")
        (match_operand 0 "const_csr_operand")))
@@ -291,6 +299,11 @@
   (and (match_code "const_vector")
        (match_test "rtx_equal_p (op, riscv_vector::gen_scalar_move_mask (GET_MODE (op)))")))
 
+(define_predicate "vector_reg_or_const_dup_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_test "const_vec_duplicate_p (op)
+       && !CONST_POLY_INT_P (CONST_VECTOR_ELT (op, 0))")))
+
 (define_predicate "vector_mask_operand"
   (ior (match_operand 0 "register_operand")
        (match_operand 0 "vector_all_trues_mask_operand")))
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index cf0cd669be4..70711310749 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -67,6 +67,46 @@ enum stack_protector_guard {
   SSP_GLOBAL			/* global canary */
 };
 
+/* RVV vector register sizes.  */
+enum riscv_vector_bits_enum
+{
+  RVV_SCALABLE,
+  RVV_NOT_IMPLEMENTED = RVV_SCALABLE,
+  RVV_64 = 64,
+  RVV_128 = 128,
+  RVV_256 = 256,
+  RVV_512 = 512,
+  RVV_1024 = 1024,
+  RVV_2048 = 2048,
+  RVV_4096 = 4096,
+  RVV_8192 = 8192,
+  RVV_16384 = 16384,
+  RVV_32768 = 32768,
+  RVV_65536 = 65536
+};
+
+/* vectorization factor.  */
+enum riscv_vector_lmul_enum
+{
+  RVV_LMUL1 = 1,
+  RVV_LMUL2 = 2,
+  RVV_LMUL4 = 4,
+  RVV_LMUL8 = 8
+};
+
+enum vlmul_field_enum
+{
+  VLMUL_FIELD_000, /* LMUL = 1.  */
+  VLMUL_FIELD_001, /* LMUL = 2.  */
+  VLMUL_FIELD_010, /* LMUL = 4.  */
+  VLMUL_FIELD_011, /* LMUL = 8.  */
+  VLMUL_FIELD_100, /* RESERVED.  */
+  VLMUL_FIELD_101, /* LMUL = 1/8.  */
+  VLMUL_FIELD_110, /* LMUL = 1/4.  */
+  VLMUL_FIELD_111, /* LMUL = 1/2.  */
+  MAX_VLMUL_FIELD
+};
+
 #define MASK_ZICSR    (1 << 0)
 #define MASK_ZIFENCEI (1 << 1)
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 5244e8dcbf0..41f60f82a55 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -237,4 +237,18 @@ extern const char*
 th_mempair_output_move (rtx[4], bool, machine_mode, RTX_CODE);
 #endif
 
+/* Routines implemented in riscv-v.cc.  */
+
+namespace riscv_vector {
+extern unsigned int riscv_classify_vlmul_field (enum machine_mode m);
+extern machine_mode riscv_vector_preferred_simd_mode (scalar_mode mode,
+						      unsigned vf);
+extern bool riscv_tuple_mode_p (machine_mode);
+extern bool riscv_vector_mask_mode_p (machine_mode);
+extern int riscv_classify_nf (machine_mode);
+extern int riscv_vlmul_regsize (machine_mode);
+extern opt_machine_mode riscv_vector_get_mask_mode (machine_mode mode);
+extern rtx get_mask_policy_no_pred ();
+extern rtx get_tail_policy_no_pred ();
+}
 #endif /* ! GCC_RISCV_PROTOS_H */
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index ff1dd4ddd4f..4db3b2cac55 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -70,6 +70,26 @@ Enum(abi_type) String(lp64f) Value(ABI_LP64F)
 EnumValue
 Enum(abi_type) String(lp64d) Value(ABI_LP64D)
 
+Enum
+Name(riscv_vector_lmul) Type(enum riscv_vector_lmul_enum)
+The possible vectorization factor:
+
+EnumValue
+Enum(riscv_vector_lmul) String(1) Value(RVV_LMUL1)
+
+EnumValue
+Enum(riscv_vector_lmul) String(2) Value(RVV_LMUL2)
+
+EnumValue
+Enum(riscv_vector_lmul) String(4) Value(RVV_LMUL4)
+
+EnumValue
+Enum(riscv_vector_lmul) String(8) Value(RVV_LMUL8)
+
+mriscv-vector-lmul=
+Target RejectNegative Joined Enum(riscv_vector_lmul) Var(riscv_vector_lmul) Init(RVV_LMUL1)
+-mriscv-vector-lmul=<lmul>	Set the vf using lmul in auto-vectorization.
+
 mfdiv
 Target Mask(FDIV)
 Use hardware floating-point divide and square root instructions.
-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 02/10]  RISC-V: autovec: Export policy functions to global scope
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
  2023-04-17 18:36 ` [PATCH v4 01/10] RISC-V: Add new predicates and function prototypes Michael Collison
@ 2023-04-17 18:36 ` Michael Collison
  2023-04-17 18:36 ` [PATCH v4 03/10] RISC-V:autovec: Add auto-vectorization support functions Michael Collison
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:36 UTC (permalink / raw)
  To: gcc-patches

2023-03-02  Michael Collison  <collison@rivosinc.com>
	    Juzhe Zhong  <juzhe.zhong@rivai.ai>

	* config/riscv/riscv-vector-builtins.cc (get_tail_policy_for_pred):
	Remove static declaration to to make externally visible.
	(get_mask_policy_for_pred): Ditto.
	* config/riscv/riscv-vector-builtins.h (get_tail_policy_for_pred):
	New external declaration.
	(get_mask_policy_for_pred): Ditto.
---
 gcc/config/riscv/riscv-vector-builtins.cc | 4 ++--
 gcc/config/riscv/riscv-vector-builtins.h  | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc
index 01cea23d3e6..1ed9e4acc40 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -2493,7 +2493,7 @@ use_real_merge_p (enum predication_type_index pred)
 
 /* Get TAIL policy for predication. If predication indicates TU, return the TU.
    Otherwise, return the prefer default configuration.  */
-static rtx
+rtx
 get_tail_policy_for_pred (enum predication_type_index pred)
 {
   if (pred == PRED_TYPE_tu || pred == PRED_TYPE_tum || pred == PRED_TYPE_tumu)
@@ -2503,7 +2503,7 @@ get_tail_policy_for_pred (enum predication_type_index pred)
 
 /* Get MASK policy for predication. If predication indicates MU, return the MU.
    Otherwise, return the prefer default configuration.  */
-static rtx
+rtx
 get_mask_policy_for_pred (enum predication_type_index pred)
 {
   if (pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu)
diff --git a/gcc/config/riscv/riscv-vector-builtins.h b/gcc/config/riscv/riscv-vector-builtins.h
index 8ffb9d33e33..de3fd6ca290 100644
--- a/gcc/config/riscv/riscv-vector-builtins.h
+++ b/gcc/config/riscv/riscv-vector-builtins.h
@@ -483,6 +483,9 @@ extern rvv_builtin_types_t builtin_types[NUM_VECTOR_TYPES + 1];
 extern function_instance get_read_vl_instance (void);
 extern tree get_read_vl_decl (void);
 
+extern rtx get_tail_policy_for_pred (enum predication_type_index pred);
+extern rtx get_mask_policy_for_pred (enum predication_type_index pred);
+
 inline tree
 rvv_arg_type_info::get_scalar_type (vector_type_index type_idx) const
 {
-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 03/10] RISC-V:autovec: Add auto-vectorization support functions
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
  2023-04-17 18:36 ` [PATCH v4 01/10] RISC-V: Add new predicates and function prototypes Michael Collison
  2023-04-17 18:36 ` [PATCH v4 02/10] RISC-V: autovec: Export policy functions to global scope Michael Collison
@ 2023-04-17 18:36 ` Michael Collison
  2023-04-19  1:15   ` Kito Cheng
  2023-04-20  2:19   ` juzhe.zhong
  2023-04-17 18:36 ` [PATCH v4 04/10] RISC-V:autovec: Add target vectorization hooks Michael Collison
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:36 UTC (permalink / raw)
  To: gcc-patches

2023-03-02  Michael Collison  <collison@rivosinc.com>
	    Juzhe Zhong  <juzhe.zhong@rivai.ai>

	* config/riscv/riscv-v.cc (riscv_classify_vlmul_field):
	New function.
	(riscv_vector_preferred_simd_mode): Ditto.
	(get_mask_policy_no_pred): Ditto.
	(get_tail_policy_no_pred): Ditto.
	(riscv_tuple_mode_p): Ditto.
	(riscv_classify_nf): Ditto.
	(riscv_vlmul_regsize): Ditto.
	(riscv_vector_mask_mode_p): Ditto.
	(riscv_vector_get_mask_mode): Ditto.
---
 gcc/config/riscv/riscv-v.cc | 176 ++++++++++++++++++++++++++++++++++++
 1 file changed, 176 insertions(+)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 392f5d02e17..9df86419caa 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -39,9 +39,11 @@
 #include "emit-rtl.h"
 #include "tm_p.h"
 #include "target.h"
+#include "targhooks.h"
 #include "expr.h"
 #include "optabs.h"
 #include "tm-constrs.h"
+#include "riscv-vector-builtins.h"
 #include "rtx-vector-builder.h"
 
 using namespace riscv_vector;
@@ -118,6 +120,41 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
 	  && IN_RANGE (INTVAL (elt), minval, maxval));
 }
 
+/* Return the vlmul field for a specific machine mode.  */
+unsigned int
+riscv_classify_vlmul_field (enum machine_mode mode)
+{
+  /* Make the decision based on the mode's enum value rather than its
+     properties, so that we keep the correct classification regardless
+     of -mriscv-vector-bits.  */
+  switch (mode)
+    {
+    case E_VNx8BImode:
+      return VLMUL_FIELD_111;
+
+    case E_VNx4BImode:
+      return VLMUL_FIELD_110;
+
+    case E_VNx2BImode:
+      return VLMUL_FIELD_101;
+
+    case E_VNx16BImode:
+      return VLMUL_FIELD_000;
+
+    case E_VNx32BImode:
+      return VLMUL_FIELD_001;
+
+    case E_VNx64BImode:
+      return VLMUL_FIELD_010;
+
+    default:
+      break;
+    }
+
+  /* we don't care about VLMUL for Mask.  */
+  return VLMUL_FIELD_000;
+}
+
 /* Emit a vlmax vsetvl instruction.  This should only be used when
    optimization is disabled or after vsetvl insertion pass.  */
 void
@@ -176,6 +213,64 @@ calculate_ratio (unsigned int sew, enum vlmul_type vlmul)
   return ratio;
 }
 
+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE for RVV.  */
+
+machine_mode
+riscv_vector_preferred_simd_mode (scalar_mode mode, unsigned vf)
+{
+  if (!TARGET_VECTOR)
+    return word_mode;
+
+  switch (mode)
+    {
+    case E_QImode:
+      return vf == 1   ? VNx8QImode
+	     : vf == 2 ? VNx16QImode
+	     : vf == 4 ? VNx32QImode
+		       : VNx64QImode;
+      break;
+    case E_HImode:
+      return vf == 1   ? VNx4HImode
+	     : vf == 2 ? VNx8HImode
+	     : vf == 4 ? VNx16HImode
+		       : VNx32HImode;
+      break;
+    case E_SImode:
+      return vf == 1   ? VNx2SImode
+	     : vf == 2 ? VNx4SImode
+	     : vf == 4 ? VNx8SImode
+		       : VNx16SImode;
+      break;
+    case E_DImode:
+      if (riscv_vector_elen_flags != MASK_VECTOR_ELEN_32
+	  && riscv_vector_elen_flags != MASK_VECTOR_ELEN_FP_32)
+	return vf == 1	 ? VNx1DImode
+	       : vf == 2 ? VNx2DImode
+	       : vf == 4 ? VNx4DImode
+			 : VNx8DImode;
+      break;
+    case E_SFmode:
+      if (TARGET_HARD_FLOAT && riscv_vector_elen_flags != MASK_VECTOR_ELEN_32
+	  && riscv_vector_elen_flags != MASK_VECTOR_ELEN_64)
+	return vf == 1	 ? VNx2SFmode
+	       : vf == 2 ? VNx4SFmode
+	       : vf == 4 ? VNx8SFmode
+			 : VNx16SFmode;
+      break;
+    case E_DFmode:
+      if (TARGET_DOUBLE_FLOAT && TARGET_VECTOR_ELEN_FP_64)
+	return vf == 1	 ? VNx1DFmode
+	       : vf == 2 ? VNx2DFmode
+	       : vf == 4 ? VNx4DFmode
+			 : VNx8DFmode;
+      break;
+    default:
+      break;
+    }
+
+  return word_mode;
+}
+
 /* Emit an RVV unmask && vl mov from SRC to DEST.  */
 static void
 emit_pred_op (unsigned icode, rtx mask, rtx dest, rtx src, rtx len,
@@ -421,6 +516,87 @@ get_avl_type_rtx (enum avl_type type)
   return gen_int_mode (type, Pmode);
 }
 
+rtx
+get_mask_policy_no_pred ()
+{
+  return get_mask_policy_for_pred (PRED_TYPE_none);
+}
+
+rtx
+get_tail_policy_no_pred ()
+{
+  return get_mask_policy_for_pred (PRED_TYPE_none);
+}
+
+/* Return true if it is a RVV tuple mode.  */
+bool
+riscv_tuple_mode_p (machine_mode mode ATTRIBUTE_UNUSED)
+{
+  return false;
+}
+
+/* Return nf for a machine mode.  */
+int
+riscv_classify_nf (machine_mode mode)
+{
+  switch (mode)
+    {
+
+    default:
+      break;
+    }
+
+  return 1;
+}
+
+/* Return vlmul register size for a machine mode.  */
+int
+riscv_vlmul_regsize (machine_mode mode)
+{
+  if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
+    return 1;
+  switch (riscv_classify_vlmul_field (mode))
+    {
+    case VLMUL_FIELD_001:
+      return 2;
+    case VLMUL_FIELD_010:
+      return 4;
+    case VLMUL_FIELD_011:
+      return 8;
+    case VLMUL_FIELD_100:
+      gcc_unreachable ();
+    default:
+      return 1;
+    }
+}
+
+/* Return true if it is a RVV mask mode.  */
+bool
+riscv_vector_mask_mode_p (machine_mode mode)
+{
+  return (mode == VNx1BImode || mode == VNx2BImode || mode == VNx4BImode
+	  || mode == VNx8BImode || mode == VNx16BImode || mode == VNx32BImode
+	  || mode == VNx64BImode);
+}
+
+/* Implement TARGET_VECTORIZE_GET_MASK_MODE for RVV.  */
+
+opt_machine_mode
+riscv_vector_get_mask_mode (machine_mode mode)
+{
+  machine_mode mask_mode;
+  int nf = 1;
+  if (riscv_tuple_mode_p (mode))
+    nf = riscv_classify_nf (mode);
+
+  FOR_EACH_MODE_IN_CLASS (mask_mode, MODE_VECTOR_BOOL)
+  if (GET_MODE_INNER (mask_mode) == BImode
+      && known_eq (GET_MODE_NUNITS (mask_mode) * nf, GET_MODE_NUNITS (mode))
+      && riscv_vector_mask_mode_p (mask_mode))
+    return mask_mode;
+  return default_get_mask_mode (mode);
+}
+
 /* Return the RVV vector mode that has NUNITS elements of mode INNER_MODE.
    This function is not only used by builtins, but also will be used by
    auto-vectorization in the future.  */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 04/10] RISC-V:autovec: Add target vectorization hooks
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
                   ` (2 preceding siblings ...)
  2023-04-17 18:36 ` [PATCH v4 03/10] RISC-V:autovec: Add auto-vectorization support functions Michael Collison
@ 2023-04-17 18:36 ` Michael Collison
  2023-04-19  1:04   ` Kito Cheng
  2023-04-20  2:11   ` juzhe.zhong
  2023-04-17 18:36 ` [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations Michael Collison
                   ` (7 subsequent siblings)
  11 siblings, 2 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:36 UTC (permalink / raw)
  To: gcc-patches

2023-03-02  Michael Collison  <collison@rivosinc.com>
	    Juzhe Zhong  <juzhe.zhong@rivai.ai>

	* config/riscv/riscv.cc (riscv_option_override):
	Set riscv_vectorization_factor.
	(riscv_estimated_poly_value): Implement
	TARGET_ESTIMATED_POLY_VALUE.
	(riscv_preferred_simd_mode): Implement
	TARGET_VECTORIZE_PREFERRED_SIMD_MODE.
	(riscv_autovectorize_vector_modes): Implement
	TARGET_AUTOVECTORIZE_VECTOR_MODES.
	(riscv_get_mask_mode): Implement TARGET_VECTORIZE_GET_MASK_MODE.
	(riscv_empty_mask_is_expensive): Implement
	TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.
	(riscv_vectorize_create_costs): Implement
	TARGET_VECTORIZE_CREATE_COSTS.
	(TARGET_ESTIMATED_POLY_VALUE): Register target macro.
	(TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto.
	(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto.
	(TARGET_VECTORIZE_GET_MASK_MODE): Ditto.
	(TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE): Ditto.
	(TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK): Ditto.
---
 gcc/config/riscv/riscv.cc | 156 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 156 insertions(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index dc47434fac4..9af06d926cf 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -60,6 +60,15 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "tm-constrs.h"
 #include "rtl-iter.h"
+#include "gimple.h"
+#include "cfghooks.h"
+#include "cfgloop.h"
+#include "cfgrtl.h"
+#include "sel-sched.h"
+#include "fold-const.h"
+#include "gimple-iterator.h"
+#include "gimple-expr.h"
+#include "tree-vectorizer.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -275,6 +284,9 @@ poly_uint16 riscv_vector_chunks;
 /* The number of bytes in a vector chunk.  */
 unsigned riscv_bytes_per_vector_chunk;
 
+/* Prefer vf for auto-vectorizer.  */
+unsigned riscv_vectorization_factor;
+
 /* Index R is the smallest register class that contains register R.  */
 const enum reg_class riscv_regno_to_class[FIRST_PSEUDO_REGISTER] = {
   GR_REGS,	GR_REGS,	GR_REGS,	GR_REGS,
@@ -6363,6 +6375,10 @@ riscv_option_override (void)
 
   /* Convert -march to a chunks count.  */
   riscv_vector_chunks = riscv_convert_vector_bits ();
+
+  if (TARGET_VECTOR)
+    riscv_vectorization_factor = riscv_vector_lmul;
+
 }
 
 /* Implement TARGET_CONDITIONAL_REGISTER_USAGE.  */
@@ -7057,6 +7073,128 @@ riscv_dwarf_poly_indeterminate_value (unsigned int i, unsigned int *factor,
   return RISCV_DWARF_VLENB;
 }
 
+/* Implement TARGET_ESTIMATED_POLY_VALUE.
+   Look into the tuning structure for an estimate.
+   KIND specifies the type of requested estimate: min, max or likely.
+   For cores with a known RVV width all three estimates are the same.
+   For generic RVV tuning we want to distinguish the maximum estimate from
+   the minimum and likely ones.
+   The likely estimate is the same as the minimum in that case to give a
+   conservative behavior of auto-vectorizing with RVV when it is a win
+   even for 128-bit RVV.
+   When RVV width information is available VAL.coeffs[1] is multiplied by
+   the number of VQ chunks over the initial Advanced SIMD 128 bits.  */
+
+static HOST_WIDE_INT
+riscv_estimated_poly_value (poly_int64 val,
+			    poly_value_estimate_kind kind = POLY_VALUE_LIKELY)
+{
+  unsigned int width_source = BITS_PER_RISCV_VECTOR.is_constant ()
+    ? (unsigned int) BITS_PER_RISCV_VECTOR.to_constant ()
+    : (unsigned int) RVV_SCALABLE;
+
+  /* If there is no core-specific information then the minimum and likely
+     values are based on 128-bit vectors and the maximum is based on
+     the architectural maximum of 2048 bits.  */
+  if (width_source == RVV_SCALABLE)
+    switch (kind)
+      {
+      case POLY_VALUE_MIN:
+      case POLY_VALUE_LIKELY:
+	return val.coeffs[0];
+
+      case POLY_VALUE_MAX:
+	return val.coeffs[0] + val.coeffs[1] * 15;
+      }
+
+  /* Allow BITS_PER_RISCV_VECTOR to be a bitmask of different VL, treating the
+     lowest as likely.  This could be made more general if future -mtune
+     options need it to be.  */
+  if (kind == POLY_VALUE_MAX)
+    width_source = 1 << floor_log2 (width_source);
+  else
+    width_source = least_bit_hwi (width_source);
+
+  /* If the core provides width information, use that.  */
+  HOST_WIDE_INT over_128 = width_source - 128;
+  return val.coeffs[0] + val.coeffs[1] * over_128 / 128;
+}
+
+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
+
+static machine_mode
+riscv_preferred_simd_mode (scalar_mode mode)
+{
+  machine_mode vmode =
+    riscv_vector::riscv_vector_preferred_simd_mode (mode,
+						    riscv_vectorization_factor);
+  if (VECTOR_MODE_P (vmode))
+    return vmode;
+
+  return word_mode;
+}
+
+/* Implement TARGET_AUTOVECTORIZE_VECTOR_MODES for RVV.  */
+static unsigned int
+riscv_autovectorize_vector_modes (vector_modes *modes, bool)
+{
+  if (!TARGET_VECTOR)
+    return 0;
+
+  if (riscv_vectorization_factor == RVV_LMUL1)
+    {
+      modes->safe_push (VNx16QImode);
+      modes->safe_push (VNx8QImode);
+      modes->safe_push (VNx4QImode);
+      modes->safe_push (VNx2QImode);
+    }
+  else if (riscv_vectorization_factor == RVV_LMUL2)
+    {
+      modes->safe_push (VNx32QImode);
+      modes->safe_push (VNx16QImode);
+      modes->safe_push (VNx8QImode);
+      modes->safe_push (VNx4QImode);
+    }
+  else if (riscv_vectorization_factor == RVV_LMUL4)
+    {
+      modes->safe_push (VNx64QImode);
+      modes->safe_push (VNx32QImode);
+      modes->safe_push (VNx16QImode);
+      modes->safe_push (VNx8QImode);
+    }
+  else
+    {
+      modes->safe_push (VNx64QImode);
+      modes->safe_push (VNx32QImode);
+      modes->safe_push (VNx16QImode);
+    }
+
+  return 0;
+}
+
+/* Implement TARGET_VECTORIZE_GET_MASK_MODE.  */
+
+static opt_machine_mode
+riscv_get_mask_mode (machine_mode mode)
+{
+  machine_mode mask_mode = VOIDmode;
+  if (TARGET_VECTOR
+      && riscv_vector::riscv_vector_get_mask_mode (mode).exists (&mask_mode))
+    return mask_mode;
+
+  return default_get_mask_mode (mode);
+}
+
+/* Implement TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.  Assume for now that
+   it isn't worth branching around empty masked ops (including masked
+   stores).  */
+
+static bool
+riscv_empty_mask_is_expensive (unsigned)
+{
+  return false;
+}
+
 /* Return true if a shift-amount matches the trailing cleared bits on
    a bitmask.  */
 
@@ -7382,6 +7520,24 @@ riscv_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
 #undef TARGET_VERIFY_TYPE_CONTEXT
 #define TARGET_VERIFY_TYPE_CONTEXT riscv_verify_type_context
 
+#undef TARGET_ESTIMATED_POLY_VALUE
+#define TARGET_ESTIMATED_POLY_VALUE riscv_estimated_poly_value
+
+#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE
+#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE riscv_preferred_simd_mode
+
+#undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES
+#define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES riscv_autovectorize_vector_modes
+
+#undef TARGET_VECTORIZE_GET_MASK_MODE
+#define TARGET_VECTORIZE_GET_MASK_MODE riscv_get_mask_mode
+
+#undef TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE
+#define TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE riscv_empty_mask_is_expensive
+
+#undef TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK
+#define TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK riscv_loop_len_override_mask
+
 #undef TARGET_VECTOR_ALIGNMENT
 #define TARGET_VECTOR_ALIGNMENT riscv_vector_alignment
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
                   ` (3 preceding siblings ...)
  2023-04-17 18:36 ` [PATCH v4 04/10] RISC-V:autovec: Add target vectorization hooks Michael Collison
@ 2023-04-17 18:36 ` Michael Collison
  2023-04-18 23:14   ` Jeff Law
                     ` (2 more replies)
  2023-04-17 18:36 ` [PATCH v4 06/10] RISC-V:autovec: Add autovectorization tests for add & sub Michael Collison
                   ` (6 subsequent siblings)
  11 siblings, 3 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:36 UTC (permalink / raw)
  To: gcc-patches

2023-03-02  Michael Collison  <collison@rivosinc.com>
	    Juzhe Zhong  <juzhe.zhong@rivai.ai>

	* config/riscv/riscv.md (riscv_vector_preferred_simd_mode): Include
	vector-iterators.md.
	* config/riscv/vector-auto.md: New file containing
	autovectorization patterns.
	* config/riscv/vector-iterators.md (UNSPEC_VADD/UNSPEC_VSUB):
	New unspecs for autovectorization patterns.
	* config/riscv/vector.md: Remove include of vector-iterators.md
	and include vector-auto.md.
---
 gcc/config/riscv/riscv.md            |  1 +
 gcc/config/riscv/vector-auto.md      | 79 ++++++++++++++++++++++++++++
 gcc/config/riscv/vector-iterators.md |  2 +
 gcc/config/riscv/vector.md           |  4 +-
 4 files changed, 84 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/riscv/vector-auto.md

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index bc384d9aedf..7f8f3a6cb18 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -135,6 +135,7 @@
 (include "predicates.md")
 (include "constraints.md")
 (include "iterators.md")
+(include "vector-iterators.md")
 
 ;; ....................
 ;;
diff --git a/gcc/config/riscv/vector-auto.md b/gcc/config/riscv/vector-auto.md
new file mode 100644
index 00000000000..dc62f9af705
--- /dev/null
+++ b/gcc/config/riscv/vector-auto.md
@@ -0,0 +1,79 @@
+;; Machine description for RISC-V 'V' Extension for GNU compiler.
+;; Copyright (C) 2022-2023 Free Software Foundation, Inc.
+;; Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd.
+;; Contributed by Michael Collison (collison@rivosinc.com, Rivos Inc.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+
+;; -------------------------------------------------------------------------
+;; ---- [INT] Addition
+;; -------------------------------------------------------------------------
+;; Includes:
+;; - vadd.vv
+;; - vadd.vx
+;; - vadd.vi
+;; -------------------------------------------------------------------------
+
+(define_expand "<optab><mode>3"
+  [(set (match_operand:VI 0 "register_operand")
+	(any_int_binop:VI (match_operand:VI 1 "register_operand")
+			  (match_operand:VI 2 "register_operand")))]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = RVV_VUNDEF (<MODE>mode);
+  rtx vl = gen_reg_rtx (Pmode);
+  emit_vlmax_vsetvl (<MODE>mode, vl);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(<VM>mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_<optab><mode>(operands[0], mask, merge, operands[1], operands[2],
+				vl, tail_policy, mask_policy, vlmax_avl_p));
+
+  DONE;
+})
+
+(define_expand "cond_<optab><mode>3"
+  [(set (match_operand:VI 0 "register_operand")
+	(if_then_else:VI
+	 (unspec:<VM>
+	  [(match_operand:<VM> 1 "register_operand")] UNSPEC_VPREDICATE)
+	 (any_int_binop:VI
+	  (match_operand:VI 2 "register_operand")
+	  (match_operand:VI 3 "register_operand"))
+	 (match_operand:VI 4 "register_operand")))]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = operands[4];
+  rtx vl = gen_reg_rtx (Pmode);
+  emit_vlmax_vsetvl (<MODE>mode, vl);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = operands[1];
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_<optab><mode>(operands[0], mask, merge, operands[2], operands[3],
+				vl, tail_policy, mask_policy, vlmax_avl_p));
+  DONE;
+})
+
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 70ad85b661b..7fae87968d7 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -34,6 +34,8 @@
   UNSPEC_VMULHU
   UNSPEC_VMULHSU
 
+  UNSPEC_VADD
+  UNSPEC_VSUB
   UNSPEC_VADC
   UNSPEC_VSBC
   UNSPEC_VMADC
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 0ecca98f20c..2ac5b744503 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -26,8 +26,6 @@
 ;; - Auto-vectorization (TBD)
 ;; - Combine optimization (TBD)
 
-(include "vector-iterators.md")
-
 (define_constants [
    (INVALID_ATTRIBUTE            255)
    (X0_REGNUM                      0)
@@ -351,6 +349,8 @@
 	   (symbol_ref "INTVAL (operands[4])")]
 	(const_int INVALID_ATTRIBUTE)))
 
+(include "vector-auto.md")
+
 ;; -----------------------------------------------------------------
 ;; ---- Miscellaneous Operations
 ;; -----------------------------------------------------------------
-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 06/10] RISC-V:autovec: Add autovectorization tests for add & sub
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
                   ` (4 preceding siblings ...)
  2023-04-17 18:36 ` [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations Michael Collison
@ 2023-04-17 18:36 ` Michael Collison
  2023-04-17 18:36 ` [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2 Michael Collison
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:36 UTC (permalink / raw)
  To: gcc-patches

2023-03-02  Michael Collison  <collison@rivosinc.com>
	    Vineet Gupta <vineetg@rivosinc.com>

	* gcc.target/riscv/rvv/autovec: New directory
	for autovectorization tests.
	* gcc.target/riscv/rvv/autovec/loop-add-rv32.c: New
	test to verify code generation of vector add on rv32.
	* gcc.target/riscv/rvv/autovec/loop-add.c: New
	test to verify code generation of vector add on rv64.
	* gcc.target/riscv/rvv/autovec/loop-sub-rv32.c: New
	test to verify code generation of vector subtract on rv32.
	* gcc.target/riscv/rvv/autovec/loop-sub.c: New
	test to verify code generation of vector subtract on rv64.
---
 .../riscv/rvv/autovec/loop-add-rv32.c         | 24 +++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/loop-add.c   | 24 +++++++++++++++++++
 .../riscv/rvv/autovec/loop-sub-rv32.c         | 24 +++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/loop-sub.c   | 24 +++++++++++++++++++
 4 files changed, 96 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
new file mode 100644
index 00000000000..bdc3b6892e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] + b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
new file mode 100644
index 00000000000..d7f992c7d27
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] + b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
new file mode 100644
index 00000000000..7d0a40ec539
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] - b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvsub\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
new file mode 100644
index 00000000000..c8900884f83
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] - b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvsub\.vv} 6 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
                   ` (5 preceding siblings ...)
  2023-04-17 18:36 ` [PATCH v4 06/10] RISC-V:autovec: Add autovectorization tests for add & sub Michael Collison
@ 2023-04-17 18:36 ` Michael Collison
  2023-04-18  6:11   ` Richard Biener
  2023-04-17 18:36 ` [PATCH v4 08/10] RISC-V:autovec: Add autovectorization tests for binary integer Michael Collison
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:36 UTC (permalink / raw)
  To: gcc-patches

While working on autovectorizing for the RISCV port I encountered an issue
where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
where GET_MODE_NUNITS is equal to one.

Tested on RISCV and x86_64-linux-gnu. Okay?

2023-03-09  Michael Collison  <collison@rivosinc.com>

	* tree-vect-slp.cc (can_duplicate_and_interleave_p):
	Check that GET_MODE_NUNITS is a multiple of 2.
---
 gcc/tree-vect-slp.cc | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index d73deaecce0..a64fe454e19 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
 	    (GET_MODE_BITSIZE (int_mode), 1);
 	  tree vector_type
 	    = get_vectype_for_scalar_type (vinfo, int_type, count);
+	  poly_int64 half_nelts;
 	  if (vector_type
 	      && VECTOR_MODE_P (TYPE_MODE (vector_type))
 	      && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
-			   GET_MODE_SIZE (base_vector_mode)))
+			   GET_MODE_SIZE (base_vector_mode))
+	      && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
+			     2, &half_nelts))
 	    {
 	      /* Try fusing consecutive sequences of COUNT / NVECTORS elements
 		 together into elements of type INT_TYPE and using the result
@@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
 	      poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
 	      vec_perm_builder sel1 (nelts, 2, 3);
 	      vec_perm_builder sel2 (nelts, 2, 3);
-	      poly_int64 half_nelts = exact_div (nelts, 2);
+
 	      for (unsigned int i = 0; i < 3; ++i)
 		{
 		  sel1.quick_push (i);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 08/10] RISC-V:autovec: Add autovectorization tests for binary integer
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
                   ` (6 preceding siblings ...)
  2023-04-17 18:36 ` [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2 Michael Collison
@ 2023-04-17 18:36 ` Michael Collison
  2023-04-17 18:37 ` [PATCH v4 09/10] This patch adds a guard for VNx1 vectors that are present in ports like riscv Michael Collison
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:36 UTC (permalink / raw)
  To: gcc-patches

2023-04-05  Michael Collison  <collison@rivosinc.com>

	* gcc.target/riscv/rvv/autovec/loop-and-rv32.c: New
        test to verify code generation of vector "and" on rv32.
        * gcc.target/riscv/rvv/autovec/loop-and.c: New
        test to verify code generation of vector "and" on rv64.
        * gcc.target/riscv/rvv/autovec/loop-div-rv32.c: New
        test to verify code generation of vector divide on rv32.
        * gcc.target/riscv/rvv/autovec/loop-div.c: New
        test to verify code generation of vector divide on rv64.
        * gcc.target/riscv/rvv/autovec/loop-max-rv32.c: New
        test to verify code generation of vector maximum on rv32.
        * gcc.target/riscv/rvv/autovec/loop-max.c: New
        test to verify code generation of vector maximum on rv64.
        * gcc.target/riscv/rvv/autovec/loop-min-rv32.c: New
        test to verify code generation of vector minimum on rv32.
        * gcc.target/riscv/rvv/autovec/loop-min.c: New
        test to verify code generation of vector minimum on rv64.
        * gcc.target/riscv/rvv/autovec/loop-mod-rv32.c: New
        test to verify code generation of vector modulus on rv32.
        * gcc.target/riscv/rvv/autovec/loop-mod.c: New
        test to verify code generation of vector modulus on rv64.
        * gcc.target/riscv/rvv/autovec/loop-mul-rv32.c: New
        test to verify code generation of vector multiply on rv32.
        * gcc.target/riscv/rvv/autovec/loop-mul.c: New
        test to verify code generation of vector multiply on rv64.
        * gcc.target/riscv/rvv/autovec/loop-or-rv32.c: New
        test to verify code generation of vector "or" on rv32.
        * gcc.target/riscv/rvv/autovec/loop-or.c: New
        test to verify code generation of vector "or" on rv64.
        * gcc.target/riscv/rvv/autovec/loop-xor-rv32.c: New
        test to verify code generation of vector xor on rv32.
        * gcc.target/riscv/rvv/autovec/loop-xor.c: New
        test to verify code generation of vector xor on rv64.
---
 .../riscv/rvv/autovec/loop-and-rv32.c         | 24 ++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/loop-and.c   | 24 ++++++++++++++++++
 .../riscv/rvv/autovec/loop-div-rv32.c         | 25 +++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/loop-div.c   | 25 +++++++++++++++++++
 .../riscv/rvv/autovec/loop-max-rv32.c         | 25 +++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/loop-max.c   | 25 +++++++++++++++++++
 .../riscv/rvv/autovec/loop-min-rv32.c         | 25 +++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/loop-min.c   | 25 +++++++++++++++++++
 .../riscv/rvv/autovec/loop-mod-rv32.c         | 25 +++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/loop-mod.c   | 25 +++++++++++++++++++
 .../riscv/rvv/autovec/loop-mul-rv32.c         | 24 ++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/loop-mul.c   | 24 ++++++++++++++++++
 .../riscv/rvv/autovec/loop-or-rv32.c          | 24 ++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/loop-or.c    | 24 ++++++++++++++++++
 .../riscv/rvv/autovec/loop-xor-rv32.c         | 24 ++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/loop-xor.c   | 24 ++++++++++++++++++
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp    |  3 +++
 17 files changed, 395 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
new file mode 100644
index 00000000000..eb1ac5b44fd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vand_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] & b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvand\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
new file mode 100644
index 00000000000..ff0cc2a5df7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vand_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] & b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvand\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
new file mode 100644
index 00000000000..21960f265b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vdiv_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] / b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvdiv\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvdivu\.vv} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
new file mode 100644
index 00000000000..bd675b4f6f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vdiv_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] / b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvdiv\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvdivu\.vv} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
new file mode 100644
index 00000000000..751ee9ecaa3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vmax_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] >= b[i] ? a[i] : b[i];			\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvmax\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmaxu\.vv} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
new file mode 100644
index 00000000000..f4dbf3f04fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vmax_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] >= b[i] ? a[i] : b[i];			\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvmax\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmaxu\.vv} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
new file mode 100644
index 00000000000..e51cf590577
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vmin_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] <= b[i] ? a[i] : b[i];			\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvmin\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvminu\.vv} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
new file mode 100644
index 00000000000..304f939f6f9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vmin_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] <= b[i] ? a[i] : b[i];			\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvmin\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvminu\.vv} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
new file mode 100644
index 00000000000..7c497f6e4cc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vmod_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] % b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvrem\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvremu\.vv} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
new file mode 100644
index 00000000000..7508f4a50d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vmod_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] % b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvrem\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvremu\.vv} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
new file mode 100644
index 00000000000..fd6dcbf9c53
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] * b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvmul\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
new file mode 100644
index 00000000000..9fce40890ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] * b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvmul\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
new file mode 100644
index 00000000000..305d106abd9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vor_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] | b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvor\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
new file mode 100644
index 00000000000..501017bc790
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vor_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] | b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvor\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
new file mode 100644
index 00000000000..6a9ffdb11d5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vxor_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] ^ b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvxor\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c
new file mode 100644
index 00000000000..c9d7d7f8a75
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE) 				\
+  void vxor_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)	\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = a[i] ^ b[i];				\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()	\
+ TEST_TYPE(int16_t)	\
+ TEST_TYPE(uint16_t)	\
+ TEST_TYPE(int32_t)	\
+ TEST_TYPE(uint32_t)	\
+ TEST_TYPE(int64_t)	\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvxor\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
index 7a9a2b6ac48..081fa9363de 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
+++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
@@ -40,10 +40,13 @@ dg-init
 
 # Main loop.
 set CFLAGS "$DEFAULT_CFLAGS -march=$gcc_march -O3"
+set AUTOVECFLAGS "$DEFAULT_CFLAGS -march=$gcc_march -O2 -fno-vect-cost-model -std=c99"
 dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/base/*.\[cS\]]] \
 	"" $CFLAGS
 gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/vsetvl/*.\[cS\]]] \
 	"" $CFLAGS
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/*.\[cS\]]] \
+	"" $AUTOVECFLAGS
 
 # All done.
 dg-finish
-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 09/10] This patch adds a guard for VNx1 vectors that are present in ports like riscv.
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
                   ` (7 preceding siblings ...)
  2023-04-17 18:36 ` [PATCH v4 08/10] RISC-V:autovec: Add autovectorization tests for binary integer Michael Collison
@ 2023-04-17 18:37 ` Michael Collison
  2023-04-18 14:26   ` Kito Cheng
  2023-04-17 18:37 ` [PATCH v4 10/10] This patch supports 8 bit auto-vectorization in riscv Michael Collison
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:37 UTC (permalink / raw)
  To: gcc-patches

From: Kevin Lee <kevinl@rivosinc.com>

Kevin Lee <kevinl@rivosinc.com>
gcc/ChangeLog:

	* tree-vect-data-refs.cc (vect_grouped_store_supported): Add new
condition
---
 gcc/tree-vect-data-refs.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 8daf7bd7dd3..df393ba723d 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -5399,6 +5399,8 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count)
 	  poly_uint64 nelt = GET_MODE_NUNITS (mode);
 
 	  /* The encoding has 2 interleaved stepped patterns.  */
+    if(!multiple_p (nelt, 2))
+      return false;
 	  vec_perm_builder sel (nelt, 2, 3);
 	  sel.quick_grow (6);
 	  for (i = 0; i < 3; i++)
-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v4 10/10] This patch supports 8 bit auto-vectorization in riscv.
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
                   ` (8 preceding siblings ...)
  2023-04-17 18:37 ` [PATCH v4 09/10] This patch adds a guard for VNx1 vectors that are present in ports like riscv Michael Collison
@ 2023-04-17 18:37 ` Michael Collison
  2023-04-17 19:26 ` [PATCH v4 00/10] RISC-V: Add autovec support Palmer Dabbelt
  2023-04-25 15:26 ` Palmer Dabbelt
  11 siblings, 0 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-17 18:37 UTC (permalink / raw)
  To: gcc-patches

From: Kevin Lee <kevinl@rivosinc.com>

2023-04-14 Kevin Lee <kevinl@rivosinc.com>
gcc/testsuite/ChangeLog:

	* config/riscv/riscv.cc (riscv_autovectorize_vector_modes): Add
new vector mode
	* gcc.target/riscv/rvv/autovec/loop-add-rv32.c: Support 8bit
type
	* gcc.target/riscv/rvv/autovec/loop-add.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-and-rv32.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-and.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-div-rv32.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-div.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-max-rv32.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-max.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-min-rv32.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-min.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-mod-rv32.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-mod.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-mul-rv32.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-mul.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-or-rv32.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-or.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-sub-rv32.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-sub.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-xor-rv32.c: Ditto
	* gcc.target/riscv/rvv/autovec/loop-xor.c: Ditto
---
 gcc/config/riscv/riscv.cc                                 | 1 +
 .../gcc.target/riscv/rvv/autovec/loop-add-rv32.c          | 5 +++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c     | 5 +++--
 .../gcc.target/riscv/rvv/autovec/loop-and-rv32.c          | 5 +++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c     | 5 +++--
 .../gcc.target/riscv/rvv/autovec/loop-div-rv32.c          | 8 +++++---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c     | 8 +++++---
 .../gcc.target/riscv/rvv/autovec/loop-max-rv32.c          | 7 ++++---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c     | 7 ++++---
 .../gcc.target/riscv/rvv/autovec/loop-min-rv32.c          | 7 ++++---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c     | 7 ++++---
 .../gcc.target/riscv/rvv/autovec/loop-mod-rv32.c          | 8 +++++---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c     | 8 +++++---
 .../gcc.target/riscv/rvv/autovec/loop-mul-rv32.c          | 5 +++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c     | 5 +++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c | 5 +++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c      | 5 +++--
 .../gcc.target/riscv/rvv/autovec/loop-sub-rv32.c          | 5 +++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c     | 5 +++--
 .../gcc.target/riscv/rvv/autovec/loop-xor-rv32.c          | 5 +++--
 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c     | 5 +++--
 21 files changed, 73 insertions(+), 48 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 9af06d926cf..a2cb83e1916 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -7147,6 +7147,7 @@ riscv_autovectorize_vector_modes (vector_modes *modes, bool)
       modes->safe_push (VNx8QImode);
       modes->safe_push (VNx4QImode);
       modes->safe_push (VNx2QImode);
+      modes->safe_push (VNx1QImode);
     }
   else if (riscv_vectorization_factor == RVV_LMUL2)
     {
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
index bdc3b6892e9..76f5a3a3ff5 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] + b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvadd\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
index d7f992c7d27..3d1e10bf4e1 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] + b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvadd\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
index eb1ac5b44fd..a4c7abfb0ad 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] & b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvand\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvand\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
index ff0cc2a5df7..a795e0968a9 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] & b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvand\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvand\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
index 21960f265b7..c734bb9c5f0 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] / b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,5 +22,6 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvdiv\.vv} 3 } } */
-/* { dg-final { scan-assembler-times {\tvdivu\.vv} 3 } } */
+/* int8_t and int16_t not autovec currently */
+/* { dg-final { scan-assembler-times {\tvdiv\.vv} 2 } } */
+/* { dg-final { scan-assembler-times {\tvdivu\.vv} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
index bd675b4f6f0..9f57cd91054 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] / b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,5 +22,6 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvdiv\.vv} 3 } } */
-/* { dg-final { scan-assembler-times {\tvdivu\.vv} 3 } } */
+/* int8_t and int16_t not autovec currently */
+/* { dg-final { scan-assembler-times {\tvdiv\.vv} 2 } } */
+/* { dg-final { scan-assembler-times {\tvdivu\.vv} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
index 751ee9ecaa3..bd825c3dfaa 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] >= b[i] ? a[i] : b[i];			\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,5 +22,5 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvmax\.vv} 3 } } */
-/* { dg-final { scan-assembler-times {\tvmaxu\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmax\.vv} 4 } } */
+/* { dg-final { scan-assembler-times {\tvmaxu\.vv} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
index f4dbf3f04fc..729fbe0bc76 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] >= b[i] ? a[i] : b[i];			\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,5 +22,5 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvmax\.vv} 3 } } */
-/* { dg-final { scan-assembler-times {\tvmaxu\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmax\.vv} 4 } } */
+/* { dg-final { scan-assembler-times {\tvmaxu\.vv} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
index e51cf590577..808c2879d86 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] <= b[i] ? a[i] : b[i];			\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,5 +22,5 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvmin\.vv} 3 } } */
-/* { dg-final { scan-assembler-times {\tvminu\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmin\.vv} 4 } } */
+/* { dg-final { scan-assembler-times {\tvminu\.vv} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
index 304f939f6f9..c81ba64223f 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] <= b[i] ? a[i] : b[i];			\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,5 +22,5 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvmin\.vv} 3 } } */
-/* { dg-final { scan-assembler-times {\tvminu\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvmin\.vv} 4 } } */
+/* { dg-final { scan-assembler-times {\tvminu\.vv} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
index 7c497f6e4cc..9ce4f82b3a8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] % b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,5 +22,6 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvrem\.vv} 3 } } */
-/* { dg-final { scan-assembler-times {\tvremu\.vv} 3 } } */
+/* int8_t and int16_t not autovec currently */
+/* { dg-final { scan-assembler-times {\tvrem\.vv} 2 } } */
+/* { dg-final { scan-assembler-times {\tvremu\.vv} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
index 7508f4a50d1..46fbff22266 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] % b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,5 +22,6 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvrem\.vv} 3 } } */
-/* { dg-final { scan-assembler-times {\tvremu\.vv} 3 } } */
+/* int8_t and int16_t not autovec currently */
+/* { dg-final { scan-assembler-times {\tvrem\.vv} 2 } } */
+/* { dg-final { scan-assembler-times {\tvremu\.vv} 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
index fd6dcbf9c53..336af62359e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] * b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvmul\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvmul\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
index 9fce40890ef..12a17d0da00 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] * b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvmul\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvmul\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
index 305d106abd9..b272d893114 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] | b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvor\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvor\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
index 501017bc790..52243be3712 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] | b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvor\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvor\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
index 7d0a40ec539..6fdce0f7881 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] - b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvsub\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvsub\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
index c8900884f83..73369745afc 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] - b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvsub\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvsub\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
index 6a9ffdb11d5..bd43e60cceb 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] ^ b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvxor\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvxor\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c
index c9d7d7f8a75..cb3adde80c9 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c
@@ -10,8 +10,9 @@
       dst[i] = a[i] ^ b[i];				\
   }
 
-/* *int8_t not autovec currently. */
 #define TEST_ALL()	\
+ TEST_TYPE(int8_t)	\
+ TEST_TYPE(uint8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(uint16_t)	\
  TEST_TYPE(int32_t)	\
@@ -21,4 +22,4 @@
 
 TEST_ALL()
 
-/* { dg-final { scan-assembler-times {\tvxor\.vv} 6 } } */
+/* { dg-final { scan-assembler-times {\tvxor\.vv} 8 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 00/10] RISC-V: Add autovec support
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
                   ` (9 preceding siblings ...)
  2023-04-17 18:37 ` [PATCH v4 10/10] This patch supports 8 bit auto-vectorization in riscv Michael Collison
@ 2023-04-17 19:26 ` Palmer Dabbelt
  2023-04-18  6:22   ` Richard Biener
  2023-04-25 15:26 ` Palmer Dabbelt
  11 siblings, 1 reply; 36+ messages in thread
From: Palmer Dabbelt @ 2023-04-17 19:26 UTC (permalink / raw)
  To: collison; +Cc: gcc-patches

On Mon, 17 Apr 2023 11:36:51 PDT (-0700), collison@rivosinc.com wrote:
> This series of patches adds foundational support for RISC-V auto-vectorization support. These patches are based on the current upstream rvv vector intrinsic support and is not a new implementation. Most of the implementation consists of adding the new vector cost model, the autovectorization patterns themselves and target hooks. This implementation only provides support for integer addition and subtraction as a proof of concept. This patch set should not be construed to be feature complete. Based on conversations with the community these patches are intended to lay the groundwork for feature completion and collaboration within the RISC-V community.
>
> These patches are largely based off the work of Juzhe Zhong (juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>) of RiVAI. More specifically the rvv-next branch at: https://github.com/riscv-collab/riscv-gcc.git <https://github.com/riscv-collab/riscv-gcc.git>is the foundation of this patch set.
>
> As discussed on this list, if these patches are approved they will be merged into a "auto-vectorization" branch once gcc-13 branches for release. There are two known issues related to crashes (assert failures) associated with tree vectorization; one of which I have sent a patch for and have received feedback.
>
> Changes in v4:
>
> - Added support for binary integer operations and test cases
> - Fixed bug to support 8-bit integer vectorization
> - Fixed several assert errors related to non-multiple of two vector modes
>
> Changes in v3:
>
> - Removed the cost model and cost hooks based on feedback from Richard Biener
> - Used RVV_VUNDEF macro to fix failing patterns
>
> Changes in v2
>
> - Updated ChangeLog entry to include RiVAI contributions
> - Fixed ChangeLog email formatting
> - Fixed gnu formatting issues in the code
>
> Kevin Lee (2):
>   This patch adds a guard for VNx1 vectors that are present in ports
>     like riscv.
>   This patch supports 8 bit auto-vectorization in riscv.
>
> Michael Collison (8):
>   RISC-V: Add new predicates and function prototypes
>   RISC-V: autovec: Export policy functions to global scope
>   RISC-V:autovec: Add auto-vectorization support functions
>   RISC-V:autovec: Add target vectorization hooks
>   RISC-V:autovec: Add autovectorization patterns for binary integer
>     operations
>   RISC-V:autovec: Add autovectorization tests for add & sub
>   vect: Verify that GET_MODE_NUNITS is a multiple of 2.
>   RISC-V:autovec: Add autovectorization tests for binary integer
>
>  gcc/config/riscv/predicates.md                |  13 ++
>  gcc/config/riscv/riscv-opts.h                 |  40 ++++
>  gcc/config/riscv/riscv-protos.h               |  14 ++
>  gcc/config/riscv/riscv-v.cc                   | 176 ++++++++++++++++++
>  gcc/config/riscv/riscv-vector-builtins.cc     |   4 +-
>  gcc/config/riscv/riscv-vector-builtins.h      |   3 +
>  gcc/config/riscv/riscv.cc                     | 157 ++++++++++++++++
>  gcc/config/riscv/riscv.md                     |   1 +
>  gcc/config/riscv/riscv.opt                    |  20 ++
>  gcc/config/riscv/vector-auto.md               |  79 ++++++++
>  gcc/config/riscv/vector-iterators.md          |   2 +
>  gcc/config/riscv/vector.md                    |   4 +-
>  .../riscv/rvv/autovec/loop-add-rv32.c         |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-add.c   |  25 +++
>  .../riscv/rvv/autovec/loop-and-rv32.c         |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-and.c   |  25 +++
>  .../riscv/rvv/autovec/loop-div-rv32.c         |  27 +++
>  .../gcc.target/riscv/rvv/autovec/loop-div.c   |  27 +++
>  .../riscv/rvv/autovec/loop-max-rv32.c         |  26 +++
>  .../gcc.target/riscv/rvv/autovec/loop-max.c   |  26 +++
>  .../riscv/rvv/autovec/loop-min-rv32.c         |  26 +++
>  .../gcc.target/riscv/rvv/autovec/loop-min.c   |  26 +++
>  .../riscv/rvv/autovec/loop-mod-rv32.c         |  27 +++
>  .../gcc.target/riscv/rvv/autovec/loop-mod.c   |  27 +++
>  .../riscv/rvv/autovec/loop-mul-rv32.c         |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-mul.c   |  25 +++
>  .../riscv/rvv/autovec/loop-or-rv32.c          |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-or.c    |  25 +++
>  .../riscv/rvv/autovec/loop-sub-rv32.c         |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-sub.c   |  25 +++
>  .../riscv/rvv/autovec/loop-xor-rv32.c         |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-xor.c   |  25 +++
>  gcc/testsuite/gcc.target/riscv/rvv/rvv.exp    |   3 +
>  gcc/tree-vect-data-refs.cc                    |   2 +
>  gcc/tree-vect-slp.cc                          |   7 +-
>  35 files changed, 1031 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/config/riscv/vector-auto.md
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c

Thanks for re-spinning these.  I haven't looked at the actual code yet, 
but I think there's still a bigger question as to which way we go here: 
Juzhe has talked about wanting to make some larger changes, but as per 
some IRC discussions at least the type widening (and possible some of 
the other bigger generic changes) aren't suitable for trunk yet as we're 
still waiting for the test failures to calm down.

So I think that leaves us with the option of either taking something 
like this now, or waiting.  I'd prefer to just get things committed to 
trunk so we can all work in the same place, but happy to hear if other 
folks have comments.

I certainly don't intend on committing any of this until it's at least 
reviewed and folks are OK with the approach.  There's still some 
testsuite failures to track down for 13 so no big rush on actually 
committing things, but a bunch of folks are spinning up on autovec now 
so I'd at least like to get agreement as to which direction we're headed 
sooner rather than later.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
  2023-04-17 18:36 ` [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2 Michael Collison
@ 2023-04-18  6:11   ` Richard Biener
  2023-04-18 14:28     ` Kito Cheng
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Biener @ 2023-04-18  6:11 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches

On Mon, Apr 17, 2023 at 8:42 PM Michael Collison <collison@rivosinc.com> wrote:
>
> While working on autovectorizing for the RISCV port I encountered an issue
> where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> where GET_MODE_NUNITS is equal to one.
>
> Tested on RISCV and x86_64-linux-gnu. Okay?

OK.

> 2023-03-09  Michael Collison  <collison@rivosinc.com>
>
>         * tree-vect-slp.cc (can_duplicate_and_interleave_p):
>         Check that GET_MODE_NUNITS is a multiple of 2.
> ---
>  gcc/tree-vect-slp.cc | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index d73deaecce0..a64fe454e19 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
>             (GET_MODE_BITSIZE (int_mode), 1);
>           tree vector_type
>             = get_vectype_for_scalar_type (vinfo, int_type, count);
> +         poly_int64 half_nelts;
>           if (vector_type
>               && VECTOR_MODE_P (TYPE_MODE (vector_type))
>               && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> -                          GET_MODE_SIZE (base_vector_mode)))
> +                          GET_MODE_SIZE (base_vector_mode))
> +             && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
> +                            2, &half_nelts))
>             {
>               /* Try fusing consecutive sequences of COUNT / NVECTORS elements
>                  together into elements of type INT_TYPE and using the result
> @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
>               poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
>               vec_perm_builder sel1 (nelts, 2, 3);
>               vec_perm_builder sel2 (nelts, 2, 3);
> -             poly_int64 half_nelts = exact_div (nelts, 2);
> +
>               for (unsigned int i = 0; i < 3; ++i)
>                 {
>                   sel1.quick_push (i);
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 00/10] RISC-V: Add autovec support
  2023-04-17 19:26 ` [PATCH v4 00/10] RISC-V: Add autovec support Palmer Dabbelt
@ 2023-04-18  6:22   ` Richard Biener
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Biener @ 2023-04-18  6:22 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: collison, gcc-patches

On Mon, Apr 17, 2023 at 9:26 PM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> On Mon, 17 Apr 2023 11:36:51 PDT (-0700), collison@rivosinc.com wrote:
> > This series of patches adds foundational support for RISC-V auto-vectorization support. These patches are based on the current upstream rvv vector intrinsic support and is not a new implementation. Most of the implementation consists of adding the new vector cost model, the autovectorization patterns themselves and target hooks. This implementation only provides support for integer addition and subtraction as a proof of concept. This patch set should not be construed to be feature complete. Based on conversations with the community these patches are intended to lay the groundwork for feature completion and collaboration within the RISC-V community.
> >
> > These patches are largely based off the work of Juzhe Zhong (juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>) of RiVAI. More specifically the rvv-next branch at: https://github.com/riscv-collab/riscv-gcc.git <https://github.com/riscv-collab/riscv-gcc.git>is the foundation of this patch set.
> >
> > As discussed on this list, if these patches are approved they will be merged into a "auto-vectorization" branch once gcc-13 branches for release. There are two known issues related to crashes (assert failures) associated with tree vectorization; one of which I have sent a patch for and have received feedback.
> >
> > Changes in v4:
> >
> > - Added support for binary integer operations and test cases
> > - Fixed bug to support 8-bit integer vectorization
> > - Fixed several assert errors related to non-multiple of two vector modes
> >
> > Changes in v3:
> >
> > - Removed the cost model and cost hooks based on feedback from Richard Biener
> > - Used RVV_VUNDEF macro to fix failing patterns
> >
> > Changes in v2
> >
> > - Updated ChangeLog entry to include RiVAI contributions
> > - Fixed ChangeLog email formatting
> > - Fixed gnu formatting issues in the code
> >
> > Kevin Lee (2):
> >   This patch adds a guard for VNx1 vectors that are present in ports
> >     like riscv.
> >   This patch supports 8 bit auto-vectorization in riscv.
> >
> > Michael Collison (8):
> >   RISC-V: Add new predicates and function prototypes
> >   RISC-V: autovec: Export policy functions to global scope
> >   RISC-V:autovec: Add auto-vectorization support functions
> >   RISC-V:autovec: Add target vectorization hooks
> >   RISC-V:autovec: Add autovectorization patterns for binary integer
> >     operations
> >   RISC-V:autovec: Add autovectorization tests for add & sub
> >   vect: Verify that GET_MODE_NUNITS is a multiple of 2.
> >   RISC-V:autovec: Add autovectorization tests for binary integer
> >
> >  gcc/config/riscv/predicates.md                |  13 ++
> >  gcc/config/riscv/riscv-opts.h                 |  40 ++++
> >  gcc/config/riscv/riscv-protos.h               |  14 ++
> >  gcc/config/riscv/riscv-v.cc                   | 176 ++++++++++++++++++
> >  gcc/config/riscv/riscv-vector-builtins.cc     |   4 +-
> >  gcc/config/riscv/riscv-vector-builtins.h      |   3 +
> >  gcc/config/riscv/riscv.cc                     | 157 ++++++++++++++++
> >  gcc/config/riscv/riscv.md                     |   1 +
> >  gcc/config/riscv/riscv.opt                    |  20 ++
> >  gcc/config/riscv/vector-auto.md               |  79 ++++++++
> >  gcc/config/riscv/vector-iterators.md          |   2 +
> >  gcc/config/riscv/vector.md                    |   4 +-
> >  .../riscv/rvv/autovec/loop-add-rv32.c         |  25 +++
> >  .../gcc.target/riscv/rvv/autovec/loop-add.c   |  25 +++
> >  .../riscv/rvv/autovec/loop-and-rv32.c         |  25 +++
> >  .../gcc.target/riscv/rvv/autovec/loop-and.c   |  25 +++
> >  .../riscv/rvv/autovec/loop-div-rv32.c         |  27 +++
> >  .../gcc.target/riscv/rvv/autovec/loop-div.c   |  27 +++
> >  .../riscv/rvv/autovec/loop-max-rv32.c         |  26 +++
> >  .../gcc.target/riscv/rvv/autovec/loop-max.c   |  26 +++
> >  .../riscv/rvv/autovec/loop-min-rv32.c         |  26 +++
> >  .../gcc.target/riscv/rvv/autovec/loop-min.c   |  26 +++
> >  .../riscv/rvv/autovec/loop-mod-rv32.c         |  27 +++
> >  .../gcc.target/riscv/rvv/autovec/loop-mod.c   |  27 +++
> >  .../riscv/rvv/autovec/loop-mul-rv32.c         |  25 +++
> >  .../gcc.target/riscv/rvv/autovec/loop-mul.c   |  25 +++
> >  .../riscv/rvv/autovec/loop-or-rv32.c          |  25 +++
> >  .../gcc.target/riscv/rvv/autovec/loop-or.c    |  25 +++
> >  .../riscv/rvv/autovec/loop-sub-rv32.c         |  25 +++
> >  .../gcc.target/riscv/rvv/autovec/loop-sub.c   |  25 +++
> >  .../riscv/rvv/autovec/loop-xor-rv32.c         |  25 +++
> >  .../gcc.target/riscv/rvv/autovec/loop-xor.c   |  25 +++
> >  gcc/testsuite/gcc.target/riscv/rvv/rvv.exp    |   3 +
> >  gcc/tree-vect-data-refs.cc                    |   2 +
> >  gcc/tree-vect-slp.cc                          |   7 +-
> >  35 files changed, 1031 insertions(+), 6 deletions(-)
> >  create mode 100644 gcc/config/riscv/vector-auto.md
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c
>
> Thanks for re-spinning these.  I haven't looked at the actual code yet,
> but I think there's still a bigger question as to which way we go here:
> Juzhe has talked about wanting to make some larger changes, but as per
> some IRC discussions at least the type widening (and possible some of
> the other bigger generic changes) aren't suitable for trunk yet as we're
> still waiting for the test failures to calm down.
>
> So I think that leaves us with the option of either taking something
> like this now, or waiting.  I'd prefer to just get things committed to
> trunk so we can all work in the same place, but happy to hear if other
> folks have comments.
>
> I certainly don't intend on committing any of this until it's at least
> reviewed and folks are OK with the approach.  There's still some
> testsuite failures to track down for 13 so no big rush on actually
> committing things, but a bunch of folks are spinning up on autovec now
> so I'd at least like to get agreement as to which direction we're headed
> sooner rather than later.

And just to add - the feature that will tell you whether the backend setup
is sustainable is vectorization of loads and stores and that's what I'd
start with, not plus or minus (that itself might also show obvious issues
of course, and without plus or minus but also bit operations most of the
testsuite will fail).

Richard.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 09/10] This patch adds a guard for VNx1 vectors that are present in ports like riscv.
  2023-04-17 18:37 ` [PATCH v4 09/10] This patch adds a guard for VNx1 vectors that are present in ports like riscv Michael Collison
@ 2023-04-18 14:26   ` Kito Cheng
  2023-04-18 18:10     ` Michael Collison
  0 siblings, 1 reply; 36+ messages in thread
From: Kito Cheng @ 2023-04-18 14:26 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches

I would prefer drop this patch from this patch series since I believe
https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zhong@rivai.ai/
is the right fix for this issue.

On Tue, Apr 18, 2023 at 2:40 AM Michael Collison <collison@rivosinc.com> wrote:
>
> From: Kevin Lee <kevinl@rivosinc.com>
>
> Kevin Lee <kevinl@rivosinc.com>
> gcc/ChangeLog:
>
>         * tree-vect-data-refs.cc (vect_grouped_store_supported): Add new
> condition
> ---
>  gcc/tree-vect-data-refs.cc | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
> index 8daf7bd7dd3..df393ba723d 100644
> --- a/gcc/tree-vect-data-refs.cc
> +++ b/gcc/tree-vect-data-refs.cc
> @@ -5399,6 +5399,8 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count)
>           poly_uint64 nelt = GET_MODE_NUNITS (mode);
>
>           /* The encoding has 2 interleaved stepped patterns.  */
> +    if(!multiple_p (nelt, 2))
> +      return false;
>           vec_perm_builder sel (nelt, 2, 3);
>           sel.quick_grow (6);
>           for (i = 0; i < 3; i++)
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
  2023-04-18  6:11   ` Richard Biener
@ 2023-04-18 14:28     ` Kito Cheng
  2023-04-18 18:21       ` Kito Cheng
  0 siblings, 1 reply; 36+ messages in thread
From: Kito Cheng @ 2023-04-18 14:28 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Collison, gcc-patches

Wait, VNx1DImode can be really evaluate to just one element if
-march=rv64g_zve64x,

I thinks this should be just fixed on backend by this patch:

https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zhong@rivai.ai/

On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Mon, Apr 17, 2023 at 8:42 PM Michael Collison <collison@rivosinc.com> wrote:
> >
> > While working on autovectorizing for the RISCV port I encountered an issue
> > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> > evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> > where GET_MODE_NUNITS is equal to one.
> >
> > Tested on RISCV and x86_64-linux-gnu. Okay?
>
> OK.
>
> > 2023-03-09  Michael Collison  <collison@rivosinc.com>
> >
> >         * tree-vect-slp.cc (can_duplicate_and_interleave_p):
> >         Check that GET_MODE_NUNITS is a multiple of 2.
> > ---
> >  gcc/tree-vect-slp.cc | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > index d73deaecce0..a64fe454e19 100644
> > --- a/gcc/tree-vect-slp.cc
> > +++ b/gcc/tree-vect-slp.cc
> > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> >             (GET_MODE_BITSIZE (int_mode), 1);
> >           tree vector_type
> >             = get_vectype_for_scalar_type (vinfo, int_type, count);
> > +         poly_int64 half_nelts;
> >           if (vector_type
> >               && VECTOR_MODE_P (TYPE_MODE (vector_type))
> >               && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> > -                          GET_MODE_SIZE (base_vector_mode)))
> > +                          GET_MODE_SIZE (base_vector_mode))
> > +             && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
> > +                            2, &half_nelts))
> >             {
> >               /* Try fusing consecutive sequences of COUNT / NVECTORS elements
> >                  together into elements of type INT_TYPE and using the result
> > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> >               poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
> >               vec_perm_builder sel1 (nelts, 2, 3);
> >               vec_perm_builder sel2 (nelts, 2, 3);
> > -             poly_int64 half_nelts = exact_div (nelts, 2);
> > +
> >               for (unsigned int i = 0; i < 3; ++i)
> >                 {
> >                   sel1.quick_push (i);
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 09/10] This patch adds a guard for VNx1 vectors that are present in ports like riscv.
  2023-04-18 14:26   ` Kito Cheng
@ 2023-04-18 18:10     ` Michael Collison
  0 siblings, 0 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-18 18:10 UTC (permalink / raw)
  To: Kito Cheng; +Cc: gcc-patches

Thanks Kito I will look into this.


On 4/18/23 10:26, Kito Cheng wrote:
> I would prefer drop this patch from this patch series since I believe
> https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zhong@rivai.ai/
> is the right fix for this issue.
>
> On Tue, Apr 18, 2023 at 2:40 AM Michael Collison <collison@rivosinc.com> wrote:
>> From: Kevin Lee <kevinl@rivosinc.com>
>>
>> Kevin Lee <kevinl@rivosinc.com>
>> gcc/ChangeLog:
>>
>>          * tree-vect-data-refs.cc (vect_grouped_store_supported): Add new
>> condition
>> ---
>>   gcc/tree-vect-data-refs.cc | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
>> index 8daf7bd7dd3..df393ba723d 100644
>> --- a/gcc/tree-vect-data-refs.cc
>> +++ b/gcc/tree-vect-data-refs.cc
>> @@ -5399,6 +5399,8 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count)
>>            poly_uint64 nelt = GET_MODE_NUNITS (mode);
>>
>>            /* The encoding has 2 interleaved stepped patterns.  */
>> +    if(!multiple_p (nelt, 2))
>> +      return false;
>>            vec_perm_builder sel (nelt, 2, 3);
>>            sel.quick_grow (6);
>>            for (i = 0; i < 3; i++)
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
  2023-04-18 14:28     ` Kito Cheng
@ 2023-04-18 18:21       ` Kito Cheng
  2023-04-18 22:48         ` juzhe.zhong
  0 siblings, 1 reply; 36+ messages in thread
From: Kito Cheng @ 2023-04-18 18:21 UTC (permalink / raw)
  To: Richard Biener, Jeff Law, Palmer Dabbelt
  Cc: Michael Collison, gcc-patches, 钟居哲

Few more background about RVV:

RISC-V has provide different VLEN configuration by different ISA
extension like `zve32x`, `zve64x` and `v`
zve32x just guarantee the minimal VLEN is 32 bits,
zve64x guarantee the minimal VLEN is 64 bits,
and v guarantee the minimal VLEN is 128 bits,

Current status (without that patch):

Zve32x: Mode for one vector register mode is VNx1SImode and VNx1DImode
is invalid mode
 - one vector register could hold 1 + 1x SImode where x is 0~n, so it
might hold just one SI

Zve64x: Mode for one vector register mode is VNx1DImode or VNx2SImode
 - one vector register could hold 1 + 1x DImode where x is 0~n, so it
might hold just one DI
 - one vector register could hold 2 + 2x SImode where x is 0~n, so it
might hold just two SI

So what I want to say here is VNx1DImode is really NOT safe to assume
to have more than two DI in theory.

However `v` extension guarantees the minimal VLEN is 128 bits.

We are trying to introduce another type/mode mapping for this configure:

v: Mode for one vector register mode is VNx2DImode or VNx4SImode
 - one vector register could hold 2 + 2x DImode where x is 0~n, so it
will hold at least two DI
 - one vector register could hold 4 + 4x SImode where x is 0~n, so it
will hold at least four DI

So GET_MODE_NUNITS for a single vector register with DI mode will
become 2 (VNx2DImode) if it is really possible, which is a more
precise way to model the vector extension for RISC-V .



On Tue, Apr 18, 2023 at 10:28 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>
> Wait, VNx1DImode can be really evaluate to just one element if
> -march=rv64g_zve64x,
>
> I thinks this should be just fixed on backend by this patch:
>
> https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zhong@rivai.ai/
>
> On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Mon, Apr 17, 2023 at 8:42 PM Michael Collison <collison@rivosinc.com> wrote:
> > >
> > > While working on autovectorizing for the RISCV port I encountered an issue
> > > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> > > evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> > > where GET_MODE_NUNITS is equal to one.
> > >
> > > Tested on RISCV and x86_64-linux-gnu. Okay?
> >
> > OK.
> >
> > > 2023-03-09  Michael Collison  <collison@rivosinc.com>
> > >
> > >         * tree-vect-slp.cc (can_duplicate_and_interleave_p):
> > >         Check that GET_MODE_NUNITS is a multiple of 2.
> > > ---
> > >  gcc/tree-vect-slp.cc | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > > index d73deaecce0..a64fe454e19 100644
> > > --- a/gcc/tree-vect-slp.cc
> > > +++ b/gcc/tree-vect-slp.cc
> > > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> > >             (GET_MODE_BITSIZE (int_mode), 1);
> > >           tree vector_type
> > >             = get_vectype_for_scalar_type (vinfo, int_type, count);
> > > +         poly_int64 half_nelts;
> > >           if (vector_type
> > >               && VECTOR_MODE_P (TYPE_MODE (vector_type))
> > >               && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> > > -                          GET_MODE_SIZE (base_vector_mode)))
> > > +                          GET_MODE_SIZE (base_vector_mode))
> > > +             && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
> > > +                            2, &half_nelts))
> > >             {
> > >               /* Try fusing consecutive sequences of COUNT / NVECTORS elements
> > >                  together into elements of type INT_TYPE and using the result
> > > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> > >               poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
> > >               vec_perm_builder sel1 (nelts, 2, 3);
> > >               vec_perm_builder sel2 (nelts, 2, 3);
> > > -             poly_int64 half_nelts = exact_div (nelts, 2);
> > > +
> > >               for (unsigned int i = 0; i < 3; ++i)
> > >                 {
> > >                   sel1.quick_push (i);
> > > --
> > > 2.34.1
> > >

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
  2023-04-18 18:21       ` Kito Cheng
@ 2023-04-18 22:48         ` juzhe.zhong
  2023-04-18 23:19           ` Michael Collison
  2023-04-20 10:01           ` Richard Sandiford
  0 siblings, 2 replies; 36+ messages in thread
From: juzhe.zhong @ 2023-04-18 22:48 UTC (permalink / raw)
  To: kito.cheng, richard.guenther, Jeff Law, palmer
  Cc: Michael Collison, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4891 bytes --]

Yes, like kito said.
We won't enable VNx1DImode in auto-vectorization so it's meaningless to fix it here.
We dynamic adjust the minimum vector-length for different '-march' according to RVV ISA specification.
So we strongly suggest that we should drop this fix.

Thanks.


juzhe.zhong@rivai.ai
 
From: Kito Cheng
Date: 2023-04-19 02:21
To: Richard Biener; Jeff Law; Palmer Dabbelt
CC: Michael Collison; gcc-patches; 钟居哲
Subject: Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
Few more background about RVV:
 
RISC-V has provide different VLEN configuration by different ISA
extension like `zve32x`, `zve64x` and `v`
zve32x just guarantee the minimal VLEN is 32 bits,
zve64x guarantee the minimal VLEN is 64 bits,
and v guarantee the minimal VLEN is 128 bits,
 
Current status (without that patch):
 
Zve32x: Mode for one vector register mode is VNx1SImode and VNx1DImode
is invalid mode
- one vector register could hold 1 + 1x SImode where x is 0~n, so it
might hold just one SI
 
Zve64x: Mode for one vector register mode is VNx1DImode or VNx2SImode
- one vector register could hold 1 + 1x DImode where x is 0~n, so it
might hold just one DI
- one vector register could hold 2 + 2x SImode where x is 0~n, so it
might hold just two SI
 
So what I want to say here is VNx1DImode is really NOT safe to assume
to have more than two DI in theory.
 
However `v` extension guarantees the minimal VLEN is 128 bits.
 
We are trying to introduce another type/mode mapping for this configure:
 
v: Mode for one vector register mode is VNx2DImode or VNx4SImode
- one vector register could hold 2 + 2x DImode where x is 0~n, so it
will hold at least two DI
- one vector register could hold 4 + 4x SImode where x is 0~n, so it
will hold at least four DI
 
So GET_MODE_NUNITS for a single vector register with DI mode will
become 2 (VNx2DImode) if it is really possible, which is a more
precise way to model the vector extension for RISC-V .
 
 
 
On Tue, Apr 18, 2023 at 10:28 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>
> Wait, VNx1DImode can be really evaluate to just one element if
> -march=rv64g_zve64x,
>
> I thinks this should be just fixed on backend by this patch:
>
> https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zhong@rivai.ai/
>
> On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Mon, Apr 17, 2023 at 8:42 PM Michael Collison <collison@rivosinc.com> wrote:
> > >
> > > While working on autovectorizing for the RISCV port I encountered an issue
> > > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> > > evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> > > where GET_MODE_NUNITS is equal to one.
> > >
> > > Tested on RISCV and x86_64-linux-gnu. Okay?
> >
> > OK.
> >
> > > 2023-03-09  Michael Collison  <collison@rivosinc.com>
> > >
> > >         * tree-vect-slp.cc (can_duplicate_and_interleave_p):
> > >         Check that GET_MODE_NUNITS is a multiple of 2.
> > > ---
> > >  gcc/tree-vect-slp.cc | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > > index d73deaecce0..a64fe454e19 100644
> > > --- a/gcc/tree-vect-slp.cc
> > > +++ b/gcc/tree-vect-slp.cc
> > > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> > >             (GET_MODE_BITSIZE (int_mode), 1);
> > >           tree vector_type
> > >             = get_vectype_for_scalar_type (vinfo, int_type, count);
> > > +         poly_int64 half_nelts;
> > >           if (vector_type
> > >               && VECTOR_MODE_P (TYPE_MODE (vector_type))
> > >               && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> > > -                          GET_MODE_SIZE (base_vector_mode)))
> > > +                          GET_MODE_SIZE (base_vector_mode))
> > > +             && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
> > > +                            2, &half_nelts))
> > >             {
> > >               /* Try fusing consecutive sequences of COUNT / NVECTORS elements
> > >                  together into elements of type INT_TYPE and using the result
> > > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> > >               poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
> > >               vec_perm_builder sel1 (nelts, 2, 3);
> > >               vec_perm_builder sel2 (nelts, 2, 3);
> > > -             poly_int64 half_nelts = exact_div (nelts, 2);
> > > +
> > >               for (unsigned int i = 0; i < 3; ++i)
> > >                 {
> > >                   sel1.quick_push (i);
> > > --
> > > 2.34.1
> > >
 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations
  2023-04-17 18:36 ` [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations Michael Collison
@ 2023-04-18 23:14   ` Jeff Law
  2023-04-19  1:19   ` Kito Cheng
  2023-04-20  2:24   ` juzhe.zhong
  2 siblings, 0 replies; 36+ messages in thread
From: Jeff Law @ 2023-04-18 23:14 UTC (permalink / raw)
  To: Michael Collison, gcc-patches



On 4/17/23 12:36, Michael Collison wrote:
> 2023-03-02  Michael Collison  <collison@rivosinc.com>
> 	    Juzhe Zhong  <juzhe.zhong@rivai.ai>
> 
> 	* config/riscv/riscv.md (riscv_vector_preferred_simd_mode): Include
> 	vector-iterators.md.
> 	* config/riscv/vector-auto.md: New file containing
> 	autovectorization patterns.
> 	* config/riscv/vector-iterators.md (UNSPEC_VADD/UNSPEC_VSUB):
> 	New unspecs for autovectorization patterns.
> 	* config/riscv/vector.md: Remove include of vector-iterators.md
> 	and include vector-auto.md.
So the basic idea here appears to be to have a define_expand with the 
well known names (for the optab interface) generate RTL that is 
subsequently matched by the intrinsics that Juzhe has already defined 
and integrated.

That seems like a reasonable model to start with and get the basic 
functionality in place.  I'm all for focusing on that basic 
functionality first.


> diff --git a/gcc/config/riscv/vector-auto.md b/gcc/config/riscv/vector-auto.md
> new file mode 100644
> index 00000000000..dc62f9af705
> --- /dev/null
> +++ b/gcc/config/riscv/vector-auto.md
So basically vector-auto.md provides the interface to utilize the 
builtins found in vector.md.  Given the size of vector.md I can 
certainly see the desire to separate that out.


> +
> +
> +;; -------------------------------------------------------------------------
> +;; ---- [INT] Addition
Just a note.  This patch actually wires up plus, minus, and, ior, xor, 
ashift, ashiftrt and lshiftrt.  So it's quite a bit more than just 
addition.  So updating the comments is probably warranted.


> +;; -------------------------------------------------------------------------
> +;; Includes:
> +;; - vadd.vv
> +;; - vadd.vx
> +;; - vadd.vi
> +;; -------------------------------------------------------------------------
> +
> +(define_expand "<optab><mode>3"
> +  [(set (match_operand:VI 0 "register_operand")
> +	(any_int_binop:VI (match_operand:VI 1 "register_operand")
> +			  (match_operand:VI 2 "register_operand")))]
> +  "TARGET_VECTOR"
> +{
> +  using namespace riscv_vector;
> +
> +  rtx merge = RVV_VUNDEF (<MODE>mode);
> +  rtx vl = gen_reg_rtx (Pmode);
> +  emit_vlmax_vsetvl (<MODE>mode, vl);
> +  rtx mask_policy = get_mask_policy_no_pred();
> +  rtx tail_policy = get_tail_policy_no_pred();
> +  rtx mask = CONSTM1_RTX(<VM>mode);
> +  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
> +
> +  emit_insn(gen_pred_<optab><mode>(operands[0], mask, merge, operands[1], operands[2],
> +				vl, tail_policy, mask_policy, vlmax_avl_p));
Just nits.  Make sure to put a space before the open paren of an 
argument list, even when the argument list is empty.  Similarly for the 
other expander in here.  And update the comment.  You may not want to 
list every instruction handled by the expander.  Your call, though 
clearly if you're going to include them, the list ought to be reasonably 
complete.

No objections to this code.  It obviously depends on some bits earlier 
in the patchset which I still need to look at, but I wanted to look at 
this one first as it shows the basic formula for how to wire up the 
basic vector patterns.

Please wait for the prereqs to get reviewed before installing on the trunk.

jeff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
  2023-04-18 22:48         ` juzhe.zhong
@ 2023-04-18 23:19           ` Michael Collison
  2023-04-20 10:01           ` Richard Sandiford
  1 sibling, 0 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-18 23:19 UTC (permalink / raw)
  To: juzhe.zhong, kito.cheng, richard.guenther, Jeff Law, palmer; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 6185 bytes --]

Juzhe and Kito,

Thank you for the clarification.

On 4/18/23 18:48, juzhe.zhong@rivai.ai wrote:
> Yes, like kito said.
> We won't enable VNx1DImode in auto-vectorization so it's meaningless 
> to fix it here.
> We dynamic adjust the minimum vector-length for different '-march' 
> according to RVV ISA specification.
> So we strongly suggest that we should drop this fix.
>
> Thanks.
> ------------------------------------------------------------------------
> juzhe.zhong@rivai.ai
>
>     *From:* Kito Cheng <mailto:kito.cheng@gmail.com>
>     *Date:* 2023-04-19 02:21
>     *To:* Richard Biener <mailto:richard.guenther@gmail.com>; Jeff Law
>     <mailto:jeffreyalaw@gmail.com>; Palmer Dabbelt
>     <mailto:palmer@dabbelt.com>
>     *CC:* Michael Collison <mailto:collison@rivosinc.com>; gcc-patches
>     <mailto:gcc-patches@gcc.gnu.org>; 钟居哲 <mailto:juzhe.zhong@rivai.ai>
>     *Subject:* Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS
>     is a multiple of 2.
>     Few more background about RVV:
>     RISC-V has provide different VLEN configuration by different ISA
>     extension like `zve32x`, `zve64x` and `v`
>     zve32x just guarantee the minimal VLEN is 32 bits,
>     zve64x guarantee the minimal VLEN is 64 bits,
>     and v guarantee the minimal VLEN is 128 bits,
>     Current status (without that patch):
>     Zve32x: Mode for one vector register mode is VNx1SImode and VNx1DImode
>     is invalid mode
>     - one vector register could hold 1 + 1x SImode where x is 0~n, so it
>     might hold just one SI
>     Zve64x: Mode for one vector register mode is VNx1DImode or VNx2SImode
>     - one vector register could hold 1 + 1x DImode where x is 0~n, so it
>     might hold just one DI
>     - one vector register could hold 2 + 2x SImode where x is 0~n, so it
>     might hold just two SI
>     So what I want to say here is VNx1DImode is really NOT safe to assume
>     to have more than two DI in theory.
>     However `v` extension guarantees the minimal VLEN is 128 bits.
>     We are trying to introduce another type/mode mapping for this
>     configure:
>     v: Mode for one vector register mode is VNx2DImode or VNx4SImode
>     - one vector register could hold 2 + 2x DImode where x is 0~n, so it
>     will hold at least two DI
>     - one vector register could hold 4 + 4x SImode where x is 0~n, so it
>     will hold at least four DI
>     So GET_MODE_NUNITS for a single vector register with DI mode will
>     become 2 (VNx2DImode) if it is really possible, which is a more
>     precise way to model the vector extension for RISC-V .
>     On Tue, Apr 18, 2023 at 10:28 PM Kito Cheng <kito.cheng@gmail.com>
>     wrote:
>     >
>     > Wait, VNx1DImode can be really evaluate to just one element if
>     > -march=rv64g_zve64x,
>     >
>     > I thinks this should be just fixed on backend by this patch:
>     >
>     >
>     https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zhong@rivai.ai/
>     >
>     > On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches
>     > <gcc-patches@gcc.gnu.org> wrote:
>     > >
>     > > On Mon, Apr 17, 2023 at 8:42 PM Michael Collison
>     <collison@rivosinc.com> wrote:
>     > > >
>     > > > While working on autovectorizing for the RISCV port I
>     encountered an issue
>     > > > where can_duplicate_and_interleave_p assumes that
>     GET_MODE_NUNITS is a
>     > > > evenly divisible by two. The RISC-V target has vector modes
>     (e.g. VNx1DImode),
>     > > > where GET_MODE_NUNITS is equal to one.
>     > > >
>     > > > Tested on RISCV and x86_64-linux-gnu. Okay?
>     > >
>     > > OK.
>     > >
>     > > > 2023-03-09  Michael Collison <collison@rivosinc.com>
>     > > >
>     > > >         * tree-vect-slp.cc (can_duplicate_and_interleave_p):
>     > > >         Check that GET_MODE_NUNITS is a multiple of 2.
>     > > > ---
>     > > >  gcc/tree-vect-slp.cc | 7 +++++--
>     > > >  1 file changed, 5 insertions(+), 2 deletions(-)
>     > > >
>     > > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
>     > > > index d73deaecce0..a64fe454e19 100644
>     > > > --- a/gcc/tree-vect-slp.cc
>     > > > +++ b/gcc/tree-vect-slp.cc
>     > > > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p
>     (vec_info *vinfo, unsigned int count,
>     > > >             (GET_MODE_BITSIZE (int_mode), 1);
>     > > >           tree vector_type
>     > > >             = get_vectype_for_scalar_type (vinfo, int_type,
>     count);
>     > > > +         poly_int64 half_nelts;
>     > > >           if (vector_type
>     > > >               && VECTOR_MODE_P (TYPE_MODE (vector_type))
>     > > >               && known_eq (GET_MODE_SIZE (TYPE_MODE
>     (vector_type)),
>     > > > -                          GET_MODE_SIZE (base_vector_mode)))
>     > > > +                          GET_MODE_SIZE (base_vector_mode))
>     > > > +             && multiple_p (GET_MODE_NUNITS (TYPE_MODE
>     (vector_type)),
>     > > > +                            2, &half_nelts))
>     > > >             {
>     > > >               /* Try fusing consecutive sequences of COUNT /
>     NVECTORS elements
>     > > >                  together into elements of type INT_TYPE and
>     using the result
>     > > > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info
>     *vinfo, unsigned int count,
>     > > >               poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE
>     (vector_type));
>     > > >               vec_perm_builder sel1 (nelts, 2, 3);
>     > > >               vec_perm_builder sel2 (nelts, 2, 3);
>     > > > -             poly_int64 half_nelts = exact_div (nelts, 2);
>     > > > +
>     > > >               for (unsigned int i = 0; i < 3; ++i)
>     > > >                 {
>     > > >                   sel1.quick_push (i);
>     > > > --
>     > > > 2.34.1
>     > > >
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 01/10] RISC-V: Add new predicates and function prototypes
  2023-04-17 18:36 ` [PATCH v4 01/10] RISC-V: Add new predicates and function prototypes Michael Collison
@ 2023-04-19  0:54   ` Kito Cheng
  2023-04-26  2:50     ` Jeff Law
  0 siblings, 1 reply; 36+ messages in thread
From: Kito Cheng @ 2023-04-19  0:54 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches

Could you please move the new function declarations and new code to
the patch where they are being used?

> +/* RVV vector register sizes.  */
> +enum riscv_vector_bits_enum
> +{
> +  RVV_SCALABLE,
> +  RVV_NOT_IMPLEMENTED = RVV_SCALABLE,
> +  RVV_64 = 64,
> +  RVV_128 = 128,
> +  RVV_256 = 256,
> +  RVV_512 = 512,
> +  RVV_1024 = 1024,
> +  RVV_2048 = 2048,
> +  RVV_4096 = 4096,
> +  RVV_8192 = 8192,
> +  RVV_16384 = 16384,
> +  RVV_32768 = 32768,
> +  RVV_65536 = 65536
> +};

I think this is not necessary for the VLA vectorizer?

> +Enum
> +Name(riscv_vector_lmul) Type(enum riscv_vector_lmul_enum)
> +The possible vectorization factor:
> +
> +EnumValue
> +Enum(riscv_vector_lmul) String(1) Value(RVV_LMUL1)
> +
> +EnumValue
> +Enum(riscv_vector_lmul) String(2) Value(RVV_LMUL2)
> +
> +EnumValue
> +Enum(riscv_vector_lmul) String(4) Value(RVV_LMUL4)
> +
> +EnumValue
> +Enum(riscv_vector_lmul) String(8) Value(RVV_LMUL8)

I would like to introduce this option later, it's used for fine tuning,
VLA vectorizer should be able to work without this tuning option.

> +mriscv-vector-lmul=
> +Target RejectNegative Joined Enum(riscv_vector_lmul) Var(riscv_vector_lmul) Init(RVV_LMUL1)
> +-mriscv-vector-lmul=<lmul>     Set the vf using lmul in auto-vectorization.
> +

Same question for this

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 04/10] RISC-V:autovec: Add target vectorization hooks
  2023-04-17 18:36 ` [PATCH v4 04/10] RISC-V:autovec: Add target vectorization hooks Michael Collison
@ 2023-04-19  1:04   ` Kito Cheng
  2023-04-20  2:11   ` juzhe.zhong
  1 sibling, 0 replies; 36+ messages in thread
From: Kito Cheng @ 2023-04-19  1:04 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches

> +/* Implement TARGET_ESTIMATED_POLY_VALUE.
> +   Look into the tuning structure for an estimate.
> +   KIND specifies the type of requested estimate: min, max or likely.
> +   For cores with a known RVV width all three estimates are the same.
> +   For generic RVV tuning we want to distinguish the maximum estimate from
> +   the minimum and likely ones.
> +   The likely estimate is the same as the minimum in that case to give a
> +   conservative behavior of auto-vectorizing with RVV when it is a win
> +   even for 128-bit RVV.
> +   When RVV width information is available VAL.coeffs[1] is multiplied by
> +   the number of VQ chunks over the initial Advanced SIMD 128 bits.  */
> +
> +static HOST_WIDE_INT
> +riscv_estimated_poly_value (poly_int64 val,
> +                           poly_value_estimate_kind kind = POLY_VALUE_LIKELY)
> +{
> +  unsigned int width_source = BITS_PER_RISCV_VECTOR.is_constant ()
> +    ? (unsigned int) BITS_PER_RISCV_VECTOR.to_constant ()
> +    : (unsigned int) RVV_SCALABLE;

It could be RVV_SCALABLE only for now, so I would prefer to just
keep that switch only for now.

And adding assert (!BITS_PER_RISCV_VECTOR.is_constant ());

> +
> +  /* If there is no core-specific information then the minimum and likely
> +     values are based on 128-bit vectors and the maximum is based on
> +     the architectural maximum of 2048 bits.  */

Maximum is 65,536 bit per vector spec.

> +  if (width_source == RVV_SCALABLE)
> +    switch (kind)
> +      {
> +      case POLY_VALUE_MIN:
> +      case POLY_VALUE_LIKELY:
> +       return val.coeffs[0];
> +
> +      case POLY_VALUE_MAX:
> +       return val.coeffs[0] + val.coeffs[1] * 15;
> +      }
> +
> +  /* Allow BITS_PER_RISCV_VECTOR to be a bitmask of different VL, treating the
> +     lowest as likely.  This could be made more general if future -mtune
> +     options need it to be.  */
> +  if (kind == POLY_VALUE_MAX)
> +    width_source = 1 << floor_log2 (width_source);
> +  else
> +    width_source = least_bit_hwi (width_source);
> +
> +  /* If the core provides width information, use that.  */
> +  HOST_WIDE_INT over_128 = width_source - 128;
> +  return val.coeffs[0] + val.coeffs[1] * over_128 / 128;
> +}
> +
> +/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
> +
> +static machine_mode
> +riscv_preferred_simd_mode (scalar_mode mode)
> +{
> +  machine_mode vmode =
> +    riscv_vector::riscv_vector_preferred_simd_mode (mode,
> +                                                   riscv_vectorization_factor);
> +  if (VECTOR_MODE_P (vmode))
> +    return vmode;
> +
> +  return word_mode;
> +}
> +
> +/* Implement TARGET_AUTOVECTORIZE_VECTOR_MODES for RVV.  */
> +static unsigned int
> +riscv_autovectorize_vector_modes (vector_modes *modes, bool)
> +{
> +  if (!TARGET_VECTOR)
> +    return 0;
> +
> +  if (riscv_vectorization_factor == RVV_LMUL1)
> +    {
> +      modes->safe_push (VNx16QImode);
> +      modes->safe_push (VNx8QImode);
> +      modes->safe_push (VNx4QImode);
> +      modes->safe_push (VNx2QImode);
> +    }

Keep LMUL1 case only for this moment.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 03/10] RISC-V:autovec: Add auto-vectorization support functions
  2023-04-17 18:36 ` [PATCH v4 03/10] RISC-V:autovec: Add auto-vectorization support functions Michael Collison
@ 2023-04-19  1:15   ` Kito Cheng
  2023-04-20  2:19   ` juzhe.zhong
  1 sibling, 0 replies; 36+ messages in thread
From: Kito Cheng @ 2023-04-19  1:15 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches

> @@ -118,6 +120,41 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
>           && IN_RANGE (INTVAL (elt), minval, maxval));
>  }
>
> +/* Return the vlmul field for a specific machine mode.  */
> +unsigned int
> +riscv_classify_vlmul_field (enum machine_mode mode)

This is not implemented right for the current type system.

> @@ -176,6 +213,64 @@ calculate_ratio (unsigned int sew, enum vlmul_type vlmul)
>    return ratio;
>  }
>
> +/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE for RVV.  */
> +
> +machine_mode
> +riscv_vector_preferred_simd_mode (scalar_mode mode, unsigned vf)

`vf` is kind of misleading, it should be `LMUL` or something like that.

> +{
> +  if (!TARGET_VECTOR)
> +    return word_mode;
> +
> +  switch (mode)
> +    {
> +    case E_QImode:
> +      return vf == 1   ? VNx8QImode
> +            : vf == 2 ? VNx16QImode
> +            : vf == 4 ? VNx32QImode
> +                      : VNx64QImode;

I would prefer only to keep LMUL=1/ vf=1 case for this patch set,
so maybe drop the vf parameter for this moment and add back when
we implement later.

> +/* Return true if it is a RVV tuple mode.  */
> +bool
> +riscv_tuple_mode_p (machine_mode mode ATTRIBUTE_UNUSED)

just drop this for now.

> +/* Return nf for a machine mode.  */
> +int
> +riscv_classify_nf (machine_mode mode)

Drop this, add that when we implement tuple type.

> +
> +/* Return vlmul register size for a machine mode.  */
> +int
> +riscv_vlmul_regsize (machine_mode mode)

UNITS_PER_V_REGget mode size and calculate with UNITS_PER_V_REG
like exact_div (GET_MODE_SIZE (mode), UNITS_PER_V_REG).to_constant ()

> +{
> +  if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
> +    return 1;
> +  switch (riscv_classify_vlmul_field (mode))
> +    {
> +    case VLMUL_FIELD_001:
> +      return 2;
> +    case VLMUL_FIELD_010:
> +      return 4;
> +    case VLMUL_FIELD_011:
> +      return 8;
> +    case VLMUL_FIELD_100:
> +      gcc_unreachable ();
> +    default:
> +      return 1;
> +    }
> +}
> +
> +/* Return true if it is a RVV mask mode.  */
> +bool
> +riscv_vector_mask_mode_p (machine_mode mode)
> +{
> +  return (mode == VNx1BImode || mode == VNx2BImode || mode == VNx4BImode
> +         || mode == VNx8BImode || mode == VNx16BImode || mode == VNx32BImode
> +         || mode == VNx64BImode);
> +}
> +
> +/* Implement TARGET_VECTORIZE_GET_MASK_MODE for RVV.  */
> +
> +opt_machine_mode
> +riscv_vector_get_mask_mode (machine_mode mode)
> +{
> +  machine_mode mask_mode;
> +  int nf = 1;
> +  if (riscv_tuple_mode_p (mode))
> +    nf = riscv_classify_nf (mode);

drop nf stuffs

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations
  2023-04-17 18:36 ` [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations Michael Collison
  2023-04-18 23:14   ` Jeff Law
@ 2023-04-19  1:19   ` Kito Cheng
  2023-04-20 20:21     ` Michael Collison
  2023-04-20  2:24   ` juzhe.zhong
  2 siblings, 1 reply; 36+ messages in thread
From: Kito Cheng @ 2023-04-19  1:19 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches

> diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
> index 70ad85b661b..7fae87968d7 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -34,6 +34,8 @@
>    UNSPEC_VMULHU
>    UNSPEC_VMULHSU
>
> +  UNSPEC_VADD
> +  UNSPEC_VSUB

Defined but unused?

>    UNSPEC_VADC
>    UNSPEC_VSBC
>    UNSPEC_VMADC
> diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
> index 0ecca98f20c..2ac5b744503 100644
> --- a/gcc/config/riscv/vector.md
> +++ b/gcc/config/riscv/vector.md
> @@ -26,8 +26,6 @@
>  ;; - Auto-vectorization (TBD)
>  ;; - Combine optimization (TBD)
>
> -(include "vector-iterators.md")
> -

Why remove this?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 04/10] RISC-V:autovec: Add target vectorization hooks
  2023-04-17 18:36 ` [PATCH v4 04/10] RISC-V:autovec: Add target vectorization hooks Michael Collison
  2023-04-19  1:04   ` Kito Cheng
@ 2023-04-20  2:11   ` juzhe.zhong
  1 sibling, 0 replies; 36+ messages in thread
From: juzhe.zhong @ 2023-04-20  2:11 UTC (permalink / raw)
  To: collison, gcc-patches
  Cc: jeffreyalaw, Kito.cheng, kito.cheng, palmer, palmer, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 9021 bytes --]

Hi, Michael. Thanks for extracting patches from "rvv-next". I have several comments here:

1. I think it's not appropriate and useless to support such many target hook in the first auto-vec support patch.

   You should only support TARGET_VECTORIZE_PREFERRED_SIMD_MODE is enough, supporting too many
   useless target hook will make patch too messy and not easy to trace.

2. TARGET_ESTIMATED_POLY_VALUE since it's currently not used. 
3. TARGET_AUTOVECTORIZE_VECTOR_MODES it's not used in the first patch.
4.  TARGET_VECTORIZE_GET_MASK_MODE && TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE is used to
    specify the mask mode for WHILE_ULT and comparison result.
    These 2 target hook are not used when you don't implement WHILE_ULT/VCOND/VEC_CMP/.... pattern.
5. TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK is the target hook I added in rvv-next, it's not existed in the upstream GCC.
    You should not add it when I didn't support it yet in upstream GCC.

....etc.

So, the basic idea is that you should only TARGET_VECTORIZE_PREFERRED_SIMD_MODE in the first enabling basic auto-vectorization patch.
It should be enough when we only implement simple len_load/len_store.

I have sent the patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616223.html to initial basic auto-vectorization.



juzhe.zhong@rivai.ai
 
From: Michael Collison
Date: 2023-04-18 02:36
To: gcc-patches
Subject: [PATCH v4 04/10] RISC-V:autovec: Add target vectorization hooks
2023-03-02  Michael Collison  <collison@rivosinc.com>
    Juzhe Zhong  <juzhe.zhong@rivai.ai>
 
* config/riscv/riscv.cc (riscv_option_override):
Set riscv_vectorization_factor.
(riscv_estimated_poly_value): Implement
TARGET_ESTIMATED_POLY_VALUE.
(riscv_preferred_simd_mode): Implement
TARGET_VECTORIZE_PREFERRED_SIMD_MODE.
(riscv_autovectorize_vector_modes): Implement
TARGET_AUTOVECTORIZE_VECTOR_MODES.
(riscv_get_mask_mode): Implement TARGET_VECTORIZE_GET_MASK_MODE.
(riscv_empty_mask_is_expensive): Implement
TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.
(riscv_vectorize_create_costs): Implement
TARGET_VECTORIZE_CREATE_COSTS.
(TARGET_ESTIMATED_POLY_VALUE): Register target macro.
(TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto.
(TARGET_VECTORIZE_GET_MASK_MODE): Ditto.
(TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE): Ditto.
(TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK): Ditto.
---
gcc/config/riscv/riscv.cc | 156 ++++++++++++++++++++++++++++++++++++++
1 file changed, 156 insertions(+)
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index dc47434fac4..9af06d926cf 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -60,6 +60,15 @@ along with GCC; see the file COPYING3.  If not see
#include "opts.h"
#include "tm-constrs.h"
#include "rtl-iter.h"
+#include "gimple.h"
+#include "cfghooks.h"
+#include "cfgloop.h"
+#include "cfgrtl.h"
+#include "sel-sched.h"
+#include "fold-const.h"
+#include "gimple-iterator.h"
+#include "gimple-expr.h"
+#include "tree-vectorizer.h"
/* This file should be included last.  */
#include "target-def.h"
@@ -275,6 +284,9 @@ poly_uint16 riscv_vector_chunks;
/* The number of bytes in a vector chunk.  */
unsigned riscv_bytes_per_vector_chunk;
+/* Prefer vf for auto-vectorizer.  */
+unsigned riscv_vectorization_factor;
+
/* Index R is the smallest register class that contains register R.  */
const enum reg_class riscv_regno_to_class[FIRST_PSEUDO_REGISTER] = {
   GR_REGS, GR_REGS, GR_REGS, GR_REGS,
@@ -6363,6 +6375,10 @@ riscv_option_override (void)
   /* Convert -march to a chunks count.  */
   riscv_vector_chunks = riscv_convert_vector_bits ();
+
+  if (TARGET_VECTOR)
+    riscv_vectorization_factor = riscv_vector_lmul;
+
}
/* Implement TARGET_CONDITIONAL_REGISTER_USAGE.  */
@@ -7057,6 +7073,128 @@ riscv_dwarf_poly_indeterminate_value (unsigned int i, unsigned int *factor,
   return RISCV_DWARF_VLENB;
}
+/* Implement TARGET_ESTIMATED_POLY_VALUE.
+   Look into the tuning structure for an estimate.
+   KIND specifies the type of requested estimate: min, max or likely.
+   For cores with a known RVV width all three estimates are the same.
+   For generic RVV tuning we want to distinguish the maximum estimate from
+   the minimum and likely ones.
+   The likely estimate is the same as the minimum in that case to give a
+   conservative behavior of auto-vectorizing with RVV when it is a win
+   even for 128-bit RVV.
+   When RVV width information is available VAL.coeffs[1] is multiplied by
+   the number of VQ chunks over the initial Advanced SIMD 128 bits.  */
+
+static HOST_WIDE_INT
+riscv_estimated_poly_value (poly_int64 val,
+     poly_value_estimate_kind kind = POLY_VALUE_LIKELY)
+{
+  unsigned int width_source = BITS_PER_RISCV_VECTOR.is_constant ()
+    ? (unsigned int) BITS_PER_RISCV_VECTOR.to_constant ()
+    : (unsigned int) RVV_SCALABLE;
+
+  /* If there is no core-specific information then the minimum and likely
+     values are based on 128-bit vectors and the maximum is based on
+     the architectural maximum of 2048 bits.  */
+  if (width_source == RVV_SCALABLE)
+    switch (kind)
+      {
+      case POLY_VALUE_MIN:
+      case POLY_VALUE_LIKELY:
+ return val.coeffs[0];
+
+      case POLY_VALUE_MAX:
+ return val.coeffs[0] + val.coeffs[1] * 15;
+      }
+
+  /* Allow BITS_PER_RISCV_VECTOR to be a bitmask of different VL, treating the
+     lowest as likely.  This could be made more general if future -mtune
+     options need it to be.  */
+  if (kind == POLY_VALUE_MAX)
+    width_source = 1 << floor_log2 (width_source);
+  else
+    width_source = least_bit_hwi (width_source);
+
+  /* If the core provides width information, use that.  */
+  HOST_WIDE_INT over_128 = width_source - 128;
+  return val.coeffs[0] + val.coeffs[1] * over_128 / 128;
+}
+
+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
+
+static machine_mode
+riscv_preferred_simd_mode (scalar_mode mode)
+{
+  machine_mode vmode =
+    riscv_vector::riscv_vector_preferred_simd_mode (mode,
+     riscv_vectorization_factor);
+  if (VECTOR_MODE_P (vmode))
+    return vmode;
+
+  return word_mode;
+}
+
+/* Implement TARGET_AUTOVECTORIZE_VECTOR_MODES for RVV.  */
+static unsigned int
+riscv_autovectorize_vector_modes (vector_modes *modes, bool)
+{
+  if (!TARGET_VECTOR)
+    return 0;
+
+  if (riscv_vectorization_factor == RVV_LMUL1)
+    {
+      modes->safe_push (VNx16QImode);
+      modes->safe_push (VNx8QImode);
+      modes->safe_push (VNx4QImode);
+      modes->safe_push (VNx2QImode);
+    }
+  else if (riscv_vectorization_factor == RVV_LMUL2)
+    {
+      modes->safe_push (VNx32QImode);
+      modes->safe_push (VNx16QImode);
+      modes->safe_push (VNx8QImode);
+      modes->safe_push (VNx4QImode);
+    }
+  else if (riscv_vectorization_factor == RVV_LMUL4)
+    {
+      modes->safe_push (VNx64QImode);
+      modes->safe_push (VNx32QImode);
+      modes->safe_push (VNx16QImode);
+      modes->safe_push (VNx8QImode);
+    }
+  else
+    {
+      modes->safe_push (VNx64QImode);
+      modes->safe_push (VNx32QImode);
+      modes->safe_push (VNx16QImode);
+    }
+
+  return 0;
+}
+
+/* Implement TARGET_VECTORIZE_GET_MASK_MODE.  */
+
+static opt_machine_mode
+riscv_get_mask_mode (machine_mode mode)
+{
+  machine_mode mask_mode = VOIDmode;
+  if (TARGET_VECTOR
+      && riscv_vector::riscv_vector_get_mask_mode (mode).exists (&mask_mode))
+    return mask_mode;
+
+  return default_get_mask_mode (mode);
+}
+
+/* Implement TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.  Assume for now that
+   it isn't worth branching around empty masked ops (including masked
+   stores).  */
+
+static bool
+riscv_empty_mask_is_expensive (unsigned)
+{
+  return false;
+}
+
/* Return true if a shift-amount matches the trailing cleared bits on
    a bitmask.  */
@@ -7382,6 +7520,24 @@ riscv_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
#undef TARGET_VERIFY_TYPE_CONTEXT
#define TARGET_VERIFY_TYPE_CONTEXT riscv_verify_type_context
+#undef TARGET_ESTIMATED_POLY_VALUE
+#define TARGET_ESTIMATED_POLY_VALUE riscv_estimated_poly_value
+
+#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE
+#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE riscv_preferred_simd_mode
+
+#undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES
+#define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES riscv_autovectorize_vector_modes
+
+#undef TARGET_VECTORIZE_GET_MASK_MODE
+#define TARGET_VECTORIZE_GET_MASK_MODE riscv_get_mask_mode
+
+#undef TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE
+#define TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE riscv_empty_mask_is_expensive
+
+#undef TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK
+#define TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK riscv_loop_len_override_mask
+
#undef TARGET_VECTOR_ALIGNMENT
#define TARGET_VECTOR_ALIGNMENT riscv_vector_alignment
-- 
2.34.1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 03/10] RISC-V:autovec: Add auto-vectorization support functions
  2023-04-17 18:36 ` [PATCH v4 03/10] RISC-V:autovec: Add auto-vectorization support functions Michael Collison
  2023-04-19  1:15   ` Kito Cheng
@ 2023-04-20  2:19   ` juzhe.zhong
  1 sibling, 0 replies; 36+ messages in thread
From: juzhe.zhong @ 2023-04-20  2:19 UTC (permalink / raw)
  To: collison, gcc-patches
  Cc: jeffreyalaw, Kito.cheng, kito.cheng, palmer, palmer, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 7034 bytes --]

All functions should be dropped except "riscv_vector_preferred_simd_mode".
I known you take from "rvv-next". However, the implementation of "prefer_simd_mode" in "rvv-next" is incorrect.

The most important part of implementing this function is that we should gurantee compiler will not generate
unexpected auto-vectorization codes according to "-march", for example, when -march=rv64gc_zve32x, we should
not have SEW = 64 RVV instructions.

I have implemented:
 https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616223.html 
You can take a look at "preferred_simd_mode" function.
And also, I have a bunch of -march combinations of testcase, make sure compiler will not auto-vectorize the codes
if we don't want it:
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616224.html 



juzhe.zhong@rivai.ai
 
From: Michael Collison
Date: 2023-04-18 02:36
To: gcc-patches
Subject: [PATCH v4 03/10] RISC-V:autovec: Add auto-vectorization support functions
2023-03-02  Michael Collison  <collison@rivosinc.com>
    Juzhe Zhong  <juzhe.zhong@rivai.ai>
 
* config/riscv/riscv-v.cc (riscv_classify_vlmul_field):
New function.
(riscv_vector_preferred_simd_mode): Ditto.
(get_mask_policy_no_pred): Ditto.
(get_tail_policy_no_pred): Ditto.
(riscv_tuple_mode_p): Ditto.
(riscv_classify_nf): Ditto.
(riscv_vlmul_regsize): Ditto.
(riscv_vector_mask_mode_p): Ditto.
(riscv_vector_get_mask_mode): Ditto.
---
gcc/config/riscv/riscv-v.cc | 176 ++++++++++++++++++++++++++++++++++++
1 file changed, 176 insertions(+)
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 392f5d02e17..9df86419caa 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -39,9 +39,11 @@
#include "emit-rtl.h"
#include "tm_p.h"
#include "target.h"
+#include "targhooks.h"
#include "expr.h"
#include "optabs.h"
#include "tm-constrs.h"
+#include "riscv-vector-builtins.h"
#include "rtx-vector-builder.h"
using namespace riscv_vector;
@@ -118,6 +120,41 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
  && IN_RANGE (INTVAL (elt), minval, maxval));
}
+/* Return the vlmul field for a specific machine mode.  */
+unsigned int
+riscv_classify_vlmul_field (enum machine_mode mode)
+{
+  /* Make the decision based on the mode's enum value rather than its
+     properties, so that we keep the correct classification regardless
+     of -mriscv-vector-bits.  */
+  switch (mode)
+    {
+    case E_VNx8BImode:
+      return VLMUL_FIELD_111;
+
+    case E_VNx4BImode:
+      return VLMUL_FIELD_110;
+
+    case E_VNx2BImode:
+      return VLMUL_FIELD_101;
+
+    case E_VNx16BImode:
+      return VLMUL_FIELD_000;
+
+    case E_VNx32BImode:
+      return VLMUL_FIELD_001;
+
+    case E_VNx64BImode:
+      return VLMUL_FIELD_010;
+
+    default:
+      break;
+    }
+
+  /* we don't care about VLMUL for Mask.  */
+  return VLMUL_FIELD_000;
+}
+
/* Emit a vlmax vsetvl instruction.  This should only be used when
    optimization is disabled or after vsetvl insertion pass.  */
void
@@ -176,6 +213,64 @@ calculate_ratio (unsigned int sew, enum vlmul_type vlmul)
   return ratio;
}
+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE for RVV.  */
+
+machine_mode
+riscv_vector_preferred_simd_mode (scalar_mode mode, unsigned vf)
+{
+  if (!TARGET_VECTOR)
+    return word_mode;
+
+  switch (mode)
+    {
+    case E_QImode:
+      return vf == 1   ? VNx8QImode
+      : vf == 2 ? VNx16QImode
+      : vf == 4 ? VNx32QImode
+        : VNx64QImode;
+      break;
+    case E_HImode:
+      return vf == 1   ? VNx4HImode
+      : vf == 2 ? VNx8HImode
+      : vf == 4 ? VNx16HImode
+        : VNx32HImode;
+      break;
+    case E_SImode:
+      return vf == 1   ? VNx2SImode
+      : vf == 2 ? VNx4SImode
+      : vf == 4 ? VNx8SImode
+        : VNx16SImode;
+      break;
+    case E_DImode:
+      if (riscv_vector_elen_flags != MASK_VECTOR_ELEN_32
+   && riscv_vector_elen_flags != MASK_VECTOR_ELEN_FP_32)
+ return vf == 1 ? VNx1DImode
+        : vf == 2 ? VNx2DImode
+        : vf == 4 ? VNx4DImode
+ : VNx8DImode;
+      break;
+    case E_SFmode:
+      if (TARGET_HARD_FLOAT && riscv_vector_elen_flags != MASK_VECTOR_ELEN_32
+   && riscv_vector_elen_flags != MASK_VECTOR_ELEN_64)
+ return vf == 1 ? VNx2SFmode
+        : vf == 2 ? VNx4SFmode
+        : vf == 4 ? VNx8SFmode
+ : VNx16SFmode;
+      break;
+    case E_DFmode:
+      if (TARGET_DOUBLE_FLOAT && TARGET_VECTOR_ELEN_FP_64)
+ return vf == 1 ? VNx1DFmode
+        : vf == 2 ? VNx2DFmode
+        : vf == 4 ? VNx4DFmode
+ : VNx8DFmode;
+      break;
+    default:
+      break;
+    }
+
+  return word_mode;
+}
+
/* Emit an RVV unmask && vl mov from SRC to DEST.  */
static void
emit_pred_op (unsigned icode, rtx mask, rtx dest, rtx src, rtx len,
@@ -421,6 +516,87 @@ get_avl_type_rtx (enum avl_type type)
   return gen_int_mode (type, Pmode);
}
+rtx
+get_mask_policy_no_pred ()
+{
+  return get_mask_policy_for_pred (PRED_TYPE_none);
+}
+
+rtx
+get_tail_policy_no_pred ()
+{
+  return get_mask_policy_for_pred (PRED_TYPE_none);
+}
+
+/* Return true if it is a RVV tuple mode.  */
+bool
+riscv_tuple_mode_p (machine_mode mode ATTRIBUTE_UNUSED)
+{
+  return false;
+}
+
+/* Return nf for a machine mode.  */
+int
+riscv_classify_nf (machine_mode mode)
+{
+  switch (mode)
+    {
+
+    default:
+      break;
+    }
+
+  return 1;
+}
+
+/* Return vlmul register size for a machine mode.  */
+int
+riscv_vlmul_regsize (machine_mode mode)
+{
+  if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
+    return 1;
+  switch (riscv_classify_vlmul_field (mode))
+    {
+    case VLMUL_FIELD_001:
+      return 2;
+    case VLMUL_FIELD_010:
+      return 4;
+    case VLMUL_FIELD_011:
+      return 8;
+    case VLMUL_FIELD_100:
+      gcc_unreachable ();
+    default:
+      return 1;
+    }
+}
+
+/* Return true if it is a RVV mask mode.  */
+bool
+riscv_vector_mask_mode_p (machine_mode mode)
+{
+  return (mode == VNx1BImode || mode == VNx2BImode || mode == VNx4BImode
+   || mode == VNx8BImode || mode == VNx16BImode || mode == VNx32BImode
+   || mode == VNx64BImode);
+}
+
+/* Implement TARGET_VECTORIZE_GET_MASK_MODE for RVV.  */
+
+opt_machine_mode
+riscv_vector_get_mask_mode (machine_mode mode)
+{
+  machine_mode mask_mode;
+  int nf = 1;
+  if (riscv_tuple_mode_p (mode))
+    nf = riscv_classify_nf (mode);
+
+  FOR_EACH_MODE_IN_CLASS (mask_mode, MODE_VECTOR_BOOL)
+  if (GET_MODE_INNER (mask_mode) == BImode
+      && known_eq (GET_MODE_NUNITS (mask_mode) * nf, GET_MODE_NUNITS (mode))
+      && riscv_vector_mask_mode_p (mask_mode))
+    return mask_mode;
+  return default_get_mask_mode (mode);
+}
+
/* Return the RVV vector mode that has NUNITS elements of mode INNER_MODE.
    This function is not only used by builtins, but also will be used by
    auto-vectorization in the future.  */
-- 
2.34.1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations
  2023-04-17 18:36 ` [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations Michael Collison
  2023-04-18 23:14   ` Jeff Law
  2023-04-19  1:19   ` Kito Cheng
@ 2023-04-20  2:24   ` juzhe.zhong
  2023-04-26 18:15     ` Robin Dapp
       [not found]     ` <3DF5ADD87A33EE11+BA2E4625-72A4-421A-B9D3-6DCA48E402BD@rivai.ai>
  2 siblings, 2 replies; 36+ messages in thread
From: juzhe.zhong @ 2023-04-20  2:24 UTC (permalink / raw)
  To: collison, gcc-patches
  Cc: jeffreyalaw, Kito.cheng, kito.cheng, palmer, palmer, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 6375 bytes --]

1. We should only support len_load/len_store in the first patch before any other auto-vectorization operation.
    I have sent the patch:
    https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616223.html 

2. cond_<optab><mode> is the conditional auto-vectorization pattern used by reduction operation and comparison selecting.
    If we don't have reduce_* pattern and VCOND/VEC_CMP/... patterns, we should not have them now.

3.+  rtx merge = RVV_VUNDEF (<MODE>mode);
+  rtx vl = gen_reg_rtx (Pmode);
+  emit_vlmax_vsetvl (<MODE>mode, vl);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(<VM>mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);

These operands preparation codes should be added into a wrapper.
How to add a wrapper, you can reference "emit_nonvlmax_op" , "emit_pred_op"... functions.

Thanks.


juzhe.zhong@rivai.ai
 
From: Michael Collison
Date: 2023-04-18 02:36
To: gcc-patches
Subject: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations
2023-03-02  Michael Collison  <collison@rivosinc.com>
    Juzhe Zhong  <juzhe.zhong@rivai.ai>
 
* config/riscv/riscv.md (riscv_vector_preferred_simd_mode): Include
vector-iterators.md.
* config/riscv/vector-auto.md: New file containing
autovectorization patterns.
* config/riscv/vector-iterators.md (UNSPEC_VADD/UNSPEC_VSUB):
New unspecs for autovectorization patterns.
* config/riscv/vector.md: Remove include of vector-iterators.md
and include vector-auto.md.
---
gcc/config/riscv/riscv.md            |  1 +
gcc/config/riscv/vector-auto.md      | 79 ++++++++++++++++++++++++++++
gcc/config/riscv/vector-iterators.md |  2 +
gcc/config/riscv/vector.md           |  4 +-
4 files changed, 84 insertions(+), 2 deletions(-)
create mode 100644 gcc/config/riscv/vector-auto.md
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index bc384d9aedf..7f8f3a6cb18 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -135,6 +135,7 @@
(include "predicates.md")
(include "constraints.md")
(include "iterators.md")
+(include "vector-iterators.md")
;; ....................
;;
diff --git a/gcc/config/riscv/vector-auto.md b/gcc/config/riscv/vector-auto.md
new file mode 100644
index 00000000000..dc62f9af705
--- /dev/null
+++ b/gcc/config/riscv/vector-auto.md
@@ -0,0 +1,79 @@
+;; Machine description for RISC-V 'V' Extension for GNU compiler.
+;; Copyright (C) 2022-2023 Free Software Foundation, Inc.
+;; Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd.
+;; Contributed by Michael Collison (collison@rivosinc.com, Rivos Inc.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+
+;; -------------------------------------------------------------------------
+;; ---- [INT] Addition
+;; -------------------------------------------------------------------------
+;; Includes:
+;; - vadd.vv
+;; - vadd.vx
+;; - vadd.vi
+;; -------------------------------------------------------------------------
+
+(define_expand "<optab><mode>3"
+  [(set (match_operand:VI 0 "register_operand")
+ (any_int_binop:VI (match_operand:VI 1 "register_operand")
+   (match_operand:VI 2 "register_operand")))]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = RVV_VUNDEF (<MODE>mode);
+  rtx vl = gen_reg_rtx (Pmode);
+  emit_vlmax_vsetvl (<MODE>mode, vl);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(<VM>mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_<optab><mode>(operands[0], mask, merge, operands[1], operands[2],
+ vl, tail_policy, mask_policy, vlmax_avl_p));
+
+  DONE;
+})
+
+(define_expand "cond_<optab><mode>3"
+  [(set (match_operand:VI 0 "register_operand")
+ (if_then_else:VI
+ (unspec:<VM>
+   [(match_operand:<VM> 1 "register_operand")] UNSPEC_VPREDICATE)
+ (any_int_binop:VI
+   (match_operand:VI 2 "register_operand")
+   (match_operand:VI 3 "register_operand"))
+ (match_operand:VI 4 "register_operand")))]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = operands[4];
+  rtx vl = gen_reg_rtx (Pmode);
+  emit_vlmax_vsetvl (<MODE>mode, vl);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = operands[1];
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_<optab><mode>(operands[0], mask, merge, operands[2], operands[3],
+ vl, tail_policy, mask_policy, vlmax_avl_p));
+  DONE;
+})
+
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 70ad85b661b..7fae87968d7 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -34,6 +34,8 @@
   UNSPEC_VMULHU
   UNSPEC_VMULHSU
+  UNSPEC_VADD
+  UNSPEC_VSUB
   UNSPEC_VADC
   UNSPEC_VSBC
   UNSPEC_VMADC
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 0ecca98f20c..2ac5b744503 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -26,8 +26,6 @@
;; - Auto-vectorization (TBD)
;; - Combine optimization (TBD)
-(include "vector-iterators.md")
-
(define_constants [
    (INVALID_ATTRIBUTE            255)
    (X0_REGNUM                      0)
@@ -351,6 +349,8 @@
   (symbol_ref "INTVAL (operands[4])")]
(const_int INVALID_ATTRIBUTE)))
+(include "vector-auto.md")
+
;; -----------------------------------------------------------------
;; ---- Miscellaneous Operations
;; -----------------------------------------------------------------
-- 
2.34.1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
  2023-04-18 22:48         ` juzhe.zhong
  2023-04-18 23:19           ` Michael Collison
@ 2023-04-20 10:01           ` Richard Sandiford
  1 sibling, 0 replies; 36+ messages in thread
From: Richard Sandiford @ 2023-04-20 10:01 UTC (permalink / raw)
  To: juzhe.zhong
  Cc: kito.cheng, richard.guenther, Jeff Law, palmer, Michael Collison,
	gcc-patches

<juzhe.zhong@rivai.ai> writes:
> Yes, like kito said.
> We won't enable VNx1DImode in auto-vectorization so it's meaningless to fix it here.
> We dynamic adjust the minimum vector-length for different '-march' according to RVV ISA specification.
> So we strongly suggest that we should drop this fix.

I think the patch should go in regardless.  If we have a port with
a VNx1 mode then the exact_div is at best dubious and at worst wrong.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations
  2023-04-19  1:19   ` Kito Cheng
@ 2023-04-20 20:21     ` Michael Collison
  0 siblings, 0 replies; 36+ messages in thread
From: Michael Collison @ 2023-04-20 20:21 UTC (permalink / raw)
  To: Kito Cheng; +Cc: gcc-patches

Hi Kito,

I will remove the unused UNSPECs, thank you for finding them.

I removed the include of "vector-iterators.md" because "riscv.md" 
already includes it and I was receiving multiple definition errors.

On 4/18/23 21:19, Kito Cheng wrote:
>> diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
>> index 70ad85b661b..7fae87968d7 100644
>> --- a/gcc/config/riscv/vector-iterators.md
>> +++ b/gcc/config/riscv/vector-iterators.md
>> @@ -34,6 +34,8 @@
>>     UNSPEC_VMULHU
>>     UNSPEC_VMULHSU
>>
>> +  UNSPEC_VADD
>> +  UNSPEC_VSUB
> Defined but unused?
>
>>     UNSPEC_VADC
>>     UNSPEC_VSBC
>>     UNSPEC_VMADC
>> diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
>> index 0ecca98f20c..2ac5b744503 100644
>> --- a/gcc/config/riscv/vector.md
>> +++ b/gcc/config/riscv/vector.md
>> @@ -26,8 +26,6 @@
>>   ;; - Auto-vectorization (TBD)
>>   ;; - Combine optimization (TBD)
>>
>> -(include "vector-iterators.md")
>> -
> Why remove this?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 00/10] RISC-V: Add autovec support
  2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
                   ` (10 preceding siblings ...)
  2023-04-17 19:26 ` [PATCH v4 00/10] RISC-V: Add autovec support Palmer Dabbelt
@ 2023-04-25 15:26 ` Palmer Dabbelt
  2023-04-26  2:52   ` Jeff Law
  11 siblings, 1 reply; 36+ messages in thread
From: Palmer Dabbelt @ 2023-04-25 15:26 UTC (permalink / raw)
  To: collison, Jeff Law; +Cc: gcc-patches

On Mon, 17 Apr 2023 11:36:51 PDT (-0700), collison@rivosinc.com wrote:
> This series of patches adds foundational support for RISC-V auto-vectorization support. These patches are based on the current upstream rvv vector intrinsic support and is not a new implementation. Most of the implementation consists of adding the new vector cost model, the autovectorization patterns themselves and target hooks. This implementation only provides support for integer addition and subtraction as a proof of concept. This patch set should not be construed to be feature complete. Based on conversations with the community these patches are intended to lay the groundwork for feature completion and collaboration within the RISC-V community.
>
> These patches are largely based off the work of Juzhe Zhong (juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>) of RiVAI. More specifically the rvv-next branch at: https://github.com/riscv-collab/riscv-gcc.git <https://github.com/riscv-collab/riscv-gcc.git>is the foundation of this patch set.
>
> As discussed on this list, if these patches are approved they will be merged into a "auto-vectorization" branch once gcc-13 branches for release. There are two known issues related to crashes (assert failures) associated with tree vectorization; one of which I have sent a patch for and have received feedback.
>
> Changes in v4:
>
> - Added support for binary integer operations and test cases
> - Fixed bug to support 8-bit integer vectorization
> - Fixed several assert errors related to non-multiple of two vector modes
>
> Changes in v3:
>
> - Removed the cost model and cost hooks based on feedback from Richard Biener
> - Used RVV_VUNDEF macro to fix failing patterns
>
> Changes in v2
>
> - Updated ChangeLog entry to include RiVAI contributions
> - Fixed ChangeLog email formatting
> - Fixed gnu formatting issues in the code
>
> Kevin Lee (2):
>   This patch adds a guard for VNx1 vectors that are present in ports
>     like riscv.
>   This patch supports 8 bit auto-vectorization in riscv.
>
> Michael Collison (8):
>   RISC-V: Add new predicates and function prototypes
>   RISC-V: autovec: Export policy functions to global scope
>   RISC-V:autovec: Add auto-vectorization support functions
>   RISC-V:autovec: Add target vectorization hooks
>   RISC-V:autovec: Add autovectorization patterns for binary integer
>     operations
>   RISC-V:autovec: Add autovectorization tests for add & sub
>   vect: Verify that GET_MODE_NUNITS is a multiple of 2.
>   RISC-V:autovec: Add autovectorization tests for binary integer
>
>  gcc/config/riscv/predicates.md                |  13 ++
>  gcc/config/riscv/riscv-opts.h                 |  40 ++++
>  gcc/config/riscv/riscv-protos.h               |  14 ++
>  gcc/config/riscv/riscv-v.cc                   | 176 ++++++++++++++++++
>  gcc/config/riscv/riscv-vector-builtins.cc     |   4 +-
>  gcc/config/riscv/riscv-vector-builtins.h      |   3 +
>  gcc/config/riscv/riscv.cc                     | 157 ++++++++++++++++
>  gcc/config/riscv/riscv.md                     |   1 +
>  gcc/config/riscv/riscv.opt                    |  20 ++
>  gcc/config/riscv/vector-auto.md               |  79 ++++++++
>  gcc/config/riscv/vector-iterators.md          |   2 +
>  gcc/config/riscv/vector.md                    |   4 +-
>  .../riscv/rvv/autovec/loop-add-rv32.c         |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-add.c   |  25 +++
>  .../riscv/rvv/autovec/loop-and-rv32.c         |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-and.c   |  25 +++
>  .../riscv/rvv/autovec/loop-div-rv32.c         |  27 +++
>  .../gcc.target/riscv/rvv/autovec/loop-div.c   |  27 +++
>  .../riscv/rvv/autovec/loop-max-rv32.c         |  26 +++
>  .../gcc.target/riscv/rvv/autovec/loop-max.c   |  26 +++
>  .../riscv/rvv/autovec/loop-min-rv32.c         |  26 +++
>  .../gcc.target/riscv/rvv/autovec/loop-min.c   |  26 +++
>  .../riscv/rvv/autovec/loop-mod-rv32.c         |  27 +++
>  .../gcc.target/riscv/rvv/autovec/loop-mod.c   |  27 +++
>  .../riscv/rvv/autovec/loop-mul-rv32.c         |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-mul.c   |  25 +++
>  .../riscv/rvv/autovec/loop-or-rv32.c          |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-or.c    |  25 +++
>  .../riscv/rvv/autovec/loop-sub-rv32.c         |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-sub.c   |  25 +++
>  .../riscv/rvv/autovec/loop-xor-rv32.c         |  25 +++
>  .../gcc.target/riscv/rvv/autovec/loop-xor.c   |  25 +++
>  gcc/testsuite/gcc.target/riscv/rvv/rvv.exp    |   3 +
>  gcc/tree-vect-data-refs.cc                    |   2 +
>  gcc/tree-vect-slp.cc                          |   7 +-
>  35 files changed, 1031 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/config/riscv/vector-auto.md
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c

A few of us were just talking in the patchwork sync, it looks like these 
are going to conflict with the WHILE_LEN work.  Jeff is going to merge 
those soon, Michael: do you mind rebasing these on trunk when those 
land?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 01/10] RISC-V: Add new predicates and function prototypes
  2023-04-19  0:54   ` Kito Cheng
@ 2023-04-26  2:50     ` Jeff Law
  0 siblings, 0 replies; 36+ messages in thread
From: Jeff Law @ 2023-04-26  2:50 UTC (permalink / raw)
  To: Kito Cheng, Michael Collison; +Cc: gcc-patches



On 4/18/23 18:54, Kito Cheng via Gcc-patches wrote:
> Could you please move the new function declarations and new code to
> 
>> +Enum
>> +Name(riscv_vector_lmul) Type(enum riscv_vector_lmul_enum)
>> +The possible vectorization factor:
>> +
>> +EnumValue
>> +Enum(riscv_vector_lmul) String(1) Value(RVV_LMUL1)
>> +
>> +EnumValue
>> +Enum(riscv_vector_lmul) String(2) Value(RVV_LMUL2)
>> +
>> +EnumValue
>> +Enum(riscv_vector_lmul) String(4) Value(RVV_LMUL4)
>> +
>> +EnumValue
>> +Enum(riscv_vector_lmul) String(8) Value(RVV_LMUL8)
> 
> I would like to introduce this option later, it's used for fine tuning,
> VLA vectorizer should be able to work without this tuning option.
So I think this was in a patch I already ACK'd from Juzhe.

> 
>> +mriscv-vector-lmul=
>> +Target RejectNegative Joined Enum(riscv_vector_lmul) Var(riscv_vector_lmul) Init(RVV_LMUL1)
>> +-mriscv-vector-lmul=<lmul>     Set the vf using lmul in auto-vectorization.
>> +
> 
> Same question for this
Similarly.

jeff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 00/10] RISC-V: Add autovec support
  2023-04-25 15:26 ` Palmer Dabbelt
@ 2023-04-26  2:52   ` Jeff Law
  0 siblings, 0 replies; 36+ messages in thread
From: Jeff Law @ 2023-04-26  2:52 UTC (permalink / raw)
  To: Palmer Dabbelt, collison, Jeff Law; +Cc: gcc-patches



On 4/25/23 09:26, Palmer Dabbelt wrote:
> On Mon, 17 Apr 2023 11:36:51 PDT (-0700), collison@rivosinc.com wrote:
>> This series of patches adds foundational support for RISC-V auto-vectorization support. These patches are based on the current upstream rvv vector intrinsic support and is not a new implementation. Most of the implementation consists of adding the new vector cost model, the autovectorization patterns themselves and target hooks. This implementation only provides support for integer addition and subtraction as a proof of concept. This patch set should not be construed to be feature complete. Based on conversations with the community these patches are intended to lay the groundwork for feature completion and collaboration within the RISC-V community.
>>
>> These patches are largely based off the work of Juzhe Zhong (juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>) of RiVAI. More specifically the rvv-next branch at: https://github.com/riscv-collab/riscv-gcc.git <https://github.com/riscv-collab/riscv-gcc.git>is the foundation of this patch set.
>>
>> As discussed on this list, if these patches are approved they will be merged into a "auto-vectorization" branch once gcc-13 branches for release. There are two known issues related to crashes (assert failures) associated with tree vectorization; one of which I have sent a patch for and have received feedback.
>>
>> Changes in v4:
>>
>> - Added support for binary integer operations and test cases
>> - Fixed bug to support 8-bit integer vectorization
>> - Fixed several assert errors related to non-multiple of two vector modes
>>
>> Changes in v3:
>>
>> - Removed the cost model and cost hooks based on feedback from Richard Biener
>> - Used RVV_VUNDEF macro to fix failing patterns
>>
>> Changes in v2
>>
>> - Updated ChangeLog entry to include RiVAI contributions
>> - Fixed ChangeLog email formatting
>> - Fixed gnu formatting issues in the code
>>
>> Kevin Lee (2):
>>    This patch adds a guard for VNx1 vectors that are present in ports
>>      like riscv.
>>    This patch supports 8 bit auto-vectorization in riscv.
>>
>> Michael Collison (8):
>>    RISC-V: Add new predicates and function prototypes
>>    RISC-V: autovec: Export policy functions to global scope
>>    RISC-V:autovec: Add auto-vectorization support functions
>>    RISC-V:autovec: Add target vectorization hooks
>>    RISC-V:autovec: Add autovectorization patterns for binary integer
>>      operations
>>    RISC-V:autovec: Add autovectorization tests for add & sub
>>    vect: Verify that GET_MODE_NUNITS is a multiple of 2.
>>    RISC-V:autovec: Add autovectorization tests for binary integer
>>
>>   gcc/config/riscv/predicates.md                |  13 ++
>>   gcc/config/riscv/riscv-opts.h                 |  40 ++++
>>   gcc/config/riscv/riscv-protos.h               |  14 ++
>>   gcc/config/riscv/riscv-v.cc                   | 176 ++++++++++++++++++
>>   gcc/config/riscv/riscv-vector-builtins.cc     |   4 +-
>>   gcc/config/riscv/riscv-vector-builtins.h      |   3 +
>>   gcc/config/riscv/riscv.cc                     | 157 ++++++++++++++++
>>   gcc/config/riscv/riscv.md                     |   1 +
>>   gcc/config/riscv/riscv.opt                    |  20 ++
>>   gcc/config/riscv/vector-auto.md               |  79 ++++++++
>>   gcc/config/riscv/vector-iterators.md          |   2 +
>>   gcc/config/riscv/vector.md                    |   4 +-
>>   .../riscv/rvv/autovec/loop-add-rv32.c         |  25 +++
>>   .../gcc.target/riscv/rvv/autovec/loop-add.c   |  25 +++
>>   .../riscv/rvv/autovec/loop-and-rv32.c         |  25 +++
>>   .../gcc.target/riscv/rvv/autovec/loop-and.c   |  25 +++
>>   .../riscv/rvv/autovec/loop-div-rv32.c         |  27 +++
>>   .../gcc.target/riscv/rvv/autovec/loop-div.c   |  27 +++
>>   .../riscv/rvv/autovec/loop-max-rv32.c         |  26 +++
>>   .../gcc.target/riscv/rvv/autovec/loop-max.c   |  26 +++
>>   .../riscv/rvv/autovec/loop-min-rv32.c         |  26 +++
>>   .../gcc.target/riscv/rvv/autovec/loop-min.c   |  26 +++
>>   .../riscv/rvv/autovec/loop-mod-rv32.c         |  27 +++
>>   .../gcc.target/riscv/rvv/autovec/loop-mod.c   |  27 +++
>>   .../riscv/rvv/autovec/loop-mul-rv32.c         |  25 +++
>>   .../gcc.target/riscv/rvv/autovec/loop-mul.c   |  25 +++
>>   .../riscv/rvv/autovec/loop-or-rv32.c          |  25 +++
>>   .../gcc.target/riscv/rvv/autovec/loop-or.c    |  25 +++
>>   .../riscv/rvv/autovec/loop-sub-rv32.c         |  25 +++
>>   .../gcc.target/riscv/rvv/autovec/loop-sub.c   |  25 +++
>>   .../riscv/rvv/autovec/loop-xor-rv32.c         |  25 +++
>>   .../gcc.target/riscv/rvv/autovec/loop-xor.c   |  25 +++
>>   gcc/testsuite/gcc.target/riscv/rvv/rvv.exp    |   3 +
>>   gcc/tree-vect-data-refs.cc                    |   2 +
>>   gcc/tree-vect-slp.cc                          |   7 +-
>>   35 files changed, 1031 insertions(+), 6 deletions(-)
>>   create mode 100644 gcc/config/riscv/vector-auto.md
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and-rv32.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-and.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div-rv32.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-div.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max-rv32.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-max.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min-rv32.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-min.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod-rv32.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mod.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul-rv32.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-mul.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or-rv32.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-or.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor-rv32.c
>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-xor.c
> 
> A few of us were just talking in the patchwork sync, it looks like these
> are going to conflict with the WHILE_LEN work.  Jeff is going to merge
> those soon, Michael: do you mind rebasing these on trunk when those
> land?
Right.  There were three preliminary patches from Juzhe that I've acked. 
  It sounds like Juzhe isn't going to be committting patches directly 
until May, so I'll merge up the first three myself.   Perhaps as early 
as tonight.

jeff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations
  2023-04-20  2:24   ` juzhe.zhong
@ 2023-04-26 18:15     ` Robin Dapp
       [not found]     ` <3DF5ADD87A33EE11+BA2E4625-72A4-421A-B9D3-6DCA48E402BD@rivai.ai>
  1 sibling, 0 replies; 36+ messages in thread
From: Robin Dapp @ 2023-04-26 18:15 UTC (permalink / raw)
  To: juzhe.zhong, collison, gcc-patches
  Cc: jeffreyalaw, Kito.cheng, kito.cheng, palmer, palmer

Hi Michael,

I have the diff below for the binops in my tree locally.
Maybe something like this works for you? Untested but compiles and
the expander helpers would need to be fortified obviously.

Regards
 Robin

--

gcc/ChangeLog:

        * config/riscv/autovec.md (<optab><mode>3): New binops expander.
        * config/riscv/riscv-protos.h (emit_nonvlmax_binop): Define.
        * config/riscv/riscv-v.cc (emit_pred_binop): New function.
        (emit_nonvlmax_binop): New function.
        * config/riscv/vector-iterators.md: New iterator.
---
 gcc/config/riscv/autovec.md          | 12 ++++
 gcc/config/riscv/riscv-protos.h      |  1 +
 gcc/config/riscv/riscv-v.cc          | 89 ++++++++++++++++++++--------
 gcc/config/riscv/vector-iterators.md | 20 +++++++
 4 files changed, 97 insertions(+), 25 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index b5d46ff57ab..c21d241f426 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -47,3 +47,15 @@ (define_expand "len_store_<mode>"
 				  operands[1], operands[2], <VM>mode);
   DONE;
 })
+
+(define_expand "<optab><mode>3"
+  [(set (match_operand:VI 0 "register_operand")
+	(any_int_binop:VI (match_operand:VI 1 "register_operand")
+			  (match_operand:VI 2 "register_operand")))]
+  "TARGET_VECTOR"
+{
+  riscv_vector::emit_nonvlmax_binop (code_for_pred (<ANY_INT_BINOP>, <MODE>mode),
+				     operands[0], operands[1], operands[2],
+				     gen_reg_rtx (Pmode), <VM>mode);
+  DONE;
+})
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index f6ea6846736..5cca543c773 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -163,6 +163,7 @@ void emit_hard_vlmax_vsetvl (machine_mode, rtx);
 void emit_vlmax_op (unsigned, rtx, rtx, machine_mode);
 void emit_vlmax_op (unsigned, rtx, rtx, rtx, machine_mode);
 void emit_nonvlmax_op (unsigned, rtx, rtx, rtx, machine_mode);
+void emit_nonvlmax_binop (unsigned, rtx, rtx, rtx, rtx, machine_mode);
 enum vlmul_type get_vlmul (machine_mode);
 unsigned int get_ratio (machine_mode);
 int get_ta (rtx);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 5e69427ac54..98ebc052340 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -52,7 +52,7 @@ namespace riscv_vector {
 template <int MAX_OPERANDS> class insn_expander
 {
 public:
-  insn_expander () : m_opno (0) {}
+  insn_expander () : m_opno (0), has_dest(false) {}
   void add_output_operand (rtx x, machine_mode mode)
   {
     create_output_operand (&m_ops[m_opno++], x, mode);
@@ -83,6 +83,44 @@ public:
     add_input_operand (gen_int_mode (type, Pmode), Pmode);
   }
 
+  void set_dest_and_mask (rtx mask, rtx dest, machine_mode mask_mode)
+  {
+    dest_mode = GET_MODE (dest);
+    has_dest = true;
+
+    add_output_operand (dest, dest_mode);
+
+    if (mask)
+      add_input_operand (mask, GET_MODE (mask));
+    else
+      add_all_one_mask_operand (mask_mode);
+
+    add_vundef_operand (dest_mode);
+  }
+
+  void set_len_and_policy (rtx len, bool vlmax_p)
+    {
+      gcc_assert (has_dest);
+      gcc_assert (len || vlmax_p);
+
+      if (len)
+	add_input_operand (len, Pmode);
+      else
+	{
+	  rtx vlmax = gen_reg_rtx (Pmode);
+	  emit_vlmax_vsetvl (dest_mode, vlmax);
+	  add_input_operand (vlmax, Pmode);
+	}
+
+      if (GET_MODE_CLASS (dest_mode) != MODE_VECTOR_BOOL)
+	add_policy_operand (get_prefer_tail_policy (), get_prefer_mask_policy ());
+
+      if (vlmax_p)
+	add_avl_type_operand (avl_type::VLMAX);
+      else
+	add_avl_type_operand (avl_type::NONVLMAX);
+    }
+
   void expand (enum insn_code icode, bool temporary_volatile_p = false)
   {
     if (temporary_volatile_p)
@@ -96,6 +134,8 @@ public:
 
 private:
   int m_opno;
+  bool has_dest;
+  machine_mode dest_mode;
   expand_operand m_ops[MAX_OPERANDS];
 };
 
@@ -183,37 +223,29 @@ emit_pred_op (unsigned icode, rtx mask, rtx dest, rtx src, rtx len,
 	      machine_mode mask_mode, bool vlmax_p)
 {
   insn_expander<8> e;
-  machine_mode mode = GET_MODE (dest);
+  e.set_dest_and_mask (mask, dest, mask_mode);
 
-  e.add_output_operand (dest, mode);
-
-  if (mask)
-    e.add_input_operand (mask, GET_MODE (mask));
-  else
-    e.add_all_one_mask_operand (mask_mode);
+  e.add_input_operand (src, GET_MODE (src));
 
-  e.add_vundef_operand (mode);
+  e.set_len_and_policy (len, vlmax_p);
 
-  e.add_input_operand (src, GET_MODE (src));
+  e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src));
+}
 
-  if (len)
-    e.add_input_operand (len, Pmode);
-  else
-    {
-      rtx vlmax = gen_reg_rtx (Pmode);
-      emit_vlmax_vsetvl (mode, vlmax);
-      e.add_input_operand (vlmax, Pmode);
-    }
+/* Emit an RVV unmask && vl mov from SRC to DEST.  */
+static void
+emit_pred_binop (unsigned icode, rtx mask, rtx dest, rtx src1, rtx src2,
+		 rtx len, machine_mode mask_mode, bool vlmax_p)
+{
+  insn_expander<9> e;
+  e.set_dest_and_mask (mask, dest, mask_mode);
 
-  if (GET_MODE_CLASS (mode) != MODE_VECTOR_BOOL)
-    e.add_policy_operand (get_prefer_tail_policy (), get_prefer_mask_policy ());
+  e.add_input_operand (src1, GET_MODE (src1));
+  e.add_input_operand (src2, GET_MODE (src2));
 
-  if (vlmax_p)
-    e.add_avl_type_operand (avl_type::VLMAX);
-  else
-    e.add_avl_type_operand (avl_type::NONVLMAX);
+  e.set_len_and_policy (len, vlmax_p);
 
-  e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src));
+  e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src1) || MEM_P (src2));
 }
 
 void
@@ -236,6 +268,13 @@ emit_nonvlmax_op (unsigned icode, rtx dest, rtx src, rtx len,
   emit_pred_op (icode, NULL_RTX, dest, src, len, mask_mode, false);
 }
 
+void
+emit_nonvlmax_binop (unsigned icode, rtx dest, rtx src1, rtx src2, rtx len,
+		     machine_mode mask_mode)
+{
+  emit_pred_binop (icode, NULL_RTX, dest, src1, src2, len, mask_mode, false);
+}
+
 static void
 expand_const_vector (rtx target, rtx src, machine_mode mask_mode)
 {
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index a8e856161d3..7cf21751d2f 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -934,6 +934,26 @@ (define_code_iterator any_int_binop [plus minus and ior xor ashift ashiftrt lshi
   smax umax smin umin mult div udiv mod umod
 ])
 
+(define_code_attr ANY_INT_BINOP [
+    (plus "PLUS")
+    (minus "MINUS")
+    (and "AND")
+    (ior "IOR")
+    (xor "XOR")
+    (ashift "ASHIFT")
+    (ashiftrt "ASHIFTRT")
+    (lshiftrt "LSHIFTRT")
+    (smax "SMAX")
+    (umax "UMAX")
+    (smin "SMIN")
+    (umin "UMIN")
+    (mult "MULT")
+    (div "DIV")
+    (udiv "UDIV")
+    (mod "MOD")
+    (umod "UMOD")
+])
+
 (define_code_iterator any_int_unop [neg not])
 
 (define_code_iterator any_commutative_binop [plus and ior xor
-- 
2.40.0

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 05/10] RISC-V: autovec: Add autovectorization patterns for binary integer operations
       [not found]     ` <3DF5ADD87A33EE11+BA2E4625-72A4-421A-B9D3-6DCA48E402BD@rivai.ai>
@ 2023-04-27  0:04       ` Michael Collison
  2023-04-27 16:20         ` Palmer Dabbelt
  0 siblings, 1 reply; 36+ messages in thread
From: Michael Collison @ 2023-04-27  0:04 UTC (permalink / raw)
  To: juzhe.zhong, Robin Dapp
  Cc: gcc-patches, jeffreyalaw, Kito.cheng, kito.cheng, palmer, palmer

[-- Attachment #1: Type: text/plain, Size: 8704 bytes --]

Hi Robin and Juzhe,

Just took a look and I like the approach.

On 4/26/23 19:43, juzhe.zhong wrote:
> Yeah,Robin stuff is what I want and is making perfect sense for me.
> ---- Replied Message ----
> From 	Robin Dapp<rdapp.gcc@gmail.com> <mailto:rdapp.gcc@gmail.com>
> Date 	04/27/2023 02:15
> To 	juzhe.zhong@rivai.ai<juzhe.zhong@rivai.ai> 
> <mailto:juzhe.zhong@rivai.ai>,
> collison<collison@rivosinc.com> <mailto:collison@rivosinc.com>,
> gcc-patches<gcc-patches@gcc.gnu.org> <mailto:gcc-patches@gcc.gnu.org>
> Cc 	jeffreyalaw<jeffreyalaw@gmail.com> <mailto:jeffreyalaw@gmail.com>,
> Kito.cheng<kito.cheng@sifive.com> <mailto:kito.cheng@sifive.com>,
> kito.cheng<kito.cheng@gmail.com> <mailto:kito.cheng@gmail.com>,
> palmer<palmer@dabbelt.com> <mailto:palmer@dabbelt.com>,
> palmer<palmer@rivosinc.com> <mailto:palmer@rivosinc.com>
> Subject 	Re: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization 
> patterns for binary integer operations
>
> Hi Michael,
>
> I have the diff below for the binops in my tree locally.
> Maybe something like this works for you? Untested but compiles and
> the expander helpers would need to be fortified obviously.
>
> Regards
> Robin
>
> -- 
>
> gcc/ChangeLog:
>
>        * config/riscv/autovec.md (<optab><mode>3): New binops expander.
>        * config/riscv/riscv-protos.h (emit_nonvlmax_binop): Define.
>        * config/riscv/riscv-v.cc (emit_pred_binop): New function.
>        (emit_nonvlmax_binop): New function.
>        * config/riscv/vector-iterators.md: New iterator.
> ---
> gcc/config/riscv/autovec.md          | 12 ++++
> gcc/config/riscv/riscv-protos.h      |  1 +
> gcc/config/riscv/riscv-v.cc          | 89 ++++++++++++++++++++--------
> gcc/config/riscv/vector-iterators.md | 20 +++++++
> 4 files changed, 97 insertions(+), 25 deletions(-)
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index b5d46ff57ab..c21d241f426 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -47,3 +47,15 @@ (define_expand "len_store_<mode>"
>                  operands[1], operands[2], <VM>mode);
>   DONE;
> })
> +
> +(define_expand "<optab><mode>3"
> +  [(set (match_operand:VI 0 "register_operand")
> +    (any_int_binop:VI (match_operand:VI 1 "register_operand")
> +              (match_operand:VI 2 "register_operand")))]
> +  "TARGET_VECTOR"
> +{
> +  riscv_vector::emit_nonvlmax_binop (code_for_pred (<ANY_INT_BINOP>, 
> <MODE>mode),
> +                     operands[0], operands[1], operands[2],
> +                     gen_reg_rtx (Pmode), <VM>mode);
> +  DONE;
> +})
> diff --git a/gcc/config/riscv/riscv-protos.h 
> b/gcc/config/riscv/riscv-protos.h
> index f6ea6846736..5cca543c773 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -163,6 +163,7 @@ void emit_hard_vlmax_vsetvl (machine_mode, rtx);
> void emit_vlmax_op (unsigned, rtx, rtx, machine_mode);
> void emit_vlmax_op (unsigned, rtx, rtx, rtx, machine_mode);
> void emit_nonvlmax_op (unsigned, rtx, rtx, rtx, machine_mode);
> +void emit_nonvlmax_binop (unsigned, rtx, rtx, rtx, rtx, machine_mode);
> enum vlmul_type get_vlmul (machine_mode);
> unsigned int get_ratio (machine_mode);
> int get_ta (rtx);
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index 5e69427ac54..98ebc052340 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -52,7 +52,7 @@ namespace riscv_vector {
> template <int MAX_OPERANDS> class insn_expander
> {
> public:
> -  insn_expander () : m_opno (0) {}
> +  insn_expander () : m_opno (0), has_dest(false) {}
>   void add_output_operand (rtx x, machine_mode mode)
>   {
>     create_output_operand (&m_ops[m_opno++], x, mode);
> @@ -83,6 +83,44 @@ public:
>     add_input_operand (gen_int_mode (type, Pmode), Pmode);
>   }
>
> +  void set_dest_and_mask (rtx mask, rtx dest, machine_mode mask_mode)
> +  {
> +    dest_mode = GET_MODE (dest);
> +    has_dest = true;
> +
> +    add_output_operand (dest, dest_mode);
> +
> +    if (mask)
> +      add_input_operand (mask, GET_MODE (mask));
> +    else
> +      add_all_one_mask_operand (mask_mode);
> +
> +    add_vundef_operand (dest_mode);
> +  }
> +
> +  void set_len_and_policy (rtx len, bool vlmax_p)
> +    {
> +      gcc_assert (has_dest);
> +      gcc_assert (len || vlmax_p);
> +
> +      if (len)
> +    add_input_operand (len, Pmode);
> +      else
> +    {
> +      rtx vlmax = gen_reg_rtx (Pmode);
> +      emit_vlmax_vsetvl (dest_mode, vlmax);
> +      add_input_operand (vlmax, Pmode);
> +    }
> +
> +      if (GET_MODE_CLASS (dest_mode) != MODE_VECTOR_BOOL)
> +    add_policy_operand (get_prefer_tail_policy (), 
> get_prefer_mask_policy ());
> +
> +      if (vlmax_p)
> +    add_avl_type_operand (avl_type::VLMAX);
> +      else
> +    add_avl_type_operand (avl_type::NONVLMAX);
> +    }
> +
>   void expand (enum insn_code icode, bool temporary_volatile_p = false)
>   {
>     if (temporary_volatile_p)
> @@ -96,6 +134,8 @@ public:
>
> private:
>   int m_opno;
> +  bool has_dest;
> +  machine_mode dest_mode;
>   expand_operand m_ops[MAX_OPERANDS];
> };
>
> @@ -183,37 +223,29 @@ emit_pred_op (unsigned icode, rtx mask, rtx 
> dest, rtx src, rtx len,
>          machine_mode mask_mode, bool vlmax_p)
> {
>   insn_expander<8> e;
> -  machine_mode mode = GET_MODE (dest);
> +  e.set_dest_and_mask (mask, dest, mask_mode);
>
> -  e.add_output_operand (dest, mode);
> -
> -  if (mask)
> -    e.add_input_operand (mask, GET_MODE (mask));
> -  else
> -    e.add_all_one_mask_operand (mask_mode);
> +  e.add_input_operand (src, GET_MODE (src));
>
> -  e.add_vundef_operand (mode);
> +  e.set_len_and_policy (len, vlmax_p);
>
> -  e.add_input_operand (src, GET_MODE (src));
> +  e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src));
> +}
>
> -  if (len)
> -    e.add_input_operand (len, Pmode);
> -  else
> -    {
> -      rtx vlmax = gen_reg_rtx (Pmode);
> -      emit_vlmax_vsetvl (mode, vlmax);
> -      e.add_input_operand (vlmax, Pmode);
> -    }
> +/* Emit an RVV unmask && vl mov from SRC to DEST.  */
> +static void
> +emit_pred_binop (unsigned icode, rtx mask, rtx dest, rtx src1, rtx src2,
> +         rtx len, machine_mode mask_mode, bool vlmax_p)
> +{
> +  insn_expander<9> e;
> +  e.set_dest_and_mask (mask, dest, mask_mode);
>
> -  if (GET_MODE_CLASS (mode) != MODE_VECTOR_BOOL)
> -    e.add_policy_operand (get_prefer_tail_policy (), 
> get_prefer_mask_policy ());
> +  e.add_input_operand (src1, GET_MODE (src1));
> +  e.add_input_operand (src2, GET_MODE (src2));
>
> -  if (vlmax_p)
> -    e.add_avl_type_operand (avl_type::VLMAX);
> -  else
> -    e.add_avl_type_operand (avl_type::NONVLMAX);
> +  e.set_len_and_policy (len, vlmax_p);
>
> -  e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src));
> +  e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src1) || 
> MEM_P (src2));
> }
>
> void
> @@ -236,6 +268,13 @@ emit_nonvlmax_op (unsigned icode, rtx dest, rtx 
> src, rtx len,
>   emit_pred_op (icode, NULL_RTX, dest, src, len, mask_mode, false);
> }
>
> +void
> +emit_nonvlmax_binop (unsigned icode, rtx dest, rtx src1, rtx src2, 
> rtx len,
> +             machine_mode mask_mode)
> +{
> +  emit_pred_binop (icode, NULL_RTX, dest, src1, src2, len, mask_mode, 
> false);
> +}
> +
> static void
> expand_const_vector (rtx target, rtx src, machine_mode mask_mode)
> {
> diff --git a/gcc/config/riscv/vector-iterators.md 
> b/gcc/config/riscv/vector-iterators.md
> index a8e856161d3..7cf21751d2f 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -934,6 +934,26 @@ (define_code_iterator any_int_binop [plus minus 
> and ior xor ashift ashiftrt lshi
>   smax umax smin umin mult div udiv mod umod
> ])
>
> +(define_code_attr ANY_INT_BINOP [
> +    (plus "PLUS")
> +    (minus "MINUS")
> +    (and "AND")
> +    (ior "IOR")
> +    (xor "XOR")
> +    (ashift "ASHIFT")
> +    (ashiftrt "ASHIFTRT")
> +    (lshiftrt "LSHIFTRT")
> +    (smax "SMAX")
> +    (umax "UMAX")
> +    (smin "SMIN")
> +    (umin "UMIN")
> +    (mult "MULT")
> +    (div "DIV")
> +    (udiv "UDIV")
> +    (mod "MOD")
> +    (umod "UMOD")
> +])
> +
> (define_code_iterator any_int_unop [neg not])
>
> (define_code_iterator any_commutative_binop [plus and ior xor
> -- 
> 2.40.0
> \x11

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v4 05/10] RISC-V: autovec: Add autovectorization patterns for binary integer operations
  2023-04-27  0:04       ` [PATCH v4 05/10] RISC-V: autovec: " Michael Collison
@ 2023-04-27 16:20         ` Palmer Dabbelt
  0 siblings, 0 replies; 36+ messages in thread
From: Palmer Dabbelt @ 2023-04-27 16:20 UTC (permalink / raw)
  To: collison, kevinl
  Cc: juzhe.zhong, rdapp.gcc, gcc-patches, jeffreyalaw, kito.cheng, Kito Cheng

On Wed, 26 Apr 2023 17:04:17 PDT (-0700), collison@rivosinc.com wrote:
> Hi Robin and Juzhe,
>
> Just took a look and I like the approach.

I assume it's best to just squash these into the series?  That seems 
reasonable to me, the only issue is that Micheal's PTO for a few days 
(this week and the first half on next week), so it might take a bit 
longer that expected.  There's a v5 on the lists, but we didn't have 
time to pick this all up and figured it'd be better to just get out 
whatever was ready.

Kevin: do you have time to squash these in and re-spin the tests?  The 
changes are big enough to warrant a v6 already, so might as well get 
started now.

> On 4/26/23 19:43, juzhe.zhong wrote:
>> Yeah,Robin stuff is what I want and is making perfect sense for me.
>> ---- Replied Message ----
>> From 	Robin Dapp<rdapp.gcc@gmail.com> <mailto:rdapp.gcc@gmail.com>
>> Date 	04/27/2023 02:15
>> To 	juzhe.zhong@rivai.ai<juzhe.zhong@rivai.ai>
>> <mailto:juzhe.zhong@rivai.ai>,
>> collison<collison@rivosinc.com> <mailto:collison@rivosinc.com>,
>> gcc-patches<gcc-patches@gcc.gnu.org> <mailto:gcc-patches@gcc.gnu.org>
>> Cc 	jeffreyalaw<jeffreyalaw@gmail.com> <mailto:jeffreyalaw@gmail.com>,
>> Kito.cheng<kito.cheng@sifive.com> <mailto:kito.cheng@sifive.com>,
>> kito.cheng<kito.cheng@gmail.com> <mailto:kito.cheng@gmail.com>,
>> palmer<palmer@dabbelt.com> <mailto:palmer@dabbelt.com>,
>> palmer<palmer@rivosinc.com> <mailto:palmer@rivosinc.com>
>> Subject 	Re: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization
>> patterns for binary integer operations
>>
>> Hi Michael,
>>
>> I have the diff below for the binops in my tree locally.
>> Maybe something like this works for you? Untested but compiles and
>> the expander helpers would need to be fortified obviously.
>>
>> Regards
>> Robin
>>
>> --
>>
>> gcc/ChangeLog:
>>
>>        * config/riscv/autovec.md (<optab><mode>3): New binops expander.
>>        * config/riscv/riscv-protos.h (emit_nonvlmax_binop): Define.
>>        * config/riscv/riscv-v.cc (emit_pred_binop): New function.
>>        (emit_nonvlmax_binop): New function.
>>        * config/riscv/vector-iterators.md: New iterator.
>> ---
>> gcc/config/riscv/autovec.md          | 12 ++++
>> gcc/config/riscv/riscv-protos.h      |  1 +
>> gcc/config/riscv/riscv-v.cc          | 89 ++++++++++++++++++++--------
>> gcc/config/riscv/vector-iterators.md | 20 +++++++
>> 4 files changed, 97 insertions(+), 25 deletions(-)
>>
>> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
>> index b5d46ff57ab..c21d241f426 100644
>> --- a/gcc/config/riscv/autovec.md
>> +++ b/gcc/config/riscv/autovec.md
>> @@ -47,3 +47,15 @@ (define_expand "len_store_<mode>"
>>                  operands[1], operands[2], <VM>mode);
>>   DONE;
>> })
>> +
>> +(define_expand "<optab><mode>3"
>> +  [(set (match_operand:VI 0 "register_operand")
>> +    (any_int_binop:VI (match_operand:VI 1 "register_operand")
>> +              (match_operand:VI 2 "register_operand")))]
>> +  "TARGET_VECTOR"
>> +{
>> +  riscv_vector::emit_nonvlmax_binop (code_for_pred (<ANY_INT_BINOP>,
>> <MODE>mode),
>> +                     operands[0], operands[1], operands[2],
>> +                     gen_reg_rtx (Pmode), <VM>mode);
>> +  DONE;
>> +})
>> diff --git a/gcc/config/riscv/riscv-protos.h
>> b/gcc/config/riscv/riscv-protos.h
>> index f6ea6846736..5cca543c773 100644
>> --- a/gcc/config/riscv/riscv-protos.h
>> +++ b/gcc/config/riscv/riscv-protos.h
>> @@ -163,6 +163,7 @@ void emit_hard_vlmax_vsetvl (machine_mode, rtx);
>> void emit_vlmax_op (unsigned, rtx, rtx, machine_mode);
>> void emit_vlmax_op (unsigned, rtx, rtx, rtx, machine_mode);
>> void emit_nonvlmax_op (unsigned, rtx, rtx, rtx, machine_mode);
>> +void emit_nonvlmax_binop (unsigned, rtx, rtx, rtx, rtx, machine_mode);
>> enum vlmul_type get_vlmul (machine_mode);
>> unsigned int get_ratio (machine_mode);
>> int get_ta (rtx);
>> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
>> index 5e69427ac54..98ebc052340 100644
>> --- a/gcc/config/riscv/riscv-v.cc
>> +++ b/gcc/config/riscv/riscv-v.cc
>> @@ -52,7 +52,7 @@ namespace riscv_vector {
>> template <int MAX_OPERANDS> class insn_expander
>> {
>> public:
>> -  insn_expander () : m_opno (0) {}
>> +  insn_expander () : m_opno (0), has_dest(false) {}
>>   void add_output_operand (rtx x, machine_mode mode)
>>   {
>>     create_output_operand (&m_ops[m_opno++], x, mode);
>> @@ -83,6 +83,44 @@ public:
>>     add_input_operand (gen_int_mode (type, Pmode), Pmode);
>>   }
>>
>> +  void set_dest_and_mask (rtx mask, rtx dest, machine_mode mask_mode)
>> +  {
>> +    dest_mode = GET_MODE (dest);
>> +    has_dest = true;
>> +
>> +    add_output_operand (dest, dest_mode);
>> +
>> +    if (mask)
>> +      add_input_operand (mask, GET_MODE (mask));
>> +    else
>> +      add_all_one_mask_operand (mask_mode);
>> +
>> +    add_vundef_operand (dest_mode);
>> +  }
>> +
>> +  void set_len_and_policy (rtx len, bool vlmax_p)
>> +    {
>> +      gcc_assert (has_dest);
>> +      gcc_assert (len || vlmax_p);
>> +
>> +      if (len)
>> +    add_input_operand (len, Pmode);
>> +      else
>> +    {
>> +      rtx vlmax = gen_reg_rtx (Pmode);
>> +      emit_vlmax_vsetvl (dest_mode, vlmax);
>> +      add_input_operand (vlmax, Pmode);
>> +    }
>> +
>> +      if (GET_MODE_CLASS (dest_mode) != MODE_VECTOR_BOOL)
>> +    add_policy_operand (get_prefer_tail_policy (),
>> get_prefer_mask_policy ());
>> +
>> +      if (vlmax_p)
>> +    add_avl_type_operand (avl_type::VLMAX);
>> +      else
>> +    add_avl_type_operand (avl_type::NONVLMAX);
>> +    }
>> +
>>   void expand (enum insn_code icode, bool temporary_volatile_p = false)
>>   {
>>     if (temporary_volatile_p)
>> @@ -96,6 +134,8 @@ public:
>>
>> private:
>>   int m_opno;
>> +  bool has_dest;
>> +  machine_mode dest_mode;
>>   expand_operand m_ops[MAX_OPERANDS];
>> };
>>
>> @@ -183,37 +223,29 @@ emit_pred_op (unsigned icode, rtx mask, rtx
>> dest, rtx src, rtx len,
>>          machine_mode mask_mode, bool vlmax_p)
>> {
>>   insn_expander<8> e;
>> -  machine_mode mode = GET_MODE (dest);
>> +  e.set_dest_and_mask (mask, dest, mask_mode);
>>
>> -  e.add_output_operand (dest, mode);
>> -
>> -  if (mask)
>> -    e.add_input_operand (mask, GET_MODE (mask));
>> -  else
>> -    e.add_all_one_mask_operand (mask_mode);
>> +  e.add_input_operand (src, GET_MODE (src));
>>
>> -  e.add_vundef_operand (mode);
>> +  e.set_len_and_policy (len, vlmax_p);
>>
>> -  e.add_input_operand (src, GET_MODE (src));
>> +  e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src));
>> +}
>>
>> -  if (len)
>> -    e.add_input_operand (len, Pmode);
>> -  else
>> -    {
>> -      rtx vlmax = gen_reg_rtx (Pmode);
>> -      emit_vlmax_vsetvl (mode, vlmax);
>> -      e.add_input_operand (vlmax, Pmode);
>> -    }
>> +/* Emit an RVV unmask && vl mov from SRC to DEST.  */
>> +static void
>> +emit_pred_binop (unsigned icode, rtx mask, rtx dest, rtx src1, rtx src2,
>> +         rtx len, machine_mode mask_mode, bool vlmax_p)
>> +{
>> +  insn_expander<9> e;
>> +  e.set_dest_and_mask (mask, dest, mask_mode);
>>
>> -  if (GET_MODE_CLASS (mode) != MODE_VECTOR_BOOL)
>> -    e.add_policy_operand (get_prefer_tail_policy (),
>> get_prefer_mask_policy ());
>> +  e.add_input_operand (src1, GET_MODE (src1));
>> +  e.add_input_operand (src2, GET_MODE (src2));
>>
>> -  if (vlmax_p)
>> -    e.add_avl_type_operand (avl_type::VLMAX);
>> -  else
>> -    e.add_avl_type_operand (avl_type::NONVLMAX);
>> +  e.set_len_and_policy (len, vlmax_p);
>>
>> -  e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src));
>> +  e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src1) ||
>> MEM_P (src2));
>> }
>>
>> void
>> @@ -236,6 +268,13 @@ emit_nonvlmax_op (unsigned icode, rtx dest, rtx
>> src, rtx len,
>>   emit_pred_op (icode, NULL_RTX, dest, src, len, mask_mode, false);
>> }
>>
>> +void
>> +emit_nonvlmax_binop (unsigned icode, rtx dest, rtx src1, rtx src2,
>> rtx len,
>> +             machine_mode mask_mode)
>> +{
>> +  emit_pred_binop (icode, NULL_RTX, dest, src1, src2, len, mask_mode,
>> false);
>> +}
>> +
>> static void
>> expand_const_vector (rtx target, rtx src, machine_mode mask_mode)
>> {
>> diff --git a/gcc/config/riscv/vector-iterators.md
>> b/gcc/config/riscv/vector-iterators.md
>> index a8e856161d3..7cf21751d2f 100644
>> --- a/gcc/config/riscv/vector-iterators.md
>> +++ b/gcc/config/riscv/vector-iterators.md
>> @@ -934,6 +934,26 @@ (define_code_iterator any_int_binop [plus minus
>> and ior xor ashift ashiftrt lshi
>>   smax umax smin umin mult div udiv mod umod
>> ])
>>
>> +(define_code_attr ANY_INT_BINOP [
>> +    (plus "PLUS")
>> +    (minus "MINUS")
>> +    (and "AND")
>> +    (ior "IOR")
>> +    (xor "XOR")
>> +    (ashift "ASHIFT")
>> +    (ashiftrt "ASHIFTRT")
>> +    (lshiftrt "LSHIFTRT")
>> +    (smax "SMAX")
>> +    (umax "UMAX")
>> +    (smin "SMIN")
>> +    (umin "UMIN")
>> +    (mult "MULT")
>> +    (div "DIV")
>> +    (udiv "UDIV")
>> +    (mod "MOD")
>> +    (umod "UMOD")
>> +])
>> +
>> (define_code_iterator any_int_unop [neg not])
>>
>> (define_code_iterator any_commutative_binop [plus and ior xor
>> --
>> 2.40.0
>> \x11

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2023-04-27 16:20 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-17 18:36 [PATCH v4 00/10] RISC-V: Add autovec support Michael Collison
2023-04-17 18:36 ` [PATCH v4 01/10] RISC-V: Add new predicates and function prototypes Michael Collison
2023-04-19  0:54   ` Kito Cheng
2023-04-26  2:50     ` Jeff Law
2023-04-17 18:36 ` [PATCH v4 02/10] RISC-V: autovec: Export policy functions to global scope Michael Collison
2023-04-17 18:36 ` [PATCH v4 03/10] RISC-V:autovec: Add auto-vectorization support functions Michael Collison
2023-04-19  1:15   ` Kito Cheng
2023-04-20  2:19   ` juzhe.zhong
2023-04-17 18:36 ` [PATCH v4 04/10] RISC-V:autovec: Add target vectorization hooks Michael Collison
2023-04-19  1:04   ` Kito Cheng
2023-04-20  2:11   ` juzhe.zhong
2023-04-17 18:36 ` [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations Michael Collison
2023-04-18 23:14   ` Jeff Law
2023-04-19  1:19   ` Kito Cheng
2023-04-20 20:21     ` Michael Collison
2023-04-20  2:24   ` juzhe.zhong
2023-04-26 18:15     ` Robin Dapp
     [not found]     ` <3DF5ADD87A33EE11+BA2E4625-72A4-421A-B9D3-6DCA48E402BD@rivai.ai>
2023-04-27  0:04       ` [PATCH v4 05/10] RISC-V: autovec: " Michael Collison
2023-04-27 16:20         ` Palmer Dabbelt
2023-04-17 18:36 ` [PATCH v4 06/10] RISC-V:autovec: Add autovectorization tests for add & sub Michael Collison
2023-04-17 18:36 ` [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2 Michael Collison
2023-04-18  6:11   ` Richard Biener
2023-04-18 14:28     ` Kito Cheng
2023-04-18 18:21       ` Kito Cheng
2023-04-18 22:48         ` juzhe.zhong
2023-04-18 23:19           ` Michael Collison
2023-04-20 10:01           ` Richard Sandiford
2023-04-17 18:36 ` [PATCH v4 08/10] RISC-V:autovec: Add autovectorization tests for binary integer Michael Collison
2023-04-17 18:37 ` [PATCH v4 09/10] This patch adds a guard for VNx1 vectors that are present in ports like riscv Michael Collison
2023-04-18 14:26   ` Kito Cheng
2023-04-18 18:10     ` Michael Collison
2023-04-17 18:37 ` [PATCH v4 10/10] This patch supports 8 bit auto-vectorization in riscv Michael Collison
2023-04-17 19:26 ` [PATCH v4 00/10] RISC-V: Add autovec support Palmer Dabbelt
2023-04-18  6:22   ` Richard Biener
2023-04-25 15:26 ` Palmer Dabbelt
2023-04-26  2:52   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).