public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments
@ 2023-07-03  9:10 juzhe.zhong
  2023-07-03  9:17 ` Robin Dapp
  2023-07-03  9:26 ` Richard Sandiford
  0 siblings, 2 replies; 7+ messages in thread
From: juzhe.zhong @ 2023-07-03  9:10 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford, rguenther, Ju-Zhe Zhong

From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>

Hi, Richard. I fix the order as you suggeted.

Before this patch, the order is {len,mask,bias}.

Now, after this patch, the order becomes {len,bias,mask}.

Since you said we should not need 'internal_fn_bias_index', the bias index should always be the len index + 1.
I notice LEN_STORE order is {len,vector,bias}, to make them consistent, I reorder into LEN_STORE {len,bias,vector}.
Just like MASK_STORE {mask,vector}.

Ok for trunk ?

gcc/ChangeLog:

        * config/riscv/autovec.md: Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
        * config/riscv/riscv-v.cc (expand_load_store): Ditto.
        * doc/md.texi: Ditto.
        * gimple-fold.cc (gimple_fold_partial_load_store_mem_ref): Ditto.
        * internal-fn.cc (len_maskload_direct): Ditto.
        (len_maskstore_direct): Ditto.
        (add_len_and_mask_args): New function.
        (expand_partial_load_optab_fn): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
        (expand_partial_store_optab_fn): Ditto.
        (internal_fn_len_index): New function.
        (internal_fn_mask_index): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
        (internal_fn_stored_value_index): Ditto.
        (internal_len_load_store_bias): Ditto.
        * internal-fn.h (internal_fn_len_index): New function.
        * tree-ssa-dse.cc (initialize_ao_ref_for_dse): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
        * tree-vect-stmts.cc (vectorizable_store): Ditto.
        (vectorizable_load): Ditto.

---
 gcc/config/riscv/autovec.md |   8 +-
 gcc/config/riscv/riscv-v.cc |   2 +-
 gcc/doc/md.texi             |  16 ++--
 gcc/gimple-fold.cc          |   8 +-
 gcc/internal-fn.cc          | 156 ++++++++++++++++++------------------
 gcc/internal-fn.h           |   1 +
 gcc/tree-ssa-dse.cc         |  11 +--
 gcc/tree-vect-stmts.cc      |  11 +--
 8 files changed, 107 insertions(+), 106 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 1488f2be1be..4ab0e9f99eb 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -26,8 +26,8 @@
   [(match_operand:V 0 "register_operand")
    (match_operand:V 1 "memory_operand")
    (match_operand 2 "autovec_length_operand")
-   (match_operand:<VM> 3 "vector_mask_operand")
-   (match_operand 4 "const_0_operand")]
+   (match_operand 3 "const_0_operand")
+   (match_operand:<VM> 4 "vector_mask_operand")]
   "TARGET_VECTOR"
 {
   riscv_vector::expand_load_store (operands, true);
@@ -38,8 +38,8 @@
   [(match_operand:V 0 "memory_operand")
    (match_operand:V 1 "register_operand")
    (match_operand 2 "autovec_length_operand")
-   (match_operand:<VM> 3 "vector_mask_operand")
-   (match_operand 4 "const_0_operand")]
+   (match_operand 3 "const_0_operand")
+   (match_operand:<VM> 4 "vector_mask_operand")]
   "TARGET_VECTOR"
 {
   riscv_vector::expand_load_store (operands, false);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index adb8d7d36a5..8d5bed7ebe4 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -2777,7 +2777,7 @@ expand_load_store (rtx *ops, bool is_load)
 {
   poly_int64 value;
   rtx len = ops[2];
-  rtx mask = ops[3];
+  rtx mask = ops[4];
   machine_mode mode = GET_MODE (ops[0]);
 
   if (poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode)))
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index cefdee84821..5e5482265cd 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5302,15 +5302,15 @@ This pattern is not allowed to @code{FAIL}.
 @cindex @code{len_maskload@var{m}@var{n}} instruction pattern
 @item @samp{len_maskload@var{m}@var{n}}
 Perform a masked load from the memory location pointed to by operand 1
-into register operand 0.  (operand 2 + operand 4) elements are loaded from
+into register operand 0.  (operand 2 + operand 3) elements are loaded from
 memory and other elements in operand 0 are set to undefined values.
 This is a combination of len_load and maskload.
 Operands 0 and 1 have mode @var{m}, which must be a vector mode.  Operand 2
 has whichever integer mode the target prefers.  A mask is specified in
-operand 3 which must be of type @var{n}.  The mask has lower precedence than
+operand 4 which must be of type @var{n}.  The mask has lower precedence than
 the length and is itself subject to length masking,
-i.e. only mask indices < (operand 2 + operand 4) are used.
-Operand 4 conceptually has mode @code{QI}.
+i.e. only mask indices < (operand 2 + operand 3) are used.
+Operand 3 conceptually has mode @code{QI}.
 
 Operand 2 can be a variable or a constant amount.  Operand 4 specifies a
 constant bias: it is either a constant 0 or a constant -1.  The predicate on
@@ -5329,14 +5329,14 @@ This pattern is not allowed to @code{FAIL}.
 @cindex @code{len_maskstore@var{m}@var{n}} instruction pattern
 @item @samp{len_maskstore@var{m}@var{n}}
 Perform a masked store from vector register operand 1 into memory operand 0.
-(operand 2 + operand 4) elements are stored to memory
+(operand 2 + operand 3) elements are stored to memory
 and leave the other elements of operand 0 unchanged.
 This is a combination of len_store and maskstore.
 Operands 0 and 1 have mode @var{m}, which must be a vector mode.  Operand 2 has whichever
-integer mode the target prefers.  A mask is specified in operand 3 which must be
+integer mode the target prefers.  A mask is specified in operand 4 which must be
 of type @var{n}.  The mask has lower precedence than the length and is itself subject to
-length masking, i.e. only mask indices < (operand 2 + operand 4) are used.
-Operand 4 conceptually has mode @code{QI}.
+length masking, i.e. only mask indices < (operand 2 + operand 3) are used.
+Operand 3 conceptually has mode @code{QI}.
 
 Operand 2 can be a variable or a constant amount.  Operand 3 specifies a
 constant bias: it is either a constant 0 or a constant -1.  The predicate on
diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 8434274f69d..4027ff71e10 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -5391,11 +5391,12 @@ gimple_fold_partial_load_store_mem_ref (gcall *call, tree vectype, bool mask_p)
     }
   else
     {
-      tree basic_len = gimple_call_arg (call, 2);
+      internal_fn ifn = gimple_call_internal_fn (call);
+      int len_index = internal_fn_len_index (ifn);
+      tree basic_len = gimple_call_arg (call, len_index);
       if (!poly_int_tree_p (basic_len))
 	return NULL_TREE;
-      unsigned int nargs = gimple_call_num_args (call);
-      tree bias = gimple_call_arg (call, nargs - 1);
+      tree bias = gimple_call_arg (call, len_index + 1);
       gcc_assert (TREE_CODE (bias) == INTEGER_CST);
       /* For LEN_LOAD/LEN_STORE/LEN_MASK_LOAD/LEN_MASK_STORE,
 	 we don't fold when (bias + len) != VF.  */
@@ -5405,7 +5406,6 @@ gimple_fold_partial_load_store_mem_ref (gcall *call, tree vectype, bool mask_p)
 
       /* For LEN_MASK_{LOAD,STORE}, we should also check whether
 	  the mask is all ones mask.  */
-      internal_fn ifn = gimple_call_internal_fn (call);
       if (ifn == IFN_LEN_MASK_LOAD || ifn == IFN_LEN_MASK_STORE)
 	{
 	  tree mask = gimple_call_arg (call, internal_fn_mask_index (ifn));
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 9017176dc7a..c1fcb38b17b 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -165,7 +165,7 @@ init_internal_fns ()
 #define mask_load_lanes_direct { -1, -1, false }
 #define gather_load_direct { 3, 1, false }
 #define len_load_direct { -1, -1, false }
-#define len_maskload_direct { -1, 3, false }
+#define len_maskload_direct { -1, 4, false }
 #define mask_store_direct { 3, 2, false }
 #define store_lanes_direct { 0, 0, false }
 #define mask_store_lanes_direct { 0, 0, false }
@@ -173,7 +173,7 @@ init_internal_fns ()
 #define vec_cond_direct { 2, 0, false }
 #define scatter_store_direct { 3, 1, false }
 #define len_store_direct { 3, 3, false }
-#define len_maskstore_direct { 4, 3, false }
+#define len_maskstore_direct { 4, 5, false }
 #define vec_set_direct { 3, 3, false }
 #define unary_direct { 0, 0, true }
 #define unary_convert_direct { -1, 0, true }
@@ -293,6 +293,38 @@ get_multi_vector_move (tree array_type, convert_optab optab)
   return convert_optab_handler (optab, imode, vmode);
 }
 
+/* Add len and mask arguments according to the STMT.  */
+
+static unsigned int
+add_len_and_mask_args (expand_operand *ops, unsigned int opno, gcall *stmt)
+{
+  internal_fn ifn = gimple_call_internal_fn (stmt);
+  int len_index = internal_fn_len_index (ifn);
+  /* BIAS is always consecutive next of LEN.  */
+  int bias_index = len_index + 1;
+  int mask_index = internal_fn_mask_index (ifn);
+  /* The order of arguments are always {len,bias,mask}.  */
+  if (len_index >= 0)
+    {
+      tree len = gimple_call_arg (stmt, len_index);
+      rtx len_rtx = expand_normal (len);
+      create_convert_operand_from (&ops[opno++], len_rtx,
+				   TYPE_MODE (TREE_TYPE (len)),
+				   TYPE_UNSIGNED (TREE_TYPE (len)));
+      tree biast = gimple_call_arg (stmt, bias_index);
+      rtx bias = expand_normal (biast);
+      create_input_operand (&ops[opno++], bias, QImode);
+    }
+  if (mask_index >= 0)
+    {
+      tree mask = gimple_call_arg (stmt, mask_index);
+      rtx mask_rtx = expand_normal (mask);
+      create_input_operand (&ops[opno++], mask_rtx,
+			    TYPE_MODE (TREE_TYPE (mask)));
+    }
+  return opno;
+}
+
 /* Expand LOAD_LANES call STMT using optab OPTAB.  */
 
 static void
@@ -2879,14 +2911,15 @@ expand_call_mem_ref (tree type, gcall *stmt, int index)
  * OPTAB.  */
 
 static void
-expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
+expand_partial_load_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab)
 {
+  int i = 0;
   class expand_operand ops[5];
-  tree type, lhs, rhs, maskt, biast;
-  rtx mem, target, mask, bias;
+  tree type, lhs, rhs, maskt;
+  rtx mem, target;
   insn_code icode;
 
-  maskt = gimple_call_arg (stmt, 2);
+  maskt = gimple_call_arg (stmt, internal_fn_mask_index (ifn));
   lhs = gimple_call_lhs (stmt);
   if (lhs == NULL_TREE)
     return;
@@ -2903,38 +2936,11 @@ expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 
   mem = expand_expr (rhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
   gcc_assert (MEM_P (mem));
-  mask = expand_normal (maskt);
   target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-  create_output_operand (&ops[0], target, TYPE_MODE (type));
-  create_fixed_operand (&ops[1], mem);
-  if (optab == len_load_optab)
-    {
-      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
-				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
-      biast = gimple_call_arg (stmt, 3);
-      bias = expand_normal (biast);
-      create_input_operand (&ops[3], bias, QImode);
-      expand_insn (icode, 4, ops);
-    }
-  else if (optab == len_maskload_optab)
-    {
-      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
-				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
-      maskt = gimple_call_arg (stmt, 3);
-      mask = expand_normal (maskt);
-      create_input_operand (&ops[3], mask, TYPE_MODE (TREE_TYPE (maskt)));
-      icode = convert_optab_handler (optab, TYPE_MODE (type),
-				     TYPE_MODE (TREE_TYPE (maskt)));
-      biast = gimple_call_arg (stmt, 4);
-      bias = expand_normal (biast);
-      create_input_operand (&ops[4], bias, QImode);
-      expand_insn (icode, 5, ops);
-    }
-  else
-    {
-      create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
-      expand_insn (icode, 3, ops);
-    }
+  create_output_operand (&ops[i++], target, TYPE_MODE (type));
+  create_fixed_operand (&ops[i++], mem);
+  i = add_len_and_mask_args (ops, i, stmt);
+  expand_insn (icode, i, ops);
 
   if (!rtx_equal_p (target, ops[0].value))
     emit_move_insn (target, ops[0].value);
@@ -2951,12 +2957,13 @@ expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 static void
 expand_partial_store_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab)
 {
+  int i = 0;
   class expand_operand ops[5];
-  tree type, lhs, rhs, maskt, biast;
-  rtx mem, reg, mask, bias;
+  tree type, lhs, rhs, maskt;
+  rtx mem, reg;
   insn_code icode;
 
-  maskt = gimple_call_arg (stmt, 2);
+  maskt = gimple_call_arg (stmt, internal_fn_mask_index (ifn));
   rhs = gimple_call_arg (stmt, internal_fn_stored_value_index (ifn));
   type = TREE_TYPE (rhs);
   lhs = expand_call_mem_ref (type, stmt, 0);
@@ -2971,37 +2978,11 @@ expand_partial_store_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab
 
   mem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
   gcc_assert (MEM_P (mem));
-  mask = expand_normal (maskt);
   reg = expand_normal (rhs);
-  create_fixed_operand (&ops[0], mem);
-  create_input_operand (&ops[1], reg, TYPE_MODE (type));
-  if (optab == len_store_optab)
-    {
-      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
-				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
-      biast = gimple_call_arg (stmt, 4);
-      bias = expand_normal (biast);
-      create_input_operand (&ops[3], bias, QImode);
-      expand_insn (icode, 4, ops);
-    }
-  else if (optab == len_maskstore_optab)
-    {
-      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
-				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
-      maskt = gimple_call_arg (stmt, 3);
-      mask = expand_normal (maskt);
-      create_input_operand (&ops[3], mask, TYPE_MODE (TREE_TYPE (maskt)));
-      biast = gimple_call_arg (stmt, 5);
-      bias = expand_normal (biast);
-      create_input_operand (&ops[4], bias, QImode);
-      icode = convert_optab_handler (optab, TYPE_MODE (type), GET_MODE (mask));
-      expand_insn (icode, 5, ops);
-    }
-  else
-    {
-      create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
-      expand_insn (icode, 3, ops);
-    }
+  create_fixed_operand (&ops[i++], mem);
+  create_input_operand (&ops[i++], reg, TYPE_MODE (type));
+  i = add_len_and_mask_args (ops, i, stmt);
+  expand_insn (icode, i, ops);
 }
 
 #define expand_mask_store_optab_fn expand_partial_store_optab_fn
@@ -4482,6 +4463,25 @@ internal_gather_scatter_fn_p (internal_fn fn)
     }
 }
 
+/* If FN takes a vector len argument, return the index of that argument,
+   otherwise return -1.  */
+
+int
+internal_fn_len_index (internal_fn fn)
+{
+  switch (fn)
+    {
+    case IFN_LEN_LOAD:
+    case IFN_LEN_STORE:
+    case IFN_LEN_MASK_LOAD:
+    case IFN_LEN_MASK_STORE:
+      return 2;
+
+    default:
+      return -1;
+    }
+}
+
 /* If FN takes a vector mask argument, return the index of that argument,
    otherwise return -1.  */
 
@@ -4498,11 +4498,9 @@ internal_fn_mask_index (internal_fn fn)
 
     case IFN_MASK_GATHER_LOAD:
     case IFN_MASK_SCATTER_STORE:
-      return 4;
-
     case IFN_LEN_MASK_LOAD:
     case IFN_LEN_MASK_STORE:
-      return 3;
+      return 4;
 
     default:
       return (conditional_internal_fn_code (fn) != ERROR_MARK
@@ -4522,12 +4520,14 @@ internal_fn_stored_value_index (internal_fn fn)
     case IFN_MASK_STORE_LANES:
     case IFN_SCATTER_STORE:
     case IFN_MASK_SCATTER_STORE:
-    case IFN_LEN_STORE:
       return 3;
 
-    case IFN_LEN_MASK_STORE:
+    case IFN_LEN_STORE:
       return 4;
 
+    case IFN_LEN_MASK_STORE:
+      return 5;
+
     default:
       return -1;
     }
@@ -4592,7 +4592,6 @@ internal_len_load_store_bias (internal_fn ifn, machine_mode mode)
 {
   optab optab = direct_internal_fn_optab (ifn);
   insn_code icode = direct_optab_handler (optab, mode);
-  int bias_opno = 3;
 
   if (icode == CODE_FOR_nothing)
     {
@@ -4610,15 +4609,14 @@ internal_len_load_store_bias (internal_fn ifn, machine_mode mode)
 	  optab = direct_internal_fn_optab (IFN_LEN_MASK_STORE);
 	}
       icode = convert_optab_handler (optab, mode, mask_mode);
-      bias_opno = 4;
     }
 
   if (icode != CODE_FOR_nothing)
     {
       /* For now we only support biases of 0 or -1.  Try both of them.  */
-      if (insn_operand_matches (icode, bias_opno, GEN_INT (0)))
+      if (insn_operand_matches (icode, 3, GEN_INT (0)))
 	return 0;
-      if (insn_operand_matches (icode, bias_opno, GEN_INT (-1)))
+      if (insn_operand_matches (icode, 3, GEN_INT (-1)))
 	return -1;
     }
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 8f21068e300..4234bbfed87 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -234,6 +234,7 @@ extern bool internal_load_fn_p (internal_fn);
 extern bool internal_store_fn_p (internal_fn);
 extern bool internal_gather_scatter_fn_p (internal_fn);
 extern int internal_fn_mask_index (internal_fn);
+extern int internal_fn_len_index (internal_fn);
 extern int internal_fn_stored_value_index (internal_fn);
 extern bool internal_gather_scatter_fn_supported_p (internal_fn, tree,
 						    tree, tree, int);
diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
index f8338037a61..9c6004cdce8 100644
--- a/gcc/tree-ssa-dse.cc
+++ b/gcc/tree-ssa-dse.cc
@@ -161,12 +161,13 @@ initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write, bool may_def_ok = false)
 	case IFN_MASK_STORE:
 	case IFN_LEN_MASK_STORE:
 	  {
-	    int stored_value_index
-	      = internal_fn_stored_value_index (gimple_call_internal_fn (stmt));
-	    if (gimple_call_internal_fn (stmt) == IFN_LEN_STORE)
+	    internal_fn ifn = gimple_call_internal_fn (stmt);
+	    int stored_value_index = internal_fn_stored_value_index (ifn);
+	    int len_index = internal_fn_len_index (ifn);
+	    if (ifn == IFN_LEN_STORE)
 	      {
-		tree len = gimple_call_arg (stmt, 2);
-		tree bias = gimple_call_arg (stmt, 4);
+		tree len = gimple_call_arg (stmt, len_index);
+		tree bias = gimple_call_arg (stmt, len_index + 1);
 		if (tree_fits_uhwi_p (len))
 		  {
 		    ao_ref_init_from_ptr_and_size (write,
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 68faa8ead39..a0c39268bf0 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -9122,13 +9122,14 @@ vectorizable_store (vec_info *vinfo,
 		  if (partial_ifn == IFN_LEN_MASK_STORE)
 		    call = gimple_build_call_internal (IFN_LEN_MASK_STORE, 6,
 						       dataref_ptr, ptr,
-						       final_len, final_mask,
-						       vec_oprnd, bias);
+						       final_len, bias,
+						       final_mask, vec_oprnd);
 		  else
 		    call
 		      = gimple_build_call_internal (IFN_LEN_STORE, 5,
-						    dataref_ptr, ptr, final_len,
-						    vec_oprnd, bias);
+						    dataref_ptr, ptr,
+						    final_len, bias,
+						    vec_oprnd);
 		  gimple_call_set_nothrow (call, true);
 		  vect_finish_stmt_generation (vinfo, stmt_info, call, gsi);
 		  new_stmt = call;
@@ -10523,7 +10524,7 @@ vectorizable_load (vec_info *vinfo,
 			  call = gimple_build_call_internal (IFN_LEN_MASK_LOAD,
 							     5, dataref_ptr,
 							     ptr, final_len,
-							     final_mask, bias);
+							     bias, final_mask);
 			else
 			  call = gimple_build_call_internal (IFN_LEN_LOAD, 4,
 							     dataref_ptr, ptr,
-- 
2.36.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments
  2023-07-03  9:10 [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments juzhe.zhong
@ 2023-07-03  9:17 ` Robin Dapp
  2023-07-03  9:20   ` juzhe.zhong
  2023-07-03  9:24   ` juzhe.zhong
  2023-07-03  9:26 ` Richard Sandiford
  1 sibling, 2 replies; 7+ messages in thread
From: Robin Dapp @ 2023-07-03  9:17 UTC (permalink / raw)
  To: juzhe.zhong, gcc-patches
  Cc: rdapp.gcc, richard.sandiford, rguenther, linkw, krebbel

Hi Juzhe,

when changing the argument order for LEN_LOAD/LEN_STORE, you will also
need to adjust rs6000's and s390's expanders. 

Regards
 Robin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments
  2023-07-03  9:17 ` Robin Dapp
@ 2023-07-03  9:20   ` juzhe.zhong
  2023-07-03  9:24   ` juzhe.zhong
  1 sibling, 0 replies; 7+ messages in thread
From: juzhe.zhong @ 2023-07-03  9:20 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches
  Cc: Robin Dapp, richard.sandiford, rguenther, linkw, krebbel

[-- Attachment #1: Type: text/plain, Size: 1574 bytes --]

No, we don't need to.

len_load/len_store optab in backend

(define_expand "len_load_v16qi"
  [(match_operand:V16QI 0 "register_operand")
   (match_operand:V16QI 1 "memory_operand")
   (match_operand:QI 2 "register_operand")
   (match_operand:QI 3 "vll_bias_operand")
  ]
  "TARGET_VX && TARGET_64BIT"
{
  rtx mem = adjust_address (operands[1], BLKmode, 0);

  rtx len = gen_reg_rtx (SImode);
  emit_move_insn (len, gen_rtx_ZERO_EXTEND (SImode, operands[2]));
  emit_insn (gen_vllv16qi (operands[0], len, mem));
  DONE;
})

(define_expand "len_store_v16qi"
  [(match_operand:V16QI 0 "memory_operand")
   (match_operand:V16QI 1 "register_operand")
   (match_operand:QI 2 "register_operand")
   (match_operand:QI 3 "vll_bias_operand")
  ]
  "TARGET_VX && TARGET_64BIT"
{
  rtx mem = adjust_address (operands[0], BLKmode, 0);

  rtx len = gen_reg_rtx (SImode);
  emit_move_insn (len, gen_rtx_ZERO_EXTEND (SImode, operands[2]));
  emit_insn (gen_vstlv16qi (operands[1], len, mem));
  DONE;
});;

is already correct order {len,bias}. Only Gimple IR need to be adjusted.

I have already tested len_load/len_store optab.

Thanks.


juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-07-03 17:17
To: juzhe.zhong; gcc-patches
CC: rdapp.gcc; richard.sandiford; rguenther; linkw; krebbel
Subject: Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments
Hi Juzhe,
 
when changing the argument order for LEN_LOAD/LEN_STORE, you will also
need to adjust rs6000's and s390's expanders. 
 
Regards
Robin
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments
  2023-07-03  9:17 ` Robin Dapp
  2023-07-03  9:20   ` juzhe.zhong
@ 2023-07-03  9:24   ` juzhe.zhong
  2023-07-03  9:32     ` Robin Dapp
  1 sibling, 1 reply; 7+ messages in thread
From: juzhe.zhong @ 2023-07-03  9:24 UTC (permalink / raw)
  To: Robin Dapp, gcc-patches
  Cc: Robin Dapp, richard.sandiford, rguenther, linkw, krebbel

[-- Attachment #1: Type: text/plain, Size: 791 bytes --]

Take mask_store/MASK_STORE as example.

In gimple IR: MASK_STORE (ptr, align, mask, v)
In maskstore RTL IR maskstore (ptr, v, mask)

For LEN_STORE/len_store, after adjusted:

In gimple IR: LEN_STORE (ptr, align, len, bias, v)
In len_store RTL IR len_store (ptr, v, len, bias)

Similar to LEN_MASK_LOAD/STORE, their orders are consistent now after this patch.

Thanks.


juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-07-03 17:17
To: juzhe.zhong; gcc-patches
CC: rdapp.gcc; richard.sandiford; rguenther; linkw; krebbel
Subject: Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments
Hi Juzhe,
 
when changing the argument order for LEN_LOAD/LEN_STORE, you will also
need to adjust rs6000's and s390's expanders. 
 
Regards
Robin
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments
  2023-07-03  9:10 [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments juzhe.zhong
  2023-07-03  9:17 ` Robin Dapp
@ 2023-07-03  9:26 ` Richard Sandiford
  2023-07-03 14:28   ` Li, Pan2
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Sandiford @ 2023-07-03  9:26 UTC (permalink / raw)
  To: juzhe.zhong; +Cc: gcc-patches, rguenther

juzhe.zhong@rivai.ai writes:
> From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
>
> Hi, Richard. I fix the order as you suggeted.
>
> Before this patch, the order is {len,mask,bias}.
>
> Now, after this patch, the order becomes {len,bias,mask}.
>
> Since you said we should not need 'internal_fn_bias_index', the bias index should always be the len index + 1.
> I notice LEN_STORE order is {len,vector,bias}, to make them consistent, I reorder into LEN_STORE {len,bias,vector}.
> Just like MASK_STORE {mask,vector}.
>
> Ok for trunk ?
>
> gcc/ChangeLog:
>
>         * config/riscv/autovec.md: Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
>         * config/riscv/riscv-v.cc (expand_load_store): Ditto.
>         * doc/md.texi: Ditto.
>         * gimple-fold.cc (gimple_fold_partial_load_store_mem_ref): Ditto.
>         * internal-fn.cc (len_maskload_direct): Ditto.
>         (len_maskstore_direct): Ditto.
>         (add_len_and_mask_args): New function.
>         (expand_partial_load_optab_fn): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
>         (expand_partial_store_optab_fn): Ditto.
>         (internal_fn_len_index): New function.
>         (internal_fn_mask_index): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
>         (internal_fn_stored_value_index): Ditto.
>         (internal_len_load_store_bias): Ditto.
>         * internal-fn.h (internal_fn_len_index): New function.
>         * tree-ssa-dse.cc (initialize_ao_ref_for_dse): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
>         * tree-vect-stmts.cc (vectorizable_store): Ditto.
>         (vectorizable_load): Ditto.

OK, thanks.

Richard

> ---
>  gcc/config/riscv/autovec.md |   8 +-
>  gcc/config/riscv/riscv-v.cc |   2 +-
>  gcc/doc/md.texi             |  16 ++--
>  gcc/gimple-fold.cc          |   8 +-
>  gcc/internal-fn.cc          | 156 ++++++++++++++++++------------------
>  gcc/internal-fn.h           |   1 +
>  gcc/tree-ssa-dse.cc         |  11 +--
>  gcc/tree-vect-stmts.cc      |  11 +--
>  8 files changed, 107 insertions(+), 106 deletions(-)
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 1488f2be1be..4ab0e9f99eb 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -26,8 +26,8 @@
>    [(match_operand:V 0 "register_operand")
>     (match_operand:V 1 "memory_operand")
>     (match_operand 2 "autovec_length_operand")
> -   (match_operand:<VM> 3 "vector_mask_operand")
> -   (match_operand 4 "const_0_operand")]
> +   (match_operand 3 "const_0_operand")
> +   (match_operand:<VM> 4 "vector_mask_operand")]
>    "TARGET_VECTOR"
>  {
>    riscv_vector::expand_load_store (operands, true);
> @@ -38,8 +38,8 @@
>    [(match_operand:V 0 "memory_operand")
>     (match_operand:V 1 "register_operand")
>     (match_operand 2 "autovec_length_operand")
> -   (match_operand:<VM> 3 "vector_mask_operand")
> -   (match_operand 4 "const_0_operand")]
> +   (match_operand 3 "const_0_operand")
> +   (match_operand:<VM> 4 "vector_mask_operand")]
>    "TARGET_VECTOR"
>  {
>    riscv_vector::expand_load_store (operands, false);
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index adb8d7d36a5..8d5bed7ebe4 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -2777,7 +2777,7 @@ expand_load_store (rtx *ops, bool is_load)
>  {
>    poly_int64 value;
>    rtx len = ops[2];
> -  rtx mask = ops[3];
> +  rtx mask = ops[4];
>    machine_mode mode = GET_MODE (ops[0]);
>  
>    if (poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode)))
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index cefdee84821..5e5482265cd 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5302,15 +5302,15 @@ This pattern is not allowed to @code{FAIL}.
>  @cindex @code{len_maskload@var{m}@var{n}} instruction pattern
>  @item @samp{len_maskload@var{m}@var{n}}
>  Perform a masked load from the memory location pointed to by operand 1
> -into register operand 0.  (operand 2 + operand 4) elements are loaded from
> +into register operand 0.  (operand 2 + operand 3) elements are loaded from
>  memory and other elements in operand 0 are set to undefined values.
>  This is a combination of len_load and maskload.
>  Operands 0 and 1 have mode @var{m}, which must be a vector mode.  Operand 2
>  has whichever integer mode the target prefers.  A mask is specified in
> -operand 3 which must be of type @var{n}.  The mask has lower precedence than
> +operand 4 which must be of type @var{n}.  The mask has lower precedence than
>  the length and is itself subject to length masking,
> -i.e. only mask indices < (operand 2 + operand 4) are used.
> -Operand 4 conceptually has mode @code{QI}.
> +i.e. only mask indices < (operand 2 + operand 3) are used.
> +Operand 3 conceptually has mode @code{QI}.
>  
>  Operand 2 can be a variable or a constant amount.  Operand 4 specifies a
>  constant bias: it is either a constant 0 or a constant -1.  The predicate on
> @@ -5329,14 +5329,14 @@ This pattern is not allowed to @code{FAIL}.
>  @cindex @code{len_maskstore@var{m}@var{n}} instruction pattern
>  @item @samp{len_maskstore@var{m}@var{n}}
>  Perform a masked store from vector register operand 1 into memory operand 0.
> -(operand 2 + operand 4) elements are stored to memory
> +(operand 2 + operand 3) elements are stored to memory
>  and leave the other elements of operand 0 unchanged.
>  This is a combination of len_store and maskstore.
>  Operands 0 and 1 have mode @var{m}, which must be a vector mode.  Operand 2 has whichever
> -integer mode the target prefers.  A mask is specified in operand 3 which must be
> +integer mode the target prefers.  A mask is specified in operand 4 which must be
>  of type @var{n}.  The mask has lower precedence than the length and is itself subject to
> -length masking, i.e. only mask indices < (operand 2 + operand 4) are used.
> -Operand 4 conceptually has mode @code{QI}.
> +length masking, i.e. only mask indices < (operand 2 + operand 3) are used.
> +Operand 3 conceptually has mode @code{QI}.
>  
>  Operand 2 can be a variable or a constant amount.  Operand 3 specifies a
>  constant bias: it is either a constant 0 or a constant -1.  The predicate on
> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
> index 8434274f69d..4027ff71e10 100644
> --- a/gcc/gimple-fold.cc
> +++ b/gcc/gimple-fold.cc
> @@ -5391,11 +5391,12 @@ gimple_fold_partial_load_store_mem_ref (gcall *call, tree vectype, bool mask_p)
>      }
>    else
>      {
> -      tree basic_len = gimple_call_arg (call, 2);
> +      internal_fn ifn = gimple_call_internal_fn (call);
> +      int len_index = internal_fn_len_index (ifn);
> +      tree basic_len = gimple_call_arg (call, len_index);
>        if (!poly_int_tree_p (basic_len))
>  	return NULL_TREE;
> -      unsigned int nargs = gimple_call_num_args (call);
> -      tree bias = gimple_call_arg (call, nargs - 1);
> +      tree bias = gimple_call_arg (call, len_index + 1);
>        gcc_assert (TREE_CODE (bias) == INTEGER_CST);
>        /* For LEN_LOAD/LEN_STORE/LEN_MASK_LOAD/LEN_MASK_STORE,
>  	 we don't fold when (bias + len) != VF.  */
> @@ -5405,7 +5406,6 @@ gimple_fold_partial_load_store_mem_ref (gcall *call, tree vectype, bool mask_p)
>  
>        /* For LEN_MASK_{LOAD,STORE}, we should also check whether
>  	  the mask is all ones mask.  */
> -      internal_fn ifn = gimple_call_internal_fn (call);
>        if (ifn == IFN_LEN_MASK_LOAD || ifn == IFN_LEN_MASK_STORE)
>  	{
>  	  tree mask = gimple_call_arg (call, internal_fn_mask_index (ifn));
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 9017176dc7a..c1fcb38b17b 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -165,7 +165,7 @@ init_internal_fns ()
>  #define mask_load_lanes_direct { -1, -1, false }
>  #define gather_load_direct { 3, 1, false }
>  #define len_load_direct { -1, -1, false }
> -#define len_maskload_direct { -1, 3, false }
> +#define len_maskload_direct { -1, 4, false }
>  #define mask_store_direct { 3, 2, false }
>  #define store_lanes_direct { 0, 0, false }
>  #define mask_store_lanes_direct { 0, 0, false }
> @@ -173,7 +173,7 @@ init_internal_fns ()
>  #define vec_cond_direct { 2, 0, false }
>  #define scatter_store_direct { 3, 1, false }
>  #define len_store_direct { 3, 3, false }
> -#define len_maskstore_direct { 4, 3, false }
> +#define len_maskstore_direct { 4, 5, false }
>  #define vec_set_direct { 3, 3, false }
>  #define unary_direct { 0, 0, true }
>  #define unary_convert_direct { -1, 0, true }
> @@ -293,6 +293,38 @@ get_multi_vector_move (tree array_type, convert_optab optab)
>    return convert_optab_handler (optab, imode, vmode);
>  }
>  
> +/* Add len and mask arguments according to the STMT.  */
> +
> +static unsigned int
> +add_len_and_mask_args (expand_operand *ops, unsigned int opno, gcall *stmt)
> +{
> +  internal_fn ifn = gimple_call_internal_fn (stmt);
> +  int len_index = internal_fn_len_index (ifn);
> +  /* BIAS is always consecutive next of LEN.  */
> +  int bias_index = len_index + 1;
> +  int mask_index = internal_fn_mask_index (ifn);
> +  /* The order of arguments are always {len,bias,mask}.  */
> +  if (len_index >= 0)
> +    {
> +      tree len = gimple_call_arg (stmt, len_index);
> +      rtx len_rtx = expand_normal (len);
> +      create_convert_operand_from (&ops[opno++], len_rtx,
> +				   TYPE_MODE (TREE_TYPE (len)),
> +				   TYPE_UNSIGNED (TREE_TYPE (len)));
> +      tree biast = gimple_call_arg (stmt, bias_index);
> +      rtx bias = expand_normal (biast);
> +      create_input_operand (&ops[opno++], bias, QImode);
> +    }
> +  if (mask_index >= 0)
> +    {
> +      tree mask = gimple_call_arg (stmt, mask_index);
> +      rtx mask_rtx = expand_normal (mask);
> +      create_input_operand (&ops[opno++], mask_rtx,
> +			    TYPE_MODE (TREE_TYPE (mask)));
> +    }
> +  return opno;
> +}
> +
>  /* Expand LOAD_LANES call STMT using optab OPTAB.  */
>  
>  static void
> @@ -2879,14 +2911,15 @@ expand_call_mem_ref (tree type, gcall *stmt, int index)
>   * OPTAB.  */
>  
>  static void
> -expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> +expand_partial_load_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab)
>  {
> +  int i = 0;
>    class expand_operand ops[5];
> -  tree type, lhs, rhs, maskt, biast;
> -  rtx mem, target, mask, bias;
> +  tree type, lhs, rhs, maskt;
> +  rtx mem, target;
>    insn_code icode;
>  
> -  maskt = gimple_call_arg (stmt, 2);
> +  maskt = gimple_call_arg (stmt, internal_fn_mask_index (ifn));
>    lhs = gimple_call_lhs (stmt);
>    if (lhs == NULL_TREE)
>      return;
> @@ -2903,38 +2936,11 @@ expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
>  
>    mem = expand_expr (rhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>    gcc_assert (MEM_P (mem));
> -  mask = expand_normal (maskt);
>    target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> -  create_output_operand (&ops[0], target, TYPE_MODE (type));
> -  create_fixed_operand (&ops[1], mem);
> -  if (optab == len_load_optab)
> -    {
> -      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
> -				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
> -      biast = gimple_call_arg (stmt, 3);
> -      bias = expand_normal (biast);
> -      create_input_operand (&ops[3], bias, QImode);
> -      expand_insn (icode, 4, ops);
> -    }
> -  else if (optab == len_maskload_optab)
> -    {
> -      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
> -				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
> -      maskt = gimple_call_arg (stmt, 3);
> -      mask = expand_normal (maskt);
> -      create_input_operand (&ops[3], mask, TYPE_MODE (TREE_TYPE (maskt)));
> -      icode = convert_optab_handler (optab, TYPE_MODE (type),
> -				     TYPE_MODE (TREE_TYPE (maskt)));
> -      biast = gimple_call_arg (stmt, 4);
> -      bias = expand_normal (biast);
> -      create_input_operand (&ops[4], bias, QImode);
> -      expand_insn (icode, 5, ops);
> -    }
> -  else
> -    {
> -      create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
> -      expand_insn (icode, 3, ops);
> -    }
> +  create_output_operand (&ops[i++], target, TYPE_MODE (type));
> +  create_fixed_operand (&ops[i++], mem);
> +  i = add_len_and_mask_args (ops, i, stmt);
> +  expand_insn (icode, i, ops);
>  
>    if (!rtx_equal_p (target, ops[0].value))
>      emit_move_insn (target, ops[0].value);
> @@ -2951,12 +2957,13 @@ expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
>  static void
>  expand_partial_store_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab)
>  {
> +  int i = 0;
>    class expand_operand ops[5];
> -  tree type, lhs, rhs, maskt, biast;
> -  rtx mem, reg, mask, bias;
> +  tree type, lhs, rhs, maskt;
> +  rtx mem, reg;
>    insn_code icode;
>  
> -  maskt = gimple_call_arg (stmt, 2);
> +  maskt = gimple_call_arg (stmt, internal_fn_mask_index (ifn));
>    rhs = gimple_call_arg (stmt, internal_fn_stored_value_index (ifn));
>    type = TREE_TYPE (rhs);
>    lhs = expand_call_mem_ref (type, stmt, 0);
> @@ -2971,37 +2978,11 @@ expand_partial_store_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab
>  
>    mem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>    gcc_assert (MEM_P (mem));
> -  mask = expand_normal (maskt);
>    reg = expand_normal (rhs);
> -  create_fixed_operand (&ops[0], mem);
> -  create_input_operand (&ops[1], reg, TYPE_MODE (type));
> -  if (optab == len_store_optab)
> -    {
> -      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
> -				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
> -      biast = gimple_call_arg (stmt, 4);
> -      bias = expand_normal (biast);
> -      create_input_operand (&ops[3], bias, QImode);
> -      expand_insn (icode, 4, ops);
> -    }
> -  else if (optab == len_maskstore_optab)
> -    {
> -      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
> -				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
> -      maskt = gimple_call_arg (stmt, 3);
> -      mask = expand_normal (maskt);
> -      create_input_operand (&ops[3], mask, TYPE_MODE (TREE_TYPE (maskt)));
> -      biast = gimple_call_arg (stmt, 5);
> -      bias = expand_normal (biast);
> -      create_input_operand (&ops[4], bias, QImode);
> -      icode = convert_optab_handler (optab, TYPE_MODE (type), GET_MODE (mask));
> -      expand_insn (icode, 5, ops);
> -    }
> -  else
> -    {
> -      create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
> -      expand_insn (icode, 3, ops);
> -    }
> +  create_fixed_operand (&ops[i++], mem);
> +  create_input_operand (&ops[i++], reg, TYPE_MODE (type));
> +  i = add_len_and_mask_args (ops, i, stmt);
> +  expand_insn (icode, i, ops);
>  }
>  
>  #define expand_mask_store_optab_fn expand_partial_store_optab_fn
> @@ -4482,6 +4463,25 @@ internal_gather_scatter_fn_p (internal_fn fn)
>      }
>  }
>  
> +/* If FN takes a vector len argument, return the index of that argument,
> +   otherwise return -1.  */
> +
> +int
> +internal_fn_len_index (internal_fn fn)
> +{
> +  switch (fn)
> +    {
> +    case IFN_LEN_LOAD:
> +    case IFN_LEN_STORE:
> +    case IFN_LEN_MASK_LOAD:
> +    case IFN_LEN_MASK_STORE:
> +      return 2;
> +
> +    default:
> +      return -1;
> +    }
> +}
> +
>  /* If FN takes a vector mask argument, return the index of that argument,
>     otherwise return -1.  */
>  
> @@ -4498,11 +4498,9 @@ internal_fn_mask_index (internal_fn fn)
>  
>      case IFN_MASK_GATHER_LOAD:
>      case IFN_MASK_SCATTER_STORE:
> -      return 4;
> -
>      case IFN_LEN_MASK_LOAD:
>      case IFN_LEN_MASK_STORE:
> -      return 3;
> +      return 4;
>  
>      default:
>        return (conditional_internal_fn_code (fn) != ERROR_MARK
> @@ -4522,12 +4520,14 @@ internal_fn_stored_value_index (internal_fn fn)
>      case IFN_MASK_STORE_LANES:
>      case IFN_SCATTER_STORE:
>      case IFN_MASK_SCATTER_STORE:
> -    case IFN_LEN_STORE:
>        return 3;
>  
> -    case IFN_LEN_MASK_STORE:
> +    case IFN_LEN_STORE:
>        return 4;
>  
> +    case IFN_LEN_MASK_STORE:
> +      return 5;
> +
>      default:
>        return -1;
>      }
> @@ -4592,7 +4592,6 @@ internal_len_load_store_bias (internal_fn ifn, machine_mode mode)
>  {
>    optab optab = direct_internal_fn_optab (ifn);
>    insn_code icode = direct_optab_handler (optab, mode);
> -  int bias_opno = 3;
>  
>    if (icode == CODE_FOR_nothing)
>      {
> @@ -4610,15 +4609,14 @@ internal_len_load_store_bias (internal_fn ifn, machine_mode mode)
>  	  optab = direct_internal_fn_optab (IFN_LEN_MASK_STORE);
>  	}
>        icode = convert_optab_handler (optab, mode, mask_mode);
> -      bias_opno = 4;
>      }
>  
>    if (icode != CODE_FOR_nothing)
>      {
>        /* For now we only support biases of 0 or -1.  Try both of them.  */
> -      if (insn_operand_matches (icode, bias_opno, GEN_INT (0)))
> +      if (insn_operand_matches (icode, 3, GEN_INT (0)))
>  	return 0;
> -      if (insn_operand_matches (icode, bias_opno, GEN_INT (-1)))
> +      if (insn_operand_matches (icode, 3, GEN_INT (-1)))
>  	return -1;
>      }
>  
> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
> index 8f21068e300..4234bbfed87 100644
> --- a/gcc/internal-fn.h
> +++ b/gcc/internal-fn.h
> @@ -234,6 +234,7 @@ extern bool internal_load_fn_p (internal_fn);
>  extern bool internal_store_fn_p (internal_fn);
>  extern bool internal_gather_scatter_fn_p (internal_fn);
>  extern int internal_fn_mask_index (internal_fn);
> +extern int internal_fn_len_index (internal_fn);
>  extern int internal_fn_stored_value_index (internal_fn);
>  extern bool internal_gather_scatter_fn_supported_p (internal_fn, tree,
>  						    tree, tree, int);
> diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
> index f8338037a61..9c6004cdce8 100644
> --- a/gcc/tree-ssa-dse.cc
> +++ b/gcc/tree-ssa-dse.cc
> @@ -161,12 +161,13 @@ initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write, bool may_def_ok = false)
>  	case IFN_MASK_STORE:
>  	case IFN_LEN_MASK_STORE:
>  	  {
> -	    int stored_value_index
> -	      = internal_fn_stored_value_index (gimple_call_internal_fn (stmt));
> -	    if (gimple_call_internal_fn (stmt) == IFN_LEN_STORE)
> +	    internal_fn ifn = gimple_call_internal_fn (stmt);
> +	    int stored_value_index = internal_fn_stored_value_index (ifn);
> +	    int len_index = internal_fn_len_index (ifn);
> +	    if (ifn == IFN_LEN_STORE)
>  	      {
> -		tree len = gimple_call_arg (stmt, 2);
> -		tree bias = gimple_call_arg (stmt, 4);
> +		tree len = gimple_call_arg (stmt, len_index);
> +		tree bias = gimple_call_arg (stmt, len_index + 1);
>  		if (tree_fits_uhwi_p (len))
>  		  {
>  		    ao_ref_init_from_ptr_and_size (write,
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 68faa8ead39..a0c39268bf0 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -9122,13 +9122,14 @@ vectorizable_store (vec_info *vinfo,
>  		  if (partial_ifn == IFN_LEN_MASK_STORE)
>  		    call = gimple_build_call_internal (IFN_LEN_MASK_STORE, 6,
>  						       dataref_ptr, ptr,
> -						       final_len, final_mask,
> -						       vec_oprnd, bias);
> +						       final_len, bias,
> +						       final_mask, vec_oprnd);
>  		  else
>  		    call
>  		      = gimple_build_call_internal (IFN_LEN_STORE, 5,
> -						    dataref_ptr, ptr, final_len,
> -						    vec_oprnd, bias);
> +						    dataref_ptr, ptr,
> +						    final_len, bias,
> +						    vec_oprnd);
>  		  gimple_call_set_nothrow (call, true);
>  		  vect_finish_stmt_generation (vinfo, stmt_info, call, gsi);
>  		  new_stmt = call;
> @@ -10523,7 +10524,7 @@ vectorizable_load (vec_info *vinfo,
>  			  call = gimple_build_call_internal (IFN_LEN_MASK_LOAD,
>  							     5, dataref_ptr,
>  							     ptr, final_len,
> -							     final_mask, bias);
> +							     bias, final_mask);
>  			else
>  			  call = gimple_build_call_internal (IFN_LEN_LOAD, 4,
>  							     dataref_ptr, ptr,

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments
  2023-07-03  9:24   ` juzhe.zhong
@ 2023-07-03  9:32     ` Robin Dapp
  0 siblings, 0 replies; 7+ messages in thread
From: Robin Dapp @ 2023-07-03  9:32 UTC (permalink / raw)
  To: juzhe.zhong, gcc-patches
  Cc: rdapp.gcc, richard.sandiford, rguenther, linkw, krebbel


> Similar to LEN_MASK_LOAD/STORE, their orders are consistent now after
> this patch.
Ah right, apologies.

Regards
 Robin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments
  2023-07-03  9:26 ` Richard Sandiford
@ 2023-07-03 14:28   ` Li, Pan2
  0 siblings, 0 replies; 7+ messages in thread
From: Li, Pan2 @ 2023-07-03 14:28 UTC (permalink / raw)
  To: Richard Sandiford, juzhe.zhong; +Cc: gcc-patches, rguenther

Committed as passed both the bootstrap and regression test, thanks Richard.

Pan

-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of Richard Sandiford via Gcc-patches
Sent: Monday, July 3, 2023 5:27 PM
To: juzhe.zhong@rivai.ai
Cc: gcc-patches@gcc.gnu.org; rguenther@suse.de
Subject: Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments

juzhe.zhong@rivai.ai writes:
> From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
>
> Hi, Richard. I fix the order as you suggeted.
>
> Before this patch, the order is {len,mask,bias}.
>
> Now, after this patch, the order becomes {len,bias,mask}.
>
> Since you said we should not need 'internal_fn_bias_index', the bias index should always be the len index + 1.
> I notice LEN_STORE order is {len,vector,bias}, to make them consistent, I reorder into LEN_STORE {len,bias,vector}.
> Just like MASK_STORE {mask,vector}.
>
> Ok for trunk ?
>
> gcc/ChangeLog:
>
>         * config/riscv/autovec.md: Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
>         * config/riscv/riscv-v.cc (expand_load_store): Ditto.
>         * doc/md.texi: Ditto.
>         * gimple-fold.cc (gimple_fold_partial_load_store_mem_ref): Ditto.
>         * internal-fn.cc (len_maskload_direct): Ditto.
>         (len_maskstore_direct): Ditto.
>         (add_len_and_mask_args): New function.
>         (expand_partial_load_optab_fn): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
>         (expand_partial_store_optab_fn): Ditto.
>         (internal_fn_len_index): New function.
>         (internal_fn_mask_index): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
>         (internal_fn_stored_value_index): Ditto.
>         (internal_len_load_store_bias): Ditto.
>         * internal-fn.h (internal_fn_len_index): New function.
>         * tree-ssa-dse.cc (initialize_ao_ref_for_dse): Change order of LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
>         * tree-vect-stmts.cc (vectorizable_store): Ditto.
>         (vectorizable_load): Ditto.

OK, thanks.

Richard

> ---
>  gcc/config/riscv/autovec.md |   8 +-
>  gcc/config/riscv/riscv-v.cc |   2 +-
>  gcc/doc/md.texi             |  16 ++--
>  gcc/gimple-fold.cc          |   8 +-
>  gcc/internal-fn.cc          | 156 ++++++++++++++++++------------------
>  gcc/internal-fn.h           |   1 +
>  gcc/tree-ssa-dse.cc         |  11 +--
>  gcc/tree-vect-stmts.cc      |  11 +--
>  8 files changed, 107 insertions(+), 106 deletions(-)
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 1488f2be1be..4ab0e9f99eb 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -26,8 +26,8 @@
>    [(match_operand:V 0 "register_operand")
>     (match_operand:V 1 "memory_operand")
>     (match_operand 2 "autovec_length_operand")
> -   (match_operand:<VM> 3 "vector_mask_operand")
> -   (match_operand 4 "const_0_operand")]
> +   (match_operand 3 "const_0_operand")
> +   (match_operand:<VM> 4 "vector_mask_operand")]
>    "TARGET_VECTOR"
>  {
>    riscv_vector::expand_load_store (operands, true);
> @@ -38,8 +38,8 @@
>    [(match_operand:V 0 "memory_operand")
>     (match_operand:V 1 "register_operand")
>     (match_operand 2 "autovec_length_operand")
> -   (match_operand:<VM> 3 "vector_mask_operand")
> -   (match_operand 4 "const_0_operand")]
> +   (match_operand 3 "const_0_operand")
> +   (match_operand:<VM> 4 "vector_mask_operand")]
>    "TARGET_VECTOR"
>  {
>    riscv_vector::expand_load_store (operands, false);
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index adb8d7d36a5..8d5bed7ebe4 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -2777,7 +2777,7 @@ expand_load_store (rtx *ops, bool is_load)
>  {
>    poly_int64 value;
>    rtx len = ops[2];
> -  rtx mask = ops[3];
> +  rtx mask = ops[4];
>    machine_mode mode = GET_MODE (ops[0]);
>  
>    if (poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode)))
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index cefdee84821..5e5482265cd 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5302,15 +5302,15 @@ This pattern is not allowed to @code{FAIL}.
>  @cindex @code{len_maskload@var{m}@var{n}} instruction pattern
>  @item @samp{len_maskload@var{m}@var{n}}
>  Perform a masked load from the memory location pointed to by operand 1
> -into register operand 0.  (operand 2 + operand 4) elements are loaded from
> +into register operand 0.  (operand 2 + operand 3) elements are loaded from
>  memory and other elements in operand 0 are set to undefined values.
>  This is a combination of len_load and maskload.
>  Operands 0 and 1 have mode @var{m}, which must be a vector mode.  Operand 2
>  has whichever integer mode the target prefers.  A mask is specified in
> -operand 3 which must be of type @var{n}.  The mask has lower precedence than
> +operand 4 which must be of type @var{n}.  The mask has lower precedence than
>  the length and is itself subject to length masking,
> -i.e. only mask indices < (operand 2 + operand 4) are used.
> -Operand 4 conceptually has mode @code{QI}.
> +i.e. only mask indices < (operand 2 + operand 3) are used.
> +Operand 3 conceptually has mode @code{QI}.
>  
>  Operand 2 can be a variable or a constant amount.  Operand 4 specifies a
>  constant bias: it is either a constant 0 or a constant -1.  The predicate on
> @@ -5329,14 +5329,14 @@ This pattern is not allowed to @code{FAIL}.
>  @cindex @code{len_maskstore@var{m}@var{n}} instruction pattern
>  @item @samp{len_maskstore@var{m}@var{n}}
>  Perform a masked store from vector register operand 1 into memory operand 0.
> -(operand 2 + operand 4) elements are stored to memory
> +(operand 2 + operand 3) elements are stored to memory
>  and leave the other elements of operand 0 unchanged.
>  This is a combination of len_store and maskstore.
>  Operands 0 and 1 have mode @var{m}, which must be a vector mode.  Operand 2 has whichever
> -integer mode the target prefers.  A mask is specified in operand 3 which must be
> +integer mode the target prefers.  A mask is specified in operand 4 which must be
>  of type @var{n}.  The mask has lower precedence than the length and is itself subject to
> -length masking, i.e. only mask indices < (operand 2 + operand 4) are used.
> -Operand 4 conceptually has mode @code{QI}.
> +length masking, i.e. only mask indices < (operand 2 + operand 3) are used.
> +Operand 3 conceptually has mode @code{QI}.
>  
>  Operand 2 can be a variable or a constant amount.  Operand 3 specifies a
>  constant bias: it is either a constant 0 or a constant -1.  The predicate on
> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
> index 8434274f69d..4027ff71e10 100644
> --- a/gcc/gimple-fold.cc
> +++ b/gcc/gimple-fold.cc
> @@ -5391,11 +5391,12 @@ gimple_fold_partial_load_store_mem_ref (gcall *call, tree vectype, bool mask_p)
>      }
>    else
>      {
> -      tree basic_len = gimple_call_arg (call, 2);
> +      internal_fn ifn = gimple_call_internal_fn (call);
> +      int len_index = internal_fn_len_index (ifn);
> +      tree basic_len = gimple_call_arg (call, len_index);
>        if (!poly_int_tree_p (basic_len))
>  	return NULL_TREE;
> -      unsigned int nargs = gimple_call_num_args (call);
> -      tree bias = gimple_call_arg (call, nargs - 1);
> +      tree bias = gimple_call_arg (call, len_index + 1);
>        gcc_assert (TREE_CODE (bias) == INTEGER_CST);
>        /* For LEN_LOAD/LEN_STORE/LEN_MASK_LOAD/LEN_MASK_STORE,
>  	 we don't fold when (bias + len) != VF.  */
> @@ -5405,7 +5406,6 @@ gimple_fold_partial_load_store_mem_ref (gcall *call, tree vectype, bool mask_p)
>  
>        /* For LEN_MASK_{LOAD,STORE}, we should also check whether
>  	  the mask is all ones mask.  */
> -      internal_fn ifn = gimple_call_internal_fn (call);
>        if (ifn == IFN_LEN_MASK_LOAD || ifn == IFN_LEN_MASK_STORE)
>  	{
>  	  tree mask = gimple_call_arg (call, internal_fn_mask_index (ifn));
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 9017176dc7a..c1fcb38b17b 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -165,7 +165,7 @@ init_internal_fns ()
>  #define mask_load_lanes_direct { -1, -1, false }
>  #define gather_load_direct { 3, 1, false }
>  #define len_load_direct { -1, -1, false }
> -#define len_maskload_direct { -1, 3, false }
> +#define len_maskload_direct { -1, 4, false }
>  #define mask_store_direct { 3, 2, false }
>  #define store_lanes_direct { 0, 0, false }
>  #define mask_store_lanes_direct { 0, 0, false }
> @@ -173,7 +173,7 @@ init_internal_fns ()
>  #define vec_cond_direct { 2, 0, false }
>  #define scatter_store_direct { 3, 1, false }
>  #define len_store_direct { 3, 3, false }
> -#define len_maskstore_direct { 4, 3, false }
> +#define len_maskstore_direct { 4, 5, false }
>  #define vec_set_direct { 3, 3, false }
>  #define unary_direct { 0, 0, true }
>  #define unary_convert_direct { -1, 0, true }
> @@ -293,6 +293,38 @@ get_multi_vector_move (tree array_type, convert_optab optab)
>    return convert_optab_handler (optab, imode, vmode);
>  }
>  
> +/* Add len and mask arguments according to the STMT.  */
> +
> +static unsigned int
> +add_len_and_mask_args (expand_operand *ops, unsigned int opno, gcall *stmt)
> +{
> +  internal_fn ifn = gimple_call_internal_fn (stmt);
> +  int len_index = internal_fn_len_index (ifn);
> +  /* BIAS is always consecutive next of LEN.  */
> +  int bias_index = len_index + 1;
> +  int mask_index = internal_fn_mask_index (ifn);
> +  /* The order of arguments are always {len,bias,mask}.  */
> +  if (len_index >= 0)
> +    {
> +      tree len = gimple_call_arg (stmt, len_index);
> +      rtx len_rtx = expand_normal (len);
> +      create_convert_operand_from (&ops[opno++], len_rtx,
> +				   TYPE_MODE (TREE_TYPE (len)),
> +				   TYPE_UNSIGNED (TREE_TYPE (len)));
> +      tree biast = gimple_call_arg (stmt, bias_index);
> +      rtx bias = expand_normal (biast);
> +      create_input_operand (&ops[opno++], bias, QImode);
> +    }
> +  if (mask_index >= 0)
> +    {
> +      tree mask = gimple_call_arg (stmt, mask_index);
> +      rtx mask_rtx = expand_normal (mask);
> +      create_input_operand (&ops[opno++], mask_rtx,
> +			    TYPE_MODE (TREE_TYPE (mask)));
> +    }
> +  return opno;
> +}
> +
>  /* Expand LOAD_LANES call STMT using optab OPTAB.  */
>  
>  static void
> @@ -2879,14 +2911,15 @@ expand_call_mem_ref (tree type, gcall *stmt, int index)
>   * OPTAB.  */
>  
>  static void
> -expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> +expand_partial_load_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab)
>  {
> +  int i = 0;
>    class expand_operand ops[5];
> -  tree type, lhs, rhs, maskt, biast;
> -  rtx mem, target, mask, bias;
> +  tree type, lhs, rhs, maskt;
> +  rtx mem, target;
>    insn_code icode;
>  
> -  maskt = gimple_call_arg (stmt, 2);
> +  maskt = gimple_call_arg (stmt, internal_fn_mask_index (ifn));
>    lhs = gimple_call_lhs (stmt);
>    if (lhs == NULL_TREE)
>      return;
> @@ -2903,38 +2936,11 @@ expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
>  
>    mem = expand_expr (rhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>    gcc_assert (MEM_P (mem));
> -  mask = expand_normal (maskt);
>    target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> -  create_output_operand (&ops[0], target, TYPE_MODE (type));
> -  create_fixed_operand (&ops[1], mem);
> -  if (optab == len_load_optab)
> -    {
> -      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
> -				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
> -      biast = gimple_call_arg (stmt, 3);
> -      bias = expand_normal (biast);
> -      create_input_operand (&ops[3], bias, QImode);
> -      expand_insn (icode, 4, ops);
> -    }
> -  else if (optab == len_maskload_optab)
> -    {
> -      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
> -				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
> -      maskt = gimple_call_arg (stmt, 3);
> -      mask = expand_normal (maskt);
> -      create_input_operand (&ops[3], mask, TYPE_MODE (TREE_TYPE (maskt)));
> -      icode = convert_optab_handler (optab, TYPE_MODE (type),
> -				     TYPE_MODE (TREE_TYPE (maskt)));
> -      biast = gimple_call_arg (stmt, 4);
> -      bias = expand_normal (biast);
> -      create_input_operand (&ops[4], bias, QImode);
> -      expand_insn (icode, 5, ops);
> -    }
> -  else
> -    {
> -      create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
> -      expand_insn (icode, 3, ops);
> -    }
> +  create_output_operand (&ops[i++], target, TYPE_MODE (type));
> +  create_fixed_operand (&ops[i++], mem);
> +  i = add_len_and_mask_args (ops, i, stmt);
> +  expand_insn (icode, i, ops);
>  
>    if (!rtx_equal_p (target, ops[0].value))
>      emit_move_insn (target, ops[0].value);
> @@ -2951,12 +2957,13 @@ expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
>  static void
>  expand_partial_store_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab)
>  {
> +  int i = 0;
>    class expand_operand ops[5];
> -  tree type, lhs, rhs, maskt, biast;
> -  rtx mem, reg, mask, bias;
> +  tree type, lhs, rhs, maskt;
> +  rtx mem, reg;
>    insn_code icode;
>  
> -  maskt = gimple_call_arg (stmt, 2);
> +  maskt = gimple_call_arg (stmt, internal_fn_mask_index (ifn));
>    rhs = gimple_call_arg (stmt, internal_fn_stored_value_index (ifn));
>    type = TREE_TYPE (rhs);
>    lhs = expand_call_mem_ref (type, stmt, 0);
> @@ -2971,37 +2978,11 @@ expand_partial_store_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab
>  
>    mem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>    gcc_assert (MEM_P (mem));
> -  mask = expand_normal (maskt);
>    reg = expand_normal (rhs);
> -  create_fixed_operand (&ops[0], mem);
> -  create_input_operand (&ops[1], reg, TYPE_MODE (type));
> -  if (optab == len_store_optab)
> -    {
> -      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
> -				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
> -      biast = gimple_call_arg (stmt, 4);
> -      bias = expand_normal (biast);
> -      create_input_operand (&ops[3], bias, QImode);
> -      expand_insn (icode, 4, ops);
> -    }
> -  else if (optab == len_maskstore_optab)
> -    {
> -      create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)),
> -				   TYPE_UNSIGNED (TREE_TYPE (maskt)));
> -      maskt = gimple_call_arg (stmt, 3);
> -      mask = expand_normal (maskt);
> -      create_input_operand (&ops[3], mask, TYPE_MODE (TREE_TYPE (maskt)));
> -      biast = gimple_call_arg (stmt, 5);
> -      bias = expand_normal (biast);
> -      create_input_operand (&ops[4], bias, QImode);
> -      icode = convert_optab_handler (optab, TYPE_MODE (type), GET_MODE (mask));
> -      expand_insn (icode, 5, ops);
> -    }
> -  else
> -    {
> -      create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
> -      expand_insn (icode, 3, ops);
> -    }
> +  create_fixed_operand (&ops[i++], mem);
> +  create_input_operand (&ops[i++], reg, TYPE_MODE (type));
> +  i = add_len_and_mask_args (ops, i, stmt);
> +  expand_insn (icode, i, ops);
>  }
>  
>  #define expand_mask_store_optab_fn expand_partial_store_optab_fn
> @@ -4482,6 +4463,25 @@ internal_gather_scatter_fn_p (internal_fn fn)
>      }
>  }
>  
> +/* If FN takes a vector len argument, return the index of that argument,
> +   otherwise return -1.  */
> +
> +int
> +internal_fn_len_index (internal_fn fn)
> +{
> +  switch (fn)
> +    {
> +    case IFN_LEN_LOAD:
> +    case IFN_LEN_STORE:
> +    case IFN_LEN_MASK_LOAD:
> +    case IFN_LEN_MASK_STORE:
> +      return 2;
> +
> +    default:
> +      return -1;
> +    }
> +}
> +
>  /* If FN takes a vector mask argument, return the index of that argument,
>     otherwise return -1.  */
>  
> @@ -4498,11 +4498,9 @@ internal_fn_mask_index (internal_fn fn)
>  
>      case IFN_MASK_GATHER_LOAD:
>      case IFN_MASK_SCATTER_STORE:
> -      return 4;
> -
>      case IFN_LEN_MASK_LOAD:
>      case IFN_LEN_MASK_STORE:
> -      return 3;
> +      return 4;
>  
>      default:
>        return (conditional_internal_fn_code (fn) != ERROR_MARK
> @@ -4522,12 +4520,14 @@ internal_fn_stored_value_index (internal_fn fn)
>      case IFN_MASK_STORE_LANES:
>      case IFN_SCATTER_STORE:
>      case IFN_MASK_SCATTER_STORE:
> -    case IFN_LEN_STORE:
>        return 3;
>  
> -    case IFN_LEN_MASK_STORE:
> +    case IFN_LEN_STORE:
>        return 4;
>  
> +    case IFN_LEN_MASK_STORE:
> +      return 5;
> +
>      default:
>        return -1;
>      }
> @@ -4592,7 +4592,6 @@ internal_len_load_store_bias (internal_fn ifn, machine_mode mode)
>  {
>    optab optab = direct_internal_fn_optab (ifn);
>    insn_code icode = direct_optab_handler (optab, mode);
> -  int bias_opno = 3;
>  
>    if (icode == CODE_FOR_nothing)
>      {
> @@ -4610,15 +4609,14 @@ internal_len_load_store_bias (internal_fn ifn, machine_mode mode)
>  	  optab = direct_internal_fn_optab (IFN_LEN_MASK_STORE);
>  	}
>        icode = convert_optab_handler (optab, mode, mask_mode);
> -      bias_opno = 4;
>      }
>  
>    if (icode != CODE_FOR_nothing)
>      {
>        /* For now we only support biases of 0 or -1.  Try both of them.  */
> -      if (insn_operand_matches (icode, bias_opno, GEN_INT (0)))
> +      if (insn_operand_matches (icode, 3, GEN_INT (0)))
>  	return 0;
> -      if (insn_operand_matches (icode, bias_opno, GEN_INT (-1)))
> +      if (insn_operand_matches (icode, 3, GEN_INT (-1)))
>  	return -1;
>      }
>  
> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
> index 8f21068e300..4234bbfed87 100644
> --- a/gcc/internal-fn.h
> +++ b/gcc/internal-fn.h
> @@ -234,6 +234,7 @@ extern bool internal_load_fn_p (internal_fn);
>  extern bool internal_store_fn_p (internal_fn);
>  extern bool internal_gather_scatter_fn_p (internal_fn);
>  extern int internal_fn_mask_index (internal_fn);
> +extern int internal_fn_len_index (internal_fn);
>  extern int internal_fn_stored_value_index (internal_fn);
>  extern bool internal_gather_scatter_fn_supported_p (internal_fn, tree,
>  						    tree, tree, int);
> diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
> index f8338037a61..9c6004cdce8 100644
> --- a/gcc/tree-ssa-dse.cc
> +++ b/gcc/tree-ssa-dse.cc
> @@ -161,12 +161,13 @@ initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write, bool may_def_ok = false)
>  	case IFN_MASK_STORE:
>  	case IFN_LEN_MASK_STORE:
>  	  {
> -	    int stored_value_index
> -	      = internal_fn_stored_value_index (gimple_call_internal_fn (stmt));
> -	    if (gimple_call_internal_fn (stmt) == IFN_LEN_STORE)
> +	    internal_fn ifn = gimple_call_internal_fn (stmt);
> +	    int stored_value_index = internal_fn_stored_value_index (ifn);
> +	    int len_index = internal_fn_len_index (ifn);
> +	    if (ifn == IFN_LEN_STORE)
>  	      {
> -		tree len = gimple_call_arg (stmt, 2);
> -		tree bias = gimple_call_arg (stmt, 4);
> +		tree len = gimple_call_arg (stmt, len_index);
> +		tree bias = gimple_call_arg (stmt, len_index + 1);
>  		if (tree_fits_uhwi_p (len))
>  		  {
>  		    ao_ref_init_from_ptr_and_size (write,
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 68faa8ead39..a0c39268bf0 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -9122,13 +9122,14 @@ vectorizable_store (vec_info *vinfo,
>  		  if (partial_ifn == IFN_LEN_MASK_STORE)
>  		    call = gimple_build_call_internal (IFN_LEN_MASK_STORE, 6,
>  						       dataref_ptr, ptr,
> -						       final_len, final_mask,
> -						       vec_oprnd, bias);
> +						       final_len, bias,
> +						       final_mask, vec_oprnd);
>  		  else
>  		    call
>  		      = gimple_build_call_internal (IFN_LEN_STORE, 5,
> -						    dataref_ptr, ptr, final_len,
> -						    vec_oprnd, bias);
> +						    dataref_ptr, ptr,
> +						    final_len, bias,
> +						    vec_oprnd);
>  		  gimple_call_set_nothrow (call, true);
>  		  vect_finish_stmt_generation (vinfo, stmt_info, call, gsi);
>  		  new_stmt = call;
> @@ -10523,7 +10524,7 @@ vectorizable_load (vec_info *vinfo,
>  			  call = gimple_build_call_internal (IFN_LEN_MASK_LOAD,
>  							     5, dataref_ptr,
>  							     ptr, final_len,
> -							     final_mask, bias);
> +							     bias, final_mask);
>  			else
>  			  call = gimple_build_call_internal (IFN_LEN_LOAD, 4,
>  							     dataref_ptr, ptr,

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-07-03 14:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-03  9:10 [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments juzhe.zhong
2023-07-03  9:17 ` Robin Dapp
2023-07-03  9:20   ` juzhe.zhong
2023-07-03  9:24   ` juzhe.zhong
2023-07-03  9:32     ` Robin Dapp
2023-07-03  9:26 ` Richard Sandiford
2023-07-03 14:28   ` Li, Pan2

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).