public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "yangyang (ET)" <yangyang305@huawei.com>
To: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Richard Sandiford <richard.sandiford@arm.com>
Subject: [PATCH 1/5] [PR target/96342] Change field "simdlen" into poly_uint64
Date: Fri, 30 Oct 2020 02:29:17 +0000	[thread overview]
Message-ID: <e8a9583ef1ca4b8d9b85167f6ad7b400@huawei.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1185 bytes --]

Hi,

    This is the first patch for PR96698.

        In order to support the generating of SVE functions for "omp declare simd", this patch changes the type of the field "simdlen" of struct cgraph_simd_clone from unsigned int to poly_uint64.

        Although Richard mentioned in the PR that poly_uint64 will naturally decay to a uint64_t in i386 target files, it seems that operation /= is not supported yet, so I change "clonei->simdlen /= GET_MODE_BITSIZE (TYPE_MODE (base_type));" into "clonei->simdlen = clonei->simdlen / GET_MODE_BITSIZE (TYPE_MODE (base_type));". Also calls of to_constant () is added in printf to pass the bootstrap. However, I have no idea whether these are the best ways to do so, any suggestion?

    Richard also suggested to define a new macro to calculate a vector multiple instead of using constant_multiple_p in part2 patch, while I found that there are similar situations in part1 patch as well, so I do this in part1 patch. I didn't think of a better name, so I use "vector_unroll_factor".

        Bootstrap and tested on both aarch64 and x86 Linux platform, no new regression witnessed.

        Ok for trunk?

Thanks,
Yang Yang

[-- Attachment #2: PR96342-part1-v1.patch --]
[-- Type: application/octet-stream, Size: 23047 bytes --]

From d01d655e3ebcaa76bec21f2c62b304c4ff9d8f56 Mon Sep 17 00:00:00 2001
From: Yang Yang <yangyang305@huawei.com>
Date: Thu, 29 Oct 2020 02:50:40 +0800
Subject: [PATCH] PR target/96342 Change field "simdlen" into poly_uint64

This is the first patch of PR96342. In order to add support for
"omp declare simd", change the type of the field "simdlen" of
struct cgraph_simd_clone from unsigned int to poly_uint64 and
related adaptation. Since the length might be variable for the
SVE cases.

2020-10-30  Yang Yang  <yangyang305@huawei.com>

gcc/ChangeLog:

	* cgraph.h (struct cgraph_simd_clone): Change field "simdlen" of
	struct cgraph_simd_clone from unsigned int to poly_uint64.
	* config/aarch64/aarch64.c
	(aarch64_simd_clone_compute_vecsize_and_simdlen): adaptation of
	operations on "simdlen".
	* config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen):
	Printf formats update.
	* gengtype.c (main): Handle poly_uint64.
	* omp-simd-clone.c (simd_clone_mangle): Likewise.Re
	(simd_clone_adjust_return_type): Likewise.
	(create_tmp_simd_array): Likewise.
	(simd_clone_adjust_argument_types): Likewise.
	(simd_clone_init_simd_arrays): Likewise.
	(ipa_simd_modify_function_body): Likewise.
	(simd_clone_adjust): Likewise.
	(expand_simd_clones): Likewise.
	* poly-int-types.h (vector_unroll_factor): New macro.
	* tree-vect-stmts.c (vectorizable_simd_clone_call): Likewise.
---
 gcc/cgraph.h                 |  6 +--
 gcc/config/aarch64/aarch64.c | 30 +++++++++------
 gcc/config/i386/i386.c       |  8 ++--
 gcc/gengtype.c               |  1 +
 gcc/omp-simd-clone.c         | 72 ++++++++++++++++++++++--------------
 gcc/poly-int-types.h         |  8 ++++
 gcc/tree-vect-stmts.c        | 59 +++++++++++++++--------------
 7 files changed, 112 insertions(+), 72 deletions(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 96d6cf609fe..9dc886cc58a 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -826,17 +826,17 @@ struct GTY(()) cgraph_simd_clone_arg {
 
 struct GTY(()) cgraph_simd_clone {
   /* Number of words in the SIMD lane associated with this clone.  */
-  unsigned int simdlen;
+  poly_uint64 simdlen;
 
   /* Number of annotated function arguments in `args'.  This is
      usually the number of named arguments in FNDECL.  */
   unsigned int nargs;
 
   /* Max hardware vector size in bits for integral vectors.  */
-  unsigned int vecsize_int;
+  poly_uint64 vecsize_int;
 
   /* Max hardware vector size in bits for floating point vectors.  */
-  unsigned int vecsize_float;
+  poly_uint64 vecsize_float;
 
   /* Machine mode of the mask argument(s), if they are to be passed
      as bitmasks in integer argument(s).  VOIDmode if masks are passed
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index a8cc545c370..c630c0c7f81 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -23044,18 +23044,23 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
 					tree base_type, int num)
 {
   tree t, ret_type, arg_type;
-  unsigned int elt_bits, vec_bits, count;
+  unsigned int elt_bits, count;
+  unsigned HOST_WIDE_INT const_simdlen;
+  poly_uint64 vec_bits;
 
   if (!TARGET_SIMD)
     return 0;
 
-  if (clonei->simdlen
-      && (clonei->simdlen < 2
-	  || clonei->simdlen > 1024
-	  || (clonei->simdlen & (clonei->simdlen - 1)) != 0))
+  /* For now, SVE simdclones won't produce illegal simdlen, So only check
+     const simdlens here.  */
+  if (maybe_ne (clonei->simdlen, 0U)
+      && (clonei->simdlen.is_constant (&const_simdlen))
+      && (const_simdlen < 2
+	  || const_simdlen > 1024
+	  || (const_simdlen & (const_simdlen - 1)) != 0))
     {
       warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
-		  "unsupported simdlen %d", clonei->simdlen);
+		  "unsupported simdlen %wd", const_simdlen);
       return 0;
     }
 
@@ -23099,21 +23104,24 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
   clonei->vecsize_mangle = 'n';
   clonei->mask_mode = VOIDmode;
   elt_bits = GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type));
-  if (clonei->simdlen == 0)
+  if (known_eq (clonei->simdlen, 0U))
     {
       count = 2;
       vec_bits = (num == 0 ? 64 : 128);
-      clonei->simdlen = vec_bits / elt_bits;
+      clonei->simdlen = exact_div (vec_bits, elt_bits);
     }
   else
     {
       count = 1;
       vec_bits = clonei->simdlen * elt_bits;
-      if (vec_bits != 64 && vec_bits != 128)
+      /* For now, SVE simdclones won't produce illegal simdlen, So only check
+	 const simdlens here.  */
+      if (clonei->simdlen.is_constant (&const_simdlen)
+	  && known_ne (vec_bits, 64U) && known_ne (vec_bits, 128U))
 	{
 	  warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
-		      "GCC does not currently support simdlen %d for type %qT",
-		      clonei->simdlen, base_type);
+		      "GCC does not currently support simdlen %wd for type %qT",
+		      const_simdlen, base_type);
 	  return 0;
 	}
     }
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 54c2cdaf060..0ef037e5e55 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -22140,7 +22140,7 @@ ix86_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
 	  || (clonei->simdlen & (clonei->simdlen - 1)) != 0))
     {
       warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
-		  "unsupported simdlen %d", clonei->simdlen);
+		  "unsupported simdlen %ld", clonei->simdlen.to_constant ());
       return 0;
     }
 
@@ -22245,7 +22245,8 @@ ix86_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
 	clonei->simdlen = clonei->vecsize_int;
       else
 	clonei->simdlen = clonei->vecsize_float;
-      clonei->simdlen /= GET_MODE_BITSIZE (TYPE_MODE (base_type));
+      clonei->simdlen = clonei->simdlen
+			/ GET_MODE_BITSIZE (TYPE_MODE (base_type));
     }
   else if (clonei->simdlen > 16)
     {
@@ -22267,7 +22268,8 @@ ix86_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
       if (cnt > (TARGET_64BIT ? 16 : 8))
 	{
 	  warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
-		      "unsupported simdlen %d", clonei->simdlen);
+		      "unsupported simdlen %ld",
+		      clonei->simdlen.to_constant ());
 	  return 0;
 	}
       }
diff --git a/gcc/gengtype.c b/gcc/gengtype.c
index a59a8823f82..919e3a00bf2 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -5198,6 +5198,7 @@ main (int argc, char **argv)
       POS_HERE (do_scalar_typedef ("widest_int", &pos));
       POS_HERE (do_scalar_typedef ("int64_t", &pos));
       POS_HERE (do_scalar_typedef ("poly_int64", &pos));
+      POS_HERE (do_scalar_typedef ("poly_uint64", &pos));
       POS_HERE (do_scalar_typedef ("uint64_t", &pos));
       POS_HERE (do_scalar_typedef ("uint8", &pos));
       POS_HERE (do_scalar_typedef ("uintptr_t", &pos));
diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
index 942fb971cb7..0671f93635c 100644
--- a/gcc/omp-simd-clone.c
+++ b/gcc/omp-simd-clone.c
@@ -338,16 +338,18 @@ simd_clone_mangle (struct cgraph_node *node,
 {
   char vecsize_mangle = clone_info->vecsize_mangle;
   char mask = clone_info->inbranch ? 'M' : 'N';
-  unsigned int simdlen = clone_info->simdlen;
+  poly_uint64 simdlen = clone_info->simdlen;
   unsigned int n;
   pretty_printer pp;
 
-  gcc_assert (vecsize_mangle && simdlen);
+  gcc_assert (vecsize_mangle && maybe_ne (simdlen, 0U));
 
   pp_string (&pp, "_ZGV");
   pp_character (&pp, vecsize_mangle);
   pp_character (&pp, mask);
-  pp_decimal_int (&pp, simdlen);
+  /* For now, simdlen is always constant, while variable simdlen pp 'n'.  */
+  unsigned int len = simdlen.to_constant ();
+  pp_decimal_int (&pp, (len));
 
   for (n = 0; n < clone_info->nargs; ++n)
     {
@@ -491,7 +493,7 @@ simd_clone_adjust_return_type (struct cgraph_node *node)
 {
   tree fndecl = node->decl;
   tree orig_rettype = TREE_TYPE (TREE_TYPE (fndecl));
-  unsigned int veclen;
+  poly_uint64 veclen;
   tree t;
 
   /* Adjust the function return type.  */
@@ -502,17 +504,18 @@ simd_clone_adjust_return_type (struct cgraph_node *node)
     veclen = node->simdclone->vecsize_int;
   else
     veclen = node->simdclone->vecsize_float;
-  veclen /= GET_MODE_BITSIZE (SCALAR_TYPE_MODE (t));
-  if (veclen > node->simdclone->simdlen)
+  veclen = exact_div (veclen, GET_MODE_BITSIZE (SCALAR_TYPE_MODE (t)));
+  if (known_gt (veclen, node->simdclone->simdlen))
     veclen = node->simdclone->simdlen;
   if (POINTER_TYPE_P (t))
     t = pointer_sized_int_node;
-  if (veclen == node->simdclone->simdlen)
+  if (known_eq (veclen, node->simdclone->simdlen))
     t = build_vector_type (t, node->simdclone->simdlen);
   else
     {
       t = build_vector_type (t, veclen);
-      t = build_array_type_nelts (t, node->simdclone->simdlen / veclen);
+      t = build_array_type_nelts (t, exact_div (node->simdclone->simdlen,
+				  veclen));
     }
   TREE_TYPE (TREE_TYPE (fndecl)) = t;
   if (!node->definition)
@@ -526,7 +529,7 @@ simd_clone_adjust_return_type (struct cgraph_node *node)
 
   tree atype = build_array_type_nelts (orig_rettype,
 				       node->simdclone->simdlen);
-  if (veclen != node->simdclone->simdlen)
+  if (maybe_ne (veclen, node->simdclone->simdlen))
     return build1 (VIEW_CONVERT_EXPR, atype, t);
 
   /* Set up a SIMD array to use as the return value.  */
@@ -546,7 +549,7 @@ simd_clone_adjust_return_type (struct cgraph_node *node)
    SIMDLEN is the number of elements.  */
 
 static tree
-create_tmp_simd_array (const char *prefix, tree type, int simdlen)
+create_tmp_simd_array (const char *prefix, tree type, poly_uint64 simdlen)
 {
   tree atype = build_array_type_nelts (type, simdlen);
   tree avar = create_tmp_var_raw (atype, prefix);
@@ -578,7 +581,8 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
   struct cgraph_simd_clone *sc = node->simdclone;
   vec<ipa_adjusted_param, va_gc> *new_params = NULL;
   vec_safe_reserve (new_params, sc->nargs);
-  unsigned i, j, veclen;
+  unsigned i, j, k;
+  poly_uint64 veclen;
 
   for (i = 0; i < sc->nargs; ++i)
     {
@@ -614,8 +618,9 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
 	    veclen = sc->vecsize_int;
 	  else
 	    veclen = sc->vecsize_float;
-	  veclen /= GET_MODE_BITSIZE (SCALAR_TYPE_MODE (parm_type));
-	  if (veclen > sc->simdlen)
+	  veclen = exact_div (veclen,
+			      GET_MODE_BITSIZE (SCALAR_TYPE_MODE (parm_type)));
+	  if (known_gt (veclen, sc->simdlen))
 	    veclen = sc->simdlen;
 	  adj.op = IPA_PARAM_OP_NEW;
 	  adj.param_prefix_index = IPA_PARAM_PREFIX_SIMD;
@@ -624,10 +629,11 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
 	  else
 	    adj.type = build_vector_type (parm_type, veclen);
 	  sc->args[i].vector_type = adj.type;
-	  for (j = veclen; j < sc->simdlen; j += veclen)
+	  k = vector_unroll_factor (sc->simdlen, veclen);
+	  for (j = 1; j < k; j++)
 	    {
 	      vec_safe_push (new_params, adj);
-	      if (j == veclen)
+	      if (j == 1)
 		{
 		  memset (&adj, 0, sizeof (adj));
 		  adj.op = IPA_PARAM_OP_NEW;
@@ -663,8 +669,9 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
 	veclen = sc->vecsize_int;
       else
 	veclen = sc->vecsize_float;
-      veclen /= GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type));
-      if (veclen > sc->simdlen)
+      veclen = exact_div (veclen,
+			  GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type)));
+      if (known_gt (veclen, sc->simdlen))
 	veclen = sc->simdlen;
       if (sc->mask_mode != VOIDmode)
 	adj.type
@@ -675,7 +682,8 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
 	adj.type = build_vector_type (base_type, veclen);
       vec_safe_push (new_params, adj);
 
-      for (j = veclen; j < sc->simdlen; j += veclen)
+      k = vector_unroll_factor (sc->simdlen, veclen);
+      for (j = 1; j < k; j++)
 	vec_safe_push (new_params, adj);
 
       /* We have previously allocated one extra entry for the mask.  Use
@@ -690,9 +698,9 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
 	  if (sc->mask_mode == VOIDmode)
 	    sc->args[i].simd_array
 	      = create_tmp_simd_array ("mask", base_type, sc->simdlen);
-	  else if (veclen < sc->simdlen)
+	  else if (known_lt (veclen, sc->simdlen))
 	    sc->args[i].simd_array
-	      = create_tmp_simd_array ("mask", adj.type, sc->simdlen / veclen);
+	      = create_tmp_simd_array ("mask", adj.type, k);
 	  else
 	    sc->args[i].simd_array = NULL_TREE;
 	}
@@ -783,7 +791,8 @@ simd_clone_init_simd_arrays (struct cgraph_node *node,
 	    }
 	  continue;
 	}
-      if (simd_clone_subparts (TREE_TYPE (arg)) == node->simdclone->simdlen)
+      if (known_eq (simd_clone_subparts (TREE_TYPE (arg)),
+		    node->simdclone->simdlen))
 	{
 	  tree ptype = build_pointer_type (TREE_TYPE (TREE_TYPE (array)));
 	  tree ptr = build_fold_addr_expr (array);
@@ -795,8 +804,10 @@ simd_clone_init_simd_arrays (struct cgraph_node *node,
       else
 	{
 	  unsigned int simdlen = simd_clone_subparts (TREE_TYPE (arg));
+	  unsigned int times = vector_unroll_factor (node->simdclone->simdlen,
+						     simdlen);
 	  tree ptype = build_pointer_type (TREE_TYPE (TREE_TYPE (array)));
-	  for (k = 0; k < node->simdclone->simdlen; k += simdlen)
+	  for (k = 0; k < times; k++)
 	    {
 	      tree ptr = build_fold_addr_expr (array);
 	      int elemsize;
@@ -808,7 +819,7 @@ simd_clone_init_simd_arrays (struct cgraph_node *node,
 	      tree elemtype = TREE_TYPE (TREE_TYPE (arg));
 	      elemsize = GET_MODE_SIZE (SCALAR_TYPE_MODE (elemtype));
 	      tree t = build2 (MEM_REF, TREE_TYPE (arg), ptr,
-			       build_int_cst (ptype, k * elemsize));
+			       build_int_cst (ptype, k * elemsize * simdlen));
 	      t = build2 (MODIFY_EXPR, TREE_TYPE (t), t, arg);
 	      gimplify_and_add (t, &seq);
 	    }
@@ -981,8 +992,11 @@ ipa_simd_modify_function_body (struct cgraph_node *node,
 		  iter, NULL_TREE, NULL_TREE);
       adjustments->register_replacement (&(*adjustments->m_adj_params)[j], r);
 
-      if (simd_clone_subparts (vectype) < node->simdclone->simdlen)
-	j += node->simdclone->simdlen / simd_clone_subparts (vectype) - 1;
+      if (known_lt (simd_clone_subparts (vectype), node->simdclone->simdlen))
+	{
+	  j += vector_unroll_factor (node->simdclone->simdlen,
+				     simd_clone_subparts (vectype)) - 1;
+	}
     }
 
   tree name;
@@ -1249,7 +1263,8 @@ simd_clone_adjust (struct cgraph_node *node)
 	 below).  */
       loop = alloc_loop ();
       cfun->has_force_vectorize_loops = true;
-      loop->safelen = node->simdclone->simdlen;
+      /* For now, simlen is always constant.  */
+      loop->safelen = node->simdclone->simdlen.to_constant ();
       loop->force_vectorize = true;
       loop->header = body_bb;
     }
@@ -1275,7 +1290,8 @@ simd_clone_adjust (struct cgraph_node *node)
 	    {
 	      tree maskt = TREE_TYPE (mask_array);
 	      int c = tree_to_uhwi (TYPE_MAX_VALUE (TYPE_DOMAIN (maskt)));
-	      c = node->simdclone->simdlen / (c + 1);
+	      /* For now, c must be constant here.  */
+	      c = exact_div (node->simdclone->simdlen, c + 1).to_constant ();
 	      int s = exact_log2 (c);
 	      gcc_assert (s > 0);
 	      c--;
@@ -1683,7 +1699,7 @@ expand_simd_clones (struct cgraph_node *node)
       if (clone_info == NULL)
 	continue;
 
-      int orig_simdlen = clone_info->simdlen;
+      poly_uint64 orig_simdlen = clone_info->simdlen;
       tree base_type = simd_clone_compute_base_data_type (node, clone_info);
       /* The target can return 0 (no simd clones should be created),
 	 1 (just one ISA of simd clones should be created) or higher
diff --git a/gcc/poly-int-types.h b/gcc/poly-int-types.h
index 5e04e63ebf2..78083098baa 100644
--- a/gcc/poly-int-types.h
+++ b/gcc/poly-int-types.h
@@ -81,6 +81,14 @@ typedef poly_int<NUM_POLY_INT_COEFFS, widest_int> poly_widest_int;
 #define vector_element_size(SIZE, NELTS) \
   (exact_div (SIZE, NELTS).to_constant ())
 
+/* Return the number of unroll times when a vector has NELTS1 elements
+   is unrolled to vectors has NELTS2 elements.
+
+   to_constant () is safe in this situation because the multiples of the
+   NELTS of two vectors are always constant-size scalars.  */
+#define vector_unroll_factor(NELTS1, NELTS2) \
+  (exact_div (NELTS1, NELTS2).to_constant ())
+
 /* Wrapper for poly_int arguments to target macros, so that if a target
    doesn't need polynomial-sized modes, its header file can continue to
    treat the argument as a normal constant.  This should go away once
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 3575f25241f..c0e979cd8ee 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -3731,7 +3731,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
   tree op, type;
   tree vec_oprnd0 = NULL_TREE;
   tree vectype;
-  unsigned int nunits;
+  poly_uint64 nunits;
   loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
   bb_vec_info bb_vinfo = dyn_cast <bb_vec_info> (vinfo);
   class loop *loop = loop_vinfo ? LOOP_VINFO_LOOP (loop_vinfo) : NULL;
@@ -3883,8 +3883,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
       arginfo.quick_push (thisarginfo);
     }
 
-  unsigned HOST_WIDE_INT vf;
-  if (!LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant (&vf))
+  poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
+  if (!vf.is_constant ())
     {
       if (dump_enabled_p ())
 	dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -3902,12 +3902,12 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 	 n = n->simdclone->next_clone)
       {
 	unsigned int this_badness = 0;
-	if (n->simdclone->simdlen > vf
+	if (known_gt (n->simdclone->simdlen, vf)
 	    || n->simdclone->nargs != nargs)
 	  continue;
-	if (n->simdclone->simdlen < vf)
-	  this_badness += (exact_log2 (vf)
-			   - exact_log2 (n->simdclone->simdlen)) * 1024;
+	if (known_lt (n->simdclone->simdlen, vf))
+	  this_badness += exact_log2
+	    (vector_unroll_factor (vf, n->simdclone->simdlen)) * 1024;
 	if (n->simdclone->inbranch)
 	  this_badness += 2048;
 	int target_badness = targetm.simd_clone.usable (n);
@@ -3988,19 +3988,19 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 	arginfo[i].vectype = get_vectype_for_scalar_type (vinfo, arg_type,
 							  slp_node);
 	if (arginfo[i].vectype == NULL
-	    || (simd_clone_subparts (arginfo[i].vectype)
-		> bestn->simdclone->simdlen))
+	    || (known_gt (simd_clone_subparts (arginfo[i].vectype),
+			  bestn->simdclone->simdlen)))
 	  return false;
       }
 
   fndecl = bestn->decl;
   nunits = bestn->simdclone->simdlen;
-  ncopies = vf / nunits;
+  ncopies = vector_unroll_factor (vf, nunits);
 
   /* If the function isn't const, only allow it in simd loops where user
      has asserted that at least nunits consecutive iterations can be
      performed using SIMD instructions.  */
-  if ((loop == NULL || (unsigned) loop->safelen < nunits)
+  if ((loop == NULL || known_lt ((unsigned) loop->safelen, nunits))
       && gimple_vuse (stmt))
     return false;
 
@@ -4078,15 +4078,16 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 	    {
 	    case SIMD_CLONE_ARG_TYPE_VECTOR:
 	      atype = bestn->simdclone->args[i].vector_type;
-	      o = nunits / simd_clone_subparts (atype);
+	      o = vector_unroll_factor (nunits,
+					simd_clone_subparts (atype));
 	      for (m = j * o; m < (j + 1) * o; m++)
 		{
-		  if (simd_clone_subparts (atype)
-		      < simd_clone_subparts (arginfo[i].vectype))
+		  if (known_lt (simd_clone_subparts (atype),
+				simd_clone_subparts (arginfo[i].vectype)))
 		    {
 		      poly_uint64 prec = GET_MODE_BITSIZE (TYPE_MODE (atype));
-		      k = (simd_clone_subparts (arginfo[i].vectype)
-			   / simd_clone_subparts (atype));
+		      k = simd_clone_subparts (arginfo[i].vectype)
+			  / simd_clone_subparts (atype);
 		      gcc_assert ((k & (k - 1)) == 0);
 		      if (m == 0)
 			{
@@ -4116,8 +4117,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 		    }
 		  else
 		    {
-		      k = (simd_clone_subparts (atype)
-			   / simd_clone_subparts (arginfo[i].vectype));
+		      k = simd_clone_subparts (atype)
+			  / simd_clone_subparts (arginfo[i].vectype);
 		      gcc_assert ((k & (k - 1)) == 0);
 		      vec<constructor_elt, va_gc> *ctor_elts;
 		      if (k != 1)
@@ -4203,7 +4204,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 		      ? POINTER_PLUS_EXPR : PLUS_EXPR;
 		  tree type = POINTER_TYPE_P (TREE_TYPE (op))
 			      ? sizetype : TREE_TYPE (op);
-		  widest_int cst
+		  poly_widest_int cst
 		    = wi::mul (bestn->simdclone->args[i].linear_step,
 			       ncopies * nunits);
 		  tree tcst = wide_int_to_tree (type, cst);
@@ -4224,7 +4225,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 		      ? POINTER_PLUS_EXPR : PLUS_EXPR;
 		  tree type = POINTER_TYPE_P (TREE_TYPE (op))
 			      ? sizetype : TREE_TYPE (op);
-		  widest_int cst
+		  poly_widest_int cst
 		    = wi::mul (bestn->simdclone->args[i].linear_step,
 			       j * nunits);
 		  tree tcst = wide_int_to_tree (type, cst);
@@ -4250,7 +4251,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
       gcall *new_call = gimple_build_call_vec (fndecl, vargs);
       if (vec_dest)
 	{
-	  gcc_assert (ratype || simd_clone_subparts (rtype) == nunits);
+	  gcc_assert (ratype || known_eq (simd_clone_subparts (rtype),
+					  nunits));
 	  if (ratype)
 	    new_temp = create_tmp_var (ratype);
 	  else if (useless_type_conversion_p (vectype, rtype))
@@ -4264,12 +4266,13 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 
       if (vec_dest)
 	{
-	  if (simd_clone_subparts (vectype) < nunits)
+	  if (known_lt (simd_clone_subparts (vectype), nunits))
 	    {
 	      unsigned int k, l;
 	      poly_uint64 prec = GET_MODE_BITSIZE (TYPE_MODE (vectype));
 	      poly_uint64 bytes = GET_MODE_SIZE (TYPE_MODE (vectype));
-	      k = nunits / simd_clone_subparts (vectype);
+	      k = vector_unroll_factor (nunits,
+					simd_clone_subparts (vectype));
 	      gcc_assert ((k & (k - 1)) == 0);
 	      for (l = 0; l < k; l++)
 		{
@@ -4295,16 +4298,18 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 		vect_clobber_variable (vinfo, stmt_info, gsi, new_temp);
 	      continue;
 	    }
-	  else if (simd_clone_subparts (vectype) > nunits)
+	  else if (known_gt (simd_clone_subparts (vectype), nunits))
 	    {
-	      unsigned int k = (simd_clone_subparts (vectype)
-				/ simd_clone_subparts (rtype));
+	      unsigned int k = simd_clone_subparts (vectype)
+			       / simd_clone_subparts (rtype);
 	      gcc_assert ((k & (k - 1)) == 0);
 	      if ((j & (k - 1)) == 0)
 		vec_alloc (ret_ctor_elts, k);
 	      if (ratype)
 		{
-		  unsigned int m, o = nunits / simd_clone_subparts (rtype);
+		  unsigned int m, o;
+		  o = vector_unroll_factor (nunits,
+					    simd_clone_subparts (rtype));
 		  for (m = 0; m < o; m++)
 		    {
 		      tree tem = build4 (ARRAY_REF, rtype, new_temp,
-- 
2.19.1


             reply	other threads:[~2020-10-30  2:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-30  2:29 yangyang (ET) [this message]
2020-10-30 13:32 ` Richard Sandiford
2020-11-02 13:17   ` yangyang (ET)
2020-11-02 14:15     ` Richard Sandiford
2020-11-03 11:07       ` yangyang (ET)
2020-11-03 16:14         ` Richard Sandiford
2020-11-04  6:01           ` yangyang (ET)
2020-11-04 10:26             ` Richard Sandiford
2020-11-05  1:46               ` yangyang (ET)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8a9583ef1ca4b8d9b85167f6ad7b400@huawei.com \
    --to=yangyang305@huawei.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).