public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC, vectorizer] Allow single element vector types for vector reduction operations
@ 2017-08-28  8:22 Jon Beniston
  2017-08-28  8:29 ` Richard Biener
  2017-09-07 11:08 ` Bernd Schmidt
  0 siblings, 2 replies; 33+ messages in thread
From: Jon Beniston @ 2017-08-28  8:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: rguenther

[-- Attachment #1: Type: text/plain, Size: 2398 bytes --]

Hi,

I have an out-of-tree GCC port and it is struggling supporting
auto-vectorization on some dot product instructions.  For example, I have an
instruction that takes three operands which are all 32-bit general
registers. The second and third operands will be treated as V2HI then do dot
product, and then generate an SI result which is then added to the first
operand which is SI as well.

I do see there is dot product recognizer in tree-vect-patters.c, however, I
found the following testcase still can't be auto-vectorized on my port which
has implemented all necessary dot product standard patterns.  This testcase
can't be auto-vectorized on other targets that have similar V2HI dot product
instructions as well, for example ARC.

=== test.c ===
#define K 4
#define M 4
#define N 256
int in[N*K][M];
int out[K];
int coeff[N][M];
void
foo (void)
{
  int i, j, k;
  int sum;
  for (k = 0; k < K; k++)
    {
      sum = 0;
      for (j = 0; j < M; j++)
        for (i = 0; i < N; i++)
          sum += in[i+k][j] * coeff[i][j];
      out[k] = sum;
    }
}
===
  The reason that auto-vectorizer doesn't work seems to be that GCC doesn't
support single-element vector types in get_vectype_for_scalar_type_and_size.
tree-vect-stmts.c: get_vectype_for_scalar_type_and_size
  ...
  if (nunits <= 1)
    return NULL_TREE;

So, I am thinking this actually should be relaxed to support more cases.  At
least on vector reduction operations which normally will have scalar result
with wider types than the element type of input operands.

I have tried to make the auto-vectorizer work for my V2HI dot product case,
with the patch attached. Is this the correct approach?

Cheers,
Jon

gcc/
2017-08-27  Jon Beniston <jon@beniston.com>

        * tree-vectorizer.h (get_vectype_for_scalar_type): New optional
        parameter declaration.
        * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Add new
        optional parameter "reduct_p".  Support single element vector types
        if it is true.
        (get_vectype_for_scalar_type): Add new parameter "reduct_p".
        * tree-vect-patterns.c (vect_pattern_recog_1): Pass new parameter
        "reduct_p".
        * tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
        (vect_model_reduction_cost): Likewise.
        (get_initial_def_for_induction): Likewise.
        (vect_create_epilog_for_reduction): Likewise.


[-- Attachment #2: fix-vec-reduct.patch --]
[-- Type: application/octet-stream, Size: 4690 bytes --]

Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c
+++ gcc/tree-vect-loop.c
@@ -232,7 +232,7 @@
                   dump_printf (MSG_NOTE, "\n");
 		}
 
-	      vectype = get_vectype_for_scalar_type (scalar_type);
+	      vectype = get_vectype_for_scalar_type (scalar_type, true);
 	      if (!vectype)
 		{
 		  if (dump_enabled_p ())
@@ -465,7 +465,7 @@
 		  dump_generic_expr (MSG_NOTE, TDF_SLIM, scalar_type);
                   dump_printf (MSG_NOTE, "\n");
 		}
-	      vectype = get_vectype_for_scalar_type (scalar_type);
+	      vectype = get_vectype_for_scalar_type (scalar_type, true);
 	      if (!vectype)
 		{
 		  if (dump_enabled_p ())
@@ -510,7 +510,7 @@
 		  dump_generic_expr (MSG_NOTE, TDF_SLIM, scalar_type);
 		  dump_printf (MSG_NOTE, "\n");
 		}
-	      vf_vectype = get_vectype_for_scalar_type (scalar_type);
+	      vf_vectype = get_vectype_for_scalar_type (scalar_type, true);
 	    }
 	  if (!vf_vectype)
 	    {
@@ -3673,7 +3673,7 @@
 
   reduction_op = get_reduction_op (stmt, reduc_index);
 
-  vectype = get_vectype_for_scalar_type (TREE_TYPE (reduction_op));
+  vectype = get_vectype_for_scalar_type (TREE_TYPE (reduction_op), true);
   if (!vectype)
     {
       if (dump_enabled_p ())
@@ -4202,7 +4202,7 @@
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   tree scalar_type = TREE_TYPE (init_val);
-  tree vectype = get_vectype_for_scalar_type (scalar_type);
+  tree vectype = get_vectype_for_scalar_type (scalar_type, true);
   int nunits;
   enum tree_code code = gimple_assign_rhs_code (stmt);
   tree def_for_init;
@@ -4455,7 +4455,7 @@
 
   reduction_op = get_reduction_op (stmt, reduc_index);
 
-  vectype = get_vectype_for_scalar_type (TREE_TYPE (reduction_op));
+  vectype = get_vectype_for_scalar_type (TREE_TYPE (reduction_op), true);
   gcc_assert (vectype);
   mode = TYPE_MODE (vectype);
 
Index: gcc/tree-vect-patterns.c
===================================================================
--- gcc/tree-vect-patterns.c
+++ gcc/tree-vect-patterns.c
@@ -4176,12 +4176,22 @@
       type_in = get_vectype_for_scalar_type (type_in);
       if (!type_in)
 	return false;
+
+      tree type_out_backup = type_out;
       if (type_out)
 	type_out = get_vectype_for_scalar_type (type_out);
       else
 	type_out = type_in;
       if (!type_out)
-	return false;
+	{
+	  /* dot_prod is vector reduction operation that we want allow
+	     single element vector types.  */
+	  if (!strcmp (recog_func->name, "dot_prod"))
+	    type_out = get_vectype_for_scalar_type (type_out_backup, true);
+
+	  if (!type_out)
+	    return false;
+	}
       pattern_vectype = type_out;
 
       if (is_gimple_assign (pattern_stmt))
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c
+++ gcc/tree-vect-stmts.c
@@ -8985,7 +8985,8 @@
    by the target.  */
 
 static tree
-get_vectype_for_scalar_type_and_size (tree scalar_type, unsigned size)
+get_vectype_for_scalar_type_and_size (tree scalar_type, unsigned size,
+				      bool reduct_p = false)
 {
   tree orig_scalar_type = scalar_type;
   machine_mode inner_mode = TYPE_MODE (scalar_type);
@@ -9039,7 +9040,7 @@
   else
     simd_mode = mode_for_vector (inner_mode, size / nbytes);
   nunits = GET_MODE_SIZE (simd_mode) / nbytes;
-  if (nunits < 1) /* Support V1SI.  */
+  if (nunits < 1 || (nunits == 1 && !reduct_p))
     return NULL_TREE;
 
   vectype = build_vector_type (scalar_type, nunits);
@@ -9065,11 +9066,12 @@
    by the target.  */
 
 tree
-get_vectype_for_scalar_type (tree scalar_type)
+get_vectype_for_scalar_type (tree scalar_type, bool reduct_p)
 {
   tree vectype;
   vectype = get_vectype_for_scalar_type_and_size (scalar_type,
-						  current_vector_size);
+						  current_vector_size,
+						  reduct_p);
   if (vectype
       && current_vector_size == 0)
     current_vector_size = GET_MODE_SIZE (TYPE_MODE (vectype));
Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h
+++ gcc/tree-vectorizer.h
@@ -1062,7 +1062,7 @@
 
 /* In tree-vect-stmts.c.  */
 extern unsigned int current_vector_size;
-extern tree get_vectype_for_scalar_type (tree);
+extern tree get_vectype_for_scalar_type (tree, bool = false);
 extern tree get_mask_type_for_scalar_type (tree);
 extern tree get_same_sized_vectype (tree, tree);
 extern bool vect_is_simple_use (tree, vec_info *, gimple **,

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-28  8:22 [RFC, vectorizer] Allow single element vector types for vector reduction operations Jon Beniston
@ 2017-08-28  8:29 ` Richard Biener
  2017-08-28  9:04   ` Jon Beniston
  2017-09-07 11:08 ` Bernd Schmidt
  1 sibling, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-08-28  8:29 UTC (permalink / raw)
  To: Jon Beniston; +Cc: gcc-patches

On Sun, 27 Aug 2017, Jon Beniston wrote:

> Hi,
> 
> I have an out-of-tree GCC port and it is struggling supporting
> auto-vectorization on some dot product instructions.  For example, I have an
> instruction that takes three operands which are all 32-bit general
> registers. The second and third operands will be treated as V2HI then do dot
> product, and then generate an SI result which is then added to the first
> operand which is SI as well.
> 
> I do see there is dot product recognizer in tree-vect-patters.c, however, I
> found the following testcase still can't be auto-vectorized on my port which
> has implemented all necessary dot product standard patterns.  This testcase
> can't be auto-vectorized on other targets that have similar V2HI dot product
> instructions as well, for example ARC.
> 
> === test.c ===
> #define K 4
> #define M 4
> #define N 256
> int in[N*K][M];
> int out[K];
> int coeff[N][M];
> void
> foo (void)
> {
>   int i, j, k;
>   int sum;
>   for (k = 0; k < K; k++)
>     {
>       sum = 0;
>       for (j = 0; j < M; j++)
>         for (i = 0; i < N; i++)
>           sum += in[i+k][j] * coeff[i][j];
>       out[k] = sum;
>     }
> }
> ===
>   The reason that auto-vectorizer doesn't work seems to be that GCC doesn't
> support single-element vector types in get_vectype_for_scalar_type_and_size.
> tree-vect-stmts.c: get_vectype_for_scalar_type_and_size
>   ...
>   if (nunits <= 1)
>     return NULL_TREE;
> 
> So, I am thinking this actually should be relaxed to support more cases.  At
> least on vector reduction operations which normally will have scalar result
> with wider types than the element type of input operands.
> 
> I have tried to make the auto-vectorizer work for my V2HI dot product case,
> with the patch attached. Is this the correct approach?

Hum,

@@ -9039,7 +9040,7 @@
   else
     simd_mode = mode_for_vector (inner_mode, size / nbytes);
   nunits = GET_MODE_SIZE (simd_mode) / nbytes;
-  if (nunits < 1) /* Support V1SI.  */
+  if (nunits < 1 || (nunits == 1 && !reduct_p))
     return NULL_TREE;

   vectype = build_vector_type (scalar_type, nunits);

doesn't seem to be against trunk which has

  if (nunits <= 1)
    return NULL_TREE;

so what happens if you just change that to

  if (nunits < 1)
    return NULL_TREE;

?

Richard.

> Cheers,
> Jon
> 
> gcc/
> 2017-08-27  Jon Beniston <jon@beniston.com>
> 
>         * tree-vectorizer.h (get_vectype_for_scalar_type): New optional
>         parameter declaration.
>         * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Add new
>         optional parameter "reduct_p".  Support single element vector types
>         if it is true.
>         (get_vectype_for_scalar_type): Add new parameter "reduct_p".
>         * tree-vect-patterns.c (vect_pattern_recog_1): Pass new parameter
>         "reduct_p".
>         * tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
>         (vect_model_reduction_cost): Likewise.
>         (get_initial_def_for_induction): Likewise.
>         (vect_create_epilog_for_reduction): Likewise.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-28  8:29 ` Richard Biener
@ 2017-08-28  9:04   ` Jon Beniston
  2017-08-28 11:41     ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Jon Beniston @ 2017-08-28  9:04 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: gcc-patches

Hi Richard,

>-  if (nunits < 1) /* Support V1SI.  */
>+  if (nunits < 1 || (nunits == 1 && !reduct_p))
>     return NULL_TREE;
>
>doesn't seem to be against trunk which has
>
>  if (nunits <= 1)
>    return NULL_TREE;
>
>so what happens if you just change that to
>
>  if (nunits < 1)
>    return NULL_TREE;

Ah, that's what I first tried, and mistakenly sent the patch against that.

That did help the initial problem, but then I needed to add a lot of support
for the V1SI type in the backend (which just duplicated SI patterns) and
there are a few places where GCC seems to have sanity checks against vectors
that are only one element wide. A dot product producing a scalar result
seems natural. 

Also, as well as ARC benefitting from this, I think the TI c6x port would
too. That currently has a sdot_prodv2hi pattern that uses SI and V2HI types.

Cheers,
Jon


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-28  9:04   ` Jon Beniston
@ 2017-08-28 11:41     ` Richard Biener
  2017-08-30 11:13       ` Jon Beniston
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-08-28 11:41 UTC (permalink / raw)
  To: Jon Beniston; +Cc: gcc-patches

On Mon, 28 Aug 2017, Jon Beniston wrote:

> Hi Richard,
> 
> >-  if (nunits < 1) /* Support V1SI.  */
> >+  if (nunits < 1 || (nunits == 1 && !reduct_p))
> >     return NULL_TREE;
> >
> >doesn't seem to be against trunk which has
> >
> >  if (nunits <= 1)
> >    return NULL_TREE;
> >
> >so what happens if you just change that to
> >
> >  if (nunits < 1)
> >    return NULL_TREE;
> 
> Ah, that's what I first tried, and mistakenly sent the patch against that.
> 
> That did help the initial problem, but then I needed to add a lot of support
> for the V1SI type in the backend (which just duplicated SI patterns) and
> there are a few places where GCC seems to have sanity checks against vectors
> that are only one element wide. A dot product producing a scalar result
> seems natural.

Where do you need V1SI types?  The vectorizer should perform a SI extract
from V1SI here.  You need to elaborate a bit here.

> 
> Also, as well as ARC benefitting from this, I think the TI c6x port would
> too. That currently has a sdot_prodv2hi pattern that uses SI and V2HI types.

The vectorizer doesn't really support vector->scalar ops so V1SI feels
more natural here.  That is, your patch looks a bit ugly -- there's
nothing wrong with V1SI vector types.  So maybe instead of adjusting
that function the respective caller when facing a lane-reducing op
such as DOT_PROD needs to consider scalar result types.

Richard.

> Cheers,
> Jon
> 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-28 11:41     ` Richard Biener
@ 2017-08-30 11:13       ` Jon Beniston
  2017-08-30 12:28         ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Jon Beniston @ 2017-08-30 11:13 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: gcc-patches

Hi Richard,

>> Ah, that's what I first tried, and mistakenly sent the patch against
that.
>> 
>> That did help the initial problem, but then I needed to add a lot of 
>> support for the V1SI type in the backend (which just duplicated SI 
>> patterns) and there are a few places where GCC seems to have sanity 
>> checks against vectors that are only one element wide. A dot product 
>> producing a scalar result seems natural.
>
>Where do you need V1SI types?  The vectorizer should perform a SI extract 
>from V1SI here.  You need to elaborate a bit here.

After adding single element vector type support in the middle-end, at the
tree level, V1SI vector types would be the result type for the dot product
if the input operands were V2HI.

During RTL expansion, if V1SImode is accepted in the
TARGET_VECTOR_MODE_SUPPORTED_P hook, then V1SImode will be used after RTL
expansion and V1SImode patterns are needed, although they are actually
duplicates of SImode patterns.  I have now removed V1SImode from
TARGET_VECTOR_MODE_SUPPORTED_P, and they are turned into SI mode so there is
no need for the V1SI patterns now.

>> The vectorizer doesn't really support vector->scalar ops so V1SI 
>> feels more natural here.  That is, your patch looks a bit ugly -- 
>> there's nothing wrong with V1SI vector types.

I agree "there's nothing wrong with V1SI vector types" and the original
patch was trying to let vector reduction operation accept V1SI vector types.

However, if the only change was "<= 1" into "< 1" in
get_vectype_for_scalar_type_and_size, then it will cause a GCC assertion
failure in optab_for_tree_node for some shift operations. It seems all such
failures come from:

vect_pattern_recog_1:4185
      optab = optab_for_tree_code (code, type_in, optab_default);

The SUB_TYPE passed to optab_for_tree_code is "optab_default", however it is
possible that vect_pattern_recog_1 will generate gimple containing shift
operations, for example vect_recog_divmod_pattern will generate RSHIFT_EXPR,
while shift operation requires sub_type to be not optab_default?

  gcc/optabs-tree.h:

  /* An extra flag to control optab_for_tree_code's behavior.  This is
needed to
     distinguish between machines with a vector shift that takes a scalar
for the
     shift amount vs. machines that take a vector for the shift amount.  */

  enum optab_subtype
  {
    optab_default,
    optab_scalar,
    optab_vector
  };

It looks to me like the other call sites of optab_for_tree_code which are
passing optab_default are mostly places where GCC is sure it is not a shift
operation.
Given TYPE_IN is returned from get_vectype_for_scalar_type, is it safe to
simply pass optab_vector in vect_pattern_recog_1?

I have verified the following middle-end fix also works for my port, it
passed also x86-64 bootstrap and there is no regression in gcc/g++
regression.

How is this new patch?

gcc/
2017-08-30  Jon Beniston  <jon@beniston.com>

        * tree-vect-patterns.c (vect_pattern_recog_1): Pass optab_vector
when
        calling optab_for_tree_code.
        * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Allow
single
        element vector types.

diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index cfdb72c..4b39cb6 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -4182,7 +4182,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
          code = CALL_EXPR;
        }

-      optab = optab_for_tree_code (code, type_in, optab_default);
+      optab = optab_for_tree_code (code, type_in, optab_vector);
       vec_mode = TYPE_MODE (type_in);
       if (!optab
           || (icode = optab_handler (optab, vec_mode)) == CODE_FOR_nothing
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 013fb1f..fc62efb 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
scalar_type, unsigned size)
   else
     simd_mode = mode_for_vector (inner_mode, size / nbytes);
   nunits = GET_MODE_SIZE (simd_mode) / nbytes;
-  if (nunits <= 1)
+  /* NOTE: nunits == 1 is allowed to support single element vector types.
*/
+  if (nunits < 1)
     return NULL_TREE;

   vectype = build_vector_type (scalar_type, nunits);


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-30 11:13       ` Jon Beniston
@ 2017-08-30 12:28         ` Richard Biener
  2017-08-30 15:52           ` Jon Beniston
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-08-30 12:28 UTC (permalink / raw)
  To: Jon Beniston; +Cc: gcc-patches

On Wed, 30 Aug 2017, Jon Beniston wrote:

> Hi Richard,
> 
> >> Ah, that's what I first tried, and mistakenly sent the patch against
> that.
> >> 
> >> That did help the initial problem, but then I needed to add a lot of 
> >> support for the V1SI type in the backend (which just duplicated SI 
> >> patterns) and there are a few places where GCC seems to have sanity 
> >> checks against vectors that are only one element wide. A dot product 
> >> producing a scalar result seems natural.
> >
> >Where do you need V1SI types?  The vectorizer should perform a SI extract 
> >from V1SI here.  You need to elaborate a bit here.
> 
> After adding single element vector type support in the middle-end, at the
> tree level, V1SI vector types would be the result type for the dot product
> if the input operands were V2HI.
> 
> During RTL expansion, if V1SImode is accepted in the
> TARGET_VECTOR_MODE_SUPPORTED_P hook, then V1SImode will be used after RTL
> expansion and V1SImode patterns are needed, although they are actually
> duplicates of SImode patterns.  I have now removed V1SImode from
> TARGET_VECTOR_MODE_SUPPORTED_P, and they are turned into SI mode so there is
> no need for the V1SI patterns now.
> 
> >> The vectorizer doesn't really support vector->scalar ops so V1SI 
> >> feels more natural here.  That is, your patch looks a bit ugly -- 
> >> there's nothing wrong with V1SI vector types.
> 
> I agree "there's nothing wrong with V1SI vector types" and the original
> patch was trying to let vector reduction operation accept V1SI vector types.
> 
> However, if the only change was "<= 1" into "< 1" in
> get_vectype_for_scalar_type_and_size, then it will cause a GCC assertion
> failure in optab_for_tree_node for some shift operations. It seems all such
> failures come from:
> 
> vect_pattern_recog_1:4185
>       optab = optab_for_tree_code (code, type_in, optab_default);
> 
> The SUB_TYPE passed to optab_for_tree_code is "optab_default", however it is
> possible that vect_pattern_recog_1 will generate gimple containing shift
> operations, for example vect_recog_divmod_pattern will generate RSHIFT_EXPR,
> while shift operation requires sub_type to be not optab_default?

I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false
for V1SI you'll get a SImode vector type and the

  if (VECTOR_BOOLEAN_TYPE_P (type_in)
      || VECTOR_MODE_P (TYPE_MODE (type_in)))

check fails.  Try changing that to || VECTOR_TYPE_P (type_in).
The else path should be hopefully dead ...

With your TARGET_VECTOR_MODE_SUPPORTED_P setting you are essentially
making the vectorizer mix what was desired as "generic" vectorization
and regular vectorization...  if it works, fine ;)

>   gcc/optabs-tree.h:
> 
>   /* An extra flag to control optab_for_tree_code's behavior.  This is
> needed to
>      distinguish between machines with a vector shift that takes a scalar
> for the
>      shift amount vs. machines that take a vector for the shift amount.  */
> 
>   enum optab_subtype
>   {
>     optab_default,
>     optab_scalar,
>     optab_vector
>   };
> 
> It looks to me like the other call sites of optab_for_tree_code which are
> passing optab_default are mostly places where GCC is sure it is not a shift
> operation.
> Given TYPE_IN is returned from get_vectype_for_scalar_type, is it safe to
> simply pass optab_vector in vect_pattern_recog_1?

I guess so.  Can you also try the above?

> I have verified the following middle-end fix also works for my port, it
> passed also x86-64 bootstrap and there is no regression in gcc/g++
> regression.
> 
> How is this new patch?
> 
> gcc/
> 2017-08-30  Jon Beniston  <jon@beniston.com>
> 
>         * tree-vect-patterns.c (vect_pattern_recog_1): Pass optab_vector
> when
>         calling optab_for_tree_code.
>         * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Allow
> single
>         element vector types.
> 
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cfdb72c..4b39cb6 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -4182,7 +4182,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
>           code = CALL_EXPR;
>         }
> 
> -      optab = optab_for_tree_code (code, type_in, optab_default);
> +      optab = optab_for_tree_code (code, type_in, optab_vector);
>        vec_mode = TYPE_MODE (type_in);
>        if (!optab
>            || (icode = optab_handler (optab, vec_mode)) == CODE_FOR_nothing
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 013fb1f..fc62efb 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> scalar_type, unsigned size)
>    else
>      simd_mode = mode_for_vector (inner_mode, size / nbytes);
>    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> -  if (nunits <= 1)
> +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> */
> +  if (nunits < 1)
>      return NULL_TREE;
> 
>    vectype = build_vector_type (scalar_type, nunits);

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-30 12:28         ` Richard Biener
@ 2017-08-30 15:52           ` Jon Beniston
  2017-08-30 16:32             ` Richard Biener
                               ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Jon Beniston @ 2017-08-30 15:52 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: gcc-patches

Hi Richard,

> I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false 
> for V1SI you'll get a SImode vector type and the
>
>  if (VECTOR_BOOLEAN_TYPE_P (type_in)
>      || VECTOR_MODE_P (TYPE_MODE (type_in)))
>
>check fails.  Try changing that to || VECTOR_TYPE_P (type_in).
>The else path should be hopefully dead ...
>
>> It looks to me like the other call sites of optab_for_tree_code which 
>> are passing optab_default are mostly places where GCC is sure it is 
>> not a shift operation.
>> Given TYPE_IN is returned from get_vectype_for_scalar_type, is it 
>> safe to simply pass optab_vector in vect_pattern_recog_1?
>
>I guess so.  Can you also try the above?

Changing VECTOR_MODE_P check into VECTOR_TYPE_P check also works for me, I
verified the following patch could vectorize my dot product case and there
is no regression on gcc tests on my port.

Meanwhile x86-64 bootstraps OK and no regression on gcc/g++ test.

 Does this look OK?


gcc/
2017-08-30  Jon Beniston  <jon@beniston.com>
            Richard Biener  <rguenther@suse.de>

diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index cfdb72c..5ebeac2 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
   loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
  
   if (VECTOR_BOOLEAN_TYPE_P (type_in)
-      || VECTOR_MODE_P (TYPE_MODE (type_in)))
+      || VECTOR_TYPE_P (type_in))
     {
       /* No need to check target support (already checked by the pattern
          recognition function).  */
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 013fb1f..fc62efb 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
scalar_type, unsigned size)
   else
     simd_mode = mode_for_vector (inner_mode, size / nbytes);
   nunits = GET_MODE_SIZE (simd_mode) / nbytes;
-  if (nunits <= 1)
+  /* NOTE: nunits == 1 is allowed to support single element vector types.
*/
+  if (nunits < 1)
     return NULL_TREE;
 
   vectype = build_vector_type (scalar_type, nunits);


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-30 15:52           ` Jon Beniston
@ 2017-08-30 16:32             ` Richard Biener
  2017-08-30 20:42             ` Richard Sandiford
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Richard Biener @ 2017-08-30 16:32 UTC (permalink / raw)
  To: Jon Beniston; +Cc: gcc-patches

On August 30, 2017 5:11:24 PM GMT+02:00, Jon Beniston <jon@beniston.com> wrote:
>Hi Richard,
>
>> I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false 
>> for V1SI you'll get a SImode vector type and the
>>
>>  if (VECTOR_BOOLEAN_TYPE_P (type_in)
>>      || VECTOR_MODE_P (TYPE_MODE (type_in)))
>>
>>check fails.  Try changing that to || VECTOR_TYPE_P (type_in).
>>The else path should be hopefully dead ...
>>
>>> It looks to me like the other call sites of optab_for_tree_code
>which 
>>> are passing optab_default are mostly places where GCC is sure it is 
>>> not a shift operation.
>>> Given TYPE_IN is returned from get_vectype_for_scalar_type, is it 
>>> safe to simply pass optab_vector in vect_pattern_recog_1?
>>
>>I guess so.  Can you also try the above?
>
>Changing VECTOR_MODE_P check into VECTOR_TYPE_P check also works for
>me, I
>verified the following patch could vectorize my dot product case and
>there
>is no regression on gcc tests on my port.
>
>Meanwhile x86-64 bootstraps OK and no regression on gcc/g++ test.
>
> Does this look OK?

OK. 

Thanks, 
Richard. 

>
>gcc/
>2017-08-30  Jon Beniston  <jon@beniston.com>
>            Richard Biener  <rguenther@suse.de>
>
>diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
>index cfdb72c..5ebeac2 100644
>--- a/gcc/tree-vect-patterns.c
>+++ b/gcc/tree-vect-patterns.c
>@@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
>*recog_func,
>   loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>  
>   if (VECTOR_BOOLEAN_TYPE_P (type_in)
>-      || VECTOR_MODE_P (TYPE_MODE (type_in)))
>+      || VECTOR_TYPE_P (type_in))
>     {
>     /* No need to check target support (already checked by the pattern
>          recognition function).  */
>diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
>index 013fb1f..fc62efb 100644
>--- a/gcc/tree-vect-stmts.c
>+++ b/gcc/tree-vect-stmts.c
>@@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
>scalar_type, unsigned size)
>   else
>     simd_mode = mode_for_vector (inner_mode, size / nbytes);
>   nunits = GET_MODE_SIZE (simd_mode) / nbytes;
>-  if (nunits <= 1)
>+  /* NOTE: nunits == 1 is allowed to support single element vector
>types.
>*/
>+  if (nunits < 1)
>     return NULL_TREE;
> 
>   vectype = build_vector_type (scalar_type, nunits);

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-30 15:52           ` Jon Beniston
  2017-08-30 16:32             ` Richard Biener
@ 2017-08-30 20:42             ` Richard Sandiford
  2017-09-02 21:58             ` Andreas Schwab
  2017-09-04 15:04             ` Christophe Lyon
  3 siblings, 0 replies; 33+ messages in thread
From: Richard Sandiford @ 2017-08-30 20:42 UTC (permalink / raw)
  To: Jon Beniston; +Cc: 'Richard Biener', gcc-patches

"Jon Beniston" <jon@beniston.com> writes:
> Hi Richard,
>
>> I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false 
>> for V1SI you'll get a SImode vector type and the
>>
>>  if (VECTOR_BOOLEAN_TYPE_P (type_in)
>>      || VECTOR_MODE_P (TYPE_MODE (type_in)))
>>
>>check fails.  Try changing that to || VECTOR_TYPE_P (type_in).
>>The else path should be hopefully dead ...
>>
>>> It looks to me like the other call sites of optab_for_tree_code which 
>>> are passing optab_default are mostly places where GCC is sure it is 
>>> not a shift operation.
>>> Given TYPE_IN is returned from get_vectype_for_scalar_type, is it 
>>> safe to simply pass optab_vector in vect_pattern_recog_1?
>>
>>I guess so.  Can you also try the above?
>
> Changing VECTOR_MODE_P check into VECTOR_TYPE_P check also works for me, I
> verified the following patch could vectorize my dot product case and there
> is no regression on gcc tests on my port.
>
> Meanwhile x86-64 bootstraps OK and no regression on gcc/g++ test.
>
>  Does this look OK?
>
>
> gcc/
> 2017-08-30  Jon Beniston  <jon@beniston.com>
>             Richard Biener  <rguenther@suse.de>
>
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cfdb72c..5ebeac2 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
>    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>   
>    if (VECTOR_BOOLEAN_TYPE_P (type_in)
> -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
> +      || VECTOR_TYPE_P (type_in))
>      {
>        /* No need to check target support (already checked by the pattern
>           recognition function).  */

It looks like this makes the VECTOR_BOOLEAN_TYPE_P (type_in) check redundant.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-30 15:52           ` Jon Beniston
  2017-08-30 16:32             ` Richard Biener
  2017-08-30 20:42             ` Richard Sandiford
@ 2017-09-02 21:58             ` Andreas Schwab
  2017-09-04  8:58               ` Tamar Christina
  2017-09-04 15:04             ` Christophe Lyon
  3 siblings, 1 reply; 33+ messages in thread
From: Andreas Schwab @ 2017-09-02 21:58 UTC (permalink / raw)
  To: Jon Beniston; +Cc: 'Richard Biener', gcc-patches

On Aug 30 2017, "Jon Beniston" <jon@beniston.com> wrote:

> gcc/
> 2017-08-30  Jon Beniston  <jon@beniston.com>
>             Richard Biener  <rguenther@suse.de>
>
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cfdb72c..5ebeac2 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
>    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>   
>    if (VECTOR_BOOLEAN_TYPE_P (type_in)
> -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
> +      || VECTOR_TYPE_P (type_in))
>      {
>        /* No need to check target support (already checked by the pattern
>           recognition function).  */
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 013fb1f..fc62efb 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> scalar_type, unsigned size)
>    else
>      simd_mode = mode_for_vector (inner_mode, size / nbytes);
>    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> -  if (nunits <= 1)
> +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> */
> +  if (nunits < 1)
>      return NULL_TREE;
>  
>    vectype = build_vector_type (scalar_type, nunits);
>
>
>

That breaks vect/pr68577.c on aarch64.

during RTL pass: expand
/opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function 'slp_test':
/opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12: internal compiler error: in simplify_subreg, at simplify-rtx.c:6050
0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
        ../../gcc/simplify-rtx.c:6049
0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
        ../../gcc/simplify-rtx.c:6278
0x81d277 store_bit_field_1
        ../../gcc/expmed.c:798
0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned long, unsigned long, machine_mode, rtx_def*, bool)
        ../../gcc/expmed.c:1133
0x840bf7 store_field
        ../../gcc/expr.c:6950
0x84792f store_constructor_field
        ../../gcc/expr.c:6142
0x83edbf store_constructor
        ../../gcc/expr.c:6726
0x840443 expand_constructor
        ../../gcc/expr.c:8027
0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/expr.c:10133
0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/expr.c:9819
0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/expr.c:10942
0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/expr.c:9819
0x83d197 expand_expr
        ../../gcc/expr.h:276
0x83d197 expand_assignment(tree_node*, tree_node*, bool)
        ../../gcc/expr.c:4971
0x71e2f3 expand_gimple_stmt_1
        ../../gcc/cfgexpand.c:3653
0x71e2f3 expand_gimple_stmt
        ../../gcc/cfgexpand.c:3751
0x721cdb expand_gimple_basic_block
        ../../gcc/cfgexpand.c:5750
0x726b07 execute
        ../../gcc/cfgexpand.c:6357

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-02 21:58             ` Andreas Schwab
@ 2017-09-04  8:58               ` Tamar Christina
  2017-09-04  9:19                 ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Tamar Christina @ 2017-09-04  8:58 UTC (permalink / raw)
  To: Andreas Schwab, Jon Beniston; +Cc: 'Richard Biener', gcc-patches, nd

Indeed it does, but even worse if you look at the gimple,
you see lots of

  vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.21_65);
  vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.22_63);
  vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.23_61);
  vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.24_59);

I suspect this patch will be quite bad for us performance wise as it thinks it's as cheap to do all our integer
operations on the vector side with vectors of 1 element. But I'm still waiting for the perf numbers to confirm.
________________________________________
From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org> on behalf of Andreas Schwab <schwab@linux-m68k.org>
Sent: Saturday, September 2, 2017 10:58 PM
To: Jon Beniston
Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
Subject: Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

On Aug 30 2017, "Jon Beniston" <jon@beniston.com> wrote:

> gcc/
> 2017-08-30  Jon Beniston  <jon@beniston.com>
>             Richard Biener  <rguenther@suse.de>
>
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cfdb72c..5ebeac2 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
>    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>
>    if (VECTOR_BOOLEAN_TYPE_P (type_in)
> -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
> +      || VECTOR_TYPE_P (type_in))
>      {
>        /* No need to check target support (already checked by the pattern
>           recognition function).  */
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 013fb1f..fc62efb 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> scalar_type, unsigned size)
>    else
>      simd_mode = mode_for_vector (inner_mode, size / nbytes);
>    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> -  if (nunits <= 1)
> +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> */
> +  if (nunits < 1)
>      return NULL_TREE;
>
>    vectype = build_vector_type (scalar_type, nunits);
>
>
>

That breaks vect/pr68577.c on aarch64.

during RTL pass: expand
/opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function 'slp_test':
/opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12: internal compiler error: in simplify_subreg, at simplify-rtx.c:6050
0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
        ../../gcc/simplify-rtx.c:6049
0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
        ../../gcc/simplify-rtx.c:6278
0x81d277 store_bit_field_1
        ../../gcc/expmed.c:798
0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned long, unsigned long, machine_mode, rtx_def*, bool)
        ../../gcc/expmed.c:1133
0x840bf7 store_field
        ../../gcc/expr.c:6950
0x84792f store_constructor_field
        ../../gcc/expr.c:6142
0x83edbf store_constructor
        ../../gcc/expr.c:6726
0x840443 expand_constructor
        ../../gcc/expr.c:8027
0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/expr.c:10133
0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/expr.c:9819
0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/expr.c:10942
0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
        ../../gcc/expr.c:9819
0x83d197 expand_expr
        ../../gcc/expr.h:276
0x83d197 expand_assignment(tree_node*, tree_node*, bool)
        ../../gcc/expr.c:4971
0x71e2f3 expand_gimple_stmt_1
        ../../gcc/cfgexpand.c:3653
0x71e2f3 expand_gimple_stmt
        ../../gcc/cfgexpand.c:3751
0x721cdb expand_gimple_basic_block
        ../../gcc/cfgexpand.c:5750
0x726b07 execute
        ../../gcc/cfgexpand.c:6357

Andreas.

--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-04  8:58               ` Tamar Christina
@ 2017-09-04  9:19                 ` Richard Biener
  2017-09-04 14:28                   ` Tamar Christina
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-09-04  9:19 UTC (permalink / raw)
  To: Tamar Christina; +Cc: Andreas Schwab, Jon Beniston, gcc-patches, nd

On Mon, 4 Sep 2017, Tamar Christina wrote:

> Indeed it does, but even worse if you look at the gimple,
> you see lots of
> 
>   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.21_65);
>   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.22_63);
>   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.23_61);
>   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.24_59);
> 
> I suspect this patch will be quite bad for us performance wise as it thinks it's as cheap to do all our integer
> operations on the vector side with vectors of 1 element. But I'm still waiting for the perf numbers to confirm.

Looks like the backend advertises that it can do POPCOUNT on V1DI.  So
SLP vectorization decides it can vectorize this without unrolling.

Vectorization with V2DI is rejected:

/tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function is not 
vectorizable.
/tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not vectorized: 
relevant stmt not supported: _8 = __builtin_popcountl (_5);
...
/tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: ***** Re-trying 
analysis with vector size 8

and that now succeeds (it probably didn't succeed before the patch).

Richard.

> ________________________________________
> From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org> on behalf of Andreas Schwab <schwab@linux-m68k.org>
> Sent: Saturday, September 2, 2017 10:58 PM
> To: Jon Beniston
> Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
> Subject: Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
> 
> On Aug 30 2017, "Jon Beniston" <jon@beniston.com> wrote:
> 
> > gcc/
> > 2017-08-30  Jon Beniston  <jon@beniston.com>
> >             Richard Biener  <rguenther@suse.de>
> >
> > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> > index cfdb72c..5ebeac2 100644
> > --- a/gcc/tree-vect-patterns.c
> > +++ b/gcc/tree-vect-patterns.c
> > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
> >    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> >
> >    if (VECTOR_BOOLEAN_TYPE_P (type_in)
> > -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
> > +      || VECTOR_TYPE_P (type_in))
> >      {
> >        /* No need to check target support (already checked by the pattern
> >           recognition function).  */
> > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> > index 013fb1f..fc62efb 100644
> > --- a/gcc/tree-vect-stmts.c
> > +++ b/gcc/tree-vect-stmts.c
> > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> > scalar_type, unsigned size)
> >    else
> >      simd_mode = mode_for_vector (inner_mode, size / nbytes);
> >    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> > -  if (nunits <= 1)
> > +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> > */
> > +  if (nunits < 1)
> >      return NULL_TREE;
> >
> >    vectype = build_vector_type (scalar_type, nunits);
> >
> >
> >
> 
> That breaks vect/pr68577.c on aarch64.
> 
> during RTL pass: expand
> /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function 'slp_test':
> /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12: internal compiler error: in simplify_subreg, at simplify-rtx.c:6050
> 0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
>         ../../gcc/simplify-rtx.c:6049
> 0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
>         ../../gcc/simplify-rtx.c:6278
> 0x81d277 store_bit_field_1
>         ../../gcc/expmed.c:798
> 0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned long, unsigned long, machine_mode, rtx_def*, bool)
>         ../../gcc/expmed.c:1133
> 0x840bf7 store_field
>         ../../gcc/expr.c:6950
> 0x84792f store_constructor_field
>         ../../gcc/expr.c:6142
> 0x83edbf store_constructor
>         ../../gcc/expr.c:6726
> 0x840443 expand_constructor
>         ../../gcc/expr.c:8027
> 0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
>         ../../gcc/expr.c:10133
> 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
>         ../../gcc/expr.c:9819
> 0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
>         ../../gcc/expr.c:10942
> 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
>         ../../gcc/expr.c:9819
> 0x83d197 expand_expr
>         ../../gcc/expr.h:276
> 0x83d197 expand_assignment(tree_node*, tree_node*, bool)
>         ../../gcc/expr.c:4971
> 0x71e2f3 expand_gimple_stmt_1
>         ../../gcc/cfgexpand.c:3653
> 0x71e2f3 expand_gimple_stmt
>         ../../gcc/cfgexpand.c:3751
> 0x721cdb expand_gimple_basic_block
>         ../../gcc/cfgexpand.c:5750
> 0x726b07 execute
>         ../../gcc/cfgexpand.c:6357
> 
> Andreas.
> 
> --
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-04  9:19                 ` Richard Biener
@ 2017-09-04 14:28                   ` Tamar Christina
  2017-09-04 16:48                     ` Andrew Pinski
  0 siblings, 1 reply; 33+ messages in thread
From: Tamar Christina @ 2017-09-04 14:28 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andreas Schwab, Jon Beniston, gcc-patches, nd

> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.21_65);
> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.22_63);
> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.23_61);
> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.24_59);
> >
> > I suspect this patch will be quite bad for us performance wise as it
> > thinks it's as cheap to do all our integer operations on the vector side with
> vectors of 1 element. But I'm still waiting for the perf numbers to confirm.
> 
> Looks like the backend advertises that it can do POPCOUNT on V1DI.  So SLP
> vectorization decides it can vectorize this without unrolling.

We don't, POPCOUNT is only defined for vector modes V8QI and V16QI, we also don't define support
For V1DI anywhere in the backend, we do however say we support V1DF, but removing
That doesn't cause the ICE to go away. 

> Vectorization with V2DI is rejected:
> 
> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function is not
> vectorizable.
> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not vectorized:
> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: ***** Re-trying
> analysis with vector size 8
> 
> and that now succeeds (it probably didn't succeed before the patch).

In the .optimized file, I see it vectorised it to 

  vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.21_65);
  vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.22_63);
  vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.23_61);
  vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.24_59);
  _54 = POPCOUNT (vect__5.25_58);
  _53 = POPCOUNT (vect__5.25_57);

Which is something we just don't have a pattern for. Before this patch, it was rejecting "long unsigned int"
With this patch is somehow thinks we support an integer vector of 1 element, even though 1) we don't have an optab
Defined for this operation for POPCOUNT (or at all in aarch64 as far as I can tell), and 2) we don't have it in our supported list of vector modes.

So I don't quite understand how we end up with this expression.

Regards,
Tamar

> 
> Richard.
> 
> > ________________________________________
> > From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org>
> on
> > behalf of Andreas Schwab <schwab@linux-m68k.org>
> > Sent: Saturday, September 2, 2017 10:58 PM
> > To: Jon Beniston
> > Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
> > Subject: Re: [RFC, vectorizer] Allow single element vector types for
> > vector reduction operations
> >
> > On Aug 30 2017, "Jon Beniston" <jon@beniston.com> wrote:
> >
> > > gcc/
> > > 2017-08-30  Jon Beniston  <jon@beniston.com>
> > >             Richard Biener  <rguenther@suse.de>
> > >
> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> > > index cfdb72c..5ebeac2 100644
> > > --- a/gcc/tree-vect-patterns.c
> > > +++ b/gcc/tree-vect-patterns.c
> > > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
> *recog_func,
> > >    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> > >
> > >    if (VECTOR_BOOLEAN_TYPE_P (type_in)
> > > -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
> > > +      || VECTOR_TYPE_P (type_in))
> > >      {
> > >        /* No need to check target support (already checked by the pattern
> > >           recognition function).  */ diff --git
> > > a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index
> > > 013fb1f..fc62efb 100644
> > > --- a/gcc/tree-vect-stmts.c
> > > +++ b/gcc/tree-vect-stmts.c
> > > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> > > scalar_type, unsigned size)
> > >    else
> > >      simd_mode = mode_for_vector (inner_mode, size / nbytes);
> > >    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> > > -  if (nunits <= 1)
> > > +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> > > */
> > > +  if (nunits < 1)
> > >      return NULL_TREE;
> > >
> > >    vectype = build_vector_type (scalar_type, nunits);
> > >
> > >
> > >
> >
> > That breaks vect/pr68577.c on aarch64.
> >
> > during RTL pass: expand
> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function
> 'slp_test':
> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12:
> > internal compiler error: in simplify_subreg, at simplify-rtx.c:6050
> > 0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode,
> unsigned int)
> >         ../../gcc/simplify-rtx.c:6049
> > 0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode,
> unsigned int)
> >         ../../gcc/simplify-rtx.c:6278
> > 0x81d277 store_bit_field_1
> >         ../../gcc/expmed.c:798
> > 0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned
> long, unsigned long, machine_mode, rtx_def*, bool)
> >         ../../gcc/expmed.c:1133
> > 0x840bf7 store_field
> >         ../../gcc/expr.c:6950
> > 0x84792f store_constructor_field
> >         ../../gcc/expr.c:6142
> > 0x83edbf store_constructor
> >         ../../gcc/expr.c:6726
> > 0x840443 expand_constructor
> >         ../../gcc/expr.c:8027
> > 0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> >         ../../gcc/expr.c:10133
> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> >         ../../gcc/expr.c:9819
> > 0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> >         ../../gcc/expr.c:10942
> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> >         ../../gcc/expr.c:9819
> > 0x83d197 expand_expr
> >         ../../gcc/expr.h:276
> > 0x83d197 expand_assignment(tree_node*, tree_node*, bool)
> >         ../../gcc/expr.c:4971
> > 0x71e2f3 expand_gimple_stmt_1
> >         ../../gcc/cfgexpand.c:3653
> > 0x71e2f3 expand_gimple_stmt
> >         ../../gcc/cfgexpand.c:3751
> > 0x721cdb expand_gimple_basic_block
> >         ../../gcc/cfgexpand.c:5750
> > 0x726b07 execute
> >         ../../gcc/cfgexpand.c:6357
> >
> > Andreas.
> >
> > --
> > Andreas Schwab, schwab@linux-m68k.org
> > GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276
> > 4ED5 "And now for something completely different."
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-30 15:52           ` Jon Beniston
                               ` (2 preceding siblings ...)
  2017-09-02 21:58             ` Andreas Schwab
@ 2017-09-04 15:04             ` Christophe Lyon
  3 siblings, 0 replies; 33+ messages in thread
From: Christophe Lyon @ 2017-09-04 15:04 UTC (permalink / raw)
  To: Jon Beniston; +Cc: Richard Biener, gcc-patches

On 30 August 2017 at 17:11, Jon Beniston <jon@beniston.com> wrote:
> Hi Richard,
>
>> I think the issue is that with TARGET_VECTOR_MODE_SUPPORTED_P false
>> for V1SI you'll get a SImode vector type and the
>>
>>  if (VECTOR_BOOLEAN_TYPE_P (type_in)
>>      || VECTOR_MODE_P (TYPE_MODE (type_in)))
>>
>>check fails.  Try changing that to || VECTOR_TYPE_P (type_in).
>>The else path should be hopefully dead ...
>>
>>> It looks to me like the other call sites of optab_for_tree_code which
>>> are passing optab_default are mostly places where GCC is sure it is
>>> not a shift operation.
>>> Given TYPE_IN is returned from get_vectype_for_scalar_type, is it
>>> safe to simply pass optab_vector in vect_pattern_recog_1?
>>
>>I guess so.  Can you also try the above?
>
> Changing VECTOR_MODE_P check into VECTOR_TYPE_P check also works for me, I
> verified the following patch could vectorize my dot product case and there
> is no regression on gcc tests on my port.
>
> Meanwhile x86-64 bootstraps OK and no regression on gcc/g++ test.
>
>  Does this look OK?
>
>
> gcc/
> 2017-08-30  Jon Beniston  <jon@beniston.com>
>             Richard Biener  <rguenther@suse.de>
>
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cfdb72c..5ebeac2 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func *recog_func,
>    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>
>    if (VECTOR_BOOLEAN_TYPE_P (type_in)
> -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
> +      || VECTOR_TYPE_P (type_in))
>      {
>        /* No need to check target support (already checked by the pattern
>           recognition function).  */
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 013fb1f..fc62efb 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> scalar_type, unsigned size)
>    else
>      simd_mode = mode_for_vector (inner_mode, size / nbytes);
>    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> -  if (nunits <= 1)
> +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> */
> +  if (nunits < 1)
>      return NULL_TREE;
>
>    vectype = build_vector_type (scalar_type, nunits);
>
>

Hi,

This patch makes this test fail:
gcc.dg/tree-ssa/ssa-dom-cse-2.c scan-tree-dump optimized "return 28;"
on arm-none-linux-gnueabi when using the default fpu, or on
arm-none-linux-gnueabihf --with-fpu=vfp (as opposed to
--with-fpu=neon*)

Christophe

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-04 14:28                   ` Tamar Christina
@ 2017-09-04 16:48                     ` Andrew Pinski
  2017-09-05  9:37                       ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Andrew Pinski @ 2017-09-04 16:48 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Richard Biener, Andreas Schwab, Jon Beniston, gcc-patches, nd

On Mon, Sep 4, 2017 at 7:28 AM, Tamar Christina <Tamar.Christina@arm.com> wrote:
>> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> intD.11>(vect__4.21_65);
>> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> intD.11>(vect__4.22_63);
>> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> intD.11>(vect__4.23_61);
>> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> > intD.11>(vect__4.24_59);
>> >
>> > I suspect this patch will be quite bad for us performance wise as it
>> > thinks it's as cheap to do all our integer operations on the vector side with
>> vectors of 1 element. But I'm still waiting for the perf numbers to confirm.
>>
>> Looks like the backend advertises that it can do POPCOUNT on V1DI.  So SLP
>> vectorization decides it can vectorize this without unrolling.
>
> We don't, POPCOUNT is only defined for vector modes V8QI and V16QI, we also don't define support
> For V1DI anywhere in the backend, we do however say we support V1DF, but removing
> That doesn't cause the ICE to go away.
>
>> Vectorization with V2DI is rejected:
>>
>> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function is not
>> vectorizable.
>> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not vectorized:
>> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
>> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: ***** Re-trying
>> analysis with vector size 8
>>
>> and that now succeeds (it probably didn't succeed before the patch).
>
> In the .optimized file, I see it vectorised it to
>
>   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.21_65);
>   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.22_63);
>   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.23_61);
>   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.24_59);
>   _54 = POPCOUNT (vect__5.25_58);
>   _53 = POPCOUNT (vect__5.25_57);
>
> Which is something we just don't have a pattern for. Before this patch, it was rejecting "long unsigned int"
> With this patch is somehow thinks we support an integer vector of 1 element, even though 1) we don't have an optab
> Defined for this operation for POPCOUNT (or at all in aarch64 as far as I can tell), and 2) we don't have it in our supported list of vector modes.

Here are the two popcount optab aarch64 has:
(define_mode_iterator GPI [SI DI])
(define_expand "popcount<mode>2"
  [(match_operand:GPI 0 "register_operand")
   (match_operand:GPI 1 "register_operand")]


(define_insn "popcount<mode>2"
(define_mode_iterator VB [V8QI V16QI])
  [(set (match_operand:VB 0 "register_operand" "=w")
        (popcount:VB (match_operand:VB 1 "register_operand" "w")))]

As you can see we only define popcount optab for SI, DI, V8QI and
V16QI.  (note SI and DI uses the V8QI and V16QI during the expansion
but that is a different story).

Maybe somehow the vectorizer is thinking V1DI and DI are
interchangeable in some places.

Thanks,
Andrew


>
> So I don't quite understand how we end up with this expression.
>
> Regards,
> Tamar
>
>>
>> Richard.
>>
>> > ________________________________________
>> > From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org>
>> on
>> > behalf of Andreas Schwab <schwab@linux-m68k.org>
>> > Sent: Saturday, September 2, 2017 10:58 PM
>> > To: Jon Beniston
>> > Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
>> > Subject: Re: [RFC, vectorizer] Allow single element vector types for
>> > vector reduction operations
>> >
>> > On Aug 30 2017, "Jon Beniston" <jon@beniston.com> wrote:
>> >
>> > > gcc/
>> > > 2017-08-30  Jon Beniston  <jon@beniston.com>
>> > >             Richard Biener  <rguenther@suse.de>
>> > >
>> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
>> > > index cfdb72c..5ebeac2 100644
>> > > --- a/gcc/tree-vect-patterns.c
>> > > +++ b/gcc/tree-vect-patterns.c
>> > > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
>> *recog_func,
>> > >    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>> > >
>> > >    if (VECTOR_BOOLEAN_TYPE_P (type_in)
>> > > -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
>> > > +      || VECTOR_TYPE_P (type_in))
>> > >      {
>> > >        /* No need to check target support (already checked by the pattern
>> > >           recognition function).  */ diff --git
>> > > a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index
>> > > 013fb1f..fc62efb 100644
>> > > --- a/gcc/tree-vect-stmts.c
>> > > +++ b/gcc/tree-vect-stmts.c
>> > > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
>> > > scalar_type, unsigned size)
>> > >    else
>> > >      simd_mode = mode_for_vector (inner_mode, size / nbytes);
>> > >    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
>> > > -  if (nunits <= 1)
>> > > +  /* NOTE: nunits == 1 is allowed to support single element vector types.
>> > > */
>> > > +  if (nunits < 1)
>> > >      return NULL_TREE;
>> > >
>> > >    vectype = build_vector_type (scalar_type, nunits);
>> > >
>> > >
>> > >
>> >
>> > That breaks vect/pr68577.c on aarch64.
>> >
>> > during RTL pass: expand
>> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function
>> 'slp_test':
>> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12:
>> > internal compiler error: in simplify_subreg, at simplify-rtx.c:6050
>> > 0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode,
>> unsigned int)
>> >         ../../gcc/simplify-rtx.c:6049
>> > 0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode,
>> unsigned int)
>> >         ../../gcc/simplify-rtx.c:6278
>> > 0x81d277 store_bit_field_1
>> >         ../../gcc/expmed.c:798
>> > 0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned
>> long, unsigned long, machine_mode, rtx_def*, bool)
>> >         ../../gcc/expmed.c:1133
>> > 0x840bf7 store_field
>> >         ../../gcc/expr.c:6950
>> > 0x84792f store_constructor_field
>> >         ../../gcc/expr.c:6142
>> > 0x83edbf store_constructor
>> >         ../../gcc/expr.c:6726
>> > 0x840443 expand_constructor
>> >         ../../gcc/expr.c:8027
>> > 0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
>> expand_modifier, rtx_def**, bool)
>> >         ../../gcc/expr.c:10133
>> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
>> expand_modifier, rtx_def**, bool)
>> >         ../../gcc/expr.c:9819
>> > 0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
>> expand_modifier, rtx_def**, bool)
>> >         ../../gcc/expr.c:10942
>> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
>> expand_modifier, rtx_def**, bool)
>> >         ../../gcc/expr.c:9819
>> > 0x83d197 expand_expr
>> >         ../../gcc/expr.h:276
>> > 0x83d197 expand_assignment(tree_node*, tree_node*, bool)
>> >         ../../gcc/expr.c:4971
>> > 0x71e2f3 expand_gimple_stmt_1
>> >         ../../gcc/cfgexpand.c:3653
>> > 0x71e2f3 expand_gimple_stmt
>> >         ../../gcc/cfgexpand.c:3751
>> > 0x721cdb expand_gimple_basic_block
>> >         ../../gcc/cfgexpand.c:5750
>> > 0x726b07 execute
>> >         ../../gcc/cfgexpand.c:6357
>> >
>> > Andreas.
>> >
>> > --
>> > Andreas Schwab, schwab@linux-m68k.org
>> > GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276
>> > 4ED5 "And now for something completely different."
>> >
>> >
>>
>> --
>> Richard Biener <rguenther@suse.de>
>> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
>> HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-04 16:48                     ` Andrew Pinski
@ 2017-09-05  9:37                       ` Richard Biener
  2017-09-05 10:24                         ` Tamar Christina
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-09-05  9:37 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Tamar Christina, Andreas Schwab, Jon Beniston, gcc-patches, nd

On Mon, 4 Sep 2017, Andrew Pinski wrote:

> On Mon, Sep 4, 2017 at 7:28 AM, Tamar Christina <Tamar.Christina@arm.com> wrote:
> >> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> >> intD.11>(vect__4.21_65);
> >> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> >> intD.11>(vect__4.22_63);
> >> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> >> intD.11>(vect__4.23_61);
> >> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> >> > intD.11>(vect__4.24_59);
> >> >
> >> > I suspect this patch will be quite bad for us performance wise as it
> >> > thinks it's as cheap to do all our integer operations on the vector side with
> >> vectors of 1 element. But I'm still waiting for the perf numbers to confirm.
> >>
> >> Looks like the backend advertises that it can do POPCOUNT on V1DI.  So SLP
> >> vectorization decides it can vectorize this without unrolling.
> >
> > We don't, POPCOUNT is only defined for vector modes V8QI and V16QI, we also don't define support
> > For V1DI anywhere in the backend, we do however say we support V1DF, but removing
> > That doesn't cause the ICE to go away.
> >
> >> Vectorization with V2DI is rejected:
> >>
> >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function is not
> >> vectorizable.
> >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not vectorized:
> >> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
> >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: ***** Re-trying
> >> analysis with vector size 8
> >>
> >> and that now succeeds (it probably didn't succeed before the patch).
> >
> > In the .optimized file, I see it vectorised it to
> >
> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.21_65);
> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.22_63);
> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.23_61);
> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned intD.11>(vect__4.24_59);
> >   _54 = POPCOUNT (vect__5.25_58);
> >   _53 = POPCOUNT (vect__5.25_57);
> >
> > Which is something we just don't have a pattern for. Before this patch, it was rejecting "long unsigned int"
> > With this patch is somehow thinks we support an integer vector of 1 element, even though 1) we don't have an optab
> > Defined for this operation for POPCOUNT (or at all in aarch64 as far as I can tell), and 2) we don't have it in our supported list of vector modes.
> 
> Here are the two popcount optab aarch64 has:
> (define_mode_iterator GPI [SI DI])
> (define_expand "popcount<mode>2"
>   [(match_operand:GPI 0 "register_operand")
>    (match_operand:GPI 1 "register_operand")]
> 
> 
> (define_insn "popcount<mode>2"
> (define_mode_iterator VB [V8QI V16QI])
>   [(set (match_operand:VB 0 "register_operand" "=w")
>         (popcount:VB (match_operand:VB 1 "register_operand" "w")))]
> 
> As you can see we only define popcount optab for SI, DI, V8QI and
> V16QI.  (note SI and DI uses the V8QI and V16QI during the expansion
> but that is a different story).
> 
> Maybe somehow the vectorizer is thinking V1DI and DI are
> interchangeable in some places.

We ask

Breakpoint 5, vectorizable_internal_function (cfn=CFN_BUILT_IN_POPCOUNTL, 
    fndecl=<function_decl 0x7ffff694b500 __builtin_popcountl>, 
    vectype_out=<vector_type 0x7ffff69789d8>, 
    vectype_in=<vector_type 0x7ffff697a690>)
    at /tmp/trunk/gcc/tree-vect-stmts.c:1666
1666      if (internal_fn_p (cfn))
(gdb) p debug_generic_expr (vectype_out)
vector(2) int
$10 = void
(gdb) p debug_generic_expr (vectype_in)
vector(1) long unsigned int
$11 = void
(gdb) fin
Run till exit from #0  vectorizable_internal_function (
    cfn=CFN_BUILT_IN_POPCOUNTL, 
    fndecl=<function_decl 0x7ffff694b500 __builtin_popcountl>, 
    vectype_out=<vector_type 0x7ffff69789d8>, 
    vectype_in=<vector_type 0x7ffff697a690>)
    at /tmp/trunk/gcc/tree-vect-stmts.c:1666
0x0000000001206afc in vectorizable_call (gs=<gimple_call 0x7ffff7fee7e0>, 
    gsi=0x0, vec_stmt=0x0, slp_node=0x2451290)
    at /tmp/trunk/gcc/tree-vect-stmts.c:2762
2762        ifn = vectorizable_internal_function (cfn, callee, 
vectype_out,
Value returned is $12 = IFN_POPCOUNT

so somehow direct_internal_fn_supported_p says true for a POPCOUNTL
V1DI -> V2SI.  I'd argue the question is odd already but the
ultimative answer is cleary wrong ;)

We have

(gdb) p info
$22 = (const direct_internal_fn_info &) @0x19b8e0c: {type0 = 0, type1 = 0, 
  vectorizable = 1}

so require vectype_in as vectype_out which means we ask for V1DI -> V1DI
(and the vectorizer compensates for the demotion).  But the
mode of V1DI is actually DI (because the target doesn't support V1DI).
Thus we end up with asking for popcount with DImode which is
available.

Note that this V1DI vector type having DImode is desirable for Jons
case as far as I understand.  DImode gets used because aarch64
advertises generic vectorization support with word_mode.

Things may get tricky later when we look for VEC_PACK_TRUNC
of V1DI, V1DI -> V2SI but the vec_pack_trunc_optab advertises
DImode support via the VDN iterator (maybe expecting this only
in case no V2SI support is available).

So in the end the vectorizer works as expected ;)

What we don't seem to handle is a single-element vector typed, DImode
constructor with a single DImode element.

We call store_bit_field with fieldmode DI but a value with V2SI,
I guess things go downhill from there.  The SSA exp we store
was DImode.  Ah.

 <ssa_name 0x7ffff66c0438
    type <integer_type 0x7ffff68a07e0 long unsigned int public unsigned DI
        size <integer_cst 0x7ffff6891a98 constant 64>
        unit-size <integer_cst 0x7ffff6891ab0 constant 8>
        align:64 warn_if_not_align:0 symtab:0 alias-set 3 canonical-type 
0x7ffff68a07e0 precision:64 min <integer_cst 0x7ffff6891d68 0> max 
<integer_cst 0x7ffff6893540 18446744073709551615>
        pointer_to_this <pointer_type 0x7ffff68b1c78>>
    visited
    def_stmt _62 = VEC_PACK_TRUNC_EXPR <_67, _64>;

I suppose _that_'s already somewhat bogus.  vector lowering creates it
for some reason:

-  vect__8.26_52 = VEC_PACK_TRUNC_EXPR <_54, _53>;
+  _67 = BIT_FIELD_REF <_54, 64, 0>;
+  _64 = BIT_FIELD_REF <_53, 64, 0>;
+  _62 = VEC_PACK_TRUNC_EXPR <_67, _64>;
+  _60 = {_62};
+  _48 = VIEW_CONVERT_EXPR<vector(2) int>(_60);
+  vect__8.26_52 = _48;

which happens because

static tree
get_compute_type (enum tree_code code, optab op, tree type)
{
  /* For very wide vectors, try using a smaller vector mode.  */
  tree compute_type = type;
  if (op
      && (!VECTOR_MODE_P (TYPE_MODE (type))
          || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))

and/or

  if (compute_type == NULL_TREE)
    compute_type = get_compute_type (code, op, type);
  if (compute_type == type)
    return; 

note that lowering sth like VEC_PACK_TRUNC_EXPR into scalars is
pointless - at least in the way the lowering code does.

Index: gcc/tree-vect-generic.c
===================================================================
--- gcc/tree-vect-generic.c	(revision 251642)
+++ gcc/tree-vect-generic.c	(working copy)
@@ -1439,7 +1439,7 @@ get_compute_type (enum tree_code code, o
   /* For very wide vectors, try using a smaller vector mode.  */
   tree compute_type = type;
   if (op
-      && (!VECTOR_MODE_P (TYPE_MODE (type))
+      && (TYPE_MODE (type) == BLKmode
 	  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))
     {
       tree vector_compute_type
@@ -1459,7 +1459,7 @@ get_compute_type (enum tree_code code, o
   if (compute_type == type)
     {
       machine_mode compute_mode = TYPE_MODE (compute_type);
-      if (VECTOR_MODE_P (compute_mode))
+      if (compute_mode != BLKmode)
 	{
 	  if (op && optab_handler (op, compute_mode) != CODE_FOR_nothing)
 	    return compute_type;

fixes this and we pass the testcase again.  Note we vectorize
that way even with the cost model enabled so it might show us
that the cost model fails to account for some costs:

/tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: Cost model 
analysis:
  Vector inside of loop cost: 8
  Vector prologue cost: 2
  Vector epilogue cost: 0
  Scalar iteration cost: 16
  Scalar outside cost: 0
  Vector outside cost: 2
  prologue iterations: 0
  epilogue iterations: 0
  Calculated minimum iters for profitability: 1

SLP costing seems to fail to account for promotion/demotion costs
it seems accounting for that still doesn't push it over the edge.

Richard.

> Thanks,
> Andrew
> 
> 
> >
> > So I don't quite understand how we end up with this expression.
> >
> > Regards,
> > Tamar
> >
> >>
> >> Richard.
> >>
> >> > ________________________________________
> >> > From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org>
> >> on
> >> > behalf of Andreas Schwab <schwab@linux-m68k.org>
> >> > Sent: Saturday, September 2, 2017 10:58 PM
> >> > To: Jon Beniston
> >> > Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
> >> > Subject: Re: [RFC, vectorizer] Allow single element vector types for
> >> > vector reduction operations
> >> >
> >> > On Aug 30 2017, "Jon Beniston" <jon@beniston.com> wrote:
> >> >
> >> > > gcc/
> >> > > 2017-08-30  Jon Beniston  <jon@beniston.com>
> >> > >             Richard Biener  <rguenther@suse.de>
> >> > >
> >> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> >> > > index cfdb72c..5ebeac2 100644
> >> > > --- a/gcc/tree-vect-patterns.c
> >> > > +++ b/gcc/tree-vect-patterns.c
> >> > > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
> >> *recog_func,
> >> > >    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> >> > >
> >> > >    if (VECTOR_BOOLEAN_TYPE_P (type_in)
> >> > > -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
> >> > > +      || VECTOR_TYPE_P (type_in))
> >> > >      {
> >> > >        /* No need to check target support (already checked by the pattern
> >> > >           recognition function).  */ diff --git
> >> > > a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index
> >> > > 013fb1f..fc62efb 100644
> >> > > --- a/gcc/tree-vect-stmts.c
> >> > > +++ b/gcc/tree-vect-stmts.c
> >> > > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> >> > > scalar_type, unsigned size)
> >> > >    else
> >> > >      simd_mode = mode_for_vector (inner_mode, size / nbytes);
> >> > >    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> >> > > -  if (nunits <= 1)
> >> > > +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> >> > > */
> >> > > +  if (nunits < 1)
> >> > >      return NULL_TREE;
> >> > >
> >> > >    vectype = build_vector_type (scalar_type, nunits);
> >> > >
> >> > >
> >> > >
> >> >
> >> > That breaks vect/pr68577.c on aarch64.
> >> >
> >> > during RTL pass: expand
> >> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function
> >> 'slp_test':
> >> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12:
> >> > internal compiler error: in simplify_subreg, at simplify-rtx.c:6050
> >> > 0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode,
> >> unsigned int)
> >> >         ../../gcc/simplify-rtx.c:6049
> >> > 0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode,
> >> unsigned int)
> >> >         ../../gcc/simplify-rtx.c:6278
> >> > 0x81d277 store_bit_field_1
> >> >         ../../gcc/expmed.c:798
> >> > 0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned
> >> long, unsigned long, machine_mode, rtx_def*, bool)
> >> >         ../../gcc/expmed.c:1133
> >> > 0x840bf7 store_field
> >> >         ../../gcc/expr.c:6950
> >> > 0x84792f store_constructor_field
> >> >         ../../gcc/expr.c:6142
> >> > 0x83edbf store_constructor
> >> >         ../../gcc/expr.c:6726
> >> > 0x840443 expand_constructor
> >> >         ../../gcc/expr.c:8027
> >> > 0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> >> expand_modifier, rtx_def**, bool)
> >> >         ../../gcc/expr.c:10133
> >> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> >> expand_modifier, rtx_def**, bool)
> >> >         ../../gcc/expr.c:9819
> >> > 0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> >> expand_modifier, rtx_def**, bool)
> >> >         ../../gcc/expr.c:10942
> >> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> >> expand_modifier, rtx_def**, bool)
> >> >         ../../gcc/expr.c:9819
> >> > 0x83d197 expand_expr
> >> >         ../../gcc/expr.h:276
> >> > 0x83d197 expand_assignment(tree_node*, tree_node*, bool)
> >> >         ../../gcc/expr.c:4971
> >> > 0x71e2f3 expand_gimple_stmt_1
> >> >         ../../gcc/cfgexpand.c:3653
> >> > 0x71e2f3 expand_gimple_stmt
> >> >         ../../gcc/cfgexpand.c:3751
> >> > 0x721cdb expand_gimple_basic_block
> >> >         ../../gcc/cfgexpand.c:5750
> >> > 0x726b07 execute
> >> >         ../../gcc/cfgexpand.c:6357
> >> >
> >> > Andreas.
> >> >
> >> > --
> >> > Andreas Schwab, schwab@linux-m68k.org
> >> > GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276
> >> > 4ED5 "And now for something completely different."
> >> >
> >> >
> >>
> >> --
> >> Richard Biener <rguenther@suse.de>
> >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> >> HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-05  9:37                       ` Richard Biener
@ 2017-09-05 10:24                         ` Tamar Christina
  2017-09-05 10:41                           ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Tamar Christina @ 2017-09-05 10:24 UTC (permalink / raw)
  To: Richard Biener, Andrew Pinski
  Cc: Andreas Schwab, Jon Beniston, gcc-patches, nd

Hi Richard,

That was an really interesting analysis, thanks for the details!

Would you be submitting the patch you proposed at the end as a fix?

Thanks,
Tamar

> -----Original Message-----
> From: Richard Biener [mailto:rguenther@suse.de]
> Sent: 05 September 2017 10:38
> To: Andrew Pinski
> Cc: Tamar Christina; Andreas Schwab; Jon Beniston; gcc-
> patches@gcc.gnu.org; nd
> Subject: Re: [RFC, vectorizer] Allow single element vector types for vector
> reduction operations
> 
> On Mon, 4 Sep 2017, Andrew Pinski wrote:
> 
> > On Mon, Sep 4, 2017 at 7:28 AM, Tamar Christina
> <Tamar.Christina@arm.com> wrote:
> > >> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > >> intD.11>(vect__4.21_65);
> > >> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > >> intD.11>(vect__4.22_63);
> > >> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > >> intD.11>(vect__4.23_61);
> > >> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > >> > intD.11>(vect__4.24_59);
> > >> >
> > >> > I suspect this patch will be quite bad for us performance wise as
> > >> > it thinks it's as cheap to do all our integer operations on the
> > >> > vector side with
> > >> vectors of 1 element. But I'm still waiting for the perf numbers to
> confirm.
> > >>
> > >> Looks like the backend advertises that it can do POPCOUNT on V1DI.
> > >> So SLP vectorization decides it can vectorize this without unrolling.
> > >
> > > We don't, POPCOUNT is only defined for vector modes V8QI and V16QI,
> > > we also don't define support For V1DI anywhere in the backend, we do
> > > however say we support V1DF, but removing That doesn't cause the ICE
> to go away.
> > >
> > >> Vectorization with V2DI is rejected:
> > >>
> > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function
> > >> is not vectorizable.
> > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not
> vectorized:
> > >> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
> > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: *****
> > >> Re-trying analysis with vector size 8
> > >>
> > >> and that now succeeds (it probably didn't succeed before the patch).
> > >
> > > In the .optimized file, I see it vectorised it to
> > >
> > >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.21_65);
> > >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.22_63);
> > >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.23_61);
> > >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.24_59);
> > >   _54 = POPCOUNT (vect__5.25_58);
> > >   _53 = POPCOUNT (vect__5.25_57);
> > >
> > > Which is something we just don't have a pattern for. Before this patch, it
> was rejecting "long unsigned int"
> > > With this patch is somehow thinks we support an integer vector of 1
> > > element, even though 1) we don't have an optab Defined for this
> operation for POPCOUNT (or at all in aarch64 as far as I can tell), and 2) we
> don't have it in our supported list of vector modes.
> >
> > Here are the two popcount optab aarch64 has:
> > (define_mode_iterator GPI [SI DI])
> > (define_expand "popcount<mode>2"
> >   [(match_operand:GPI 0 "register_operand")
> >    (match_operand:GPI 1 "register_operand")]
> >
> >
> > (define_insn "popcount<mode>2"
> > (define_mode_iterator VB [V8QI V16QI])
> >   [(set (match_operand:VB 0 "register_operand" "=w")
> >         (popcount:VB (match_operand:VB 1 "register_operand" "w")))]
> >
> > As you can see we only define popcount optab for SI, DI, V8QI and
> > V16QI.  (note SI and DI uses the V8QI and V16QI during the expansion
> > but that is a different story).
> >
> > Maybe somehow the vectorizer is thinking V1DI and DI are
> > interchangeable in some places.
> 
> We ask
> 
> Breakpoint 5, vectorizable_internal_function
> (cfn=CFN_BUILT_IN_POPCOUNTL,
>     fndecl=<function_decl 0x7ffff694b500 __builtin_popcountl>,
>     vectype_out=<vector_type 0x7ffff69789d8>,
>     vectype_in=<vector_type 0x7ffff697a690>)
>     at /tmp/trunk/gcc/tree-vect-stmts.c:1666
> 1666      if (internal_fn_p (cfn))
> (gdb) p debug_generic_expr (vectype_out)
> vector(2) int
> $10 = void
> (gdb) p debug_generic_expr (vectype_in)
> vector(1) long unsigned int
> $11 = void
> (gdb) fin
> Run till exit from #0  vectorizable_internal_function (
>     cfn=CFN_BUILT_IN_POPCOUNTL,
>     fndecl=<function_decl 0x7ffff694b500 __builtin_popcountl>,
>     vectype_out=<vector_type 0x7ffff69789d8>,
>     vectype_in=<vector_type 0x7ffff697a690>)
>     at /tmp/trunk/gcc/tree-vect-stmts.c:1666
> 0x0000000001206afc in vectorizable_call (gs=<gimple_call 0x7ffff7fee7e0>,
>     gsi=0x0, vec_stmt=0x0, slp_node=0x2451290)
>     at /tmp/trunk/gcc/tree-vect-stmts.c:2762
> 2762        ifn = vectorizable_internal_function (cfn, callee,
> vectype_out,
> Value returned is $12 = IFN_POPCOUNT
> 
> so somehow direct_internal_fn_supported_p says true for a POPCOUNTL
> V1DI -> V2SI.  I'd argue the question is odd already but the ultimative answer
> is cleary wrong ;)
> 
> We have
> 
> (gdb) p info
> $22 = (const direct_internal_fn_info &) @0x19b8e0c: {type0 = 0, type1 = 0,
>   vectorizable = 1}
> 
> so require vectype_in as vectype_out which means we ask for V1DI -> V1DI
> (and the vectorizer compensates for the demotion).  But the mode of V1DI is
> actually DI (because the target doesn't support V1DI).
> Thus we end up with asking for popcount with DImode which is available.
> 
> Note that this V1DI vector type having DImode is desirable for Jons case as
> far as I understand.  DImode gets used because aarch64 advertises generic
> vectorization support with word_mode.
> 
> Things may get tricky later when we look for VEC_PACK_TRUNC of V1DI,
> V1DI -> V2SI but the vec_pack_trunc_optab advertises DImode support via
> the VDN iterator (maybe expecting this only in case no V2SI support is
> available).
> 
> So in the end the vectorizer works as expected ;)
> 
> What we don't seem to handle is a single-element vector typed, DImode
> constructor with a single DImode element.
> 
> We call store_bit_field with fieldmode DI but a value with V2SI, I guess things
> go downhill from there.  The SSA exp we store was DImode.  Ah.
> 
>  <ssa_name 0x7ffff66c0438
>     type <integer_type 0x7ffff68a07e0 long unsigned int public unsigned DI
>         size <integer_cst 0x7ffff6891a98 constant 64>
>         unit-size <integer_cst 0x7ffff6891ab0 constant 8>
>         align:64 warn_if_not_align:0 symtab:0 alias-set 3 canonical-type
> 0x7ffff68a07e0 precision:64 min <integer_cst 0x7ffff6891d68 0> max
> <integer_cst 0x7ffff6893540 18446744073709551615>
>         pointer_to_this <pointer_type 0x7ffff68b1c78>>
>     visited
>     def_stmt _62 = VEC_PACK_TRUNC_EXPR <_67, _64>;
> 
> I suppose _that_'s already somewhat bogus.  vector lowering creates it for
> some reason:
> 
> -  vect__8.26_52 = VEC_PACK_TRUNC_EXPR <_54, _53>;
> +  _67 = BIT_FIELD_REF <_54, 64, 0>;
> +  _64 = BIT_FIELD_REF <_53, 64, 0>;
> +  _62 = VEC_PACK_TRUNC_EXPR <_67, _64>;
> +  _60 = {_62};
> +  _48 = VIEW_CONVERT_EXPR<vector(2) int>(_60);
> +  vect__8.26_52 = _48;
> 
> which happens because
> 
> static tree
> get_compute_type (enum tree_code code, optab op, tree type) {
>   /* For very wide vectors, try using a smaller vector mode.  */
>   tree compute_type = type;
>   if (op
>       && (!VECTOR_MODE_P (TYPE_MODE (type))
>           || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))
> 
> and/or
> 
>   if (compute_type == NULL_TREE)
>     compute_type = get_compute_type (code, op, type);
>   if (compute_type == type)
>     return;
> 
> note that lowering sth like VEC_PACK_TRUNC_EXPR into scalars is pointless -
> at least in the way the lowering code does.
> 
> Index: gcc/tree-vect-generic.c
> ==========================================================
> =========
> --- gcc/tree-vect-generic.c	(revision 251642)
> +++ gcc/tree-vect-generic.c	(working copy)
> @@ -1439,7 +1439,7 @@ get_compute_type (enum tree_code code, o
>    /* For very wide vectors, try using a smaller vector mode.  */
>    tree compute_type = type;
>    if (op
> -      && (!VECTOR_MODE_P (TYPE_MODE (type))
> +      && (TYPE_MODE (type) == BLKmode
>  	  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))
>      {
>        tree vector_compute_type
> @@ -1459,7 +1459,7 @@ get_compute_type (enum tree_code code, o
>    if (compute_type == type)
>      {
>        machine_mode compute_mode = TYPE_MODE (compute_type);
> -      if (VECTOR_MODE_P (compute_mode))
> +      if (compute_mode != BLKmode)
>  	{
>  	  if (op && optab_handler (op, compute_mode) !=
> CODE_FOR_nothing)
>  	    return compute_type;
> 
> fixes this and we pass the testcase again.  Note we vectorize that way even
> with the cost model enabled so it might show us that the cost model fails to
> account for some costs:
> 
> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: Cost model
> analysis:
>   Vector inside of loop cost: 8
>   Vector prologue cost: 2
>   Vector epilogue cost: 0
>   Scalar iteration cost: 16
>   Scalar outside cost: 0
>   Vector outside cost: 2
>   prologue iterations: 0
>   epilogue iterations: 0
>   Calculated minimum iters for profitability: 1
> 
> SLP costing seems to fail to account for promotion/demotion costs it seems
> accounting for that still doesn't push it over the edge.
> 
> Richard.
> 
> > Thanks,
> > Andrew
> >
> >
> > >
> > > So I don't quite understand how we end up with this expression.
> > >
> > > Regards,
> > > Tamar
> > >
> > >>
> > >> Richard.
> > >>
> > >> > ________________________________________
> > >> > From: gcc-patches-owner@gcc.gnu.org
> > >> > <gcc-patches-owner@gcc.gnu.org>
> > >> on
> > >> > behalf of Andreas Schwab <schwab@linux-m68k.org>
> > >> > Sent: Saturday, September 2, 2017 10:58 PM
> > >> > To: Jon Beniston
> > >> > Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
> > >> > Subject: Re: [RFC, vectorizer] Allow single element vector types
> > >> > for vector reduction operations
> > >> >
> > >> > On Aug 30 2017, "Jon Beniston" <jon@beniston.com> wrote:
> > >> >
> > >> > > gcc/
> > >> > > 2017-08-30  Jon Beniston  <jon@beniston.com>
> > >> > >             Richard Biener  <rguenther@suse.de>
> > >> > >
> > >> > > diff --git a/gcc/tree-vect-patterns.c
> > >> > > b/gcc/tree-vect-patterns.c index cfdb72c..5ebeac2 100644
> > >> > > --- a/gcc/tree-vect-patterns.c
> > >> > > +++ b/gcc/tree-vect-patterns.c
> > >> > > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
> > >> *recog_func,
> > >> > >    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> > >> > >
> > >> > >    if (VECTOR_BOOLEAN_TYPE_P (type_in)
> > >> > > -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
> > >> > > +      || VECTOR_TYPE_P (type_in))
> > >> > >      {
> > >> > >        /* No need to check target support (already checked by the
> pattern
> > >> > >           recognition function).  */ diff --git
> > >> > > a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index
> > >> > > 013fb1f..fc62efb 100644
> > >> > > --- a/gcc/tree-vect-stmts.c
> > >> > > +++ b/gcc/tree-vect-stmts.c
> > >> > > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size
> > >> > > (tree scalar_type, unsigned size)
> > >> > >    else
> > >> > >      simd_mode = mode_for_vector (inner_mode, size / nbytes);
> > >> > >    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> > >> > > -  if (nunits <= 1)
> > >> > > +  /* NOTE: nunits == 1 is allowed to support single element vector
> types.
> > >> > > */
> > >> > > +  if (nunits < 1)
> > >> > >      return NULL_TREE;
> > >> > >
> > >> > >    vectype = build_vector_type (scalar_type, nunits);
> > >> > >
> > >> > >
> > >> > >
> > >> >
> > >> > That breaks vect/pr68577.c on aarch64.
> > >> >
> > >> > during RTL pass: expand
> > >> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In
> > >> > function
> > >> 'slp_test':
> > >> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12:
> > >> > internal compiler error: in simplify_subreg, at
> > >> > simplify-rtx.c:6050
> > >> > 0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode,
> > >> unsigned int)
> > >> >         ../../gcc/simplify-rtx.c:6049
> > >> > 0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*,
> > >> > machine_mode,
> > >> unsigned int)
> > >> >         ../../gcc/simplify-rtx.c:6278
> > >> > 0x81d277 store_bit_field_1
> > >> >         ../../gcc/expmed.c:798
> > >> > 0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long,
> > >> > unsigned
> > >> long, unsigned long, machine_mode, rtx_def*, bool)
> > >> >         ../../gcc/expmed.c:1133
> > >> > 0x840bf7 store_field
> > >> >         ../../gcc/expr.c:6950
> > >> > 0x84792f store_constructor_field
> > >> >         ../../gcc/expr.c:6142
> > >> > 0x83edbf store_constructor
> > >> >         ../../gcc/expr.c:6726
> > >> > 0x840443 expand_constructor
> > >> >         ../../gcc/expr.c:8027
> > >> > 0x82d5bf expand_expr_real_1(tree_node*, rtx_def*,
> machine_mode,
> > >> expand_modifier, rtx_def**, bool)
> > >> >         ../../gcc/expr.c:10133
> > >> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*,
> machine_mode,
> > >> expand_modifier, rtx_def**, bool)
> > >> >         ../../gcc/expr.c:9819
> > >> > 0x82dadf expand_expr_real_1(tree_node*, rtx_def*,
> machine_mode,
> > >> expand_modifier, rtx_def**, bool)
> > >> >         ../../gcc/expr.c:10942
> > >> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*,
> machine_mode,
> > >> expand_modifier, rtx_def**, bool)
> > >> >         ../../gcc/expr.c:9819
> > >> > 0x83d197 expand_expr
> > >> >         ../../gcc/expr.h:276
> > >> > 0x83d197 expand_assignment(tree_node*, tree_node*, bool)
> > >> >         ../../gcc/expr.c:4971
> > >> > 0x71e2f3 expand_gimple_stmt_1
> > >> >         ../../gcc/cfgexpand.c:3653
> > >> > 0x71e2f3 expand_gimple_stmt
> > >> >         ../../gcc/cfgexpand.c:3751 0x721cdb
> > >> > expand_gimple_basic_block
> > >> >         ../../gcc/cfgexpand.c:5750
> > >> > 0x726b07 execute
> > >> >         ../../gcc/cfgexpand.c:6357
> > >> >
> > >> > Andreas.
> > >> >
> > >> > --
> > >> > Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA
> > >> > 54C7 6D53 942B 1756  01D3 44D5 214B 8276
> > >> > 4ED5 "And now for something completely different."
> > >> >
> > >> >
> > >>
> > >> --
> > >> Richard Biener <rguenther@suse.de>
> > >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham
> > >> Norton, HRB 21284 (AG Nuernberg)
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-05 10:24                         ` Tamar Christina
@ 2017-09-05 10:41                           ` Richard Biener
  2017-09-05 12:50                             ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-09-05 10:41 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Andrew Pinski, Andreas Schwab, Jon Beniston, gcc-patches, nd

On Tue, 5 Sep 2017, Tamar Christina wrote:

> Hi Richard,
> 
> That was an really interesting analysis, thanks for the details!
> 
> Would you be submitting the patch you proposed at the end as a fix?

I'm testing it currently.

Richard.

> Thanks,
> Tamar
> 
> > -----Original Message-----
> > From: Richard Biener [mailto:rguenther@suse.de]
> > Sent: 05 September 2017 10:38
> > To: Andrew Pinski
> > Cc: Tamar Christina; Andreas Schwab; Jon Beniston; gcc-
> > patches@gcc.gnu.org; nd
> > Subject: Re: [RFC, vectorizer] Allow single element vector types for vector
> > reduction operations
> > 
> > On Mon, 4 Sep 2017, Andrew Pinski wrote:
> > 
> > > On Mon, Sep 4, 2017 at 7:28 AM, Tamar Christina
> > <Tamar.Christina@arm.com> wrote:
> > > >> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > > >> intD.11>(vect__4.21_65);
> > > >> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > > >> intD.11>(vect__4.22_63);
> > > >> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > > >> intD.11>(vect__4.23_61);
> > > >> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > > >> > intD.11>(vect__4.24_59);
> > > >> >
> > > >> > I suspect this patch will be quite bad for us performance wise as
> > > >> > it thinks it's as cheap to do all our integer operations on the
> > > >> > vector side with
> > > >> vectors of 1 element. But I'm still waiting for the perf numbers to
> > confirm.
> > > >>
> > > >> Looks like the backend advertises that it can do POPCOUNT on V1DI.
> > > >> So SLP vectorization decides it can vectorize this without unrolling.
> > > >
> > > > We don't, POPCOUNT is only defined for vector modes V8QI and V16QI,
> > > > we also don't define support For V1DI anywhere in the backend, we do
> > > > however say we support V1DF, but removing That doesn't cause the ICE
> > to go away.
> > > >
> > > >> Vectorization with V2DI is rejected:
> > > >>
> > > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function
> > > >> is not vectorizable.
> > > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not
> > vectorized:
> > > >> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
> > > >> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: *****
> > > >> Re-trying analysis with vector size 8
> > > >>
> > > >> and that now succeeds (it probably didn't succeed before the patch).
> > > >
> > > > In the .optimized file, I see it vectorised it to
> > > >
> > > >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.21_65);
> > > >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.22_63);
> > > >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.23_61);
> > > >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.24_59);
> > > >   _54 = POPCOUNT (vect__5.25_58);
> > > >   _53 = POPCOUNT (vect__5.25_57);
> > > >
> > > > Which is something we just don't have a pattern for. Before this patch, it
> > was rejecting "long unsigned int"
> > > > With this patch is somehow thinks we support an integer vector of 1
> > > > element, even though 1) we don't have an optab Defined for this
> > operation for POPCOUNT (or at all in aarch64 as far as I can tell), and 2) we
> > don't have it in our supported list of vector modes.
> > >
> > > Here are the two popcount optab aarch64 has:
> > > (define_mode_iterator GPI [SI DI])
> > > (define_expand "popcount<mode>2"
> > >   [(match_operand:GPI 0 "register_operand")
> > >    (match_operand:GPI 1 "register_operand")]
> > >
> > >
> > > (define_insn "popcount<mode>2"
> > > (define_mode_iterator VB [V8QI V16QI])
> > >   [(set (match_operand:VB 0 "register_operand" "=w")
> > >         (popcount:VB (match_operand:VB 1 "register_operand" "w")))]
> > >
> > > As you can see we only define popcount optab for SI, DI, V8QI and
> > > V16QI.  (note SI and DI uses the V8QI and V16QI during the expansion
> > > but that is a different story).
> > >
> > > Maybe somehow the vectorizer is thinking V1DI and DI are
> > > interchangeable in some places.
> > 
> > We ask
> > 
> > Breakpoint 5, vectorizable_internal_function
> > (cfn=CFN_BUILT_IN_POPCOUNTL,
> >     fndecl=<function_decl 0x7ffff694b500 __builtin_popcountl>,
> >     vectype_out=<vector_type 0x7ffff69789d8>,
> >     vectype_in=<vector_type 0x7ffff697a690>)
> >     at /tmp/trunk/gcc/tree-vect-stmts.c:1666
> > 1666      if (internal_fn_p (cfn))
> > (gdb) p debug_generic_expr (vectype_out)
> > vector(2) int
> > $10 = void
> > (gdb) p debug_generic_expr (vectype_in)
> > vector(1) long unsigned int
> > $11 = void
> > (gdb) fin
> > Run till exit from #0  vectorizable_internal_function (
> >     cfn=CFN_BUILT_IN_POPCOUNTL,
> >     fndecl=<function_decl 0x7ffff694b500 __builtin_popcountl>,
> >     vectype_out=<vector_type 0x7ffff69789d8>,
> >     vectype_in=<vector_type 0x7ffff697a690>)
> >     at /tmp/trunk/gcc/tree-vect-stmts.c:1666
> > 0x0000000001206afc in vectorizable_call (gs=<gimple_call 0x7ffff7fee7e0>,
> >     gsi=0x0, vec_stmt=0x0, slp_node=0x2451290)
> >     at /tmp/trunk/gcc/tree-vect-stmts.c:2762
> > 2762        ifn = vectorizable_internal_function (cfn, callee,
> > vectype_out,
> > Value returned is $12 = IFN_POPCOUNT
> > 
> > so somehow direct_internal_fn_supported_p says true for a POPCOUNTL
> > V1DI -> V2SI.  I'd argue the question is odd already but the ultimative answer
> > is cleary wrong ;)
> > 
> > We have
> > 
> > (gdb) p info
> > $22 = (const direct_internal_fn_info &) @0x19b8e0c: {type0 = 0, type1 = 0,
> >   vectorizable = 1}
> > 
> > so require vectype_in as vectype_out which means we ask for V1DI -> V1DI
> > (and the vectorizer compensates for the demotion).  But the mode of V1DI is
> > actually DI (because the target doesn't support V1DI).
> > Thus we end up with asking for popcount with DImode which is available.
> > 
> > Note that this V1DI vector type having DImode is desirable for Jons case as
> > far as I understand.  DImode gets used because aarch64 advertises generic
> > vectorization support with word_mode.
> > 
> > Things may get tricky later when we look for VEC_PACK_TRUNC of V1DI,
> > V1DI -> V2SI but the vec_pack_trunc_optab advertises DImode support via
> > the VDN iterator (maybe expecting this only in case no V2SI support is
> > available).
> > 
> > So in the end the vectorizer works as expected ;)
> > 
> > What we don't seem to handle is a single-element vector typed, DImode
> > constructor with a single DImode element.
> > 
> > We call store_bit_field with fieldmode DI but a value with V2SI, I guess things
> > go downhill from there.  The SSA exp we store was DImode.  Ah.
> > 
> >  <ssa_name 0x7ffff66c0438
> >     type <integer_type 0x7ffff68a07e0 long unsigned int public unsigned DI
> >         size <integer_cst 0x7ffff6891a98 constant 64>
> >         unit-size <integer_cst 0x7ffff6891ab0 constant 8>
> >         align:64 warn_if_not_align:0 symtab:0 alias-set 3 canonical-type
> > 0x7ffff68a07e0 precision:64 min <integer_cst 0x7ffff6891d68 0> max
> > <integer_cst 0x7ffff6893540 18446744073709551615>
> >         pointer_to_this <pointer_type 0x7ffff68b1c78>>
> >     visited
> >     def_stmt _62 = VEC_PACK_TRUNC_EXPR <_67, _64>;
> > 
> > I suppose _that_'s already somewhat bogus.  vector lowering creates it for
> > some reason:
> > 
> > -  vect__8.26_52 = VEC_PACK_TRUNC_EXPR <_54, _53>;
> > +  _67 = BIT_FIELD_REF <_54, 64, 0>;
> > +  _64 = BIT_FIELD_REF <_53, 64, 0>;
> > +  _62 = VEC_PACK_TRUNC_EXPR <_67, _64>;
> > +  _60 = {_62};
> > +  _48 = VIEW_CONVERT_EXPR<vector(2) int>(_60);
> > +  vect__8.26_52 = _48;
> > 
> > which happens because
> > 
> > static tree
> > get_compute_type (enum tree_code code, optab op, tree type) {
> >   /* For very wide vectors, try using a smaller vector mode.  */
> >   tree compute_type = type;
> >   if (op
> >       && (!VECTOR_MODE_P (TYPE_MODE (type))
> >           || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))
> > 
> > and/or
> > 
> >   if (compute_type == NULL_TREE)
> >     compute_type = get_compute_type (code, op, type);
> >   if (compute_type == type)
> >     return;
> > 
> > note that lowering sth like VEC_PACK_TRUNC_EXPR into scalars is pointless -
> > at least in the way the lowering code does.
> > 
> > Index: gcc/tree-vect-generic.c
> > ==========================================================
> > =========
> > --- gcc/tree-vect-generic.c	(revision 251642)
> > +++ gcc/tree-vect-generic.c	(working copy)
> > @@ -1439,7 +1439,7 @@ get_compute_type (enum tree_code code, o
> >    /* For very wide vectors, try using a smaller vector mode.  */
> >    tree compute_type = type;
> >    if (op
> > -      && (!VECTOR_MODE_P (TYPE_MODE (type))
> > +      && (TYPE_MODE (type) == BLKmode
> >  	  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))
> >      {
> >        tree vector_compute_type
> > @@ -1459,7 +1459,7 @@ get_compute_type (enum tree_code code, o
> >    if (compute_type == type)
> >      {
> >        machine_mode compute_mode = TYPE_MODE (compute_type);
> > -      if (VECTOR_MODE_P (compute_mode))
> > +      if (compute_mode != BLKmode)
> >  	{
> >  	  if (op && optab_handler (op, compute_mode) !=
> > CODE_FOR_nothing)
> >  	    return compute_type;
> > 
> > fixes this and we pass the testcase again.  Note we vectorize that way even
> > with the cost model enabled so it might show us that the cost model fails to
> > account for some costs:
> > 
> > /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: Cost model
> > analysis:
> >   Vector inside of loop cost: 8
> >   Vector prologue cost: 2
> >   Vector epilogue cost: 0
> >   Scalar iteration cost: 16
> >   Scalar outside cost: 0
> >   Vector outside cost: 2
> >   prologue iterations: 0
> >   epilogue iterations: 0
> >   Calculated minimum iters for profitability: 1
> > 
> > SLP costing seems to fail to account for promotion/demotion costs it seems
> > accounting for that still doesn't push it over the edge.
> > 
> > Richard.
> > 
> > > Thanks,
> > > Andrew
> > >
> > >
> > > >
> > > > So I don't quite understand how we end up with this expression.
> > > >
> > > > Regards,
> > > > Tamar
> > > >
> > > >>
> > > >> Richard.
> > > >>
> > > >> > ________________________________________
> > > >> > From: gcc-patches-owner@gcc.gnu.org
> > > >> > <gcc-patches-owner@gcc.gnu.org>
> > > >> on
> > > >> > behalf of Andreas Schwab <schwab@linux-m68k.org>
> > > >> > Sent: Saturday, September 2, 2017 10:58 PM
> > > >> > To: Jon Beniston
> > > >> > Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
> > > >> > Subject: Re: [RFC, vectorizer] Allow single element vector types
> > > >> > for vector reduction operations
> > > >> >
> > > >> > On Aug 30 2017, "Jon Beniston" <jon@beniston.com> wrote:
> > > >> >
> > > >> > > gcc/
> > > >> > > 2017-08-30  Jon Beniston  <jon@beniston.com>
> > > >> > >             Richard Biener  <rguenther@suse.de>
> > > >> > >
> > > >> > > diff --git a/gcc/tree-vect-patterns.c
> > > >> > > b/gcc/tree-vect-patterns.c index cfdb72c..5ebeac2 100644
> > > >> > > --- a/gcc/tree-vect-patterns.c
> > > >> > > +++ b/gcc/tree-vect-patterns.c
> > > >> > > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
> > > >> *recog_func,
> > > >> > >    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> > > >> > >
> > > >> > >    if (VECTOR_BOOLEAN_TYPE_P (type_in)
> > > >> > > -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
> > > >> > > +      || VECTOR_TYPE_P (type_in))
> > > >> > >      {
> > > >> > >        /* No need to check target support (already checked by the
> > pattern
> > > >> > >           recognition function).  */ diff --git
> > > >> > > a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index
> > > >> > > 013fb1f..fc62efb 100644
> > > >> > > --- a/gcc/tree-vect-stmts.c
> > > >> > > +++ b/gcc/tree-vect-stmts.c
> > > >> > > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size
> > > >> > > (tree scalar_type, unsigned size)
> > > >> > >    else
> > > >> > >      simd_mode = mode_for_vector (inner_mode, size / nbytes);
> > > >> > >    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> > > >> > > -  if (nunits <= 1)
> > > >> > > +  /* NOTE: nunits == 1 is allowed to support single element vector
> > types.
> > > >> > > */
> > > >> > > +  if (nunits < 1)
> > > >> > >      return NULL_TREE;
> > > >> > >
> > > >> > >    vectype = build_vector_type (scalar_type, nunits);
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> >
> > > >> > That breaks vect/pr68577.c on aarch64.
> > > >> >
> > > >> > during RTL pass: expand
> > > >> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In
> > > >> > function
> > > >> 'slp_test':
> > > >> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12:
> > > >> > internal compiler error: in simplify_subreg, at
> > > >> > simplify-rtx.c:6050
> > > >> > 0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode,
> > > >> unsigned int)
> > > >> >         ../../gcc/simplify-rtx.c:6049
> > > >> > 0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*,
> > > >> > machine_mode,
> > > >> unsigned int)
> > > >> >         ../../gcc/simplify-rtx.c:6278
> > > >> > 0x81d277 store_bit_field_1
> > > >> >         ../../gcc/expmed.c:798
> > > >> > 0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long,
> > > >> > unsigned
> > > >> long, unsigned long, machine_mode, rtx_def*, bool)
> > > >> >         ../../gcc/expmed.c:1133
> > > >> > 0x840bf7 store_field
> > > >> >         ../../gcc/expr.c:6950
> > > >> > 0x84792f store_constructor_field
> > > >> >         ../../gcc/expr.c:6142
> > > >> > 0x83edbf store_constructor
> > > >> >         ../../gcc/expr.c:6726
> > > >> > 0x840443 expand_constructor
> > > >> >         ../../gcc/expr.c:8027
> > > >> > 0x82d5bf expand_expr_real_1(tree_node*, rtx_def*,
> > machine_mode,
> > > >> expand_modifier, rtx_def**, bool)
> > > >> >         ../../gcc/expr.c:10133
> > > >> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*,
> > machine_mode,
> > > >> expand_modifier, rtx_def**, bool)
> > > >> >         ../../gcc/expr.c:9819
> > > >> > 0x82dadf expand_expr_real_1(tree_node*, rtx_def*,
> > machine_mode,
> > > >> expand_modifier, rtx_def**, bool)
> > > >> >         ../../gcc/expr.c:10942
> > > >> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*,
> > machine_mode,
> > > >> expand_modifier, rtx_def**, bool)
> > > >> >         ../../gcc/expr.c:9819
> > > >> > 0x83d197 expand_expr
> > > >> >         ../../gcc/expr.h:276
> > > >> > 0x83d197 expand_assignment(tree_node*, tree_node*, bool)
> > > >> >         ../../gcc/expr.c:4971
> > > >> > 0x71e2f3 expand_gimple_stmt_1
> > > >> >         ../../gcc/cfgexpand.c:3653
> > > >> > 0x71e2f3 expand_gimple_stmt
> > > >> >         ../../gcc/cfgexpand.c:3751 0x721cdb
> > > >> > expand_gimple_basic_block
> > > >> >         ../../gcc/cfgexpand.c:5750
> > > >> > 0x726b07 execute
> > > >> >         ../../gcc/cfgexpand.c:6357
> > > >> >
> > > >> > Andreas.
> > > >> >
> > > >> > --
> > > >> > Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA
> > > >> > 54C7 6D53 942B 1756  01D3 44D5 214B 8276
> > > >> > 4ED5 "And now for something completely different."
> > > >> >
> > > >> >
> > > >>
> > > >> --
> > > >> Richard Biener <rguenther@suse.de>
> > > >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham
> > > >> Norton, HRB 21284 (AG Nuernberg)
> > >
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-05 10:41                           ` Richard Biener
@ 2017-09-05 12:50                             ` Richard Biener
  2017-09-05 13:06                               ` Tamar Christina
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-09-05 12:50 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Andrew Pinski, Andreas Schwab, Jon Beniston, gcc-patches, nd

On Tue, 5 Sep 2017, Richard Biener wrote:

> On Tue, 5 Sep 2017, Tamar Christina wrote:
> 
> > Hi Richard,
> > 
> > That was an really interesting analysis, thanks for the details!
> > 
> > Would you be submitting the patch you proposed at the end as a fix?
> 
> I'm testing it currently.

Unfortunately it breaks some required lowering.  I'll have to more
closely look at this.

Richard.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-05 12:50                             ` Richard Biener
@ 2017-09-05 13:06                               ` Tamar Christina
  2017-09-05 13:12                                 ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Tamar Christina @ 2017-09-05 13:06 UTC (permalink / raw)
  To: Richard Biener
  Cc: Andrew Pinski, Andreas Schwab, Jon Beniston, gcc-patches, nd



> -----Original Message-----
> From: Richard Biener [mailto:rguenther@suse.de]
> Sent: 05 September 2017 13:51
> To: Tamar Christina
> Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; gcc-patches@gcc.gnu.org;
> nd
> Subject: RE: [RFC, vectorizer] Allow single element vector types for vector
> reduction operations
> 
> On Tue, 5 Sep 2017, Richard Biener wrote:
> 
> > On Tue, 5 Sep 2017, Tamar Christina wrote:
> >
> > > Hi Richard,
> > >
> > > That was an really interesting analysis, thanks for the details!
> > >
> > > Would you be submitting the patch you proposed at the end as a fix?
> >
> > I'm testing it currently.
> 
> Unfortunately it breaks some required lowering.  I'll have to more closely
> look at this.

Ah, ok. In the meantime, can this patch be reverted? It's currently breaking spec for us so we're
Not able to get any benchmarking numbers.

> 
> Richard.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-05 13:06                               ` Tamar Christina
@ 2017-09-05 13:12                                 ` Richard Biener
  2017-09-05 14:00                                   ` Tamar Christina
  2017-09-11 15:39                                   ` Vidya Praveen
  0 siblings, 2 replies; 33+ messages in thread
From: Richard Biener @ 2017-09-05 13:12 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Andrew Pinski, Andreas Schwab, Jon Beniston, gcc-patches, nd

On Tue, 5 Sep 2017, Tamar Christina wrote:

> 
> 
> > -----Original Message-----
> > From: Richard Biener [mailto:rguenther@suse.de]
> > Sent: 05 September 2017 13:51
> > To: Tamar Christina
> > Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; gcc-patches@gcc.gnu.org;
> > nd
> > Subject: RE: [RFC, vectorizer] Allow single element vector types for vector
> > reduction operations
> > 
> > On Tue, 5 Sep 2017, Richard Biener wrote:
> > 
> > > On Tue, 5 Sep 2017, Tamar Christina wrote:
> > >
> > > > Hi Richard,
> > > >
> > > > That was an really interesting analysis, thanks for the details!
> > > >
> > > > Would you be submitting the patch you proposed at the end as a fix?
> > >
> > > I'm testing it currently.
> > 
> > Unfortunately it breaks some required lowering.  I'll have to more closely
> > look at this.
> 
> Ah, ok. In the meantime, can this patch be reverted? It's currently breaking spec for us so we're
> Not able to get any benchmarking numbers.

Testing the following instead:

Index: gcc/tree-vect-generic.c
===================================================================
--- gcc/tree-vect-generic.c     (revision 251642)
+++ gcc/tree-vect-generic.c     (working copy)
@@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
       || code == VEC_UNPACK_FLOAT_LO_EXPR)
     type = TREE_TYPE (rhs1);
 
-  /* For widening/narrowing vector operations, the relevant type is of 
the
+  /* For widening vector operations, the relevant type is of the
      arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
      calculated in the same way above.  */
   if (code == WIDEN_SUM_EXPR
@@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
       || code == VEC_WIDEN_MULT_ODD_EXPR
       || code == VEC_UNPACK_HI_EXPR
       || code == VEC_UNPACK_LO_EXPR
-      || code == VEC_PACK_TRUNC_EXPR
-      || code == VEC_PACK_SAT_EXPR
-      || code == VEC_PACK_FIX_TRUNC_EXPR
       || code == VEC_WIDEN_LSHIFT_HI_EXPR
       || code == VEC_WIDEN_LSHIFT_LO_EXPR)
     type = TREE_TYPE (rhs1);


also fix for a bug uncovered by the previous one:

Index: gcc/gimple-ssa-strength-reduction.c
===================================================================
--- gcc/gimple-ssa-strength-reduction.c (revision 251710)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
        slsr_process_ref (gs);
 
       else if (is_gimple_assign (gs)
-              && SCALAR_INT_MODE_P
-                   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)))))
+              && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (gs))))
        {
          tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-05 13:12                                 ` Richard Biener
@ 2017-09-05 14:00                                   ` Tamar Christina
  2017-09-11 15:39                                   ` Vidya Praveen
  1 sibling, 0 replies; 33+ messages in thread
From: Tamar Christina @ 2017-09-05 14:00 UTC (permalink / raw)
  To: Richard Biener
  Cc: Andrew Pinski, Andreas Schwab, Jon Beniston, gcc-patches, nd



> -----Original Message-----
> From: Richard Biener [mailto:rguenther@suse.de]
> Sent: 05 September 2017 14:13
> To: Tamar Christina
> Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; gcc-patches@gcc.gnu.org;
> nd
> Subject: RE: [RFC, vectorizer] Allow single element vector types for vector
> reduction operations
> 
> On Tue, 5 Sep 2017, Tamar Christina wrote:
> 
> >
> >
> > > -----Original Message-----
> > > From: Richard Biener [mailto:rguenther@suse.de]
> > > Sent: 05 September 2017 13:51
> > > To: Tamar Christina
> > > Cc: Andrew Pinski; Andreas Schwab; Jon Beniston;
> > > gcc-patches@gcc.gnu.org; nd
> > > Subject: RE: [RFC, vectorizer] Allow single element vector types for
> > > vector reduction operations
> > >
> > > On Tue, 5 Sep 2017, Richard Biener wrote:
> > >
> > > > On Tue, 5 Sep 2017, Tamar Christina wrote:
> > > >
> > > > > Hi Richard,
> > > > >
> > > > > That was an really interesting analysis, thanks for the details!
> > > > >
> > > > > Would you be submitting the patch you proposed at the end as a fix?
> > > >
> > > > I'm testing it currently.
> > >
> > > Unfortunately it breaks some required lowering.  I'll have to more
> > > closely look at this.
> >
> > Ah, ok. In the meantime, can this patch be reverted? It's currently
> > breaking spec for us so we're Not able to get any benchmarking numbers.
> 
> Testing the following instead:

That does seem to build spec again, haven't tested the testsuite yet. 

> Index: gcc/tree-vect-generic.c
> ==========================================================
> =========
> --- gcc/tree-vect-generic.c     (revision 251642)
> +++ gcc/tree-vect-generic.c     (working copy)
> @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
>        || code == VEC_UNPACK_FLOAT_LO_EXPR)
>      type = TREE_TYPE (rhs1);
> 
> -  /* For widening/narrowing vector operations, the relevant type is of the
> +  /* For widening vector operations, the relevant type is of the
>       arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
>       calculated in the same way above.  */
>    if (code == WIDEN_SUM_EXPR
> @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
>        || code == VEC_WIDEN_MULT_ODD_EXPR
>        || code == VEC_UNPACK_HI_EXPR
>        || code == VEC_UNPACK_LO_EXPR
> -      || code == VEC_PACK_TRUNC_EXPR
> -      || code == VEC_PACK_SAT_EXPR
> -      || code == VEC_PACK_FIX_TRUNC_EXPR
>        || code == VEC_WIDEN_LSHIFT_HI_EXPR
>        || code == VEC_WIDEN_LSHIFT_LO_EXPR)
>      type = TREE_TYPE (rhs1);
> 
> 
> also fix for a bug uncovered by the previous one:
> 
> Index: gcc/gimple-ssa-strength-reduction.c
> ==========================================================
> =========
> --- gcc/gimple-ssa-strength-reduction.c (revision 251710)
> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
>         slsr_process_ref (gs);
> 
>        else if (is_gimple_assign (gs)
> -              && SCALAR_INT_MODE_P
> -                   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)))))
> +              && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (gs))))
>         {
>           tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-08-28  8:22 [RFC, vectorizer] Allow single element vector types for vector reduction operations Jon Beniston
  2017-08-28  8:29 ` Richard Biener
@ 2017-09-07 11:08 ` Bernd Schmidt
  2017-09-07 11:20   ` Jon Beniston
  1 sibling, 1 reply; 33+ messages in thread
From: Bernd Schmidt @ 2017-09-07 11:08 UTC (permalink / raw)
  To: Jon Beniston, gcc-patches; +Cc: rguenther

On 08/27/2017 09:36 PM, Jon Beniston wrote:
> I have an out-of-tree GCC port and it is struggling supporting
> auto-vectorization on some dot product instructions.  For example, I have an
> instruction that takes three operands which are all 32-bit general
> registers. The second and third operands will be treated as V2HI then do dot
> product, and then generate an SI result which is then added to the first
> operand which is SI as well.

This seems to ring a bell. I think I submitted a patch for this here -
is this the same problem?
https://gcc.gnu.org/ml/gcc-patches/2010-12/msg01724.html


Bernd

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-07 11:08 ` Bernd Schmidt
@ 2017-09-07 11:20   ` Jon Beniston
  2017-09-07 14:42     ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Jon Beniston @ 2017-09-07 11:20 UTC (permalink / raw)
  To: 'Bernd Schmidt', gcc-patches; +Cc: rguenther

Hi Bernd,

>This seems to ring a bell. I think I submitted a patch for this here 
>- is this the same problem?
>https://gcc.gnu.org/ml/gcc-patches/2010-12/msg01724.html

Looks like it.

Cheers,
Jon


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-07 11:20   ` Jon Beniston
@ 2017-09-07 14:42     ` Richard Biener
  0 siblings, 0 replies; 33+ messages in thread
From: Richard Biener @ 2017-09-07 14:42 UTC (permalink / raw)
  To: Jon Beniston, 'Bernd Schmidt', gcc-patches

On September 7, 2017 1:20:13 PM GMT+02:00, Jon Beniston <jon@beniston.com> wrote:
>Hi Bernd,
>
>>This seems to ring a bell. I think I submitted a patch for this here 
>>- is this the same problem?
>>https://gcc.gnu.org/ml/gcc-patches/2010-12/msg01724.html
>
>Looks like it.

Looks my suggestion was exactly what we have now. Note I didn't yet manage to do minimal surgery to vector lowering to fix the ARM fallout... 

Richard. 

>Cheers,
>Jon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-05 13:12                                 ` Richard Biener
  2017-09-05 14:00                                   ` Tamar Christina
@ 2017-09-11 15:39                                   ` Vidya Praveen
  2017-09-12  7:13                                     ` Richard Biener
  1 sibling, 1 reply; 33+ messages in thread
From: Vidya Praveen @ 2017-09-11 15:39 UTC (permalink / raw)
  To: Richard Biener
  Cc: Tamar Christina, Andrew Pinski, Andreas Schwab, Jon Beniston,
	gcc-patches, nd

On Tue, Sep 05, 2017 at 03:12:47PM +0200, Richard Biener wrote:
> On Tue, 5 Sep 2017, Tamar Christina wrote:
> 
> > 
> > 
> > > -----Original Message-----
> > > From: Richard Biener [mailto:rguenther@suse.de]
> > > Sent: 05 September 2017 13:51
> > > To: Tamar Christina
> > > Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; gcc-patches@gcc.gnu.org;
> > > nd
> > > Subject: RE: [RFC, vectorizer] Allow single element vector types for vector
> > > reduction operations
> > > 
> > > On Tue, 5 Sep 2017, Richard Biener wrote:
> > > 
> > > > On Tue, 5 Sep 2017, Tamar Christina wrote:
> > > >
> > > > > Hi Richard,
> > > > >
> > > > > That was an really interesting analysis, thanks for the details!
> > > > >
> > > > > Would you be submitting the patch you proposed at the end as a fix?
> > > >
> > > > I'm testing it currently.
> > > 
> > > Unfortunately it breaks some required lowering.  I'll have to more closely
> > > look at this.
> > 
> > Ah, ok. In the meantime, can this patch be reverted? It's currently breaking spec for us so we're
> > Not able to get any benchmarking numbers.
> 
> Testing the following instead:

Any news on this?

VP.


> 
> Index: gcc/tree-vect-generic.c
> ===================================================================
> --- gcc/tree-vect-generic.c     (revision 251642)
> +++ gcc/tree-vect-generic.c     (working copy)
> @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
>        || code == VEC_UNPACK_FLOAT_LO_EXPR)
>      type = TREE_TYPE (rhs1);
>  
> -  /* For widening/narrowing vector operations, the relevant type is of 
> the
> +  /* For widening vector operations, the relevant type is of the
>       arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
>       calculated in the same way above.  */
>    if (code == WIDEN_SUM_EXPR
> @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
>        || code == VEC_WIDEN_MULT_ODD_EXPR
>        || code == VEC_UNPACK_HI_EXPR
>        || code == VEC_UNPACK_LO_EXPR
> -      || code == VEC_PACK_TRUNC_EXPR
> -      || code == VEC_PACK_SAT_EXPR
> -      || code == VEC_PACK_FIX_TRUNC_EXPR
>        || code == VEC_WIDEN_LSHIFT_HI_EXPR
>        || code == VEC_WIDEN_LSHIFT_LO_EXPR)
>      type = TREE_TYPE (rhs1);
> 
> 
> also fix for a bug uncovered by the previous one:
> 
> Index: gcc/gimple-ssa-strength-reduction.c
> ===================================================================
> --- gcc/gimple-ssa-strength-reduction.c (revision 251710)
> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
>         slsr_process_ref (gs);
>  
>        else if (is_gimple_assign (gs)
> -              && SCALAR_INT_MODE_P
> -                   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)))))
> +              && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (gs))))
>         {
>           tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
>  
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-11 15:39                                   ` Vidya Praveen
@ 2017-09-12  7:13                                     ` Richard Biener
  2017-09-12  8:45                                       ` Tamar Christina
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-09-12  7:13 UTC (permalink / raw)
  To: Vidya Praveen
  Cc: Tamar Christina, Andrew Pinski, Andreas Schwab, Jon Beniston,
	gcc-patches, nd

On Mon, 11 Sep 2017, Vidya Praveen wrote:

> On Tue, Sep 05, 2017 at 03:12:47PM +0200, Richard Biener wrote:
> > On Tue, 5 Sep 2017, Tamar Christina wrote:
> > 
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: Richard Biener [mailto:rguenther@suse.de]
> > > > Sent: 05 September 2017 13:51
> > > > To: Tamar Christina
> > > > Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; gcc-patches@gcc.gnu.org;
> > > > nd
> > > > Subject: RE: [RFC, vectorizer] Allow single element vector types for vector
> > > > reduction operations
> > > > 
> > > > On Tue, 5 Sep 2017, Richard Biener wrote:
> > > > 
> > > > > On Tue, 5 Sep 2017, Tamar Christina wrote:
> > > > >
> > > > > > Hi Richard,
> > > > > >
> > > > > > That was an really interesting analysis, thanks for the details!
> > > > > >
> > > > > > Would you be submitting the patch you proposed at the end as a fix?
> > > > >
> > > > > I'm testing it currently.
> > > > 
> > > > Unfortunately it breaks some required lowering.  I'll have to more closely
> > > > look at this.
> > > 
> > > Ah, ok. In the meantime, can this patch be reverted? It's currently breaking spec for us so we're
> > > Not able to get any benchmarking numbers.
> > 
> > Testing the following instead:
> 
> Any news on this?

Didn't work out as expected.  I think the logic in tree-vect-generic
is the one to be fixed but I have to carve out some time to look
into it so stay tuned.

Richard.

> VP.
> 
> 
> > 
> > Index: gcc/tree-vect-generic.c
> > ===================================================================
> > --- gcc/tree-vect-generic.c     (revision 251642)
> > +++ gcc/tree-vect-generic.c     (working copy)
> > @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
> >        || code == VEC_UNPACK_FLOAT_LO_EXPR)
> >      type = TREE_TYPE (rhs1);
> >  
> > -  /* For widening/narrowing vector operations, the relevant type is of 
> > the
> > +  /* For widening vector operations, the relevant type is of the
> >       arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
> >       calculated in the same way above.  */
> >    if (code == WIDEN_SUM_EXPR
> > @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
> >        || code == VEC_WIDEN_MULT_ODD_EXPR
> >        || code == VEC_UNPACK_HI_EXPR
> >        || code == VEC_UNPACK_LO_EXPR
> > -      || code == VEC_PACK_TRUNC_EXPR
> > -      || code == VEC_PACK_SAT_EXPR
> > -      || code == VEC_PACK_FIX_TRUNC_EXPR
> >        || code == VEC_WIDEN_LSHIFT_HI_EXPR
> >        || code == VEC_WIDEN_LSHIFT_LO_EXPR)
> >      type = TREE_TYPE (rhs1);
> > 
> > 
> > also fix for a bug uncovered by the previous one:
> > 
> > Index: gcc/gimple-ssa-strength-reduction.c
> > ===================================================================
> > --- gcc/gimple-ssa-strength-reduction.c (revision 251710)
> > +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> > @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
> >         slsr_process_ref (gs);
> >  
> >        else if (is_gimple_assign (gs)
> > -              && SCALAR_INT_MODE_P
> > -                   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)))))
> > +              && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (gs))))
> >         {
> >           tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
> >  
> > 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-12  7:13                                     ` Richard Biener
@ 2017-09-12  8:45                                       ` Tamar Christina
  2017-09-12  9:03                                         ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Tamar Christina @ 2017-09-12  8:45 UTC (permalink / raw)
  To: Richard Biener, Vidya Praveen
  Cc: Andrew Pinski, Andreas Schwab, Jon Beniston, gcc-patches, nd

Hi Jon, Richard,

> > >
> > > Testing the following instead:
> >
> > Any news on this?
> 
> Didn't work out as expected.  I think the logic in tree-vect-generic is the one
> to be fixed but I have to carve out some time to look into it so stay tuned.

In the meantime can this patch be reverted? It is causing ICEs on AArch64 and ARM
Preventing us from building benchmarks and tracking the toolchain performance.

Thanks,
Tamar

> 
> Richard.
> 
> > VP.
> >
> >
> > >
> > > Index: gcc/tree-vect-generic.c
> > >
> ==========================================================
> =========
> > > --- gcc/tree-vect-generic.c     (revision 251642)
> > > +++ gcc/tree-vect-generic.c     (working copy)
> > > @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
> > >        || code == VEC_UNPACK_FLOAT_LO_EXPR)
> > >      type = TREE_TYPE (rhs1);
> > >
> > > -  /* For widening/narrowing vector operations, the relevant type is
> > > of the
> > > +  /* For widening vector operations, the relevant type is of the
> > >       arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
> > >       calculated in the same way above.  */
> > >    if (code == WIDEN_SUM_EXPR
> > > @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
> > >        || code == VEC_WIDEN_MULT_ODD_EXPR
> > >        || code == VEC_UNPACK_HI_EXPR
> > >        || code == VEC_UNPACK_LO_EXPR
> > > -      || code == VEC_PACK_TRUNC_EXPR
> > > -      || code == VEC_PACK_SAT_EXPR
> > > -      || code == VEC_PACK_FIX_TRUNC_EXPR
> > >        || code == VEC_WIDEN_LSHIFT_HI_EXPR
> > >        || code == VEC_WIDEN_LSHIFT_LO_EXPR)
> > >      type = TREE_TYPE (rhs1);
> > >
> > >
> > > also fix for a bug uncovered by the previous one:
> > >
> > > Index: gcc/gimple-ssa-strength-reduction.c
> > >
> ==========================================================
> =========
> > > --- gcc/gimple-ssa-strength-reduction.c (revision 251710)
> > > +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> > > @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
> > >         slsr_process_ref (gs);
> > >
> > >        else if (is_gimple_assign (gs)
> > > -              && SCALAR_INT_MODE_P
> > > -                   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)))))
> > > +              && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs
> > > + (gs))))
> > >         {
> > >           tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
> > >
> > >
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-12  8:45                                       ` Tamar Christina
@ 2017-09-12  9:03                                         ` Richard Biener
  2017-09-12 10:50                                           ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-09-12  9:03 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Vidya Praveen, Andrew Pinski, Andreas Schwab, Jon Beniston,
	gcc-patches, nd

On Tue, 12 Sep 2017, Tamar Christina wrote:

> Hi Jon, Richard,
> 
> > > >
> > > > Testing the following instead:
> > >
> > > Any news on this?
> > 
> > Didn't work out as expected.  I think the logic in tree-vect-generic is the one
> > to be fixed but I have to carve out some time to look into it so stay tuned.
> 
> In the meantime can this patch be reverted? It is causing ICEs on 
> AArch64 and ARM Preventing us from building benchmarks and tracking the 
> toolchain performance.

You can always locally patch things.  I have yet another patch in
testing right now.

Richard.

> Thanks,
> Tamar
> 
> > 
> > Richard.
> > 
> > > VP.
> > >
> > >
> > > >
> > > > Index: gcc/tree-vect-generic.c
> > > >
> > ==========================================================
> > =========
> > > > --- gcc/tree-vect-generic.c     (revision 251642)
> > > > +++ gcc/tree-vect-generic.c     (working copy)
> > > > @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_
> > > >        || code == VEC_UNPACK_FLOAT_LO_EXPR)
> > > >      type = TREE_TYPE (rhs1);
> > > >
> > > > -  /* For widening/narrowing vector operations, the relevant type is
> > > > of the
> > > > +  /* For widening vector operations, the relevant type is of the
> > > >       arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
> > > >       calculated in the same way above.  */
> > > >    if (code == WIDEN_SUM_EXPR
> > > > @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_
> > > >        || code == VEC_WIDEN_MULT_ODD_EXPR
> > > >        || code == VEC_UNPACK_HI_EXPR
> > > >        || code == VEC_UNPACK_LO_EXPR
> > > > -      || code == VEC_PACK_TRUNC_EXPR
> > > > -      || code == VEC_PACK_SAT_EXPR
> > > > -      || code == VEC_PACK_FIX_TRUNC_EXPR
> > > >        || code == VEC_WIDEN_LSHIFT_HI_EXPR
> > > >        || code == VEC_WIDEN_LSHIFT_LO_EXPR)
> > > >      type = TREE_TYPE (rhs1);
> > > >
> > > >
> > > > also fix for a bug uncovered by the previous one:
> > > >
> > > > Index: gcc/gimple-ssa-strength-reduction.c
> > > >
> > ==========================================================
> > =========
> > > > --- gcc/gimple-ssa-strength-reduction.c (revision 251710)
> > > > +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> > > > @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c
> > > >         slsr_process_ref (gs);
> > > >
> > > >        else if (is_gimple_assign (gs)
> > > > -              && SCALAR_INT_MODE_P
> > > > -                   (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs)))))
> > > > +              && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs
> > > > + (gs))))
> > > >         {
> > > >           tree rhs1 = NULL_TREE, rhs2 = NULL_TREE;
> > > >
> > > >
> > >
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-12  9:03                                         ` Richard Biener
@ 2017-09-12 10:50                                           ` Richard Biener
  2017-09-12 16:26                                             ` Andreas Schwab
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-09-12 10:50 UTC (permalink / raw)
  To: Tamar Christina
  Cc: Vidya Praveen, Andrew Pinski, Andreas Schwab, Jon Beniston,
	gcc-patches, nd

On Tue, 12 Sep 2017, Richard Biener wrote:

> On Tue, 12 Sep 2017, Tamar Christina wrote:
> 
> > Hi Jon, Richard,
> > 
> > > > >
> > > > > Testing the following instead:
> > > >
> > > > Any news on this?
> > > 
> > > Didn't work out as expected.  I think the logic in tree-vect-generic is the one
> > > to be fixed but I have to carve out some time to look into it so stay tuned.
> > 
> > In the meantime can this patch be reverted? It is causing ICEs on 
> > AArch64 and ARM Preventing us from building benchmarks and tracking the 
> > toolchain performance.
> 
> You can always locally patch things.  I have yet another patch in
> testing right now.

I've installed the following.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2017-09-12  Richard Biener  <rguenther@suse.de>

	* tree-vect-generic.c (expand_vector_operations_1): Do nothing
	for operations we cannot scalarize.

Index: gcc/tree-vect-generic.c
===================================================================
--- gcc/tree-vect-generic.c	(revision 251997)
+++ gcc/tree-vect-generic.c	(working copy)
@@ -1638,7 +1638,11 @@ expand_vector_operations_1 (gimple_stmt_
   /* The signedness is determined from input argument.  */
   if (code == VEC_UNPACK_FLOAT_HI_EXPR
       || code == VEC_UNPACK_FLOAT_LO_EXPR)
-    type = TREE_TYPE (rhs1);
+    {
+      type = TREE_TYPE (rhs1);
+      /* We do not know how to scalarize those.  */
+      return;
+    }
 
   /* For widening/narrowing vector operations, the relevant type is of the
      arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
@@ -1655,7 +1659,11 @@ expand_vector_operations_1 (gimple_stmt_
       || code == VEC_PACK_FIX_TRUNC_EXPR
       || code == VEC_WIDEN_LSHIFT_HI_EXPR
       || code == VEC_WIDEN_LSHIFT_LO_EXPR)
-    type = TREE_TYPE (rhs1);
+    {
+      type = TREE_TYPE (rhs1);
+      /* We do not know how to scalarize those.  */
+      return;
+    }
 
   /* Choose between vector shift/rotate by vector and vector shift/rotate by
      scalar */

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-12 10:50                                           ` Richard Biener
@ 2017-09-12 16:26                                             ` Andreas Schwab
  2017-09-12 16:36                                               ` Richard Biener
  0 siblings, 1 reply; 33+ messages in thread
From: Andreas Schwab @ 2017-09-12 16:26 UTC (permalink / raw)
  To: Richard Biener
  Cc: Tamar Christina, Vidya Praveen, Andrew Pinski, Jon Beniston,
	gcc-patches, nd

On Sep 12 2017, Richard Biener <rguenther@suse.de> wrote:

> Index: gcc/tree-vect-generic.c
> ===================================================================
> --- gcc/tree-vect-generic.c	(revision 251997)
> +++ gcc/tree-vect-generic.c	(working copy)
> @@ -1638,7 +1638,11 @@ expand_vector_operations_1 (gimple_stmt_
>    /* The signedness is determined from input argument.  */
>    if (code == VEC_UNPACK_FLOAT_HI_EXPR
>        || code == VEC_UNPACK_FLOAT_LO_EXPR)
> -    type = TREE_TYPE (rhs1);
> +    {
> +      type = TREE_TYPE (rhs1);
> +      /* We do not know how to scalarize those.  */
> +      return;

The assignment is surely useless.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-12 16:26                                             ` Andreas Schwab
@ 2017-09-12 16:36                                               ` Richard Biener
  2017-09-12 17:13                                                 ` Tamar Christina
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Biener @ 2017-09-12 16:36 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Tamar Christina, Vidya Praveen, Andrew Pinski, Jon Beniston,
	gcc-patches, nd

On September 12, 2017 6:25:32 PM GMT+02:00, Andreas Schwab <schwab@linux-m68k.org> wrote:
>On Sep 12 2017, Richard Biener <rguenther@suse.de> wrote:
>
>> Index: gcc/tree-vect-generic.c
>> ===================================================================
>> --- gcc/tree-vect-generic.c	(revision 251997)
>> +++ gcc/tree-vect-generic.c	(working copy)
>> @@ -1638,7 +1638,11 @@ expand_vector_operations_1 (gimple_stmt_
>>    /* The signedness is determined from input argument.  */
>>    if (code == VEC_UNPACK_FLOAT_HI_EXPR
>>        || code == VEC_UNPACK_FLOAT_LO_EXPR)
>> -    type = TREE_TYPE (rhs1);
>> +    {
>> +      type = TREE_TYPE (rhs1);
>> +      /* We do not know how to scalarize those.  */
>> +      return;
>
>The assignment is surely useless.

Sure. I left it because all this needs further cleanup. 

Richard. 

>Andreas.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
  2017-09-12 16:36                                               ` Richard Biener
@ 2017-09-12 17:13                                                 ` Tamar Christina
  0 siblings, 0 replies; 33+ messages in thread
From: Tamar Christina @ 2017-09-12 17:13 UTC (permalink / raw)
  To: Richard Biener, Andreas Schwab
  Cc: Vidya Praveen, Andrew Pinski, Jon Beniston, gcc-patches, nd

I've been able to build spec again with this patch.

Thanks Richard!

Tamar.
________________________________________
From: Richard Biener <rguenther@suse.de>
Sent: Tuesday, September 12, 2017 5:36:10 PM
To: Andreas Schwab
Cc: Tamar Christina; Vidya Praveen; Andrew Pinski; Jon Beniston; gcc-patches@gcc.gnu.org; nd
Subject: Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

On September 12, 2017 6:25:32 PM GMT+02:00, Andreas Schwab <schwab@linux-m68k.org> wrote:
>On Sep 12 2017, Richard Biener <rguenther@suse.de> wrote:
>
>> Index: gcc/tree-vect-generic.c
>> ===================================================================
>> --- gcc/tree-vect-generic.c  (revision 251997)
>> +++ gcc/tree-vect-generic.c  (working copy)
>> @@ -1638,7 +1638,11 @@ expand_vector_operations_1 (gimple_stmt_
>>    /* The signedness is determined from input argument.  */
>>    if (code == VEC_UNPACK_FLOAT_HI_EXPR
>>        || code == VEC_UNPACK_FLOAT_LO_EXPR)
>> -    type = TREE_TYPE (rhs1);
>> +    {
>> +      type = TREE_TYPE (rhs1);
>> +      /* We do not know how to scalarize those.  */
>> +      return;
>
>The assignment is surely useless.

Sure. I left it because all this needs further cleanup.

Richard.

>Andreas.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2017-09-12 17:13 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-28  8:22 [RFC, vectorizer] Allow single element vector types for vector reduction operations Jon Beniston
2017-08-28  8:29 ` Richard Biener
2017-08-28  9:04   ` Jon Beniston
2017-08-28 11:41     ` Richard Biener
2017-08-30 11:13       ` Jon Beniston
2017-08-30 12:28         ` Richard Biener
2017-08-30 15:52           ` Jon Beniston
2017-08-30 16:32             ` Richard Biener
2017-08-30 20:42             ` Richard Sandiford
2017-09-02 21:58             ` Andreas Schwab
2017-09-04  8:58               ` Tamar Christina
2017-09-04  9:19                 ` Richard Biener
2017-09-04 14:28                   ` Tamar Christina
2017-09-04 16:48                     ` Andrew Pinski
2017-09-05  9:37                       ` Richard Biener
2017-09-05 10:24                         ` Tamar Christina
2017-09-05 10:41                           ` Richard Biener
2017-09-05 12:50                             ` Richard Biener
2017-09-05 13:06                               ` Tamar Christina
2017-09-05 13:12                                 ` Richard Biener
2017-09-05 14:00                                   ` Tamar Christina
2017-09-11 15:39                                   ` Vidya Praveen
2017-09-12  7:13                                     ` Richard Biener
2017-09-12  8:45                                       ` Tamar Christina
2017-09-12  9:03                                         ` Richard Biener
2017-09-12 10:50                                           ` Richard Biener
2017-09-12 16:26                                             ` Andreas Schwab
2017-09-12 16:36                                               ` Richard Biener
2017-09-12 17:13                                                 ` Tamar Christina
2017-09-04 15:04             ` Christophe Lyon
2017-09-07 11:08 ` Bernd Schmidt
2017-09-07 11:20   ` Jon Beniston
2017-09-07 14:42     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).