public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
@ 2022-05-25  9:11 Joel Hutton
  2022-05-27 13:23 ` Richard Biener
  0 siblings, 1 reply; 53+ messages in thread
From: Joel Hutton @ 2022-05-25  9:11 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Biener, gcc-patches

Ping!

Just checking there is still interest in this. I'm assuming you've been busy with release.

Joel

> -----Original Message-----
> From: Joel Hutton
> Sent: 13 April 2022 16:53
> To: Richard Sandiford <richard.sandiford@arm.com>
> Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org
> Subject: [vect-patterns] Refactor widen_plus/widen_minus as internal_fns
> 
> Hi all,
> 
> These patches refactor the widening patterns in vect-patterns to use
> internal_fn instead of tree_codes.
> 
> Sorry about the delay, some changes to master made it a bit messier.
> 
> Bootstrapped and regression tested on aarch64.
> 
> Joel
> 
> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> > > index 854cbcff390..4a8ea67e62f 100644
> > > --- a/gcc/tree-vect-patterns.c
> > > +++ b/gcc/tree-vect-patterns.c
> > > @@ -1245,7 +1245,7 @@ vect_recog_sad_pattern (vec_info *vinfo,
> > > static gimple *  vect_recog_widen_op_pattern (vec_info *vinfo,
> > >  			     stmt_vec_info last_stmt_info, tree *type_out,
> > > -			     tree_code orig_code, tree_code wide_code,
> > > +			     tree_code orig_code, code_helper
> > wide_code_or_ifn,
> >
> > I think it'd be better to keep the original “wide_code” name and try
> > to remove as many places as possible in which switching based on
> > tree_code or internal_fn is necessary.  The recent gimple-match.h
> > patches should help with that, but more routines might be needed.
> 
> Done.
> 
> > > @@ -1309,8 +1310,16 @@ vect_recog_widen_op_pattern (vec_info
> *vinfo,
> > >  		       2, oprnd, half_type, unprom, vectype);
> > >
> > >    tree var = vect_recog_temp_ssa_var (itype, NULL);
> > > -  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
> > > -					      oprnd[0], oprnd[1]);
> > > +  gimple *pattern_stmt;
> > > +  if (wide_code_or_ifn.is_tree_code ())
> > > +    pattern_stmt = gimple_build_assign (var, wide_code_or_ifn,
> > > +						oprnd[0], oprnd[1]);
> > > +  else
> > > +    {
> > > +      internal_fn fn = as_internal_fn ((combined_fn) wide_code_or_ifn);
> > > +      pattern_stmt = gimple_build_call_internal (fn, 2, oprnd[0], oprnd[1]);
> > > +      gimple_call_set_lhs (pattern_stmt, var);
> > > +    }
> >
> > For example, I think we should hide this inside a new:
> >
> >   gimple_build (var, wide_code, oprnd[0], oprnd[1]);
> >
> > that works directly on code_helper, similarly to the new code_helper
> > gimple_build interfaces.
> 
> Done.
> 
> > > @@ -4513,14 +4513,16 @@ vect_gen_widened_results_half (vec_info
> > *vinfo, enum tree_code code,
> > >    tree new_temp;
> > >
> > >    /* Generate half of the widened result:  */
> > > -  gcc_assert (op_type == TREE_CODE_LENGTH (code));
> > >    if (op_type != binary_op)
> > >      vec_oprnd1 = NULL;
> > > -  new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0,
> > vec_oprnd1);
> > > +  if (ch.is_tree_code ())
> > > +    new_stmt = gimple_build_assign (vec_dest, ch, vec_oprnd0,
> > vec_oprnd1);
> > > +  else
> > > +    new_stmt = gimple_build_call_internal (as_internal_fn
> > > + ((combined_fn)
> > ch),
> > > +					   2, vec_oprnd0, vec_oprnd1);
> >
> > Similarly here.  I guess the combined_fn/internal_fn path will also
> > need to cope with null trailing operands, for consistency with the tree_code
> one.
> >
> 
> Done.
> 
> > > @@ -4744,31 +4747,28 @@ vectorizable_conversion (vec_info *vinfo,
> > >        && ! vec_stmt)
> > >      return false;
> > >
> > > -  gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt);
> > > -  if (!stmt)
> > > +  gimple* stmt = stmt_info->stmt;
> > > +  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
> > >      return false;
> > >
> > > -  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
> > > -    return false;
> > > +  if (is_gimple_assign (stmt))
> > > +  {
> > > +    code_or_ifn = gimple_assign_rhs_code (stmt);  }  else
> > > +    code_or_ifn = gimple_call_combined_fn (stmt);
> >
> > It might be possible to use gimple_extract_op here (only recently added).
> > This would also provide the number of operands directly, instead of
> > needing “op_type”.  It would also provide an array of operands.
> >
> 
> Done.
> 
> > > -  code = gimple_assign_rhs_code (stmt);
> > > -  if (!CONVERT_EXPR_CODE_P (code)
> > > -      && code != FIX_TRUNC_EXPR
> > > -      && code != FLOAT_EXPR
> > > -      && code != WIDEN_PLUS_EXPR
> > > -      && code != WIDEN_MINUS_EXPR
> > > -      && code != WIDEN_MULT_EXPR
> > > -      && code != WIDEN_LSHIFT_EXPR)
> >
> > Is it safe to drop this check independently of parts 2 and 3?
> > (Genuine question, haven't checked in detail.)
> 
> It requires the parts 2 and 3. I've moved that change into this first patch.
> 
> > > @@ -4784,7 +4784,8 @@ vectorizable_conversion (vec_info *vinfo,
> > >      }
> > >
> > >    rhs_type = TREE_TYPE (op0);
> > > -  if ((code != FIX_TRUNC_EXPR && code != FLOAT_EXPR)
> > > +  if ((code_or_ifn.is_tree_code () && code_or_ifn != FIX_TRUNC_EXPR
> > > +       && code_or_ifn != FLOAT_EXPR)
> >
> > I don't think we want the is_tree_code condition here.  The existing
> > != should work.
> >
> 
> Done.
> 
> > > @@ -11856,13 +11888,13 @@ supportable_widening_operation
> (vec_info
> > *vinfo,
> > >    if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
> > >      std::swap (c1, c2);
> > >
> > > -  if (code == FIX_TRUNC_EXPR)
> > > +  if (code_or_ifn == FIX_TRUNC_EXPR)
> > >      {
> > >        /* The signedness is determined from output operand.  */
> > >        optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
> > >        optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
> > >      }
> > > -  else if (CONVERT_EXPR_CODE_P (code)
> > > +  else if (CONVERT_EXPR_CODE_P ((tree_code) code_or_ifn)
> >
> > I think this should be as_tree_code (), so that it's safe for internal
> > functions if (tree_code) ever becomes a checked convrsion in future.
> > Same for other instances.
> >
> 
> Done.
> 
> > >  	   && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
> > >  	   && VECTOR_BOOLEAN_TYPE_P (vectype)
> > >  	   && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) […] @@
> > > -12000,7 +12031,7 @@ supportable_widening_operation (vec_info
> > *vinfo,
> > >  bool
> > >  supportable_narrowing_operation (enum tree_code code,
> > >  				 tree vectype_out, tree vectype_in,
> > > -				 enum tree_code *code1, int *multi_step_cvt,
> > > +				 void* _code1, int *multi_step_cvt,
> >
> > This might be rehashing an old conversation, sorry, but why does this
> > need to be void?
> >
> 
> Reworked to avoid using void*.
> 
> > >                                   vec<tree> *interm_types)  {
> > >    machine_mode vec_mode;
> > > @@ -12013,6 +12044,7 @@ supportable_narrowing_operation (enum
> > tree_code code,
> > >    machine_mode intermediate_mode, prev_mode;
> > >    int i;
> > >    bool uns;
> > > +  tree_code * code1 = (tree_code*) _code1;
> > >
> > >    *multi_step_cvt = 0;
> > >    switch (code)
> > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index
> > > bd6f334d15f..70c06264c11 100644
> > > --- a/gcc/tree-vectorizer.h
> > > +++ b/gcc/tree-vectorizer.h
> > > @@ -2030,13 +2030,16 @@ extern bool vect_is_simple_use (vec_info *,
> > stmt_vec_info, slp_tree,
> > >  				enum vect_def_type *,
> > >  				tree *, stmt_vec_info * = NULL);  extern bool
> > > vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool
> > > supportable_widening_operation (vec_info *,
> > > -					    enum tree_code, stmt_vec_info,
> > > -					    tree, tree, enum tree_code *,
> > > -					    enum tree_code *, int *,
> > > -					    vec<tree> *);
> > > +extern bool supportable_widening_operation (vec_info *vinfo,
> > > +				code_helper code_or_ifn,
> > > +				stmt_vec_info stmt_info,
> > > +				tree vectype_out, tree vectype_in,
> > > +				code_helper *code_or_ifn1,
> > > +				code_helper *code_or_ifn2,
> > > +				int *multi_step_cvt,
> > > +				vec<tree> *interm_types);
> >
> > Normal style is to keep the variable names out of the header.
> > The documentation lives in the .c file, so in practice, anyone who
> > wants to add a new caller will need to look there anyway.
> >
> > Thanks,
> > Richard
> >
> > >  extern bool supportable_narrowing_operation (enum tree_code, tree,
> > tree,
> > > -					     enum tree_code *, int *,
> > > +					     void *, int *,
> > >  					     vec<tree> *);
> > >
> > >  extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, diff
> > > --git a/gcc/tree.h b/gcc/tree.h index f62c00bc870..346565f84ce
> > > 100644
> > > --- a/gcc/tree.h
> > > +++ b/gcc/tree.h
> > > @@ -6546,5 +6546,31 @@ extern unsigned fndecl_dealloc_argno (tree);
> > >     if nonnull, set the second argument to the referenced enclosing
> > >     object or pointer.  Otherwise return null.  */  extern tree
> > > get_attr_nonstring_decl (tree, tree * = NULL);
> > > +/* Helper to transparently allow tree codes and builtin function codes
> > > +   exist in one storage entity.  */ class code_helper {
> > > +public:
> > > +  code_helper () {}
> > > +  code_helper (tree_code code) : rep ((int) code) {}
> > > +  code_helper (combined_fn fn) : rep (-(int) fn) {}
> > > +  operator tree_code () const { return is_tree_code () ?
> > > +						       (tree_code) rep :
> > > +						       ERROR_MARK; }
> > > +  operator combined_fn () const { return is_fn_code () ?
> > > +						       (combined_fn) -rep:
> > > +						       CFN_LAST; }
> > > +  bool is_tree_code () const { return rep > 0; }
> > > +  bool is_fn_code () const { return rep < 0; }
> > > +  int get_rep () const { return rep; }
> > > +
> > > +  enum tree_code as_tree_code () const { return is_tree_code () ?
> > > +    (tree_code)* this : MAX_TREE_CODES; }  combined_fn as_fn_code
> > > + () const { return is_fn_code () ? (combined_fn)
> > *this
> > > +    : CFN_LAST;}
> > > +
> > > +private:
> > > +  int rep;
> > > +};
> > >
> > >  #endif  /* GCC_TREE_H  */

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-05-25  9:11 [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Joel Hutton
@ 2022-05-27 13:23 ` Richard Biener
  2022-05-31 10:07   ` Joel Hutton
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2022-05-27 13:23 UTC (permalink / raw)
  To: Joel Hutton; +Cc: Richard Sandiford, gcc-patches

On Wed, 25 May 2022, Joel Hutton wrote:

> Ping!
> 
> Just checking there is still interest in this. I'm assuming you've been 
> busy with release.

Can you post an updated patch (after the .cc renaming, and code_helper
now already moved to tree.h).

Thanks,
Richard.

> Joel
> 
> > -----Original Message-----
> > From: Joel Hutton
> > Sent: 13 April 2022 16:53
> > To: Richard Sandiford <richard.sandiford@arm.com>
> > Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org
> > Subject: [vect-patterns] Refactor widen_plus/widen_minus as internal_fns
> > 
> > Hi all,
> > 
> > These patches refactor the widening patterns in vect-patterns to use
> > internal_fn instead of tree_codes.
> > 
> > Sorry about the delay, some changes to master made it a bit messier.
> > 
> > Bootstrapped and regression tested on aarch64.
> > 
> > Joel
> > 
> > > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> > > > index 854cbcff390..4a8ea67e62f 100644
> > > > --- a/gcc/tree-vect-patterns.c
> > > > +++ b/gcc/tree-vect-patterns.c
> > > > @@ -1245,7 +1245,7 @@ vect_recog_sad_pattern (vec_info *vinfo,
> > > > static gimple *  vect_recog_widen_op_pattern (vec_info *vinfo,
> > > >  			     stmt_vec_info last_stmt_info, tree *type_out,
> > > > -			     tree_code orig_code, tree_code wide_code,
> > > > +			     tree_code orig_code, code_helper
> > > wide_code_or_ifn,
> > >
> > > I think it'd be better to keep the original ?wide_code? name and try
> > > to remove as many places as possible in which switching based on
> > > tree_code or internal_fn is necessary.  The recent gimple-match.h
> > > patches should help with that, but more routines might be needed.
> > 
> > Done.
> > 
> > > > @@ -1309,8 +1310,16 @@ vect_recog_widen_op_pattern (vec_info
> > *vinfo,
> > > >  		       2, oprnd, half_type, unprom, vectype);
> > > >
> > > >    tree var = vect_recog_temp_ssa_var (itype, NULL);
> > > > -  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
> > > > -					      oprnd[0], oprnd[1]);
> > > > +  gimple *pattern_stmt;
> > > > +  if (wide_code_or_ifn.is_tree_code ())
> > > > +    pattern_stmt = gimple_build_assign (var, wide_code_or_ifn,
> > > > +						oprnd[0], oprnd[1]);
> > > > +  else
> > > > +    {
> > > > +      internal_fn fn = as_internal_fn ((combined_fn) wide_code_or_ifn);
> > > > +      pattern_stmt = gimple_build_call_internal (fn, 2, oprnd[0], oprnd[1]);
> > > > +      gimple_call_set_lhs (pattern_stmt, var);
> > > > +    }
> > >
> > > For example, I think we should hide this inside a new:
> > >
> > >   gimple_build (var, wide_code, oprnd[0], oprnd[1]);
> > >
> > > that works directly on code_helper, similarly to the new code_helper
> > > gimple_build interfaces.
> > 
> > Done.
> > 
> > > > @@ -4513,14 +4513,16 @@ vect_gen_widened_results_half (vec_info
> > > *vinfo, enum tree_code code,
> > > >    tree new_temp;
> > > >
> > > >    /* Generate half of the widened result:  */
> > > > -  gcc_assert (op_type == TREE_CODE_LENGTH (code));
> > > >    if (op_type != binary_op)
> > > >      vec_oprnd1 = NULL;
> > > > -  new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0,
> > > vec_oprnd1);
> > > > +  if (ch.is_tree_code ())
> > > > +    new_stmt = gimple_build_assign (vec_dest, ch, vec_oprnd0,
> > > vec_oprnd1);
> > > > +  else
> > > > +    new_stmt = gimple_build_call_internal (as_internal_fn
> > > > + ((combined_fn)
> > > ch),
> > > > +					   2, vec_oprnd0, vec_oprnd1);
> > >
> > > Similarly here.  I guess the combined_fn/internal_fn path will also
> > > need to cope with null trailing operands, for consistency with the tree_code
> > one.
> > >
> > 
> > Done.
> > 
> > > > @@ -4744,31 +4747,28 @@ vectorizable_conversion (vec_info *vinfo,
> > > >        && ! vec_stmt)
> > > >      return false;
> > > >
> > > > -  gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt);
> > > > -  if (!stmt)
> > > > +  gimple* stmt = stmt_info->stmt;
> > > > +  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
> > > >      return false;
> > > >
> > > > -  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
> > > > -    return false;
> > > > +  if (is_gimple_assign (stmt))
> > > > +  {
> > > > +    code_or_ifn = gimple_assign_rhs_code (stmt);  }  else
> > > > +    code_or_ifn = gimple_call_combined_fn (stmt);
> > >
> > > It might be possible to use gimple_extract_op here (only recently added).
> > > This would also provide the number of operands directly, instead of
> > > needing ?op_type?.  It would also provide an array of operands.
> > >
> > 
> > Done.
> > 
> > > > -  code = gimple_assign_rhs_code (stmt);
> > > > -  if (!CONVERT_EXPR_CODE_P (code)
> > > > -      && code != FIX_TRUNC_EXPR
> > > > -      && code != FLOAT_EXPR
> > > > -      && code != WIDEN_PLUS_EXPR
> > > > -      && code != WIDEN_MINUS_EXPR
> > > > -      && code != WIDEN_MULT_EXPR
> > > > -      && code != WIDEN_LSHIFT_EXPR)
> > >
> > > Is it safe to drop this check independently of parts 2 and 3?
> > > (Genuine question, haven't checked in detail.)
> > 
> > It requires the parts 2 and 3. I've moved that change into this first patch.
> > 
> > > > @@ -4784,7 +4784,8 @@ vectorizable_conversion (vec_info *vinfo,
> > > >      }
> > > >
> > > >    rhs_type = TREE_TYPE (op0);
> > > > -  if ((code != FIX_TRUNC_EXPR && code != FLOAT_EXPR)
> > > > +  if ((code_or_ifn.is_tree_code () && code_or_ifn != FIX_TRUNC_EXPR
> > > > +       && code_or_ifn != FLOAT_EXPR)
> > >
> > > I don't think we want the is_tree_code condition here.  The existing
> > > != should work.
> > >
> > 
> > Done.
> > 
> > > > @@ -11856,13 +11888,13 @@ supportable_widening_operation
> > (vec_info
> > > *vinfo,
> > > >    if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
> > > >      std::swap (c1, c2);
> > > >
> > > > -  if (code == FIX_TRUNC_EXPR)
> > > > +  if (code_or_ifn == FIX_TRUNC_EXPR)
> > > >      {
> > > >        /* The signedness is determined from output operand.  */
> > > >        optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
> > > >        optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
> > > >      }
> > > > -  else if (CONVERT_EXPR_CODE_P (code)
> > > > +  else if (CONVERT_EXPR_CODE_P ((tree_code) code_or_ifn)
> > >
> > > I think this should be as_tree_code (), so that it's safe for internal
> > > functions if (tree_code) ever becomes a checked convrsion in future.
> > > Same for other instances.
> > >
> > 
> > Done.
> > 
> > > >  	   && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
> > > >  	   && VECTOR_BOOLEAN_TYPE_P (vectype)
> > > >  	   && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) [?] @@
> > > > -12000,7 +12031,7 @@ supportable_widening_operation (vec_info
> > > *vinfo,
> > > >  bool
> > > >  supportable_narrowing_operation (enum tree_code code,
> > > >  				 tree vectype_out, tree vectype_in,
> > > > -				 enum tree_code *code1, int *multi_step_cvt,
> > > > +				 void* _code1, int *multi_step_cvt,
> > >
> > > This might be rehashing an old conversation, sorry, but why does this
> > > need to be void?
> > >
> > 
> > Reworked to avoid using void*.
> > 
> > > >                                   vec<tree> *interm_types)  {
> > > >    machine_mode vec_mode;
> > > > @@ -12013,6 +12044,7 @@ supportable_narrowing_operation (enum
> > > tree_code code,
> > > >    machine_mode intermediate_mode, prev_mode;
> > > >    int i;
> > > >    bool uns;
> > > > +  tree_code * code1 = (tree_code*) _code1;
> > > >
> > > >    *multi_step_cvt = 0;
> > > >    switch (code)
> > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index
> > > > bd6f334d15f..70c06264c11 100644
> > > > --- a/gcc/tree-vectorizer.h
> > > > +++ b/gcc/tree-vectorizer.h
> > > > @@ -2030,13 +2030,16 @@ extern bool vect_is_simple_use (vec_info *,
> > > stmt_vec_info, slp_tree,
> > > >  				enum vect_def_type *,
> > > >  				tree *, stmt_vec_info * = NULL);  extern bool
> > > > vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool
> > > > supportable_widening_operation (vec_info *,
> > > > -					    enum tree_code, stmt_vec_info,
> > > > -					    tree, tree, enum tree_code *,
> > > > -					    enum tree_code *, int *,
> > > > -					    vec<tree> *);
> > > > +extern bool supportable_widening_operation (vec_info *vinfo,
> > > > +				code_helper code_or_ifn,
> > > > +				stmt_vec_info stmt_info,
> > > > +				tree vectype_out, tree vectype_in,
> > > > +				code_helper *code_or_ifn1,
> > > > +				code_helper *code_or_ifn2,
> > > > +				int *multi_step_cvt,
> > > > +				vec<tree> *interm_types);
> > >
> > > Normal style is to keep the variable names out of the header.
> > > The documentation lives in the .c file, so in practice, anyone who
> > > wants to add a new caller will need to look there anyway.
> > >
> > > Thanks,
> > > Richard
> > >
> > > >  extern bool supportable_narrowing_operation (enum tree_code, tree,
> > > tree,
> > > > -					     enum tree_code *, int *,
> > > > +					     void *, int *,
> > > >  					     vec<tree> *);
> > > >
> > > >  extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, diff
> > > > --git a/gcc/tree.h b/gcc/tree.h index f62c00bc870..346565f84ce
> > > > 100644
> > > > --- a/gcc/tree.h
> > > > +++ b/gcc/tree.h
> > > > @@ -6546,5 +6546,31 @@ extern unsigned fndecl_dealloc_argno (tree);
> > > >     if nonnull, set the second argument to the referenced enclosing
> > > >     object or pointer.  Otherwise return null.  */  extern tree
> > > > get_attr_nonstring_decl (tree, tree * = NULL);
> > > > +/* Helper to transparently allow tree codes and builtin function codes
> > > > +   exist in one storage entity.  */ class code_helper {
> > > > +public:
> > > > +  code_helper () {}
> > > > +  code_helper (tree_code code) : rep ((int) code) {}
> > > > +  code_helper (combined_fn fn) : rep (-(int) fn) {}
> > > > +  operator tree_code () const { return is_tree_code () ?
> > > > +						       (tree_code) rep :
> > > > +						       ERROR_MARK; }
> > > > +  operator combined_fn () const { return is_fn_code () ?
> > > > +						       (combined_fn) -rep:
> > > > +						       CFN_LAST; }
> > > > +  bool is_tree_code () const { return rep > 0; }
> > > > +  bool is_fn_code () const { return rep < 0; }
> > > > +  int get_rep () const { return rep; }
> > > > +
> > > > +  enum tree_code as_tree_code () const { return is_tree_code () ?
> > > > +    (tree_code)* this : MAX_TREE_CODES; }  combined_fn as_fn_code
> > > > + () const { return is_fn_code () ? (combined_fn)
> > > *this
> > > > +    : CFN_LAST;}
> > > > +
> > > > +private:
> > > > +  int rep;
> > > > +};
> > > >
> > > >  #endif  /* GCC_TREE_H  */
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-05-27 13:23 ` Richard Biener
@ 2022-05-31 10:07   ` Joel Hutton
  2022-05-31 16:46     ` Tamar Christina
  2022-06-01 10:11     ` Richard Biener
  0 siblings, 2 replies; 53+ messages in thread
From: Joel Hutton @ 2022-05-31 10:07 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Sandiford, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 250 bytes --]

> Can you post an updated patch (after the .cc renaming, and code_helper
> now already moved to tree.h).
> 
> Thanks,
> Richard.

Patches attached. They already incorporated the .cc rename, now rebased to be after the change to tree.h

Joel

[-- Attachment #2: 0001-Refactor-to-allow-internal_fn-s.patch --]
[-- Type: application/octet-stream, Size: 26740 bytes --]

From 3467bf531402d83c6427716954b9fab933f858ef Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Wed, 25 Aug 2021 14:31:15 +0100
Subject: [PATCH 1/3] Refactor to allow internal_fn's

Hi all,

This refactor allows widening patterns (such as widen_plus/widen_minus) to be represented as
either internal_fns or tree_codes.

[vect-patterns] Refactor as internal_fn's

Refactor vect-patterns to allow patterns to be internal_fns starting
with widening_plus/minus patterns

gcc/ChangeLog:

	* gimple-match.h (class code_helper): Add as_internal_fn, as_tree_code
    helper functions.
	* gimple.cc (gimple_build): Function to build a GIMPLE_CALL or
    GIMPLE_ASSIGN as appropriate, given a code_helper.
	* gimple.h (gimple_build): Function prototype.
	* tree-core.h (ECF_WIDEN): Flag to mark internal_fn as widening.
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Refactor to
    use code_helper.
	* tree-vect-stmts.cc (vect_gen_widened_results_half): Refactor to
    use code_helper.
	(vect_create_vectorized_promotion_stmts): Refactor to use
    code_helper.
	(vectorizable_conversion): Refactor to use code_helper.
    gimple_call or gimple_assign.
	(supportable_widening_operation): Refactor to use code_helper.
	(supportable_narrowing_operation): Refactor to use code_helper.
	* tree-vectorizer.h (supportable_widening_operation): Change
    prototype to use code_helper.
	(supportable_narrowing_operation): change prototype to use
    code_helper.
---
 gcc/gimple.cc             |  24 +++++
 gcc/gimple.h              |   1 +
 gcc/tree-core.h           |   3 +
 gcc/tree-vect-patterns.cc |   7 +-
 gcc/tree-vect-stmts.cc    | 216 +++++++++++++++++++++++---------------
 gcc/tree-vectorizer.h     |  11 +-
 gcc/tree.h                |  52 +++++++++
 7 files changed, 221 insertions(+), 93 deletions(-)

diff --git a/gcc/gimple.cc b/gcc/gimple.cc
index b70ab4d25230374f0c90f93d77f9caf8d57587ee..bd3210cd35298daf7c74276e38658b5151d5cc1f 100644
--- a/gcc/gimple.cc
+++ b/gcc/gimple.cc
@@ -502,6 +502,30 @@ gimple_build_assign (tree lhs, enum tree_code subcode, tree op1 MEM_STAT_DECL)
 				PASS_MEM_STAT);
 }
 
+/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
+   or internal_fn contained in ch, respectively.  */
+gimple *
+gimple_build (tree lhs, code_helper ch, tree op0, tree op1)
+{
+  if (op0 == NULL_TREE)
+    return NULL;
+  if (ch.is_tree_code ())
+    return op1 == NULL_TREE ? gimple_build_assign (lhs, ch.as_tree_code (),
+						   op0) :
+			      gimple_build_assign (lhs, ch.as_tree_code (), op0,
+						   op1);
+  else
+  {
+    internal_fn fn = as_internal_fn (ch.as_fn_code ());
+    gimple* stmt;
+    if (op1 == NULL_TREE)
+      stmt = gimple_build_call_internal (fn, 1, op0);
+    else
+      stmt = gimple_build_call_internal (fn, 2, op0, op1);
+    gimple_call_set_lhs (stmt, lhs);
+    return stmt;
+  }
+}
 
 /* Build a GIMPLE_COND statement.
 
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 6b1e89ad74e6b22dd534ff48e48fef688032f844..5350f14f6f29bd8e011b79c2aa79c2bfaef8c58f 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1523,6 +1523,7 @@ gcall *gimple_build_call_valist (tree, unsigned, va_list);
 gcall *gimple_build_call_internal (enum internal_fn, unsigned, ...);
 gcall *gimple_build_call_internal_vec (enum internal_fn, const vec<tree> &);
 gcall *gimple_build_call_from_tree (tree, tree);
+gimple* gimple_build (tree, code_helper, tree, tree);
 gassign *gimple_build_assign (tree, tree CXX_MEM_STAT_INFO);
 gassign *gimple_build_assign (tree, enum tree_code,
 			      tree, tree, tree CXX_MEM_STAT_INFO);
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index ab5fa01e5cb5fb56c1964b93b014ed55a4aa704a..cff6211080bced0bffb39e98039a6550897acf77 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -96,6 +96,9 @@ struct die_struct;
 /* Nonzero if this is a cold function.  */
 #define ECF_COLD		  (1 << 15)
 
+/* Nonzero if this is a widening function.  */
+#define ECF_WIDEN		  (1 << 16)
+
 /* Call argument flags.  */
 
 /* Nonzero if the argument is not used by the function.  */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 0fad4dbd0945c6c176f3457b751e812f17fcd148..36e362e1daf3f946c6074600a6a322b3bda67755 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -1348,7 +1348,7 @@ vect_recog_sad_pattern (vec_info *vinfo,
 static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
-			     tree_code orig_code, tree_code wide_code,
+			     tree_code orig_code, code_helper wide_code,
 			     bool shift_p, const char *name)
 {
   gimple *last_stmt = last_stmt_info->stmt;
@@ -1391,7 +1391,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
       vecctype = get_vectype_for_scalar_type (vinfo, ctype);
     }
 
-  enum tree_code dummy_code;
+  code_helper dummy_code;
   int dummy_int;
   auto_vec<tree> dummy_vec;
   if (!vectype
@@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 		       2, oprnd, half_type, unprom, vectype);
 
   tree var = vect_recog_temp_ssa_var (itype, NULL);
-  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
-					      oprnd[0], oprnd[1]);
+  gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], oprnd[1]);
 
   if (vecctype != vecitype)
     pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype,
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 346d8ce280437e00bfeb19a4b4adc59eb96207f9..61b51a29f99bcdf0ff6b4ead4a69163ebf8ed383 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4636,7 +4636,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
    STMT_INFO is the original scalar stmt that we are vectorizing.  */
 
 static gimple *
-vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
+vect_gen_widened_results_half (vec_info *vinfo, code_helper ch,
                                tree vec_oprnd0, tree vec_oprnd1, int op_type,
 			       tree vec_dest, gimple_stmt_iterator *gsi,
 			       stmt_vec_info stmt_info)
@@ -4645,14 +4645,12 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
   tree new_temp;
 
   /* Generate half of the widened result:  */
-  gcc_assert (op_type == TREE_CODE_LENGTH (code));
   if (op_type != binary_op)
     vec_oprnd1 = NULL;
-  new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1);
+  new_stmt = gimple_build (vec_dest, ch, vec_oprnd0, vec_oprnd1);
   new_temp = make_ssa_name (vec_dest, new_stmt);
-  gimple_assign_set_lhs (new_stmt, new_temp);
+  gimple_set_lhs (new_stmt, new_temp);
   vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
-
   return new_stmt;
 }
 
@@ -4729,8 +4727,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 					vec<tree> *vec_oprnds1,
 					stmt_vec_info stmt_info, tree vec_dest,
 					gimple_stmt_iterator *gsi,
-					enum tree_code code1,
-					enum tree_code code2, int op_type)
+					code_helper ch1,
+					code_helper ch2, int op_type)
 {
   int i;
   tree vop0, vop1, new_tmp1, new_tmp2;
@@ -4746,10 +4744,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 	vop1 = NULL_TREE;
 
       /* Generate the two halves of promotion operation.  */
-      new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1,
+      new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
-      new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1,
+      new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
       if (is_gimple_call (new_stmt1))
@@ -4846,8 +4844,9 @@ vectorizable_conversion (vec_info *vinfo,
   tree scalar_dest;
   tree op0, op1 = NULL_TREE;
   loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
-  enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK;
-  enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
+  tree_code tc1;
+  code_helper code, code1, code2;
+  code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
   tree new_temp;
   enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type};
   int ndts = 2;
@@ -4876,31 +4875,42 @@ vectorizable_conversion (vec_info *vinfo,
       && ! vec_stmt)
     return false;
 
-  gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!stmt)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return false;
 
-  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
+  if (gimple_get_lhs (stmt) == NULL_TREE ||
+      TREE_CODE(gimple_get_lhs (stmt)) != SSA_NAME)
     return false;
 
-  code = gimple_assign_rhs_code (stmt);
-  if (!CONVERT_EXPR_CODE_P (code)
-      && code != FIX_TRUNC_EXPR
-      && code != FLOAT_EXPR
-      && code != WIDEN_PLUS_EXPR
-      && code != WIDEN_MINUS_EXPR
-      && code != WIDEN_MULT_EXPR
-      && code != WIDEN_LSHIFT_EXPR)
+  if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
     return false;
 
-  bool widen_arith = (code == WIDEN_PLUS_EXPR
-		      || code == WIDEN_MINUS_EXPR
-		      || code == WIDEN_MULT_EXPR
-		      || code == WIDEN_LSHIFT_EXPR);
-  op_type = TREE_CODE_LENGTH (code);
+  bool widen_arith = false;
+  gimple_match_op res_op;
+  if (!gimple_extract_op (stmt, &res_op))
+    return false;
+  code = res_op.code;
+  op_type = res_op.num_ops;
+
+  if (is_gimple_assign (stmt))
+  {
+      widen_arith = (code == WIDEN_PLUS_EXPR
+		     || code == WIDEN_MINUS_EXPR
+		     || code == WIDEN_MULT_EXPR
+		     || code == WIDEN_LSHIFT_EXPR);
+ }
+  else
+      widen_arith = gimple_call_flags (stmt) & ECF_WIDEN;
+
+  if (!widen_arith
+      && !CONVERT_EXPR_CODE_P (code)
+      && code != FIX_TRUNC_EXPR
+      && code != FLOAT_EXPR)
+    return false;
 
   /* Check types of lhs and rhs.  */
-  scalar_dest = gimple_assign_lhs (stmt);
+  scalar_dest = gimple_get_lhs (stmt);
   lhs_type = TREE_TYPE (scalar_dest);
   vectype_out = STMT_VINFO_VECTYPE (stmt_info);
 
@@ -4938,10 +4948,15 @@ vectorizable_conversion (vec_info *vinfo,
 
   if (op_type == binary_op)
     {
-      gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR);
+      gcc_assert (code == WIDEN_MULT_EXPR
+		  || code == WIDEN_LSHIFT_EXPR
+		  || code == WIDEN_PLUS_EXPR
+		  || code == WIDEN_MINUS_EXPR
+		  || widen_arith);
+
 
-      op1 = gimple_assign_rhs2 (stmt);
+      op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
+				     gimple_call_arg (stmt, 0);
       tree vectype1_in;
       if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1,
 			       &op1, &slp_op1, &dt[1], &vectype1_in))
@@ -5025,8 +5040,12 @@ vectorizable_conversion (vec_info *vinfo,
 	  && code != FLOAT_EXPR
 	  && !CONVERT_EXPR_CODE_P (code))
 	return false;
-      if (supportable_convert_operation (code, vectype_out, vectype_in, &code1))
+      if (supportable_convert_operation (code.as_tree_code (), vectype_out,
+					 vectype_in, &tc1))
+      {
+	code1 = tc1;
 	break;
+      }
       /* FALLTHRU */
     unsupported:
       if (dump_enabled_p ())
@@ -5037,9 +5056,11 @@ vectorizable_conversion (vec_info *vinfo,
     case WIDEN:
       if (known_eq (nunits_in, nunits_out))
 	{
-	  if (!supportable_half_widening_operation (code, vectype_out,
-						   vectype_in, &code1))
+	  if (!supportable_half_widening_operation (code.as_tree_code (),
+						    vectype_out, vectype_in,
+						    &tc1))
 	    goto unsupported;
+	  code1 = tc1;
 	  gcc_assert (!(multi_step_cvt && op_type == binary_op));
 	  break;
 	}
@@ -5073,14 +5094,17 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (GET_MODE_SIZE (rhs_mode) == fltsz)
 	    {
-	      if (!supportable_convert_operation (code, vectype_out,
-						  cvt_type, &codecvt1))
+	      tc1 = ERROR_MARK;
+	      if (!supportable_convert_operation (code.as_tree_code (),
+						  vectype_out,
+						  cvt_type, &tc1))
 		goto unsupported;
+	      codecvt1 = tc1;
 	    }
-	  else if (!supportable_widening_operation (vinfo, code, stmt_info,
-						    vectype_out, cvt_type,
-						    &codecvt1, &codecvt2,
-						    &multi_step_cvt,
+	  else if (!supportable_widening_operation (vinfo, code,
+						    stmt_info, vectype_out,
+						    cvt_type, &codecvt1,
+						    &codecvt2, &multi_step_cvt,
 						    &interm_types))
 	    continue;
 	  else
@@ -5088,8 +5112,9 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info,
 					      cvt_type,
-					      vectype_in, &code1, &code2,
-					      &multi_step_cvt, &interm_types))
+					      vectype_in, &code1,
+					      &code2, &multi_step_cvt,
+					      &interm_types))
 	    {
 	      found_mode = true;
 	      break;
@@ -5111,10 +5136,14 @@ vectorizable_conversion (vec_info *vinfo,
 
     case NARROW:
       gcc_assert (op_type == unary_op);
-      if (supportable_narrowing_operation (code, vectype_out, vectype_in,
-					   &code1, &multi_step_cvt,
+      if (supportable_narrowing_operation (code.as_tree_code (), vectype_out,
+					   vectype_in,
+					   &tc1, &multi_step_cvt,
 					   &interm_types))
-	break;
+	{
+	  code1 = tc1;
+	  break;
+	}
 
       if (code != FIX_TRUNC_EXPR
 	  || GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode))
@@ -5125,13 +5154,18 @@ vectorizable_conversion (vec_info *vinfo,
       cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
       if (cvt_type == NULL_TREE)
 	goto unsupported;
-      if (!supportable_convert_operation (code, cvt_type, vectype_in,
-					  &codecvt1))
+      if (!supportable_convert_operation (code.as_tree_code (), cvt_type,
+					  vectype_in,
+					  &tc1))
 	goto unsupported;
+      codecvt1 = tc1;
       if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type,
-					   &code1, &multi_step_cvt,
+					   &tc1, &multi_step_cvt,
 					   &interm_types))
-	break;
+	{
+	  code1 = tc1;
+	  break;
+	}
       goto unsupported;
 
     default:
@@ -5245,8 +5279,9 @@ vectorizable_conversion (vec_info *vinfo,
       FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	{
 	  /* Arguments are ready, create the new vector stmt.  */
-	  gcc_assert (TREE_CODE_LENGTH (code1) == unary_op);
-	  gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0);
+	  gcc_assert (TREE_CODE_LENGTH ((tree_code) code1) == unary_op);
+	  gassign *new_stmt = gimple_build_assign (vec_dest,
+						   code1.as_tree_code (), vop0);
 	  new_temp = make_ssa_name (vec_dest, new_stmt);
 	  gimple_assign_set_lhs (new_stmt, new_temp);
 	  vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
@@ -5278,7 +5313,7 @@ vectorizable_conversion (vec_info *vinfo,
       for (i = multi_step_cvt; i >= 0; i--)
 	{
 	  tree this_dest = vec_dsts[i];
-	  enum tree_code c1 = code1, c2 = code2;
+	  code_helper c1 = code1, c2 = code2;
 	  if (i == 0 && codecvt2 != ERROR_MARK)
 	    {
 	      c1 = codecvt1;
@@ -5288,7 +5323,8 @@ vectorizable_conversion (vec_info *vinfo,
 	    vect_create_half_widening_stmts (vinfo, &vec_oprnds0,
 						    &vec_oprnds1, stmt_info,
 						    this_dest, gsi,
-						    c1, op_type);
+						    c1.as_tree_code (),
+						    op_type);
 	  else
 	    vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0,
 						    &vec_oprnds1, stmt_info,
@@ -5301,9 +5337,11 @@ vectorizable_conversion (vec_info *vinfo,
 	  gimple *new_stmt;
 	  if (cvt_type)
 	    {
-	      gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
+	      gcc_assert (TREE_CODE_LENGTH ((tree_code) codecvt1) == unary_op);
 	      new_temp = make_ssa_name (vec_dest);
-	      new_stmt = gimple_build_assign (new_temp, codecvt1, vop0);
+	      new_stmt = gimple_build_assign (new_temp,
+					      codecvt1.as_tree_code (),
+					      vop0);
 	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    }
 	  else
@@ -5327,10 +5365,10 @@ vectorizable_conversion (vec_info *vinfo,
       if (cvt_type)
 	FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	  {
-	    gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
+	    gcc_assert (TREE_CODE_LENGTH (((tree_code) codecvt1)) == unary_op);
 	    new_temp = make_ssa_name (vec_dest);
 	    gassign *new_stmt
-	      = gimple_build_assign (new_temp, codecvt1, vop0);
+	      = gimple_build_assign (new_temp, codecvt1.as_tree_code (), vop0);
 	    vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    vec_oprnds0[i] = new_temp;
 	  }
@@ -5338,7 +5376,7 @@ vectorizable_conversion (vec_info *vinfo,
       vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0,
 					     multi_step_cvt,
 					     stmt_info, vec_dsts, gsi,
-					     slp_node, code1);
+					     slp_node, code1.as_tree_code ());
       break;
     }
   if (!slp_node)
@@ -11926,9 +11964,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype)
 
 bool
 supportable_widening_operation (vec_info *vinfo,
-				enum tree_code code, stmt_vec_info stmt_info,
+				code_helper code,
+				stmt_vec_info stmt_info,
 				tree vectype_out, tree vectype_in,
-                                enum tree_code *code1, enum tree_code *code2,
+				code_helper *code1,
+				code_helper *code2,
                                 int *multi_step_cvt,
                                 vec<tree> *interm_types)
 {
@@ -11939,7 +11979,7 @@ supportable_widening_operation (vec_info *vinfo,
   optab optab1, optab2;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
-  enum tree_code c1, c2;
+  code_helper c1=MAX_TREE_CODES, c2=MAX_TREE_CODES;
   int i;
   tree prev_type, intermediate_type;
   machine_mode intermediate_mode, prev_mode;
@@ -11949,7 +11989,7 @@ supportable_widening_operation (vec_info *vinfo,
   if (loop_info)
     vect_loop = LOOP_VINFO_LOOP (loop_info);
 
-  switch (code)
+  switch (code.as_tree_code ())
     {
     case WIDEN_MULT_EXPR:
       /* The result of a vectorized widening operation usually requires
@@ -11990,8 +12030,9 @@ supportable_widening_operation (vec_info *vinfo,
 	  && !nested_in_vect_loop_p (vect_loop, stmt_info)
 	  && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR,
 					     stmt_info, vectype_out,
-					     vectype_in, code1, code2,
-					     multi_step_cvt, interm_types))
+					     vectype_in, code1,
+					     code2, multi_step_cvt,
+					     interm_types))
         {
           /* Elements in a vector with vect_used_by_reduction property cannot
              be reordered if the use chain with this property does not have the
@@ -12054,6 +12095,9 @@ supportable_widening_operation (vec_info *vinfo,
       c2 = VEC_UNPACK_FIX_TRUNC_HI_EXPR;
       break;
 
+    case MAX_TREE_CODES:
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -12061,13 +12105,16 @@ supportable_widening_operation (vec_info *vinfo,
   if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
     std::swap (c1, c2);
 
+
   if (code == FIX_TRUNC_EXPR)
     {
       /* The signedness is determined from output operand.  */
-      optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+      optab1 = optab_for_tree_code (c1.as_tree_code (), vectype_out,
+				    optab_default);
+      optab2 = optab_for_tree_code (c2.as_tree_code (), vectype_out,
+				    optab_default);
     }
-  else if (CONVERT_EXPR_CODE_P (code)
+  else if (CONVERT_EXPR_CODE_P (code.as_tree_code ())
 	   && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
 	   && VECTOR_BOOLEAN_TYPE_P (vectype)
 	   && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
@@ -12080,8 +12127,8 @@ supportable_widening_operation (vec_info *vinfo,
     }
   else
     {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+      optab1 = optab_for_tree_code (c1.as_tree_code (), vectype, optab_default);
+      optab2 = optab_for_tree_code (c2.as_tree_code (), vectype, optab_default);
     }
 
   if (!optab1 || !optab2)
@@ -12092,8 +12139,12 @@ supportable_widening_operation (vec_info *vinfo,
        || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
     return false;
 
-  *code1 = c1;
-  *code2 = c2;
+  if (code.is_tree_code ())
+  {
+    *code1 = c1;
+    *code2 = c2;
+  }
+
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
       && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
@@ -12114,7 +12165,7 @@ supportable_widening_operation (vec_info *vinfo,
   prev_type = vectype;
   prev_mode = vec_mode;
 
-  if (!CONVERT_EXPR_CODE_P (code))
+  if (!CONVERT_EXPR_CODE_P ((tree_code) code))
     return false;
 
   /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS
@@ -12145,8 +12196,10 @@ supportable_widening_operation (vec_info *vinfo,
 	}
       else
 	{
-	  optab3 = optab_for_tree_code (c1, intermediate_type, optab_default);
-	  optab4 = optab_for_tree_code (c2, intermediate_type, optab_default);
+	  optab3 = optab_for_tree_code (c1.as_tree_code (), intermediate_type,
+					optab_default);
+	  optab4 = optab_for_tree_code (c2.as_tree_code (), intermediate_type,
+					optab_default);
 	}
 
       if (!optab3 || !optab4
@@ -12181,7 +12234,6 @@ supportable_widening_operation (vec_info *vinfo,
   return false;
 }
 
-
 /* Function supportable_narrowing_operation
 
    Check whether an operation represented by the code CODE is a
@@ -12205,7 +12257,7 @@ supportable_widening_operation (vec_info *vinfo,
 bool
 supportable_narrowing_operation (enum tree_code code,
 				 tree vectype_out, tree vectype_in,
-				 enum tree_code *code1, int *multi_step_cvt,
+				 tree_code* _code1, int *multi_step_cvt,
                                  vec<tree> *interm_types)
 {
   machine_mode vec_mode;
@@ -12217,8 +12269,8 @@ supportable_narrowing_operation (enum tree_code code,
   tree intermediate_type, prev_type;
   machine_mode intermediate_mode, prev_mode;
   int i;
-  unsigned HOST_WIDE_INT n_elts;
   bool uns;
+  tree_code * code1 = (tree_code*) _code1;
 
   *multi_step_cvt = 0;
   switch (code)
@@ -12227,9 +12279,8 @@ supportable_narrowing_operation (enum tree_code code,
       c1 = VEC_PACK_TRUNC_EXPR;
       if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
 	  && VECTOR_BOOLEAN_TYPE_P (vectype)
-	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype))
-	  && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts)
-	  && n_elts < BITS_PER_UNIT)
+	  && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype)
+	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
 	optab1 = vec_pack_sbool_trunc_optab;
       else
 	optab1 = optab_for_tree_code (c1, vectype, optab_default);
@@ -12320,9 +12371,8 @@ supportable_narrowing_operation (enum tree_code code,
 	  = lang_hooks.types.type_for_mode (intermediate_mode, uns);
       if (VECTOR_BOOLEAN_TYPE_P (intermediate_type)
 	  && VECTOR_BOOLEAN_TYPE_P (prev_type)
-	  && SCALAR_INT_MODE_P (prev_mode)
-	  && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant (&n_elts)
-	  && n_elts < BITS_PER_UNIT)
+	  && intermediate_mode == prev_mode
+	  && SCALAR_INT_MODE_P (prev_mode))
 	interm_optab = vec_pack_sbool_trunc_optab;
       else
 	interm_optab
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd784f10ee3d8ff4b4dc 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree,
 				enum vect_def_type *,
 				tree *, stmt_vec_info * = NULL);
 extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree);
-extern bool supportable_widening_operation (vec_info *,
-					    enum tree_code, stmt_vec_info,
-					    tree, tree, enum tree_code *,
-					    enum tree_code *, int *,
-					    vec<tree> *);
+extern bool supportable_widening_operation (vec_info*, code_helper,
+					    stmt_vec_info, tree, tree,
+					    code_helper*, code_helper*,
+					    int*, vec<tree> *);
 extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
-					     enum tree_code *, int *,
+					     tree_code *, int *,
 					     vec<tree> *);
 
 extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
diff --git a/gcc/tree.h b/gcc/tree.h
index f84958933d51144bb6ce7cc41eca5f7f06814550..e51e34c051d9b91d1c02a4b2fefdb2b15606a36f 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -92,6 +92,10 @@ public:
   bool is_fn_code () const { return rep < 0; }
   bool is_internal_fn () const;
   bool is_builtin_fn () const;
+  enum tree_code as_tree_code () const { return is_tree_code () ?
+    (tree_code)* this : MAX_TREE_CODES; }
+  combined_fn as_fn_code () const { return is_fn_code () ? (combined_fn) *this
+    : CFN_LAST;}
   int get_rep () const { return rep; }
   bool operator== (const code_helper &other) { return rep == other.rep; }
   bool operator!= (const code_helper &other) { return rep != other.rep; }
@@ -6657,6 +6661,54 @@ extern unsigned fndecl_dealloc_argno (tree);
    if nonnull, set the second argument to the referenced enclosing
    object or pointer.  Otherwise return null.  */
 extern tree get_attr_nonstring_decl (tree, tree * = NULL);
+/* Helper to transparently allow tree codes and builtin function codes
+   exist in one storage entity.  */
+class code_helper
+{
+public:
+  code_helper () {}
+  code_helper (tree_code code) : rep ((int) code) {}
+  code_helper (combined_fn fn) : rep (-(int) fn) {}
+  code_helper (internal_fn fn) : rep (-(int) as_combined_fn (fn)) {}
+  explicit operator tree_code () const { return (tree_code) rep; }
+  explicit operator combined_fn () const { return (combined_fn) -rep; }
+  explicit operator internal_fn () const;
+  explicit operator built_in_function () const;
+  bool is_tree_code () const { return rep > 0; }
+  bool is_fn_code () const { return rep < 0; }
+  bool is_internal_fn () const;
+  bool is_builtin_fn () const;
+  int get_rep () const { return rep; }
+  bool operator== (const code_helper &other) { return rep == other.rep; }
+  bool operator!= (const code_helper &other) { return rep != other.rep; }
+  bool operator== (tree_code c) { return rep == code_helper (c).rep; }
+  bool operator!= (tree_code c) { return rep != code_helper (c).rep; }
+
+private:
+  int rep;
+};
+
+inline code_helper::operator internal_fn () const
+{
+  return as_internal_fn (combined_fn (*this));
+}
+
+inline code_helper::operator built_in_function () const
+{
+  return as_builtin_fn (combined_fn (*this));
+}
+
+inline bool
+code_helper::is_internal_fn () const
+{
+  return is_fn_code () && internal_fn_p (combined_fn (*this));
+}
+
+inline bool
+code_helper::is_builtin_fn () const
+{
+  return is_fn_code () && builtin_fn_p (combined_fn (*this));
+}
 
 extern int get_target_clone_attr_len (tree);
 
-- 
2.17.1


[-- Attachment #3: 0002-Refactor-widen_plus-as-internal_fn.patch --]
[-- Type: application/octet-stream, Size: 22720 bytes --]

From 70fa5fc3e8282a73b973bdd79fccd3450d5b312c Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Wed, 26 Jan 2022 14:00:17 +0000
Subject: [PATCH 2/3] Refactor widen_plus as internal_fn

This patch replaces the existing tree_code widen_plus and widen_minus
patterns with internal_fn versions.

DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it provides convenience wrappers for defining conversions that require a hi/lo split, like widening and narrowing operations.  Each definition for <NAME> will require an optab named <OPTAB> and two other optabs that you specify for signed and unsigned. The hi/lo pair is necessary because the widening operations take n narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' operation operates on the first n/2 elements of input. The 'hi' operation operates on the second n/2 elements of input. Defining an internal_fn along with hi/lo variations allows a single internal function to be returned from a vect_recog function that will later be expanded to hi/lo.

DEF_INTERNAL_OPTAB_MULTI_FN is used in internal-fn.def to register a widening internal_fn. It is defined differently in different places and internal-fn.def is sourced from those places so the parameters given can be reused.
  internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later defined to generate the  'expand_' functions for the hi/lo versions of the fn.
  internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original and hi/lo variants of the internal_fn

 For example:
 IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>addl_hi_<mode> -> (u/s)addl2
                       IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>addl_lo_<mode> -> (u/s)addl

This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.

gcc/ChangeLog:

2022-04-13  Joel Hutton  <joel.hutton@arm.com>
2022-04-13  Tamar Christina  <tamar.christina@arm.com>

	* internal-fn.cc (INCLUDE_MAP): Include maps for use in optab
    lookup.
	(DEF_INTERNAL_OPTAB_MULTI_FN): Macro to define an internal_fn that
    expands into multiple internal_fns (for widening).
	(ifn_cmp): Function to compare ifn's for sorting/searching.
	(lookup_multi_ifn_optab): Add lookup function.
	(lookup_multi_internal_fn): Add lookup function.
	(commutative_binary_fn_p): Add widen_plus fn's.
	* internal-fn.def (DEF_INTERNAL_OPTAB_MULTI_FN): Define widening
    plus,minus functions.
	(VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code.
	(VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code.
	* internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
	(lookup_multi_ifn_optab): Add prototype.
	(lookup_multi_internal_fn): Add prototype.
	* optabs.cc (commutative_optab_p): Add widening plus, minus optabs.
	* optabs.def (OPTAB_CD): widen add, sub optabs
	* tree-core.h (ECF_MULTI): Flag to indicate if a function decays
    into hi/lo parts.
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
    patterns with a hi/lo split.
	(vect_recog_widen_plus_pattern): Refactor to return
    IFN_VECT_WIDEN_PLUS.
	(vect_recog_widen_minus_pattern): Refactor to return new
    IFN_VEC_WIDEN_MINUS.
	* tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus
    ifn
    support.
	(supportable_widening_operation): Add widen plus/minus ifn support.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-widen-add.c: Test that new
    IFN_VEC_WIDEN_PLUS is being used.
	* gcc.target/aarch64/vect-widen-sub.c: Test that new
    IFN_VEC_WIDEN_MINUS is being used.
---
 gcc/internal-fn.cc                            | 107 ++++++++++++++++++
 gcc/internal-fn.def                           |  23 ++++
 gcc/internal-fn.h                             |   7 ++
 gcc/optabs.cc                                 |  12 +-
 gcc/optabs.def                                |   2 +
 .../gcc.target/aarch64/vect-widen-add.c       |   4 +-
 .../gcc.target/aarch64/vect-widen-sub.c       |   4 +-
 gcc/tree-core.h                               |   4 +
 gcc/tree-vect-patterns.cc                     |  37 ++++--
 gcc/tree-vect-stmts.cc                        |  60 +++++++++-
 10 files changed, 244 insertions(+), 16 deletions(-)

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 8b1733e20c4455e4e8c383c92fe859f4256cae69..e95b13af884f67990ad43c286990a351e2bd641b 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
 along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
+#define INCLUDE_MAP
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = {
   0
 };
 
+const enum internal_fn internal_fn_hilo_keys_array[] = {
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  IFN_##NAME##_LO, \
+  IFN_##NAME##_HI,
+#include "internal-fn.def"
+  IFN_LAST
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+};
+
+const optab internal_fn_hilo_values_array[] = {
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  SOPTAB##_lo_optab, UOPTAB##_lo_optab, \
+  SOPTAB##_hi_optab, UOPTAB##_hi_optab,
+#include "internal-fn.def"
+  unknown_optab, unknown_optab
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+};
+
 /* Return the internal function called NAME, or IFN_LAST if there's
    no such function.  */
 
@@ -90,6 +111,62 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+static int
+ifn_cmp (const void *a_, const void *b_)
+{
+  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
+  auto *a = (const std::pair<ifn_pair, optab> *)a_;
+  auto *b = (const std::pair<ifn_pair, optab> *)b_;
+  return (int) (a->first.first) - (b->first.first);
+}
+
+/* Return the optab belonging to the given internal function NAME for the given
+   SIGN or unknown_optab.  */
+
+optab
+lookup_multi_ifn_optab (enum internal_fn fn, unsigned sign)
+{
+  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
+  typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type;
+  static fn_to_optab_map_type *fn_to_optab_map;
+
+  if (!fn_to_optab_map)
+    {
+      unsigned num
+	= sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn);
+      fn_to_optab_map = new fn_to_optab_map_type ();
+      for (unsigned int i = 0; i < num - 1; ++i)
+	{
+	  enum internal_fn fn = internal_fn_hilo_keys_array[i];
+	  optab v1 = internal_fn_hilo_values_array[2*i];
+	  optab v2 = internal_fn_hilo_values_array[2*i + 1];
+	  ifn_pair key1 (fn, 0);
+	  fn_to_optab_map->safe_push ({key1, v1});
+	  ifn_pair key2 (fn, 1);
+	  fn_to_optab_map->safe_push ({key2, v2});
+	}
+	fn_to_optab_map->qsort(ifn_cmp);
+    }
+
+  ifn_pair new_pair (fn, sign ? 1 : 0);
+  optab tmp;
+  std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp);
+  auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp);
+  return entry != fn_to_optab_map->end () ? entry->second : unknown_optab;
+}
+
+extern void
+lookup_multi_internal_fn (enum internal_fn ifn, enum internal_fn *lo,
+			  enum internal_fn *hi)
+{
+  int ecf_flags = internal_fn_flags (ifn);
+  gcc_assert (ecf_flags & ECF_MULTI);
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -3906,6 +3983,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -3991,6 +4071,32 @@ set_edom_supported_p (void)
 #endif
 }
 
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  static void						        \
+  expand_##CODE (internal_fn, gcall *)		                \
+  {							        \
+    gcc_unreachable ();	                                        \
+  }                                                             \
+  static void						        \
+  expand_##CODE##_LO (internal_fn fn, gcall *stmt)	        \
+  {							        \
+    tree ty = TREE_TYPE (gimple_get_lhs (stmt));                \
+    if (!TYPE_UNSIGNED (ty))                                    \
+      expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab);	\
+    else                                                        \
+      expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab);	\
+  }                                                             \
+  static void						        \
+  expand_##CODE##_HI (internal_fn fn, gcall *stmt)	        \
+  {							        \
+    tree ty = TREE_TYPE (gimple_get_lhs (stmt));                \
+    if (!TYPE_UNSIGNED (ty))                                    \
+      expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab);	\
+    else                                                        \
+      expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab);	\
+  }
+
 #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \
   static void						\
   expand_##CODE (internal_fn fn, gcall *stmt)		\
@@ -4007,6 +4113,7 @@ set_edom_supported_p (void)
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);			\
   }
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index d2d550d358606022b1cb44fa842f06e0be507bc3..4635a9c8af9ad27bb05d7510388d0fe2270428e5 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -82,6 +82,13 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it
+   provides convenience wrappers for defining conversions that require a
+   hi/lo split, like widening and narrowing operations.  Each definition
+   for <NAME> will require an optab named <OPTAB> and two other optabs that
+   you specify for signed and unsigned.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -120,6 +127,14 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME, FLAGS | ECF_MULTI, OPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE)
+#endif
+
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -292,6 +307,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_PLUS,
+			     ECF_CONST | ECF_WIDEN | ECF_NOTHROW,
+			     vec_widen_add, vec_widen_saddl, vec_widen_uaddl,
+			     binary)
+DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_MINUS,
+			     ECF_CONST | ECF_WIDEN | ECF_NOTHROW,
+			     vec_widen_sub, vec_widen_ssubl, vec_widen_usubl,
+			     binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 23c014a963c4d72da92c763db87ee486a2adb485..b35de19747d251d19dc13de1e0323368bd2ebdf2 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern optab lookup_multi_ifn_optab (enum internal_fn, unsigned);
+extern void lookup_multi_internal_fn (enum internal_fn, enum internal_fn *,
+				      enum internal_fn *);
 
 /* Return the ECF_* flags for function FN.  */
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index c0a68471d2ddf08bc0e6a3fd592ebb9f05e516c1..7e904b3e154d018779bb1a36de74e6997f70e193 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1314,7 +1314,17 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == smul_widen_optab
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
-	  || binoptab == umul_highpart_optab);
+	  || binoptab == umul_highpart_optab
+	  || binoptab == vec_widen_add_optab
+	  || binoptab == vec_widen_sub_optab
+	  || binoptab == vec_widen_saddl_hi_optab
+	  || binoptab == vec_widen_saddl_lo_optab
+	  || binoptab == vec_widen_ssubl_hi_optab
+	  || binoptab == vec_widen_ssubl_lo_optab
+	  || binoptab == vec_widen_uaddl_hi_optab
+	  || binoptab == vec_widen_uaddl_lo_optab
+	  || binoptab == vec_widen_usubl_hi_optab
+	  || binoptab == vec_widen_usubl_lo_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 801310ebaa7d469520809bb7efed6820f8eb866b..a7881dcb49e4ef07d8f07aa31214eb3a7a944e2e 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4")
 OPTAB_CD(umsub_widen_optab, "umsub$b$a4")
 OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4")
 OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4")
+OPTAB_CD(vec_widen_add_optab, "add$a$b3")
+OPTAB_CD(vec_widen_sub_optab, "sub$a$b3")
 OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b")
 OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b")
 OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index cff6211080bced0bffb39e98039a6550897acf77..d0c8b812cfb9c3ac83bf25fff0431b08cb7d823d 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -99,6 +99,10 @@ struct die_struct;
 /* Nonzero if this is a widening function.  */
 #define ECF_WIDEN		  (1 << 16)
 
+/* Nonzero if this is a function that decomposes into a lo/hi operation.  */
+#define ECF_MULTI		  (1 << 17)
+
+
 /* Call argument flags.  */
 
 /* Nonzero if the argument is not used by the function.  */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 36e362e1daf3f946c6074600a6a322b3bda67755..0fd587da12b4a17b238327ae60f5a2a7a0efc514 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -1349,14 +1349,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
 			     tree_code orig_code, code_helper wide_code,
-			     bool shift_p, const char *name)
+			     bool shift_p, const char *name,
+			     enum optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-			     shift_p, 2, unprom, &half_type))
+			     shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1422,6 +1424,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 			      type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+			     stmt_vec_info last_stmt_info, tree *type_out,
+			     tree_code orig_code, internal_fn wide_ifn,
+			     bool shift_p, const char *name,
+			     enum optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+				      orig_code, ifn, shift_p, name,
+				      subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1435,26 +1451,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS.  See vect_recog_widen_op_pattern for details.  */
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  enum optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-				      "vect_recog_widen_plus_pattern");
+				      PLUS_EXPR, IFN_VEC_WIDEN_PLUS,
+				      false, "vect_recog_widen_plus_pattern",
+				      &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS.  See vect_recog_widen_op_pattern for details.  */
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  enum optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-				      "vect_recog_widen_minus_pattern");
+				      MINUS_EXPR, IFN_VEC_WIDEN_MINUS,
+				      false, "vect_recog_widen_minus_pattern",
+				      &subtype);
 }
 
 /* Function vect_recog_popcount_pattern
@@ -5618,6 +5638,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 61b51a29f99bcdf0ff6b4ead4a69163ebf8ed383..c31831df723eeae8ea4fca2790a18b562106c889 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4880,7 +4880,7 @@ vectorizable_conversion (vec_info *vinfo,
     return false;
 
   if (gimple_get_lhs (stmt) == NULL_TREE ||
-      TREE_CODE(gimple_get_lhs (stmt)) != SSA_NAME)
+      TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
     return false;
 
   if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
@@ -12125,12 +12125,62 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
-    {
-      optab1 = optab_for_tree_code (c1.as_tree_code (), vectype, optab_default);
-      optab2 = optab_for_tree_code (c2.as_tree_code (), vectype, optab_default);
+
+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn (code.as_fn_code ());
+      int ecf_flags = internal_fn_flags (ifn);
+      gcc_assert (ecf_flags & ECF_MULTI);
+
+      switch (code.as_fn_code ())
+	{
+	case CFN_VEC_WIDEN_PLUS:
+	  break;
+	case CFN_VEC_WIDEN_MINUS:
+	  break;
+	case CFN_LAST:
+	default:
+	  return false;
+	}
+
+      internal_fn lo, hi;
+      lookup_multi_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
+      optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
     }
 
+  if (code.is_tree_code ())
+  {
+    if (code == FIX_TRUNC_EXPR)
+      {
+	/* The signedness is determined from output operand.  */
+	optab1 = optab_for_tree_code (c1.as_tree_code (), vectype_out,
+				      optab_default);
+	optab2 = optab_for_tree_code (c2.as_tree_code (), vectype_out,
+				      optab_default);
+      }
+    else if (CONVERT_EXPR_CODE_P (code.as_tree_code ())
+	     && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+	     && VECTOR_BOOLEAN_TYPE_P (vectype)
+	     && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+	     && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+      {
+	/* If the input and result modes are the same, a different optab
+	   is needed where we pass in the number of units in vectype.  */
+	optab1 = vec_unpacks_sbool_lo_optab;
+	optab2 = vec_unpacks_sbool_hi_optab;
+      }
+    else
+      {
+	optab1 = optab_for_tree_code (c1.as_tree_code (), vectype,
+				      optab_default);
+	optab2 = optab_for_tree_code (c2.as_tree_code (), vectype,
+				      optab_default);
+      }
+  }
+
   if (!optab1 || !optab2)
     return false;
 
-- 
2.17.1


[-- Attachment #4: 0003-Remove-widen_plus-minus_expr-tree-codes.patch --]
[-- Type: application/octet-stream, Size: 19457 bytes --]

From 40224aa09ccd4f44aa6bd843f6c7ce0dbb3b6970 Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Fri, 28 Jan 2022 12:04:44 +0000
Subject: [PATCH 3/3] Remove widen_plus/minus_expr tree codes

This patch removes the old widen plus/minus tree codes which have been
replaced by internal functions.

gcc/ChangeLog:

	* doc/generic.texi: Remove old tree codes.
	* expr.cc (expand_expr_real_2): Remove old tree code cases.
	* gimple-pretty-print.cc (dump_binary_rhs): Remove old tree code
    cases.
	* optabs-tree.cc (optab_for_tree_code): Remove old tree code cases.
	(supportable_half_widening_operation): Remove old tree code cases.
	* tree-cfg.cc (verify_gimple_assign_binary): Remove old tree code
    cases.
	* tree-inline.cc (estimate_operator_cost): Remove old tree code
    cases.
	* tree-pretty-print.cc (dump_generic_node): Remove tree code definition.
	(op_symbol_code): Remove old tree code
    cases.
	* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Remove old tree code
    cases.
	(vect_analyze_data_ref_accesses): Remove old tree code
    cases.
	* tree-vect-generic.cc (expand_vector_operations_1): Remove old tree code
    cases.
	* tree-vect-patterns.cc (vect_widened_op_tree): Refactor ot replace
    usage in vect_recog_sad_pattern.
	(vect_recog_sad_pattern): Replace tree code widening pattern with
    internal function.
	(vect_recog_average_pattern): Replace tree code widening pattern
    with internal function.
	* tree-vect-stmts.cc (vectorizable_conversion): Remove old tree code
    cases.
	(supportable_widening_operation): Remove old tree code
    cases.
	* tree.def (WIDEN_PLUS_EXPR): Remove tree code definition.
	(WIDEN_MINUS_EXPR): Remove tree code definition.
	(VEC_WIDEN_PLUS_HI_EXPR): Remove tree code definition.
	(VEC_WIDEN_PLUS_LO_EXPR): Remove tree code definition.
	(VEC_WIDEN_MINUS_HI_EXPR): Remove tree code definition.
	(VEC_WIDEN_MINUS_LO_EXPR): Remove tree code definition.
---
 gcc/doc/generic.texi       | 31 -------------------------------
 gcc/expr.cc                |  6 ------
 gcc/gimple-pretty-print.cc |  4 ----
 gcc/optabs-tree.cc         | 24 ------------------------
 gcc/tree-cfg.cc            |  6 ------
 gcc/tree-inline.cc         |  6 ------
 gcc/tree-pretty-print.cc   | 12 ------------
 gcc/tree-vect-data-refs.cc |  8 +++-----
 gcc/tree-vect-generic.cc   |  4 ----
 gcc/tree-vect-patterns.cc  | 36 +++++++++++++++++++++++++-----------
 gcc/tree-vect-stmts.cc     | 18 ++----------------
 gcc/tree.def               |  6 ------
 12 files changed, 30 insertions(+), 131 deletions(-)

diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index e5f9d1be8ea81f3da002ec3bb925590d331a2551..344045efd419b0cc3a11771acf70d2fd279c48ac 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,10 +1811,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
-@tindex VEC_WIDEN_PLUS_HI_EXPR
-@tindex VEC_WIDEN_PLUS_LO_EXPR
-@tindex VEC_WIDEN_MINUS_HI_EXPR
-@tindex VEC_WIDEN_MINUS_LO_EXPR
 @tindex VEC_UNPACK_HI_EXPR
 @tindex VEC_UNPACK_LO_EXPR
 @tindex VEC_UNPACK_FLOAT_HI_EXPR
@@ -1861,33 +1857,6 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
-@item VEC_WIDEN_PLUS_HI_EXPR
-@itemx VEC_WIDEN_PLUS_LO_EXPR
-These nodes represent widening vector addition of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The result
-is a vector that contains half as many elements, of an integral type whose size
-is twice as wide.  In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.  In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.
-
-@item VEC_WIDEN_MINUS_HI_EXPR
-@itemx VEC_WIDEN_MINUS_LO_EXPR
-These nodes represent widening vector subtraction of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The high/low
-elements of the second vector are subtracted from the high/low elements of the
-first. The result is a vector that contains half as many elements, of an
-integral type whose size is twice as wide.  In the case of
-@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second
-vector are subtracted from the high @code{N/2} of the first to produce the
-vector of @code{N/2} products.  In the case of
-@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second
-vector are subtracted from the low @code{N/2} of the first to produce the
-vector of @code{N/2} products.
-
 @item VEC_UNPACK_HI_EXPR
 @itemx VEC_UNPACK_LO_EXPR
 These nodes represent unpacking of the high and low parts of the input vector,
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 7197996cec7d24dd43d60928d5618b32b77677a1..1f941efc9e995c7f6a35ff93aaa6bd3c35faaa1f 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -9337,8 +9337,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 					  target, unsignedp);
       return target;
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_MULT_EXPR:
       /* If first operand is constant, swap them.
 	 Thus the following special case checks need only
@@ -10116,10 +10114,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	return temp;
       }
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc
index ebd87b20a0adc080c4a8f9429e75f49b96e72f9a..2a1a5b7f811ca341e8ee7e85a9701d3a37ff80bf 100644
--- a/gcc/gimple-pretty-print.cc
+++ b/gcc/gimple-pretty-print.cc
@@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc,
     case VEC_PACK_FLOAT_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_SERIES_EXPR:
       for (p = get_tree_code_name (code); *p; p++)
 	pp_character (buffer, TOUPPER (*p));
diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc
index 8383fe820b080f6d66f83dcf3b77d3c9f869f4bc..2f5f93dc6624f86f6b5618cf6e7aa2b508053a64 100644
--- a/gcc/optabs-tree.cc
+++ b/gcc/optabs-tree.cc
@@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
       return (TYPE_UNSIGNED (type)
 	      ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab);
 
-    case VEC_WIDEN_PLUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab);
-
-    case VEC_WIDEN_PLUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab);
-
-    case VEC_WIDEN_MINUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab);
-
-    case VEC_WIDEN_MINUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab);
-
     case VEC_UNPACK_HI_EXPR:
       return (TYPE_UNSIGNED (type)
 	      ? vec_unpacku_hi_optab : vec_unpacks_hi_optab);
@@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
    'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO.
 
    Supported widening operations:
-    WIDEN_MINUS_EXPR
-    WIDEN_PLUS_EXPR
     WIDEN_MULT_EXPR
     WIDEN_LSHIFT_EXPR
 
@@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out,
     case WIDEN_LSHIFT_EXPR:
       *code1 = LSHIFT_EXPR;
       break;
-    case WIDEN_MINUS_EXPR:
-      *code1 = MINUS_EXPR;
-      break;
-    case WIDEN_PLUS_EXPR:
-      *code1 = PLUS_EXPR;
-      break;
     case WIDEN_MULT_EXPR:
       *code1 = MULT_EXPR;
       break;
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 8de1b144a426776bf464765477c71ee8f2e52b81..46eed1e1f22052fc077f2fc25e5be627bce541b6 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -3948,8 +3948,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case PLUS_EXPR:
     case MINUS_EXPR:
       {
@@ -4070,10 +4068,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 043e1d5987a4c4b0159109dafb85a805ca828c1e..c0bebb7f4de36838341ed62389ad0e2b79f03034 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4288,8 +4288,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
 
     case REALIGN_LOAD_EXPR:
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case DOT_PROD_EXPR:
@@ -4298,10 +4296,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case WIDEN_MULT_MINUS_EXPR:
     case WIDEN_LSHIFT_EXPR:
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 6acd394a0790ad2ad989f195a3288f0f0a8cc489..53ca62dc1a6873ae9365f199061bde9edd486196 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -2825,8 +2825,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
       break;
 
       /* Binary arithmetic and logic expressions.  */
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case MULT_EXPR:
@@ -3790,10 +3788,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
     case VEC_SERIES_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
     case VEC_WIDEN_MULT_ODD_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
@@ -4311,12 +4305,6 @@ op_symbol_code (enum tree_code code)
     case WIDEN_LSHIFT_EXPR:
       return "w<<";
 
-    case WIDEN_PLUS_EXPR:
-      return "w+";
-
-    case WIDEN_MINUS_EXPR:
-      return "w-";
-
     case POINTER_PLUS_EXPR:
       return "+";
 
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index d20a10a1524164eef788ab4b88ba57c7a09c3387..98dd56ff022233ccead36a1f5a5e896e352f9f5b 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type)
 	  || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR
 	  || gimple_assign_rhs_code (assign) == FLOAT_EXPR)
 	{
 	  tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign));
@@ -3172,8 +3170,8 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
 	    break;
 
 	  /* Check that the DR_INITs are compile-time constants.  */
-	  if (!tree_fits_shwi_p (DR_INIT (dra))
-	      || !tree_fits_shwi_p (DR_INIT (drb)))
+	  if (TREE_CODE (DR_INIT (dra)) != INTEGER_CST
+	      || TREE_CODE (DR_INIT (drb)) != INTEGER_CST)
 	    break;
 
 	  /* Different .GOMP_SIMD_LANE calls still give the same lane,
@@ -3225,7 +3223,7 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
 		  unsigned HOST_WIDE_INT step
 		    = absu_hwi (tree_to_shwi (DR_STEP (dra)));
 		  if (step != 0
-		      && step <= ((unsigned HOST_WIDE_INT)init_b - init_a))
+		      && step <= (unsigned HOST_WIDE_INT)(init_b - init_a))
 		    break;
 		}
 	    }
diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index 92aba5d4af61dd478ec3f1b94854e4ad84166774..5823b08baf70b89b22ecc148b0702a84671ad084 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -2209,10 +2209,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi,
      arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
      calculated in the same way above.  */
   if (code == WIDEN_SUM_EXPR
-      || code == VEC_WIDEN_PLUS_HI_EXPR
-      || code == VEC_WIDEN_PLUS_LO_EXPR
-      || code == VEC_WIDEN_MINUS_HI_EXPR
-      || code == VEC_WIDEN_MINUS_LO_EXPR
       || code == VEC_WIDEN_MULT_HI_EXPR
       || code == VEC_WIDEN_MULT_LO_EXPR
       || code == VEC_WIDEN_MULT_EVEN_EXPR
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 0fd587da12b4a17b238327ae60f5a2a7a0efc514..0d821413b971d983c6562c3e8fbe60e2c3d0cb94 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -557,21 +557,29 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-		      tree_code widened_code, bool shift_p,
+		      code_helper widened_code, bool shift_p,
 		      unsigned int max_nops,
 		      vect_unpromoted_value *unprom, tree *common_type,
 		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else
+    rhs_code = gimple_call_combined_fn (stmt);
+
+  if (rhs_code.as_tree_code () != code
+      && rhs_code.get_rep () != widened_code.get_rep ())
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = is_gimple_assign (stmt) ? gimple_assign_lhs (stmt):
+				      gimple_call_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -584,7 +592,11 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op;
+      if (is_gimple_assign (stmt))
+	op = gimple_op (stmt, i + 1);
+      else
+	op = gimple_call_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
 	{
 	  /* We already have a common type from earlier operands.
@@ -1297,8 +1309,9 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
-			     false, 2, unprom, &half_type))
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+			     CFN_VEC_WIDEN_MINUS, false, 2, unprom,
+			     &half_type))
     return NULL;
 
   vect_pattern_detected ("vect_recog_sad_pattern", last_stmt);
@@ -2335,9 +2348,10 @@ vect_recog_average_pattern (vec_info *vinfo,
   internal_fn ifn = IFN_AVG_FLOOR;
   vect_unpromoted_value unprom[3];
   tree new_type;
+  enum optab_subtype subtype;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-					    WIDEN_PLUS_EXPR, false, 3,
-					    unprom, &new_type);
+					    CFN_VEC_WIDEN_PLUS, false, 3,
+					    unprom, &new_type, &subtype);
   if (nops == 0)
     return NULL;
   if (nops == 3)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index c31831df723eeae8ea4fca2790a18b562106c889..9adbb9fbf116ef316d5bed2c84a7074722055717 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4895,9 +4895,7 @@ vectorizable_conversion (vec_info *vinfo,
 
   if (is_gimple_assign (stmt))
   {
-      widen_arith = (code == WIDEN_PLUS_EXPR
-		     || code == WIDEN_MINUS_EXPR
-		     || code == WIDEN_MULT_EXPR
+      widen_arith = (code == WIDEN_MULT_EXPR
 		     || code == WIDEN_LSHIFT_EXPR);
  }
   else
@@ -4950,8 +4948,6 @@ vectorizable_conversion (vec_info *vinfo,
     {
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR
 		  || widen_arith);
 
 
@@ -11976,7 +11972,7 @@ supportable_widening_operation (vec_info *vinfo,
   class loop *vect_loop = NULL;
   machine_mode vec_mode;
   enum insn_code icode1, icode2;
-  optab optab1, optab2;
+  optab optab1 = unknown_optab, optab2 = unknown_optab;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
   code_helper c1=MAX_TREE_CODES, c2=MAX_TREE_CODES;
@@ -12070,16 +12066,6 @@ supportable_widening_operation (vec_info *vinfo,
       c2 = VEC_WIDEN_LSHIFT_HI_EXPR;
       break;
 
-    case WIDEN_PLUS_EXPR:
-      c1 = VEC_WIDEN_PLUS_LO_EXPR;
-      c2 = VEC_WIDEN_PLUS_HI_EXPR;
-      break;
-
-    case WIDEN_MINUS_EXPR:
-      c1 = VEC_WIDEN_MINUS_LO_EXPR;
-      c2 = VEC_WIDEN_MINUS_HI_EXPR;
-      break;
-
     CASE_CONVERT:
       c1 = VEC_UNPACK_LO_EXPR;
       c2 = VEC_UNPACK_HI_EXPR;
diff --git a/gcc/tree.def b/gcc/tree.def
index 62650b6934b337c5d56e5393dc114173d72c9aa9..9b2dce3576440c445d3240b9ed937fe67c9a5992 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1383,8 +1383,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3)
    the first argument from type t1 to type t2, and then shifting it
    by the second argument.  */
 DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2)
 
 /* Widening vector multiplication.
    The two operands are vectors with N elements of size S. Multiplying the
@@ -1449,10 +1447,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2)
  */
 DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2)
 DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2)
 
 /* PREDICT_EXPR.  Specify hint for branch prediction.  The
    PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the
-- 
2.17.1


^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-05-31 10:07   ` Joel Hutton
@ 2022-05-31 16:46     ` Tamar Christina
  2022-06-01 10:11     ` Richard Biener
  1 sibling, 0 replies; 53+ messages in thread
From: Tamar Christina @ 2022-05-31 16:46 UTC (permalink / raw)
  To: Joel Hutton, Richard Biener; +Cc: Richard Sandiford, gcc-patches

> Just checking there is still interest in this

Definitely,  I am waiting for this to be able to send a new patch upstream 😊

Cheers,
Tamar.

> -----Original Message-----
> From: Gcc-patches <gcc-patches-
> bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of Joel Hutton
> via Gcc-patches
> Sent: Tuesday, May 31, 2022 11:08 AM
> To: Richard Biener <rguenther@suse.de>
> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; gcc-
> patches@gcc.gnu.org
> Subject: RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as
> internal_fns
> 
> > Can you post an updated patch (after the .cc renaming, and code_helper
> > now already moved to tree.h).
> >
> > Thanks,
> > Richard.
> 
> Patches attached. They already incorporated the .cc rename, now rebased to
> be after the change to tree.h
> 
> Joel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-05-31 10:07   ` Joel Hutton
  2022-05-31 16:46     ` Tamar Christina
@ 2022-06-01 10:11     ` Richard Biener
  2022-06-06 17:20       ` Joel Hutton
  1 sibling, 1 reply; 53+ messages in thread
From: Richard Biener @ 2022-06-01 10:11 UTC (permalink / raw)
  To: Joel Hutton; +Cc: Richard Sandiford, gcc-patches

On Tue, 31 May 2022, Joel Hutton wrote:

> > Can you post an updated patch (after the .cc renaming, and code_helper
> > now already moved to tree.h).
> > 
> > Thanks,
> > Richard.
> 
> Patches attached. They already incorporated the .cc rename, now rebased to be after the change to tree.h

@@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
                       2, oprnd, half_type, unprom, vectype);

   tree var = vect_recog_temp_ssa_var (itype, NULL);
-  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
-                                             oprnd[0], oprnd[1]);
+  gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], 
oprnd[1]);


you should be able to do without the new gimple_build overload
by using

   gimple_seq stmts = NULL;
   gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]);
   gimple *pattern_stmt = gimple_seq_last_stmt (stmts);

because 'gimple_build' is an existing API.


-  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
+  if (gimple_get_lhs (stmt) == NULL_TREE ||
+      TREE_CODE(gimple_get_lhs (stmt)) != SSA_NAME)
     return false;

|| go to the next line, space after TREE_CODE

+  bool widen_arith = false;
+  gimple_match_op res_op;
+  if (!gimple_extract_op (stmt, &res_op))
+    return false;
+  code = res_op.code;
+  op_type = res_op.num_ops;
+
+  if (is_gimple_assign (stmt))
+  {
+      widen_arith = (code == WIDEN_PLUS_EXPR
+                    || code == WIDEN_MINUS_EXPR
+                    || code == WIDEN_MULT_EXPR
+                    || code == WIDEN_LSHIFT_EXPR);
+ }
+  else
+      widen_arith = gimple_call_flags (stmt) & ECF_WIDEN;

there seem to be formatting issues.  Also shouldn't you check
if (res_op.code.is_tree_code ()) instead if is_gimple_assign?
I also don't like the ECF_WIDEN "trick", just do as with the
tree codes and explicitely enumerate widening ifns here.

gimple_extract_op is a bit heavy-weight as well, so maybe
instead simply do

  if (is_gimple_assign (stmt))
    {
      code = gimple_assign_rhs_code (stmt);
...
    }
  else if (gimple_call_internal_p (stmt))
    {
      code = gimple_call_internal_fn (stmt);
...
    }
  else
    return false;

+  code_helper c1=MAX_TREE_CODES, c2=MAX_TREE_CODES;

spaces before/after '='

@@ -12061,13 +12105,16 @@ supportable_widening_operation (vec_info *vinfo,
   if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
     std::swap (c1, c2);

+
   if (code == FIX_TRUNC_EXPR)
     {

unnecessary whitespace change.

diff --git a/gcc/tree.h b/gcc/tree.h
index 
f84958933d51144bb6ce7cc41eca5f7f06814550..e51e34c051d9b91d1c02a4b2fefdb2b15606a36f 
100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -92,6 +92,10 @@ public:
   bool is_fn_code () const { return rep < 0; }
   bool is_internal_fn () const;
   bool is_builtin_fn () const;
+  enum tree_code as_tree_code () const { return is_tree_code () ?
+    (tree_code)* this : MAX_TREE_CODES; }
+  combined_fn as_fn_code () const { return is_fn_code () ? (combined_fn) 
*this
+    : CFN_LAST;}

hmm, the other as_* functions we have are not member functions.
Also this semantically differs from the tree_code () conversion
operator (that one was supposed to be "cheap").  The existing
as_internal_fn for example is documented as being valid only if
the code is actually an internal fn.  I see you are introducing
the new function as convenience to get a "safe" not-a-X value,
so maybe they should be called safe_as_tree_code () instead?


   int get_rep () const { return rep; }
   bool operator== (const code_helper &other) { return rep == other.rep; }
   bool operator!= (const code_helper &other) { return rep != other.rep; }
@@ -6657,6 +6661,54 @@ extern unsigned fndecl_dealloc_argno (tree);
    if nonnull, set the second argument to the referenced enclosing
    object or pointer.  Otherwise return null.  */
 extern tree get_attr_nonstring_decl (tree, tree * = NULL);
+/* Helper to transparently allow tree codes and builtin function codes
+   exist in one storage entity.  */
+class code_helper
+{

duplicate add of code_helper.

Sorry to raise these issues so late.

Richard.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-06-01 10:11     ` Richard Biener
@ 2022-06-06 17:20       ` Joel Hutton
  2022-06-07  8:18         ` Richard Sandiford
  0 siblings, 1 reply; 53+ messages in thread
From: Joel Hutton @ 2022-06-06 17:20 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Sandiford, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5524 bytes --]

> > Patches attached. They already incorporated the .cc rename, now
> > rebased to be after the change to tree.h
> 
> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
>                        2, oprnd, half_type, unprom, vectype);
> 
>    tree var = vect_recog_temp_ssa_var (itype, NULL);
> -  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
> -                                             oprnd[0], oprnd[1]);
> +  gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0],
> oprnd[1]);
> 
> 
> you should be able to do without the new gimple_build overload
> by using
> 
>    gimple_seq stmts = NULL;
>    gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]);
>    gimple *pattern_stmt = gimple_seq_last_stmt (stmts);
> 
> because 'gimple_build' is an existing API.

Done.

The gimple_build overload was at the request of Richard Sandiford, I assume removing it is ok with you Richard S?
From Richard Sandiford:
> For example, I think we should hide this inside a new:
> 
>   gimple_build (var, wide_code, oprnd[0], oprnd[1]);
> 
> that works directly on code_helper, similarly to the new code_helper 
> gimple_build interfaces.




> 
> 
> -  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
> +  if (gimple_get_lhs (stmt) == NULL_TREE ||
> +      TREE_CODE(gimple_get_lhs (stmt)) != SSA_NAME)
>      return false;
> 
> || go to the next line, space after TREE_CODE
> 

Done.

> +  bool widen_arith = false;
> +  gimple_match_op res_op;
> +  if (!gimple_extract_op (stmt, &res_op))
> +    return false;
> +  code = res_op.code;
> +  op_type = res_op.num_ops;
> +
> +  if (is_gimple_assign (stmt))
> +  {
> +      widen_arith = (code == WIDEN_PLUS_EXPR
> +                    || code == WIDEN_MINUS_EXPR
> +                    || code == WIDEN_MULT_EXPR
> +                    || code == WIDEN_LSHIFT_EXPR);
> + }
> +  else
> +      widen_arith = gimple_call_flags (stmt) & ECF_WIDEN;
> 
> there seem to be formatting issues.  Also shouldn't you check
> if (res_op.code.is_tree_code ()) instead if is_gimple_assign?
> I also don't like the ECF_WIDEN "trick", just do as with the
> tree codes and explicitely enumerate widening ifns here.
> 

Done. I've set widen_arith to False for the first patch as the second patch introduces the widening ifns.

> gimple_extract_op is a bit heavy-weight as well, so maybe
> instead simply do
> 
>   if (is_gimple_assign (stmt))
>     {
>       code = gimple_assign_rhs_code (stmt);
> ...
>     }
>   else if (gimple_call_internal_p (stmt))
>     {
>       code = gimple_call_internal_fn (stmt);
> ...
>     }
>   else
>     return false;

The patch was originally written as above, it was changed to use gimple_extract_op at the request of Richard Sandiford. I prefer gimple_extract_op as it's more compact, but I don't have strong feelings. If the Richards can agree on either version I'm happy.


From Richard Sandiford:
> > +  if (is_gimple_assign (stmt))
> > +  {
> > +    code_or_ifn = gimple_assign_rhs_code (stmt);  }  else
> > +    code_or_ifn = gimple_call_combined_fn (stmt);
> 
> It might be possible to use gimple_extract_op here (only recently added).
> This would also provide the number of operands directly, instead of 
> needing "op_type".  It would also provide an array of operands.


> 
> +  code_helper c1=MAX_TREE_CODES, c2=MAX_TREE_CODES;
> 
> spaces before/after '='
> 

Done.

> @@ -12061,13 +12105,16 @@ supportable_widening_operation (vec_info
> *vinfo,
>    if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
>      std::swap (c1, c2);
> 
> +
>    if (code == FIX_TRUNC_EXPR)
>      {
> 
> unnecessary whitespace change.
> 
Fixed.

> diff --git a/gcc/tree.h b/gcc/tree.h
> index
> f84958933d51144bb6ce7cc41eca5f7f06814550..e51e34c051d9b91d1c02a4b2
> fefdb2b15606a36f
> 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -92,6 +92,10 @@ public:
>    bool is_fn_code () const { return rep < 0; }
>    bool is_internal_fn () const;
>    bool is_builtin_fn () const;
> +  enum tree_code as_tree_code () const { return is_tree_code () ?
> +    (tree_code)* this : MAX_TREE_CODES; }
> +  combined_fn as_fn_code () const { return is_fn_code () ? (combined_fn)
> *this
> +    : CFN_LAST;}
> 
> hmm, the other as_* functions we have are not member functions.
> Also this semantically differs from the tree_code () conversion
> operator (that one was supposed to be "cheap").  The existing
> as_internal_fn for example is documented as being valid only if
> the code is actually an internal fn.  I see you are introducing
> the new function as convenience to get a "safe" not-a-X value,
> so maybe they should be called safe_as_tree_code () instead?
> 
SGTM. Done

> 
>    int get_rep () const { return rep; }
>    bool operator== (const code_helper &other) { return rep == other.rep; }
>    bool operator!= (const code_helper &other) { return rep != other.rep; }
> @@ -6657,6 +6661,54 @@ extern unsigned fndecl_dealloc_argno (tree);
>     if nonnull, set the second argument to the referenced enclosing
>     object or pointer.  Otherwise return null.  */
>  extern tree get_attr_nonstring_decl (tree, tree * = NULL);
> +/* Helper to transparently allow tree codes and builtin function codes
> +   exist in one storage entity.  */
> +class code_helper
> +{
> 
> duplicate add of code_helper.
Fixed.


Tests are being re-run.

Ok, with changes?

[-- Attachment #2: 0001-Refactor-to-allow-internal_fn-s.patch --]
[-- Type: application/octet-stream, Size: 23246 bytes --]

From 58d1f19224bd6501b5238916871cf2c0f3ba8bd0 Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Wed, 25 Aug 2021 14:31:15 +0100
Subject: [PATCH 1/3] Refactor to allow internal_fn's

Hi all,

This refactor allows widening patterns (such as widen_plus/widen_minus) to be represented as
either internal_fns or tree_codes.

[vect-patterns] Refactor as internal_fn's

Refactor vect-patterns to allow patterns to be internal_fns starting
with widening_plus/minus patterns

gcc/ChangeLog:

	* gimple-match.h (class code_helper): Add safe_as_internal_fn, safe_as_tree_code
    helper functions.
	* tree-core.h (ECF_WIDEN): Flag to mark internal_fn as widening.
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Refactor to
    use code_helper.
	* tree-vect-stmts.cc (vect_gen_widened_results_half): Refactor to
    use code_helper.
	(vect_create_vectorized_promotion_stmts): Refactor to use
    code_helper.
	(vectorizable_conversion): Refactor to use code_helper.
    gimple_call or gimple_assign.
	(supportable_widening_operation): Refactor to use code_helper.
	(supportable_narrowing_operation): Refactor to use code_helper.
	* tree-vectorizer.h (supportable_widening_operation): Change
    prototype to use code_helper.
	(supportable_narrowing_operation): change prototype to use
    code_helper.
---
 gcc/tree-core.h           |   3 +
 gcc/tree-vect-patterns.cc |  11 +-
 gcc/tree-vect-stmts.cc    | 217 +++++++++++++++++++++++---------------
 gcc/tree-vectorizer.h     |  11 +-
 gcc/tree.h                |   4 +
 5 files changed, 153 insertions(+), 93 deletions(-)

diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index ab5fa01e5cb5fb56c1964b93b014ed55a4aa704a..cff6211080bced0bffb39e98039a6550897acf77 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -96,6 +96,9 @@ struct die_struct;
 /* Nonzero if this is a cold function.  */
 #define ECF_COLD		  (1 << 15)
 
+/* Nonzero if this is a widening function.  */
+#define ECF_WIDEN		  (1 << 16)
+
 /* Call argument flags.  */
 
 /* Nonzero if the argument is not used by the function.  */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 0fad4dbd0945c6c176f3457b751e812f17fcd148..c011b8ede3c266b59f731e316efbec7d98e91068 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -25,6 +25,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl.h"
 #include "tree.h"
 #include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-fold.h"
 #include "ssa.h"
 #include "expmed.h"
 #include "optabs-tree.h"
@@ -1348,7 +1350,7 @@ vect_recog_sad_pattern (vec_info *vinfo,
 static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
-			     tree_code orig_code, tree_code wide_code,
+			     tree_code orig_code, code_helper wide_code,
 			     bool shift_p, const char *name)
 {
   gimple *last_stmt = last_stmt_info->stmt;
@@ -1391,7 +1393,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
       vecctype = get_vectype_for_scalar_type (vinfo, ctype);
     }
 
-  enum tree_code dummy_code;
+  code_helper dummy_code;
   int dummy_int;
   auto_vec<tree> dummy_vec;
   if (!vectype
@@ -1412,8 +1414,9 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 		       2, oprnd, half_type, unprom, vectype);
 
   tree var = vect_recog_temp_ssa_var (itype, NULL);
-  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
-					      oprnd[0], oprnd[1]);
+  gimple_seq stmts = NULL;
+  gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]);
+  gimple *pattern_stmt = gimple_seq_last_stmt (stmts);
 
   if (vecctype != vecitype)
     pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype,
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 346d8ce280437e00bfeb19a4b4adc59eb96207f9..9b31425352689d409b8c0aa0c1d5c69e72db869a 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4636,7 +4636,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
    STMT_INFO is the original scalar stmt that we are vectorizing.  */
 
 static gimple *
-vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
+vect_gen_widened_results_half (vec_info *vinfo, code_helper ch,
                                tree vec_oprnd0, tree vec_oprnd1, int op_type,
 			       tree vec_dest, gimple_stmt_iterator *gsi,
 			       stmt_vec_info stmt_info)
@@ -4645,14 +4645,15 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
   tree new_temp;
 
   /* Generate half of the widened result:  */
-  gcc_assert (op_type == TREE_CODE_LENGTH (code));
   if (op_type != binary_op)
     vec_oprnd1 = NULL;
-  new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1);
+
+  gimple_seq stmts = NULL;
+  gimple_build (&stmts, ch, vec_oprnd0, vec_oprnd1);
+  new_stmt = gimple_seq_last_stmt (stmts);
   new_temp = make_ssa_name (vec_dest, new_stmt);
-  gimple_assign_set_lhs (new_stmt, new_temp);
+  gimple_set_lhs (new_stmt, new_temp);
   vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
-
   return new_stmt;
 }
 
@@ -4729,8 +4730,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 					vec<tree> *vec_oprnds1,
 					stmt_vec_info stmt_info, tree vec_dest,
 					gimple_stmt_iterator *gsi,
-					enum tree_code code1,
-					enum tree_code code2, int op_type)
+					code_helper ch1,
+					code_helper ch2, int op_type)
 {
   int i;
   tree vop0, vop1, new_tmp1, new_tmp2;
@@ -4746,10 +4747,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 	vop1 = NULL_TREE;
 
       /* Generate the two halves of promotion operation.  */
-      new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1,
+      new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
-      new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1,
+      new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
       if (is_gimple_call (new_stmt1))
@@ -4846,8 +4847,9 @@ vectorizable_conversion (vec_info *vinfo,
   tree scalar_dest;
   tree op0, op1 = NULL_TREE;
   loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
-  enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK;
-  enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
+  tree_code tc1;
+  code_helper code, code1, code2;
+  code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
   tree new_temp;
   enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type};
   int ndts = 2;
@@ -4876,31 +4878,42 @@ vectorizable_conversion (vec_info *vinfo,
       && ! vec_stmt)
     return false;
 
-  gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!stmt)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return false;
 
-  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
+  if (gimple_get_lhs (stmt) == NULL_TREE 
+      || TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
     return false;
 
-  code = gimple_assign_rhs_code (stmt);
-  if (!CONVERT_EXPR_CODE_P (code)
-      && code != FIX_TRUNC_EXPR
-      && code != FLOAT_EXPR
-      && code != WIDEN_PLUS_EXPR
-      && code != WIDEN_MINUS_EXPR
-      && code != WIDEN_MULT_EXPR
-      && code != WIDEN_LSHIFT_EXPR)
+  if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
     return false;
 
-  bool widen_arith = (code == WIDEN_PLUS_EXPR
-		      || code == WIDEN_MINUS_EXPR
-		      || code == WIDEN_MULT_EXPR
-		      || code == WIDEN_LSHIFT_EXPR);
-  op_type = TREE_CODE_LENGTH (code);
+  bool widen_arith = false;
+  gimple_match_op res_op;
+  if (!gimple_extract_op (stmt, &res_op))
+    return false;
+  code = res_op.code;
+  op_type = res_op.num_ops;
+
+  if (res_op.code.is_tree_code ())
+  {
+      widen_arith = (code == WIDEN_PLUS_EXPR
+		     || code == WIDEN_MINUS_EXPR
+		     || code == WIDEN_MULT_EXPR
+		     || code == WIDEN_LSHIFT_EXPR);
+ }
+  else
+      widen_arith = false;
+
+  if (!widen_arith
+      && !CONVERT_EXPR_CODE_P (code)
+      && code != FIX_TRUNC_EXPR
+      && code != FLOAT_EXPR)
+    return false;
 
   /* Check types of lhs and rhs.  */
-  scalar_dest = gimple_assign_lhs (stmt);
+  scalar_dest = gimple_get_lhs (stmt);
   lhs_type = TREE_TYPE (scalar_dest);
   vectype_out = STMT_VINFO_VECTYPE (stmt_info);
 
@@ -4938,10 +4951,14 @@ vectorizable_conversion (vec_info *vinfo,
 
   if (op_type == binary_op)
     {
-      gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR);
+      gcc_assert (code == WIDEN_MULT_EXPR
+		  || code == WIDEN_LSHIFT_EXPR
+		  || code == WIDEN_PLUS_EXPR
+		  || code == WIDEN_MINUS_EXPR);
 
-      op1 = gimple_assign_rhs2 (stmt);
+
+      op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
+				     gimple_call_arg (stmt, 0);
       tree vectype1_in;
       if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1,
 			       &op1, &slp_op1, &dt[1], &vectype1_in))
@@ -5025,8 +5042,12 @@ vectorizable_conversion (vec_info *vinfo,
 	  && code != FLOAT_EXPR
 	  && !CONVERT_EXPR_CODE_P (code))
 	return false;
-      if (supportable_convert_operation (code, vectype_out, vectype_in, &code1))
+      if (supportable_convert_operation (code.safe_as_tree_code (), vectype_out,
+					 vectype_in, &tc1))
+      {
+	code1 = tc1;
 	break;
+      }
       /* FALLTHRU */
     unsupported:
       if (dump_enabled_p ())
@@ -5037,9 +5058,11 @@ vectorizable_conversion (vec_info *vinfo,
     case WIDEN:
       if (known_eq (nunits_in, nunits_out))
 	{
-	  if (!supportable_half_widening_operation (code, vectype_out,
-						   vectype_in, &code1))
+	  if (!supportable_half_widening_operation (code.safe_as_tree_code (),
+						    vectype_out, vectype_in,
+						    &tc1))
 	    goto unsupported;
+	  code1 = tc1;
 	  gcc_assert (!(multi_step_cvt && op_type == binary_op));
 	  break;
 	}
@@ -5073,14 +5096,17 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (GET_MODE_SIZE (rhs_mode) == fltsz)
 	    {
-	      if (!supportable_convert_operation (code, vectype_out,
-						  cvt_type, &codecvt1))
+	      tc1 = ERROR_MARK;
+	      if (!supportable_convert_operation (code.safe_as_tree_code (),
+						  vectype_out,
+						  cvt_type, &tc1))
 		goto unsupported;
+	      codecvt1 = tc1;
 	    }
-	  else if (!supportable_widening_operation (vinfo, code, stmt_info,
-						    vectype_out, cvt_type,
-						    &codecvt1, &codecvt2,
-						    &multi_step_cvt,
+	  else if (!supportable_widening_operation (vinfo, code,
+						    stmt_info, vectype_out,
+						    cvt_type, &codecvt1,
+						    &codecvt2, &multi_step_cvt,
 						    &interm_types))
 	    continue;
 	  else
@@ -5088,8 +5114,9 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info,
 					      cvt_type,
-					      vectype_in, &code1, &code2,
-					      &multi_step_cvt, &interm_types))
+					      vectype_in, &code1,
+					      &code2, &multi_step_cvt,
+					      &interm_types))
 	    {
 	      found_mode = true;
 	      break;
@@ -5111,10 +5138,14 @@ vectorizable_conversion (vec_info *vinfo,
 
     case NARROW:
       gcc_assert (op_type == unary_op);
-      if (supportable_narrowing_operation (code, vectype_out, vectype_in,
-					   &code1, &multi_step_cvt,
+      if (supportable_narrowing_operation (code.safe_as_tree_code (), vectype_out,
+					   vectype_in,
+					   &tc1, &multi_step_cvt,
 					   &interm_types))
-	break;
+	{
+	  code1 = tc1;
+	  break;
+	}
 
       if (code != FIX_TRUNC_EXPR
 	  || GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode))
@@ -5125,13 +5156,18 @@ vectorizable_conversion (vec_info *vinfo,
       cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
       if (cvt_type == NULL_TREE)
 	goto unsupported;
-      if (!supportable_convert_operation (code, cvt_type, vectype_in,
-					  &codecvt1))
+      if (!supportable_convert_operation (code.safe_as_tree_code (), cvt_type,
+					  vectype_in,
+					  &tc1))
 	goto unsupported;
+      codecvt1 = tc1;
       if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type,
-					   &code1, &multi_step_cvt,
+					   &tc1, &multi_step_cvt,
 					   &interm_types))
-	break;
+	{
+	  code1 = tc1;
+	  break;
+	}
       goto unsupported;
 
     default:
@@ -5245,8 +5281,9 @@ vectorizable_conversion (vec_info *vinfo,
       FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	{
 	  /* Arguments are ready, create the new vector stmt.  */
-	  gcc_assert (TREE_CODE_LENGTH (code1) == unary_op);
-	  gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0);
+	  gcc_assert (TREE_CODE_LENGTH ((tree_code) code1) == unary_op);
+	  gassign *new_stmt = gimple_build_assign (vec_dest,
+						   code1.safe_as_tree_code (), vop0);
 	  new_temp = make_ssa_name (vec_dest, new_stmt);
 	  gimple_assign_set_lhs (new_stmt, new_temp);
 	  vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
@@ -5278,7 +5315,7 @@ vectorizable_conversion (vec_info *vinfo,
       for (i = multi_step_cvt; i >= 0; i--)
 	{
 	  tree this_dest = vec_dsts[i];
-	  enum tree_code c1 = code1, c2 = code2;
+	  code_helper c1 = code1, c2 = code2;
 	  if (i == 0 && codecvt2 != ERROR_MARK)
 	    {
 	      c1 = codecvt1;
@@ -5288,7 +5325,8 @@ vectorizable_conversion (vec_info *vinfo,
 	    vect_create_half_widening_stmts (vinfo, &vec_oprnds0,
 						    &vec_oprnds1, stmt_info,
 						    this_dest, gsi,
-						    c1, op_type);
+						    c1.safe_as_tree_code (),
+						    op_type);
 	  else
 	    vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0,
 						    &vec_oprnds1, stmt_info,
@@ -5301,9 +5339,11 @@ vectorizable_conversion (vec_info *vinfo,
 	  gimple *new_stmt;
 	  if (cvt_type)
 	    {
-	      gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
+	      gcc_assert (TREE_CODE_LENGTH ((tree_code) codecvt1) == unary_op);
 	      new_temp = make_ssa_name (vec_dest);
-	      new_stmt = gimple_build_assign (new_temp, codecvt1, vop0);
+	      new_stmt = gimple_build_assign (new_temp,
+					      codecvt1.safe_as_tree_code (),
+					      vop0);
 	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    }
 	  else
@@ -5327,10 +5367,10 @@ vectorizable_conversion (vec_info *vinfo,
       if (cvt_type)
 	FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	  {
-	    gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
+	    gcc_assert (TREE_CODE_LENGTH (((tree_code) codecvt1)) == unary_op);
 	    new_temp = make_ssa_name (vec_dest);
 	    gassign *new_stmt
-	      = gimple_build_assign (new_temp, codecvt1, vop0);
+	      = gimple_build_assign (new_temp, codecvt1.safe_as_tree_code (), vop0);
 	    vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    vec_oprnds0[i] = new_temp;
 	  }
@@ -5338,7 +5378,7 @@ vectorizable_conversion (vec_info *vinfo,
       vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0,
 					     multi_step_cvt,
 					     stmt_info, vec_dsts, gsi,
-					     slp_node, code1);
+					     slp_node, code1.safe_as_tree_code ());
       break;
     }
   if (!slp_node)
@@ -11926,9 +11966,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype)
 
 bool
 supportable_widening_operation (vec_info *vinfo,
-				enum tree_code code, stmt_vec_info stmt_info,
+				code_helper code,
+				stmt_vec_info stmt_info,
 				tree vectype_out, tree vectype_in,
-                                enum tree_code *code1, enum tree_code *code2,
+				code_helper *code1,
+				code_helper *code2,
                                 int *multi_step_cvt,
                                 vec<tree> *interm_types)
 {
@@ -11939,7 +11981,7 @@ supportable_widening_operation (vec_info *vinfo,
   optab optab1, optab2;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
-  enum tree_code c1, c2;
+  code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES;
   int i;
   tree prev_type, intermediate_type;
   machine_mode intermediate_mode, prev_mode;
@@ -11949,7 +11991,7 @@ supportable_widening_operation (vec_info *vinfo,
   if (loop_info)
     vect_loop = LOOP_VINFO_LOOP (loop_info);
 
-  switch (code)
+  switch (code.safe_as_tree_code ())
     {
     case WIDEN_MULT_EXPR:
       /* The result of a vectorized widening operation usually requires
@@ -11990,8 +12032,9 @@ supportable_widening_operation (vec_info *vinfo,
 	  && !nested_in_vect_loop_p (vect_loop, stmt_info)
 	  && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR,
 					     stmt_info, vectype_out,
-					     vectype_in, code1, code2,
-					     multi_step_cvt, interm_types))
+					     vectype_in, code1,
+					     code2, multi_step_cvt,
+					     interm_types))
         {
           /* Elements in a vector with vect_used_by_reduction property cannot
              be reordered if the use chain with this property does not have the
@@ -12054,6 +12097,9 @@ supportable_widening_operation (vec_info *vinfo,
       c2 = VEC_UNPACK_FIX_TRUNC_HI_EXPR;
       break;
 
+    case MAX_TREE_CODES:
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -12064,10 +12110,12 @@ supportable_widening_operation (vec_info *vinfo,
   if (code == FIX_TRUNC_EXPR)
     {
       /* The signedness is determined from output operand.  */
-      optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+      optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out,
+				    optab_default);
+      optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out,
+				    optab_default);
     }
-  else if (CONVERT_EXPR_CODE_P (code)
+  else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
 	   && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
 	   && VECTOR_BOOLEAN_TYPE_P (vectype)
 	   && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
@@ -12080,8 +12128,8 @@ supportable_widening_operation (vec_info *vinfo,
     }
   else
     {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+      optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, optab_default);
+      optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, optab_default);
     }
 
   if (!optab1 || !optab2)
@@ -12092,8 +12140,12 @@ supportable_widening_operation (vec_info *vinfo,
        || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
     return false;
 
-  *code1 = c1;
-  *code2 = c2;
+  if (code.is_tree_code ())
+  {
+    *code1 = c1;
+    *code2 = c2;
+  }
+
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
       && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
@@ -12114,7 +12166,7 @@ supportable_widening_operation (vec_info *vinfo,
   prev_type = vectype;
   prev_mode = vec_mode;
 
-  if (!CONVERT_EXPR_CODE_P (code))
+  if (!CONVERT_EXPR_CODE_P ((tree_code) code))
     return false;
 
   /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS
@@ -12145,8 +12197,10 @@ supportable_widening_operation (vec_info *vinfo,
 	}
       else
 	{
-	  optab3 = optab_for_tree_code (c1, intermediate_type, optab_default);
-	  optab4 = optab_for_tree_code (c2, intermediate_type, optab_default);
+	  optab3 = optab_for_tree_code (c1.safe_as_tree_code (), intermediate_type,
+					optab_default);
+	  optab4 = optab_for_tree_code (c2.safe_as_tree_code (), intermediate_type,
+					optab_default);
 	}
 
       if (!optab3 || !optab4
@@ -12181,7 +12235,6 @@ supportable_widening_operation (vec_info *vinfo,
   return false;
 }
 
-
 /* Function supportable_narrowing_operation
 
    Check whether an operation represented by the code CODE is a
@@ -12205,7 +12258,7 @@ supportable_widening_operation (vec_info *vinfo,
 bool
 supportable_narrowing_operation (enum tree_code code,
 				 tree vectype_out, tree vectype_in,
-				 enum tree_code *code1, int *multi_step_cvt,
+				 tree_code* _code1, int *multi_step_cvt,
                                  vec<tree> *interm_types)
 {
   machine_mode vec_mode;
@@ -12217,8 +12270,8 @@ supportable_narrowing_operation (enum tree_code code,
   tree intermediate_type, prev_type;
   machine_mode intermediate_mode, prev_mode;
   int i;
-  unsigned HOST_WIDE_INT n_elts;
   bool uns;
+  tree_code * code1 = (tree_code*) _code1;
 
   *multi_step_cvt = 0;
   switch (code)
@@ -12227,9 +12280,8 @@ supportable_narrowing_operation (enum tree_code code,
       c1 = VEC_PACK_TRUNC_EXPR;
       if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
 	  && VECTOR_BOOLEAN_TYPE_P (vectype)
-	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype))
-	  && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts)
-	  && n_elts < BITS_PER_UNIT)
+	  && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype)
+	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
 	optab1 = vec_pack_sbool_trunc_optab;
       else
 	optab1 = optab_for_tree_code (c1, vectype, optab_default);
@@ -12320,9 +12372,8 @@ supportable_narrowing_operation (enum tree_code code,
 	  = lang_hooks.types.type_for_mode (intermediate_mode, uns);
       if (VECTOR_BOOLEAN_TYPE_P (intermediate_type)
 	  && VECTOR_BOOLEAN_TYPE_P (prev_type)
-	  && SCALAR_INT_MODE_P (prev_mode)
-	  && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant (&n_elts)
-	  && n_elts < BITS_PER_UNIT)
+	  && intermediate_mode == prev_mode
+	  && SCALAR_INT_MODE_P (prev_mode))
 	interm_optab = vec_pack_sbool_trunc_optab;
       else
 	interm_optab
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd784f10ee3d8ff4b4dc 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree,
 				enum vect_def_type *,
 				tree *, stmt_vec_info * = NULL);
 extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree);
-extern bool supportable_widening_operation (vec_info *,
-					    enum tree_code, stmt_vec_info,
-					    tree, tree, enum tree_code *,
-					    enum tree_code *, int *,
-					    vec<tree> *);
+extern bool supportable_widening_operation (vec_info*, code_helper,
+					    stmt_vec_info, tree, tree,
+					    code_helper*, code_helper*,
+					    int*, vec<tree> *);
 extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
-					     enum tree_code *, int *,
+					     tree_code *, int *,
 					     vec<tree> *);
 
 extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
diff --git a/gcc/tree.h b/gcc/tree.h
index f84958933d51144bb6ce7cc41eca5f7f06814550..00b0e4d1c696633fe38baad5295b1f90398d53fc 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -92,6 +92,10 @@ public:
   bool is_fn_code () const { return rep < 0; }
   bool is_internal_fn () const;
   bool is_builtin_fn () const;
+  enum tree_code safe_as_tree_code () const { return is_tree_code () ?
+    (tree_code)* this : MAX_TREE_CODES; }
+  combined_fn safe_as_fn_code () const { return is_fn_code () ? (combined_fn) *this
+    : CFN_LAST;}
   int get_rep () const { return rep; }
   bool operator== (const code_helper &other) { return rep == other.rep; }
   bool operator!= (const code_helper &other) { return rep != other.rep; }
-- 
2.17.1


[-- Attachment #3: 0002-Refactor-widen_plus-as-internal_fn.patch --]
[-- Type: application/octet-stream, Size: 23132 bytes --]

From 233a24f2a4eeced2fd4e99578e6ea81ec8622192 Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Wed, 26 Jan 2022 14:00:17 +0000
Subject: [PATCH 2/3] Refactor widen_plus as internal_fn

This patch replaces the existing tree_code widen_plus and widen_minus
patterns with internal_fn versions.

DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it provides convenience wrappers for defining conversions that require a hi/lo split, like widening and narrowing operations.  Each definition for <NAME> will require an optab named <OPTAB> and two other optabs that you specify for signed and unsigned. The hi/lo pair is necessary because the widening operations take n narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' operation operates on the first n/2 elements of input. The 'hi' operation operates on the second n/2 elements of input. Defining an internal_fn along with hi/lo variations allows a single internal function to be returned from a vect_recog function that will later be expanded to hi/lo.

DEF_INTERNAL_OPTAB_MULTI_FN is used in internal-fn.def to register a widening internal_fn. It is defined differently in different places and internal-fn.def is sourced from those places so the parameters given can be reused.
  internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later defined to generate the  'expand_' functions for the hi/lo versions of the fn.
  internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original and hi/lo variants of the internal_fn

 For example:
 IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>addl_hi_<mode> -> (u/s)addl2
                       IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>addl_lo_<mode> -> (u/s)addl

This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.

gcc/ChangeLog:

2022-04-13  Joel Hutton  <joel.hutton@arm.com>
2022-04-13  Tamar Christina  <tamar.christina@arm.com>

	* internal-fn.cc (INCLUDE_MAP): Include maps for use in optab
    lookup.
	(DEF_INTERNAL_OPTAB_MULTI_FN): Macro to define an internal_fn that
    expands into multiple internal_fns (for widening).
	(ifn_cmp): Function to compare ifn's for sorting/searching.
	(lookup_multi_ifn_optab): Add lookup function.
	(lookup_multi_internal_fn): Add lookup function.
	(commutative_binary_fn_p): Add widen_plus fn's.
	* internal-fn.def (DEF_INTERNAL_OPTAB_MULTI_FN): Define widening
    plus,minus functions.
	(VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code.
	(VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code.
	* internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
	(lookup_multi_ifn_optab): Add prototype.
	(lookup_multi_internal_fn): Add prototype.
	* optabs.cc (commutative_optab_p): Add widening plus, minus optabs.
	* optabs.def (OPTAB_CD): widen add, sub optabs
	* tree-core.h (ECF_MULTI): Flag to indicate if a function decays
    into hi/lo parts.
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
    patterns with a hi/lo split.
	(vect_recog_widen_plus_pattern): Refactor to return
    IFN_VECT_WIDEN_PLUS.
	(vect_recog_widen_minus_pattern): Refactor to return new
    IFN_VEC_WIDEN_MINUS.
	* tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus
    ifn
    support.
	(supportable_widening_operation): Add widen plus/minus ifn support.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-widen-add.c: Test that new
    IFN_VEC_WIDEN_PLUS is being used.
	* gcc.target/aarch64/vect-widen-sub.c: Test that new
    IFN_VEC_WIDEN_MINUS is being used.
---
 gcc/internal-fn.cc                            | 107 ++++++++++++++++++
 gcc/internal-fn.def                           |  23 ++++
 gcc/internal-fn.h                             |   7 ++
 gcc/optabs.cc                                 |  12 +-
 gcc/optabs.def                                |   2 +
 .../gcc.target/aarch64/vect-widen-add.c       |   4 +-
 .../gcc.target/aarch64/vect-widen-sub.c       |   4 +-
 gcc/tree-core.h                               |   4 +
 gcc/tree-vect-patterns.cc                     |  37 ++++--
 gcc/tree-vect-stmts.cc                        |  65 ++++++++++-
 10 files changed, 248 insertions(+), 17 deletions(-)

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 8b1733e20c4455e4e8c383c92fe859f4256cae69..e95b13af884f67990ad43c286990a351e2bd641b 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
 along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
+#define INCLUDE_MAP
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = {
   0
 };
 
+const enum internal_fn internal_fn_hilo_keys_array[] = {
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  IFN_##NAME##_LO, \
+  IFN_##NAME##_HI,
+#include "internal-fn.def"
+  IFN_LAST
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+};
+
+const optab internal_fn_hilo_values_array[] = {
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  SOPTAB##_lo_optab, UOPTAB##_lo_optab, \
+  SOPTAB##_hi_optab, UOPTAB##_hi_optab,
+#include "internal-fn.def"
+  unknown_optab, unknown_optab
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+};
+
 /* Return the internal function called NAME, or IFN_LAST if there's
    no such function.  */
 
@@ -90,6 +111,62 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+static int
+ifn_cmp (const void *a_, const void *b_)
+{
+  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
+  auto *a = (const std::pair<ifn_pair, optab> *)a_;
+  auto *b = (const std::pair<ifn_pair, optab> *)b_;
+  return (int) (a->first.first) - (b->first.first);
+}
+
+/* Return the optab belonging to the given internal function NAME for the given
+   SIGN or unknown_optab.  */
+
+optab
+lookup_multi_ifn_optab (enum internal_fn fn, unsigned sign)
+{
+  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
+  typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type;
+  static fn_to_optab_map_type *fn_to_optab_map;
+
+  if (!fn_to_optab_map)
+    {
+      unsigned num
+	= sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn);
+      fn_to_optab_map = new fn_to_optab_map_type ();
+      for (unsigned int i = 0; i < num - 1; ++i)
+	{
+	  enum internal_fn fn = internal_fn_hilo_keys_array[i];
+	  optab v1 = internal_fn_hilo_values_array[2*i];
+	  optab v2 = internal_fn_hilo_values_array[2*i + 1];
+	  ifn_pair key1 (fn, 0);
+	  fn_to_optab_map->safe_push ({key1, v1});
+	  ifn_pair key2 (fn, 1);
+	  fn_to_optab_map->safe_push ({key2, v2});
+	}
+	fn_to_optab_map->qsort(ifn_cmp);
+    }
+
+  ifn_pair new_pair (fn, sign ? 1 : 0);
+  optab tmp;
+  std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp);
+  auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp);
+  return entry != fn_to_optab_map->end () ? entry->second : unknown_optab;
+}
+
+extern void
+lookup_multi_internal_fn (enum internal_fn ifn, enum internal_fn *lo,
+			  enum internal_fn *hi)
+{
+  int ecf_flags = internal_fn_flags (ifn);
+  gcc_assert (ecf_flags & ECF_MULTI);
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -3906,6 +3983,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -3991,6 +4071,32 @@ set_edom_supported_p (void)
 #endif
 }
 
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  static void						        \
+  expand_##CODE (internal_fn, gcall *)		                \
+  {							        \
+    gcc_unreachable ();	                                        \
+  }                                                             \
+  static void						        \
+  expand_##CODE##_LO (internal_fn fn, gcall *stmt)	        \
+  {							        \
+    tree ty = TREE_TYPE (gimple_get_lhs (stmt));                \
+    if (!TYPE_UNSIGNED (ty))                                    \
+      expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab);	\
+    else                                                        \
+      expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab);	\
+  }                                                             \
+  static void						        \
+  expand_##CODE##_HI (internal_fn fn, gcall *stmt)	        \
+  {							        \
+    tree ty = TREE_TYPE (gimple_get_lhs (stmt));                \
+    if (!TYPE_UNSIGNED (ty))                                    \
+      expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab);	\
+    else                                                        \
+      expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab);	\
+  }
+
 #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \
   static void						\
   expand_##CODE (internal_fn fn, gcall *stmt)		\
@@ -4007,6 +4113,7 @@ set_edom_supported_p (void)
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);			\
   }
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index d2d550d358606022b1cb44fa842f06e0be507bc3..4635a9c8af9ad27bb05d7510388d0fe2270428e5 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -82,6 +82,13 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it
+   provides convenience wrappers for defining conversions that require a
+   hi/lo split, like widening and narrowing operations.  Each definition
+   for <NAME> will require an optab named <OPTAB> and two other optabs that
+   you specify for signed and unsigned.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -120,6 +127,14 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME, FLAGS | ECF_MULTI, OPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE)
+#endif
+
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -292,6 +307,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_PLUS,
+			     ECF_CONST | ECF_WIDEN | ECF_NOTHROW,
+			     vec_widen_add, vec_widen_saddl, vec_widen_uaddl,
+			     binary)
+DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_MINUS,
+			     ECF_CONST | ECF_WIDEN | ECF_NOTHROW,
+			     vec_widen_sub, vec_widen_ssubl, vec_widen_usubl,
+			     binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 23c014a963c4d72da92c763db87ee486a2adb485..b35de19747d251d19dc13de1e0323368bd2ebdf2 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern optab lookup_multi_ifn_optab (enum internal_fn, unsigned);
+extern void lookup_multi_internal_fn (enum internal_fn, enum internal_fn *,
+				      enum internal_fn *);
 
 /* Return the ECF_* flags for function FN.  */
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index c0a68471d2ddf08bc0e6a3fd592ebb9f05e516c1..7e904b3e154d018779bb1a36de74e6997f70e193 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1314,7 +1314,17 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == smul_widen_optab
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
-	  || binoptab == umul_highpart_optab);
+	  || binoptab == umul_highpart_optab
+	  || binoptab == vec_widen_add_optab
+	  || binoptab == vec_widen_sub_optab
+	  || binoptab == vec_widen_saddl_hi_optab
+	  || binoptab == vec_widen_saddl_lo_optab
+	  || binoptab == vec_widen_ssubl_hi_optab
+	  || binoptab == vec_widen_ssubl_lo_optab
+	  || binoptab == vec_widen_uaddl_hi_optab
+	  || binoptab == vec_widen_uaddl_lo_optab
+	  || binoptab == vec_widen_usubl_hi_optab
+	  || binoptab == vec_widen_usubl_lo_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 801310ebaa7d469520809bb7efed6820f8eb866b..a7881dcb49e4ef07d8f07aa31214eb3a7a944e2e 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4")
 OPTAB_CD(umsub_widen_optab, "umsub$b$a4")
 OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4")
 OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4")
+OPTAB_CD(vec_widen_add_optab, "add$a$b3")
+OPTAB_CD(vec_widen_sub_optab, "sub$a$b3")
 OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b")
 OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b")
 OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index cff6211080bced0bffb39e98039a6550897acf77..d0c8b812cfb9c3ac83bf25fff0431b08cb7d823d 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -99,6 +99,10 @@ struct die_struct;
 /* Nonzero if this is a widening function.  */
 #define ECF_WIDEN		  (1 << 16)
 
+/* Nonzero if this is a function that decomposes into a lo/hi operation.  */
+#define ECF_MULTI		  (1 << 17)
+
+
 /* Call argument flags.  */
 
 /* Nonzero if the argument is not used by the function.  */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index c011b8ede3c266b59f731e316efbec7d98e91068..268f5402fcdd5ec5bfb806db8c410e701c771275 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -1351,14 +1351,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
 			     tree_code orig_code, code_helper wide_code,
-			     bool shift_p, const char *name)
+			     bool shift_p, const char *name,
+			     enum optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-			     shift_p, 2, unprom, &half_type))
+			     shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1426,6 +1428,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 			      type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+			     stmt_vec_info last_stmt_info, tree *type_out,
+			     tree_code orig_code, internal_fn wide_ifn,
+			     bool shift_p, const char *name,
+			     enum optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+				      orig_code, ifn, shift_p, name,
+				      subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1439,26 +1455,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS.  See vect_recog_widen_op_pattern for details.  */
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  enum optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-				      "vect_recog_widen_plus_pattern");
+				      PLUS_EXPR, IFN_VEC_WIDEN_PLUS,
+				      false, "vect_recog_widen_plus_pattern",
+				      &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS.  See vect_recog_widen_op_pattern for details.  */
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  enum optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-				      "vect_recog_widen_minus_pattern");
+				      MINUS_EXPR, IFN_VEC_WIDEN_MINUS,
+				      false, "vect_recog_widen_minus_pattern",
+				      &subtype);
 }
 
 /* Function vect_recog_popcount_pattern
@@ -5622,6 +5642,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 9b31425352689d409b8c0aa0c1d5c69e72db869a..9af0d107fdafb959db10d87e4e0ba5fda4e47bd7 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4904,7 +4904,8 @@ vectorizable_conversion (vec_info *vinfo,
 		     || code == WIDEN_LSHIFT_EXPR);
  }
   else
-      widen_arith = false;
+      widen_arith = (code == IFN_VEC_WIDEN_PLUS
+		     || code == IFN_VEC_WIDEN_MINUS);
 
   if (!widen_arith
       && !CONVERT_EXPR_CODE_P (code)
@@ -4954,7 +4955,9 @@ vectorizable_conversion (vec_info *vinfo,
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
 		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR);
+		  || code == WIDEN_MINUS_EXPR
+		  || code == IFN_VEC_WIDEN_PLUS
+		  || code == IFN_VEC_WIDEN_MINUS);
 
 
       op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
@@ -12126,12 +12129,62 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
-    {
-      optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, optab_default);
-      optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, optab_default);
+
+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn (code.safe_as_fn_code ());
+      int ecf_flags = internal_fn_flags (ifn);
+      gcc_assert (ecf_flags & ECF_MULTI);
+
+      switch (code.safe_as_fn_code ())
+	{
+	case CFN_VEC_WIDEN_PLUS:
+	  break;
+	case CFN_VEC_WIDEN_MINUS:
+	  break;
+	case CFN_LAST:
+	default:
+	  return false;
+	}
+
+      internal_fn lo, hi;
+      lookup_multi_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
+      optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
     }
 
+  if (code.is_tree_code ())
+  {
+    if (code == FIX_TRUNC_EXPR)
+      {
+	/* The signedness is determined from output operand.  */
+	optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out,
+				      optab_default);
+	optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out,
+				      optab_default);
+      }
+    else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
+	     && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+	     && VECTOR_BOOLEAN_TYPE_P (vectype)
+	     && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+	     && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+      {
+	/* If the input and result modes are the same, a different optab
+	   is needed where we pass in the number of units in vectype.  */
+	optab1 = vec_unpacks_sbool_lo_optab;
+	optab2 = vec_unpacks_sbool_hi_optab;
+      }
+    else
+      {
+	optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype,
+				      optab_default);
+	optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype,
+				      optab_default);
+      }
+  }
+
   if (!optab1 || !optab2)
     return false;
 
-- 
2.17.1


[-- Attachment #4: 0003-Remove-widen_plus-minus_expr-tree-codes.patch --]
[-- Type: application/octet-stream, Size: 19519 bytes --]

From 061e11fa3c76c42640ab6467858e057e3067a6d3 Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Fri, 28 Jan 2022 12:04:44 +0000
Subject: [PATCH 3/3] Remove widen_plus/minus_expr tree codes

This patch removes the old widen plus/minus tree codes which have been
replaced by internal functions.

gcc/ChangeLog:

	* doc/generic.texi: Remove old tree codes.
	* expr.cc (expand_expr_real_2): Remove old tree code cases.
	* gimple-pretty-print.cc (dump_binary_rhs): Remove old tree code
    cases.
	* optabs-tree.cc (optab_for_tree_code): Remove old tree code cases.
	(supportable_half_widening_operation): Remove old tree code cases.
	* tree-cfg.cc (verify_gimple_assign_binary): Remove old tree code
    cases.
	* tree-inline.cc (estimate_operator_cost): Remove old tree code
    cases.
	* tree-pretty-print.cc (dump_generic_node): Remove tree code definition.
	(op_symbol_code): Remove old tree code
    cases.
	* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Remove old tree code
    cases.
	(vect_analyze_data_ref_accesses): Remove old tree code
    cases.
	* tree-vect-generic.cc (expand_vector_operations_1): Remove old tree code
    cases.
	* tree-vect-patterns.cc (vect_widened_op_tree): Refactor ot replace
    usage in vect_recog_sad_pattern.
	(vect_recog_sad_pattern): Replace tree code widening pattern with
    internal function.
	(vect_recog_average_pattern): Replace tree code widening pattern
    with internal function.
	* tree-vect-stmts.cc (vectorizable_conversion): Remove old tree code
    cases.
	(supportable_widening_operation): Remove old tree code
    cases.
	* tree.def (WIDEN_PLUS_EXPR): Remove tree code definition.
	(WIDEN_MINUS_EXPR): Remove tree code definition.
	(VEC_WIDEN_PLUS_HI_EXPR): Remove tree code definition.
	(VEC_WIDEN_PLUS_LO_EXPR): Remove tree code definition.
	(VEC_WIDEN_MINUS_HI_EXPR): Remove tree code definition.
	(VEC_WIDEN_MINUS_LO_EXPR): Remove tree code definition.
---
 gcc/doc/generic.texi       | 31 -------------------------------
 gcc/expr.cc                |  6 ------
 gcc/gimple-pretty-print.cc |  4 ----
 gcc/optabs-tree.cc         | 24 ------------------------
 gcc/tree-cfg.cc            |  6 ------
 gcc/tree-inline.cc         |  6 ------
 gcc/tree-pretty-print.cc   | 12 ------------
 gcc/tree-vect-data-refs.cc |  8 +++-----
 gcc/tree-vect-generic.cc   |  4 ----
 gcc/tree-vect-patterns.cc  | 36 +++++++++++++++++++++++++-----------
 gcc/tree-vect-stmts.cc     | 18 ++----------------
 gcc/tree.def               |  6 ------
 12 files changed, 30 insertions(+), 131 deletions(-)

diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index e5f9d1be8ea81f3da002ec3bb925590d331a2551..344045efd419b0cc3a11771acf70d2fd279c48ac 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,10 +1811,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
-@tindex VEC_WIDEN_PLUS_HI_EXPR
-@tindex VEC_WIDEN_PLUS_LO_EXPR
-@tindex VEC_WIDEN_MINUS_HI_EXPR
-@tindex VEC_WIDEN_MINUS_LO_EXPR
 @tindex VEC_UNPACK_HI_EXPR
 @tindex VEC_UNPACK_LO_EXPR
 @tindex VEC_UNPACK_FLOAT_HI_EXPR
@@ -1861,33 +1857,6 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
-@item VEC_WIDEN_PLUS_HI_EXPR
-@itemx VEC_WIDEN_PLUS_LO_EXPR
-These nodes represent widening vector addition of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The result
-is a vector that contains half as many elements, of an integral type whose size
-is twice as wide.  In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.  In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.
-
-@item VEC_WIDEN_MINUS_HI_EXPR
-@itemx VEC_WIDEN_MINUS_LO_EXPR
-These nodes represent widening vector subtraction of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The high/low
-elements of the second vector are subtracted from the high/low elements of the
-first. The result is a vector that contains half as many elements, of an
-integral type whose size is twice as wide.  In the case of
-@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second
-vector are subtracted from the high @code{N/2} of the first to produce the
-vector of @code{N/2} products.  In the case of
-@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second
-vector are subtracted from the low @code{N/2} of the first to produce the
-vector of @code{N/2} products.
-
 @item VEC_UNPACK_HI_EXPR
 @itemx VEC_UNPACK_LO_EXPR
 These nodes represent unpacking of the high and low parts of the input vector,
diff --git a/gcc/expr.cc b/gcc/expr.cc
index fb062dc847577ec9dc2c951330f4cfadcc869325..4e3655070400cee086c2fdc6ac5bbe08d303de5e 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -9371,8 +9371,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 					  target, unsignedp);
       return target;
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_MULT_EXPR:
       /* If first operand is constant, swap them.
 	 Thus the following special case checks need only
@@ -10150,10 +10148,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	return temp;
       }
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc
index ebd87b20a0adc080c4a8f9429e75f49b96e72f9a..2a1a5b7f811ca341e8ee7e85a9701d3a37ff80bf 100644
--- a/gcc/gimple-pretty-print.cc
+++ b/gcc/gimple-pretty-print.cc
@@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc,
     case VEC_PACK_FLOAT_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_SERIES_EXPR:
       for (p = get_tree_code_name (code); *p; p++)
 	pp_character (buffer, TOUPPER (*p));
diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc
index 8383fe820b080f6d66f83dcf3b77d3c9f869f4bc..2f5f93dc6624f86f6b5618cf6e7aa2b508053a64 100644
--- a/gcc/optabs-tree.cc
+++ b/gcc/optabs-tree.cc
@@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
       return (TYPE_UNSIGNED (type)
 	      ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab);
 
-    case VEC_WIDEN_PLUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab);
-
-    case VEC_WIDEN_PLUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab);
-
-    case VEC_WIDEN_MINUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab);
-
-    case VEC_WIDEN_MINUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab);
-
     case VEC_UNPACK_HI_EXPR:
       return (TYPE_UNSIGNED (type)
 	      ? vec_unpacku_hi_optab : vec_unpacks_hi_optab);
@@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
    'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO.
 
    Supported widening operations:
-    WIDEN_MINUS_EXPR
-    WIDEN_PLUS_EXPR
     WIDEN_MULT_EXPR
     WIDEN_LSHIFT_EXPR
 
@@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out,
     case WIDEN_LSHIFT_EXPR:
       *code1 = LSHIFT_EXPR;
       break;
-    case WIDEN_MINUS_EXPR:
-      *code1 = MINUS_EXPR;
-      break;
-    case WIDEN_PLUS_EXPR:
-      *code1 = PLUS_EXPR;
-      break;
     case WIDEN_MULT_EXPR:
       *code1 = MULT_EXPR;
       break;
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 8de1b144a426776bf464765477c71ee8f2e52b81..46eed1e1f22052fc077f2fc25e5be627bce541b6 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -3948,8 +3948,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case PLUS_EXPR:
     case MINUS_EXPR:
       {
@@ -4070,10 +4068,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 043e1d5987a4c4b0159109dafb85a805ca828c1e..c0bebb7f4de36838341ed62389ad0e2b79f03034 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4288,8 +4288,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
 
     case REALIGN_LOAD_EXPR:
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case DOT_PROD_EXPR:
@@ -4298,10 +4296,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case WIDEN_MULT_MINUS_EXPR:
     case WIDEN_LSHIFT_EXPR:
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 6acd394a0790ad2ad989f195a3288f0f0a8cc489..53ca62dc1a6873ae9365f199061bde9edd486196 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -2825,8 +2825,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
       break;
 
       /* Binary arithmetic and logic expressions.  */
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case MULT_EXPR:
@@ -3790,10 +3788,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
     case VEC_SERIES_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
     case VEC_WIDEN_MULT_ODD_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
@@ -4311,12 +4305,6 @@ op_symbol_code (enum tree_code code)
     case WIDEN_LSHIFT_EXPR:
       return "w<<";
 
-    case WIDEN_PLUS_EXPR:
-      return "w+";
-
-    case WIDEN_MINUS_EXPR:
-      return "w-";
-
     case POINTER_PLUS_EXPR:
       return "+";
 
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index d20a10a1524164eef788ab4b88ba57c7a09c3387..98dd56ff022233ccead36a1f5a5e896e352f9f5b 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type)
 	  || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR
 	  || gimple_assign_rhs_code (assign) == FLOAT_EXPR)
 	{
 	  tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign));
@@ -3172,8 +3170,8 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
 	    break;
 
 	  /* Check that the DR_INITs are compile-time constants.  */
-	  if (!tree_fits_shwi_p (DR_INIT (dra))
-	      || !tree_fits_shwi_p (DR_INIT (drb)))
+	  if (TREE_CODE (DR_INIT (dra)) != INTEGER_CST
+	      || TREE_CODE (DR_INIT (drb)) != INTEGER_CST)
 	    break;
 
 	  /* Different .GOMP_SIMD_LANE calls still give the same lane,
@@ -3225,7 +3223,7 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
 		  unsigned HOST_WIDE_INT step
 		    = absu_hwi (tree_to_shwi (DR_STEP (dra)));
 		  if (step != 0
-		      && step <= ((unsigned HOST_WIDE_INT)init_b - init_a))
+		      && step <= (unsigned HOST_WIDE_INT)(init_b - init_a))
 		    break;
 		}
 	    }
diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index 92aba5d4af61dd478ec3f1b94854e4ad84166774..5823b08baf70b89b22ecc148b0702a84671ad084 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -2209,10 +2209,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi,
      arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
      calculated in the same way above.  */
   if (code == WIDEN_SUM_EXPR
-      || code == VEC_WIDEN_PLUS_HI_EXPR
-      || code == VEC_WIDEN_PLUS_LO_EXPR
-      || code == VEC_WIDEN_MINUS_HI_EXPR
-      || code == VEC_WIDEN_MINUS_LO_EXPR
       || code == VEC_WIDEN_MULT_HI_EXPR
       || code == VEC_WIDEN_MULT_LO_EXPR
       || code == VEC_WIDEN_MULT_EVEN_EXPR
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 268f5402fcdd5ec5bfb806db8c410e701c771275..64a0bde05bfcf62d2dc4bd18a9b6f1cb5f8698b5 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -559,21 +559,29 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-		      tree_code widened_code, bool shift_p,
+		      code_helper widened_code, bool shift_p,
 		      unsigned int max_nops,
 		      vect_unpromoted_value *unprom, tree *common_type,
 		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else
+    rhs_code = gimple_call_combined_fn (stmt);
+
+  if (rhs_code.safe_as_tree_code () != code
+      && rhs_code.get_rep () != widened_code.get_rep ())
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = is_gimple_assign (stmt) ? gimple_assign_lhs (stmt):
+				      gimple_call_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -586,7 +594,11 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op;
+      if (is_gimple_assign (stmt))
+	op = gimple_op (stmt, i + 1);
+      else
+	op = gimple_call_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
 	{
 	  /* We already have a common type from earlier operands.
@@ -1299,8 +1311,9 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
-			     false, 2, unprom, &half_type))
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+			     CFN_VEC_WIDEN_MINUS, false, 2, unprom,
+			     &half_type))
     return NULL;
 
   vect_pattern_detected ("vect_recog_sad_pattern", last_stmt);
@@ -2339,9 +2352,10 @@ vect_recog_average_pattern (vec_info *vinfo,
   internal_fn ifn = IFN_AVG_FLOOR;
   vect_unpromoted_value unprom[3];
   tree new_type;
+  enum optab_subtype subtype;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-					    WIDEN_PLUS_EXPR, false, 3,
-					    unprom, &new_type);
+					    CFN_VEC_WIDEN_PLUS, false, 3,
+					    unprom, &new_type, &subtype);
   if (nops == 0)
     return NULL;
   if (nops == 3)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 9af0d107fdafb959db10d87e4e0ba5fda4e47bd7..a2c0947864a64aea7c19a9c2bc53961821838e00 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4898,9 +4898,7 @@ vectorizable_conversion (vec_info *vinfo,
 
   if (res_op.code.is_tree_code ())
   {
-      widen_arith = (code == WIDEN_PLUS_EXPR
-		     || code == WIDEN_MINUS_EXPR
-		     || code == WIDEN_MULT_EXPR
+      widen_arith = (code == WIDEN_MULT_EXPR
 		     || code == WIDEN_LSHIFT_EXPR);
  }
   else
@@ -4954,8 +4952,6 @@ vectorizable_conversion (vec_info *vinfo,
     {
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR
 		  || code == IFN_VEC_WIDEN_PLUS
 		  || code == IFN_VEC_WIDEN_MINUS);
 
@@ -11981,7 +11977,7 @@ supportable_widening_operation (vec_info *vinfo,
   class loop *vect_loop = NULL;
   machine_mode vec_mode;
   enum insn_code icode1, icode2;
-  optab optab1, optab2;
+  optab optab1 = unknown_optab, optab2 = unknown_optab;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
   code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES;
@@ -12075,16 +12071,6 @@ supportable_widening_operation (vec_info *vinfo,
       c2 = VEC_WIDEN_LSHIFT_HI_EXPR;
       break;
 
-    case WIDEN_PLUS_EXPR:
-      c1 = VEC_WIDEN_PLUS_LO_EXPR;
-      c2 = VEC_WIDEN_PLUS_HI_EXPR;
-      break;
-
-    case WIDEN_MINUS_EXPR:
-      c1 = VEC_WIDEN_MINUS_LO_EXPR;
-      c2 = VEC_WIDEN_MINUS_HI_EXPR;
-      break;
-
     CASE_CONVERT:
       c1 = VEC_UNPACK_LO_EXPR;
       c2 = VEC_UNPACK_HI_EXPR;
diff --git a/gcc/tree.def b/gcc/tree.def
index 62650b6934b337c5d56e5393dc114173d72c9aa9..9b2dce3576440c445d3240b9ed937fe67c9a5992 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1383,8 +1383,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3)
    the first argument from type t1 to type t2, and then shifting it
    by the second argument.  */
 DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2)
 
 /* Widening vector multiplication.
    The two operands are vectors with N elements of size S. Multiplying the
@@ -1449,10 +1447,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2)
  */
 DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2)
 DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2)
 
 /* PREDICT_EXPR.  Specify hint for branch prediction.  The
    PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the
-- 
2.17.1


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-06-06 17:20       ` Joel Hutton
@ 2022-06-07  8:18         ` Richard Sandiford
  2022-06-07  9:01           ` Joel Hutton
  2022-06-13  9:18           ` [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Richard Biener
  0 siblings, 2 replies; 53+ messages in thread
From: Richard Sandiford @ 2022-06-07  8:18 UTC (permalink / raw)
  To: Joel Hutton; +Cc: Richard Biener, gcc-patches

Joel Hutton <Joel.Hutton@arm.com> writes:
>> > Patches attached. They already incorporated the .cc rename, now
>> > rebased to be after the change to tree.h
>>
>> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
>>                        2, oprnd, half_type, unprom, vectype);
>>
>>    tree var = vect_recog_temp_ssa_var (itype, NULL);
>> -  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
>> -                                             oprnd[0], oprnd[1]);
>> +  gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0],
>> oprnd[1]);
>>
>>
>> you should be able to do without the new gimple_build overload
>> by using
>>
>>    gimple_seq stmts = NULL;
>>    gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]);
>>    gimple *pattern_stmt = gimple_seq_last_stmt (stmts);
>>
>> because 'gimple_build' is an existing API.
>
> Done.
>
> The gimple_build overload was at the request of Richard Sandiford, I assume removing it is ok with you Richard S?
> From Richard Sandiford:
>> For example, I think we should hide this inside a new:
>>
>>   gimple_build (var, wide_code, oprnd[0], oprnd[1]);
>>
>> that works directly on code_helper, similarly to the new code_helper
>> gimple_build interfaces.

I thought the potential problem with the above is that gimple_build
is a folding interface, so in principle it's allowed to return an
existing SSA_NAME set by an existing statement (or even a constant).
I think in this context we do need to force a new statement to be
created.

Of course, the hope is that there wouldn't still be such folding
opportunities at this stage, but I don't think it's guaranteed
(especially with options fuzzing).

Sind I was mentioned :-) ...

Could you run the patch through contrib/check_GNU_style.py?
There seem to be a few long lines.

> +  if (res_op.code.is_tree_code ())

Do you need this is_tree_code ()?  These comparisons…

> +  {
> +      widen_arith = (code == WIDEN_PLUS_EXPR
> +		     || code == WIDEN_MINUS_EXPR
> +		     || code == WIDEN_MULT_EXPR
> +		     || code == WIDEN_LSHIFT_EXPR);

…ought to be safe unconditionally.

> + }
> +  else
> +      widen_arith = false;
> +
> +  if (!widen_arith
> +      && !CONVERT_EXPR_CODE_P (code)
> +      && code != FIX_TRUNC_EXPR
> +      && code != FLOAT_EXPR)
> +    return false;
>  
>    /* Check types of lhs and rhs.  */
> -  scalar_dest = gimple_assign_lhs (stmt);
> +  scalar_dest = gimple_get_lhs (stmt);
>    lhs_type = TREE_TYPE (scalar_dest);
>    vectype_out = STMT_VINFO_VECTYPE (stmt_info);
>  
> @@ -4938,10 +4951,14 @@ vectorizable_conversion (vec_info *vinfo,
>  
>    if (op_type == binary_op)
>      {
> -      gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR
> -		  || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR);
> +      gcc_assert (code == WIDEN_MULT_EXPR
> +		  || code == WIDEN_LSHIFT_EXPR
> +		  || code == WIDEN_PLUS_EXPR
> +		  || code == WIDEN_MINUS_EXPR);
>  
> -      op1 = gimple_assign_rhs2 (stmt);
> +
> +      op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
> +				     gimple_call_arg (stmt, 0);
>        tree vectype1_in;
>        if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1,
>  			       &op1, &slp_op1, &dt[1], &vectype1_in))
> […]
> @@ -12181,7 +12235,6 @@ supportable_widening_operation (vec_info *vinfo,
>    return false;
>  }
>  
> -
>  /* Function supportable_narrowing_operation
>  
>     Check whether an operation represented by the code CODE is a

Seems like a spurious change.

> @@ -12205,7 +12258,7 @@ supportable_widening_operation (vec_info *vinfo,
>  bool
>  supportable_narrowing_operation (enum tree_code code,
>  				 tree vectype_out, tree vectype_in,
> -				 enum tree_code *code1, int *multi_step_cvt,
> +				 tree_code* _code1, int *multi_step_cvt,

The original formatting (space before the “*”) was correct.
Names beginning with _ are reserved, so I think we need a different
name here.  Also, the name in the comment should stay in sync with
the name in the code.

That said though, I'm not sure…

>                                   vec<tree> *interm_types)
>  {
>    machine_mode vec_mode;
> @@ -12217,8 +12270,8 @@ supportable_narrowing_operation (enum tree_code code,
>    tree intermediate_type, prev_type;
>    machine_mode intermediate_mode, prev_mode;
>    int i;
> -  unsigned HOST_WIDE_INT n_elts;
>    bool uns;
> +  tree_code * code1 = (tree_code*) _code1;

…the combination of these two changes makes sense on their own.

>  
>    *multi_step_cvt = 0;
>    switch (code)
> @@ -12227,9 +12280,8 @@ supportable_narrowing_operation (enum tree_code code,
>        c1 = VEC_PACK_TRUNC_EXPR;
>        if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
>  	  && VECTOR_BOOLEAN_TYPE_P (vectype)
> -	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype))
> -	  && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts)
> -	  && n_elts < BITS_PER_UNIT)
> +	  && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype)
> +	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
>  	optab1 = vec_pack_sbool_trunc_optab;
>        else
>  	optab1 = optab_for_tree_code (c1, vectype, optab_default);
> @@ -12320,9 +12372,8 @@ supportable_narrowing_operation (enum tree_code code,
>  	  = lang_hooks.types.type_for_mode (intermediate_mode, uns);
>        if (VECTOR_BOOLEAN_TYPE_P (intermediate_type)
>  	  && VECTOR_BOOLEAN_TYPE_P (prev_type)
> -	  && SCALAR_INT_MODE_P (prev_mode)
> -	  && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant (&n_elts)
> -	  && n_elts < BITS_PER_UNIT)
> +	  && intermediate_mode == prev_mode
> +	  && SCALAR_INT_MODE_P (prev_mode))
>  	interm_optab = vec_pack_sbool_trunc_optab;
>        else
>  	interm_optab

This part looks like a behavioural change, so I think it should be part
of a separate patch.

> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd784f10ee3d8ff4b4dc 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree,
>  				enum vect_def_type *,
>  				tree *, stmt_vec_info * = NULL);
>  extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree);
> -extern bool supportable_widening_operation (vec_info *,
> -					    enum tree_code, stmt_vec_info,
> -					    tree, tree, enum tree_code *,
> -					    enum tree_code *, int *,
> -					    vec<tree> *);
> +extern bool supportable_widening_operation (vec_info*, code_helper,
> +					    stmt_vec_info, tree, tree,
> +					    code_helper*, code_helper*,
> +					    int*, vec<tree> *);
>  extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
> -					     enum tree_code *, int *,
> +					     tree_code *, int *,
>  					     vec<tree> *);
>  
>  extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
> diff --git a/gcc/tree.h b/gcc/tree.h
> index f84958933d51144bb6ce7cc41eca5f7f06814550..00b0e4d1c696633fe38baad5295b1f90398d53fc 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -92,6 +92,10 @@ public:
>    bool is_fn_code () const { return rep < 0; }
>    bool is_internal_fn () const;
>    bool is_builtin_fn () const;
> +  enum tree_code safe_as_tree_code () const { return is_tree_code () ?
> +    (tree_code)* this : MAX_TREE_CODES; }
> +  combined_fn safe_as_fn_code () const { return is_fn_code () ? (combined_fn) *this
> +    : CFN_LAST;}

Since these don't fit on a line, the coding convention says that they
should be defined outside of the class.

Thanks,
Richard

>    int get_rep () const { return rep; }
>    bool operator== (const code_helper &other) { return rep == other.rep; }
>    bool operator!= (const code_helper &other) { return rep != other.rep; }

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-06-07  8:18         ` Richard Sandiford
@ 2022-06-07  9:01           ` Joel Hutton
  2022-06-09 14:03             ` Joel Hutton
  2022-06-13  9:02             ` Richard Biener
  2022-06-13  9:18           ` [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Richard Biener
  1 sibling, 2 replies; 53+ messages in thread
From: Joel Hutton @ 2022-06-07  9:01 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Biener, gcc-patches

Thanks Richard,

> I thought the potential problem with the above is that gimple_build is a
> folding interface, so in principle it's allowed to return an existing SSA_NAME
> set by an existing statement (or even a constant).
> I think in this context we do need to force a new statement to be created.

Before I make any changes, I'd like to check we're all on the same page.

richi, are you ok with the gimple_build function, perhaps with a different name if you are concerned with overloading? we could use gimple_ch_build or gimple_code_helper_build?

Similarly are you ok with the use of gimple_extract_op? I would lean towards using it as it is cleaner, but I don't have strong feelings.

Joel

> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: 07 June 2022 09:18
> To: Joel Hutton <Joel.Hutton@arm.com>
> Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org
> Subject: Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as
> internal_fns
> 
> Joel Hutton <Joel.Hutton@arm.com> writes:
> >> > Patches attached. They already incorporated the .cc rename, now
> >> > rebased to be after the change to tree.h
> >>
> >> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
> >>                        2, oprnd, half_type, unprom, vectype);
> >>
> >>    tree var = vect_recog_temp_ssa_var (itype, NULL);
> >> -  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
> >> -                                             oprnd[0], oprnd[1]);
> >> +  gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0],
> >> oprnd[1]);
> >>
> >>
> >> you should be able to do without the new gimple_build overload by
> >> using
> >>
> >>    gimple_seq stmts = NULL;
> >>    gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]);
> >>    gimple *pattern_stmt = gimple_seq_last_stmt (stmts);
> >>
> >> because 'gimple_build' is an existing API.
> >
> > Done.
> >
> > The gimple_build overload was at the request of Richard Sandiford, I
> assume removing it is ok with you Richard S?
> > From Richard Sandiford:
> >> For example, I think we should hide this inside a new:
> >>
> >>   gimple_build (var, wide_code, oprnd[0], oprnd[1]);
> >>
> >> that works directly on code_helper, similarly to the new code_helper
> >> gimple_build interfaces.
> 
> I thought the potential problem with the above is that gimple_build is a
> folding interface, so in principle it's allowed to return an existing SSA_NAME
> set by an existing statement (or even a constant).
> I think in this context we do need to force a new statement to be created.
> 
> Of course, the hope is that there wouldn't still be such folding opportunities
> at this stage, but I don't think it's guaranteed (especially with options
> fuzzing).
> 
> Sind I was mentioned :-) ...
> 
> Could you run the patch through contrib/check_GNU_style.py?
> There seem to be a few long lines.
> 
> > +  if (res_op.code.is_tree_code ())
> 
> Do you need this is_tree_code ()?  These comparisons…
> 
> > +  {
> > +      widen_arith = (code == WIDEN_PLUS_EXPR
> > +		     || code == WIDEN_MINUS_EXPR
> > +		     || code == WIDEN_MULT_EXPR
> > +		     || code == WIDEN_LSHIFT_EXPR);
> 
> …ought to be safe unconditionally.
> 
> > + }
> > +  else
> > +      widen_arith = false;
> > +
> > +  if (!widen_arith
> > +      && !CONVERT_EXPR_CODE_P (code)
> > +      && code != FIX_TRUNC_EXPR
> > +      && code != FLOAT_EXPR)
> > +    return false;
> >
> >    /* Check types of lhs and rhs.  */
> > -  scalar_dest = gimple_assign_lhs (stmt);
> > +  scalar_dest = gimple_get_lhs (stmt);
> >    lhs_type = TREE_TYPE (scalar_dest);
> >    vectype_out = STMT_VINFO_VECTYPE (stmt_info);
> >
> > @@ -4938,10 +4951,14 @@ vectorizable_conversion (vec_info *vinfo,
> >
> >    if (op_type == binary_op)
> >      {
> > -      gcc_assert (code == WIDEN_MULT_EXPR || code ==
> WIDEN_LSHIFT_EXPR
> > -		  || code == WIDEN_PLUS_EXPR || code ==
> WIDEN_MINUS_EXPR);
> > +      gcc_assert (code == WIDEN_MULT_EXPR
> > +		  || code == WIDEN_LSHIFT_EXPR
> > +		  || code == WIDEN_PLUS_EXPR
> > +		  || code == WIDEN_MINUS_EXPR);
> >
> > -      op1 = gimple_assign_rhs2 (stmt);
> > +
> > +      op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
> > +				     gimple_call_arg (stmt, 0);
> >        tree vectype1_in;
> >        if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1,
> >  			       &op1, &slp_op1, &dt[1], &vectype1_in)) […] @@
> -12181,7
> > +12235,6 @@ supportable_widening_operation (vec_info *vinfo,
> >    return false;
> >  }
> >
> > -
> >  /* Function supportable_narrowing_operation
> >
> >     Check whether an operation represented by the code CODE is a
> 
> Seems like a spurious change.
> 
> > @@ -12205,7 +12258,7 @@ supportable_widening_operation (vec_info
> > *vinfo,  bool  supportable_narrowing_operation (enum tree_code code,
> >  				 tree vectype_out, tree vectype_in,
> > -				 enum tree_code *code1, int *multi_step_cvt,
> > +				 tree_code* _code1, int *multi_step_cvt,
> 
> The original formatting (space before the “*”) was correct.
> Names beginning with _ are reserved, so I think we need a different
> name here.  Also, the name in the comment should stay in sync with
> the name in the code.
> 
> That said though, I'm not sure…
> 
> >                                   vec<tree> *interm_types)
> >  {
> >    machine_mode vec_mode;
> > @@ -12217,8 +12270,8 @@ supportable_narrowing_operation (enum
> tree_code code,
> >    tree intermediate_type, prev_type;
> >    machine_mode intermediate_mode, prev_mode;
> >    int i;
> > -  unsigned HOST_WIDE_INT n_elts;
> >    bool uns;
> > +  tree_code * code1 = (tree_code*) _code1;
> 
> …the combination of these two changes makes sense on their own.
> 
> >
> >    *multi_step_cvt = 0;
> >    switch (code)
> > @@ -12227,9 +12280,8 @@ supportable_narrowing_operation (enum
> tree_code code,
> >        c1 = VEC_PACK_TRUNC_EXPR;
> >        if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
> >  	  && VECTOR_BOOLEAN_TYPE_P (vectype)
> > -	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype))
> > -	  && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts)
> > -	  && n_elts < BITS_PER_UNIT)
> > +	  && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype)
> > +	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
> >  	optab1 = vec_pack_sbool_trunc_optab;
> >        else
> >  	optab1 = optab_for_tree_code (c1, vectype, optab_default);
> > @@ -12320,9 +12372,8 @@ supportable_narrowing_operation (enum
> tree_code code,
> >  	  = lang_hooks.types.type_for_mode (intermediate_mode, uns);
> >        if (VECTOR_BOOLEAN_TYPE_P (intermediate_type)
> >  	  && VECTOR_BOOLEAN_TYPE_P (prev_type)
> > -	  && SCALAR_INT_MODE_P (prev_mode)
> > -	  && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant
> (&n_elts)
> > -	  && n_elts < BITS_PER_UNIT)
> > +	  && intermediate_mode == prev_mode
> > +	  && SCALAR_INT_MODE_P (prev_mode))
> >  	interm_optab = vec_pack_sbool_trunc_optab;
> >        else
> >  	interm_optab
> 
> This part looks like a behavioural change, so I think it should be part
> of a separate patch.
> 
> > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > index
> 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd78
> 4f10ee3d8ff4b4dc 100644
> > --- a/gcc/tree-vectorizer.h
> > +++ b/gcc/tree-vectorizer.h
> > @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *,
> stmt_vec_info, slp_tree,
> >  				enum vect_def_type *,
> >  				tree *, stmt_vec_info * = NULL);
> >  extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree);
> > -extern bool supportable_widening_operation (vec_info *,
> > -					    enum tree_code, stmt_vec_info,
> > -					    tree, tree, enum tree_code *,
> > -					    enum tree_code *, int *,
> > -					    vec<tree> *);
> > +extern bool supportable_widening_operation (vec_info*, code_helper,
> > +					    stmt_vec_info, tree, tree,
> > +					    code_helper*, code_helper*,
> > +					    int*, vec<tree> *);
> >  extern bool supportable_narrowing_operation (enum tree_code, tree,
> tree,
> > -					     enum tree_code *, int *,
> > +					     tree_code *, int *,
> >  					     vec<tree> *);
> >
> >  extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
> > diff --git a/gcc/tree.h b/gcc/tree.h
> > index
> f84958933d51144bb6ce7cc41eca5f7f06814550..00b0e4d1c696633fe38baad5
> 295b1f90398d53fc 100644
> > --- a/gcc/tree.h
> > +++ b/gcc/tree.h
> > @@ -92,6 +92,10 @@ public:
> >    bool is_fn_code () const { return rep < 0; }
> >    bool is_internal_fn () const;
> >    bool is_builtin_fn () const;
> > +  enum tree_code safe_as_tree_code () const { return is_tree_code () ?
> > +    (tree_code)* this : MAX_TREE_CODES; }
> > +  combined_fn safe_as_fn_code () const { return is_fn_code () ?
> (combined_fn) *this
> > +    : CFN_LAST;}
> 
> Since these don't fit on a line, the coding convention says that they
> should be defined outside of the class.
> 
> Thanks,
> Richard
> 
> >    int get_rep () const { return rep; }
> >    bool operator== (const code_helper &other) { return rep == other.rep; }
> >    bool operator!= (const code_helper &other) { return rep != other.rep; }

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-06-07  9:01           ` Joel Hutton
@ 2022-06-09 14:03             ` Joel Hutton
  2022-06-13  9:02             ` Richard Biener
  1 sibling, 0 replies; 53+ messages in thread
From: Joel Hutton @ 2022-06-09 14:03 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Biener, gcc-patches

> Before I make any changes, I'd like to check we're all on the same page.
> 
> richi, are you ok with the gimple_build function, perhaps with a different
> name if you are concerned with overloading? we could use gimple_ch_build
> or gimple_code_helper_build?
> 
> Similarly are you ok with the use of gimple_extract_op? I would lean towards
> using it as it is cleaner, but I don't have strong feelings.
> 
> Joel

Ping. Just looking for some confirmation before I rework this patch. It would be good to get some agreement on this as Tamar is blocked on this patch.

Joel



> -----Original Message-----
> From: Joel Hutton
> Sent: 07 June 2022 10:02
> To: Richard Sandiford <richard.sandiford@arm.com>
> Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org
> Subject: RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as
> internal_fns
> 
> Thanks Richard,
> 
> > I thought the potential problem with the above is that gimple_build is
> > a folding interface, so in principle it's allowed to return an
> > existing SSA_NAME set by an existing statement (or even a constant).
> > I think in this context we do need to force a new statement to be created.
> 
> Before I make any changes, I'd like to check we're all on the same page.
> 
> richi, are you ok with the gimple_build function, perhaps with a different
> name if you are concerned with overloading? we could use gimple_ch_build
> or gimple_code_helper_build?
> 
> Similarly are you ok with the use of gimple_extract_op? I would lean towards
> using it as it is cleaner, but I don't have strong feelings.
> 
> Joel
> 
> > -----Original Message-----
> > From: Richard Sandiford <richard.sandiford@arm.com>
> > Sent: 07 June 2022 09:18
> > To: Joel Hutton <Joel.Hutton@arm.com>
> > Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org
> > Subject: Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as
> > internal_fns
> >
> > Joel Hutton <Joel.Hutton@arm.com> writes:
> > >> > Patches attached. They already incorporated the .cc rename, now
> > >> > rebased to be after the change to tree.h
> > >>
> > >> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info
> *vinfo,
> > >>                        2, oprnd, half_type, unprom, vectype);
> > >>
> > >>    tree var = vect_recog_temp_ssa_var (itype, NULL);
> > >> -  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
> > >> -                                             oprnd[0], oprnd[1]);
> > >> +  gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0],
> > >> oprnd[1]);
> > >>
> > >>
> > >> you should be able to do without the new gimple_build overload by
> > >> using
> > >>
> > >>    gimple_seq stmts = NULL;
> > >>    gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]);
> > >>    gimple *pattern_stmt = gimple_seq_last_stmt (stmts);
> > >>
> > >> because 'gimple_build' is an existing API.
> > >
> > > Done.
> > >
> > > The gimple_build overload was at the request of Richard Sandiford, I
> > assume removing it is ok with you Richard S?
> > > From Richard Sandiford:
> > >> For example, I think we should hide this inside a new:
> > >>
> > >>   gimple_build (var, wide_code, oprnd[0], oprnd[1]);
> > >>
> > >> that works directly on code_helper, similarly to the new
> > >> code_helper gimple_build interfaces.
> >
> > I thought the potential problem with the above is that gimple_build is
> > a folding interface, so in principle it's allowed to return an
> > existing SSA_NAME set by an existing statement (or even a constant).
> > I think in this context we do need to force a new statement to be created.
> >
> > Of course, the hope is that there wouldn't still be such folding
> > opportunities at this stage, but I don't think it's guaranteed
> > (especially with options fuzzing).
> >
> > Sind I was mentioned :-) ...
> >
> > Could you run the patch through contrib/check_GNU_style.py?
> > There seem to be a few long lines.
> >
> > > +  if (res_op.code.is_tree_code ())
> >
> > Do you need this is_tree_code ()?  These comparisons…
> >
> > > +  {
> > > +      widen_arith = (code == WIDEN_PLUS_EXPR
> > > +		     || code == WIDEN_MINUS_EXPR
> > > +		     || code == WIDEN_MULT_EXPR
> > > +		     || code == WIDEN_LSHIFT_EXPR);
> >
> > …ought to be safe unconditionally.
> >
> > > + }
> > > +  else
> > > +      widen_arith = false;
> > > +
> > > +  if (!widen_arith
> > > +      && !CONVERT_EXPR_CODE_P (code)
> > > +      && code != FIX_TRUNC_EXPR
> > > +      && code != FLOAT_EXPR)
> > > +    return false;
> > >
> > >    /* Check types of lhs and rhs.  */
> > > -  scalar_dest = gimple_assign_lhs (stmt);
> > > +  scalar_dest = gimple_get_lhs (stmt);
> > >    lhs_type = TREE_TYPE (scalar_dest);
> > >    vectype_out = STMT_VINFO_VECTYPE (stmt_info);
> > >
> > > @@ -4938,10 +4951,14 @@ vectorizable_conversion (vec_info *vinfo,
> > >
> > >    if (op_type == binary_op)
> > >      {
> > > -      gcc_assert (code == WIDEN_MULT_EXPR || code ==
> > WIDEN_LSHIFT_EXPR
> > > -		  || code == WIDEN_PLUS_EXPR || code ==
> > WIDEN_MINUS_EXPR);
> > > +      gcc_assert (code == WIDEN_MULT_EXPR
> > > +		  || code == WIDEN_LSHIFT_EXPR
> > > +		  || code == WIDEN_PLUS_EXPR
> > > +		  || code == WIDEN_MINUS_EXPR);
> > >
> > > -      op1 = gimple_assign_rhs2 (stmt);
> > > +
> > > +      op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
> > > +				     gimple_call_arg (stmt, 0);
> > >        tree vectype1_in;
> > >        if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1,
> > >  			       &op1, &slp_op1, &dt[1], &vectype1_in)) […] @@
> > -12181,7
> > > +12235,6 @@ supportable_widening_operation (vec_info *vinfo,
> > >    return false;
> > >  }
> > >
> > > -
> > >  /* Function supportable_narrowing_operation
> > >
> > >     Check whether an operation represented by the code CODE is a
> >
> > Seems like a spurious change.
> >
> > > @@ -12205,7 +12258,7 @@ supportable_widening_operation (vec_info
> > > *vinfo,  bool  supportable_narrowing_operation (enum tree_code code,
> > >  				 tree vectype_out, tree vectype_in,
> > > -				 enum tree_code *code1, int *multi_step_cvt,
> > > +				 tree_code* _code1, int *multi_step_cvt,
> >
> > The original formatting (space before the “*”) was correct.
> > Names beginning with _ are reserved, so I think we need a different
> > name here.  Also, the name in the comment should stay in sync with the
> > name in the code.
> >
> > That said though, I'm not sure…
> >
> > >                                   vec<tree> *interm_types)  {
> > >    machine_mode vec_mode;
> > > @@ -12217,8 +12270,8 @@ supportable_narrowing_operation (enum
> > tree_code code,
> > >    tree intermediate_type, prev_type;
> > >    machine_mode intermediate_mode, prev_mode;
> > >    int i;
> > > -  unsigned HOST_WIDE_INT n_elts;
> > >    bool uns;
> > > +  tree_code * code1 = (tree_code*) _code1;
> >
> > …the combination of these two changes makes sense on their own.
> >
> > >
> > >    *multi_step_cvt = 0;
> > >    switch (code)
> > > @@ -12227,9 +12280,8 @@ supportable_narrowing_operation (enum
> > tree_code code,
> > >        c1 = VEC_PACK_TRUNC_EXPR;
> > >        if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
> > >  	  && VECTOR_BOOLEAN_TYPE_P (vectype)
> > > -	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype))
> > > -	  && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts)
> > > -	  && n_elts < BITS_PER_UNIT)
> > > +	  && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype)
> > > +	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
> > >  	optab1 = vec_pack_sbool_trunc_optab;
> > >        else
> > >  	optab1 = optab_for_tree_code (c1, vectype, optab_default); @@
> > > -12320,9 +12372,8 @@ supportable_narrowing_operation (enum
> > tree_code code,
> > >  	  = lang_hooks.types.type_for_mode (intermediate_mode, uns);
> > >        if (VECTOR_BOOLEAN_TYPE_P (intermediate_type)
> > >  	  && VECTOR_BOOLEAN_TYPE_P (prev_type)
> > > -	  && SCALAR_INT_MODE_P (prev_mode)
> > > -	  && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant
> > (&n_elts)
> > > -	  && n_elts < BITS_PER_UNIT)
> > > +	  && intermediate_mode == prev_mode
> > > +	  && SCALAR_INT_MODE_P (prev_mode))
> > >  	interm_optab = vec_pack_sbool_trunc_optab;
> > >        else
> > >  	interm_optab
> >
> > This part looks like a behavioural change, so I think it should be
> > part of a separate patch.
> >
> > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index
> >
> 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd78
> > 4f10ee3d8ff4b4dc 100644
> > > --- a/gcc/tree-vectorizer.h
> > > +++ b/gcc/tree-vectorizer.h
> > > @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *,
> > stmt_vec_info, slp_tree,
> > >  				enum vect_def_type *,
> > >  				tree *, stmt_vec_info * = NULL);  extern bool
> > > vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool
> > > supportable_widening_operation (vec_info *,
> > > -					    enum tree_code, stmt_vec_info,
> > > -					    tree, tree, enum tree_code *,
> > > -					    enum tree_code *, int *,
> > > -					    vec<tree> *);
> > > +extern bool supportable_widening_operation (vec_info*, code_helper,
> > > +					    stmt_vec_info, tree, tree,
> > > +					    code_helper*, code_helper*,
> > > +					    int*, vec<tree> *);
> > >  extern bool supportable_narrowing_operation (enum tree_code, tree,
> > tree,
> > > -					     enum tree_code *, int *,
> > > +					     tree_code *, int *,
> > >  					     vec<tree> *);
> > >
> > >  extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, diff
> > > --git a/gcc/tree.h b/gcc/tree.h index
> >
> f84958933d51144bb6ce7cc41eca5f7f06814550..00b0e4d1c696633fe38baad5
> > 295b1f90398d53fc 100644
> > > --- a/gcc/tree.h
> > > +++ b/gcc/tree.h
> > > @@ -92,6 +92,10 @@ public:
> > >    bool is_fn_code () const { return rep < 0; }
> > >    bool is_internal_fn () const;
> > >    bool is_builtin_fn () const;
> > > +  enum tree_code safe_as_tree_code () const { return is_tree_code () ?
> > > +    (tree_code)* this : MAX_TREE_CODES; }  combined_fn
> > > + safe_as_fn_code () const { return is_fn_code () ?
> > (combined_fn) *this
> > > +    : CFN_LAST;}
> >
> > Since these don't fit on a line, the coding convention says that they
> > should be defined outside of the class.
> >
> > Thanks,
> > Richard
> >
> > >    int get_rep () const { return rep; }
> > >    bool operator== (const code_helper &other) { return rep == other.rep; }
> > >    bool operator!= (const code_helper &other) { return rep !=
> > > other.rep; }

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-06-07  9:01           ` Joel Hutton
  2022-06-09 14:03             ` Joel Hutton
@ 2022-06-13  9:02             ` Richard Biener
  2022-06-30 13:20               ` Joel Hutton
  1 sibling, 1 reply; 53+ messages in thread
From: Richard Biener @ 2022-06-13  9:02 UTC (permalink / raw)
  To: Joel Hutton; +Cc: Richard Sandiford, gcc-patches

On Tue, 7 Jun 2022, Joel Hutton wrote:

> Thanks Richard,
> 
> > I thought the potential problem with the above is that gimple_build is a
> > folding interface, so in principle it's allowed to return an existing SSA_NAME
> > set by an existing statement (or even a constant).
> > I think in this context we do need to force a new statement to be created.
> 
> Before I make any changes, I'd like to check we're all on the same page.
> 
> richi, are you ok with the gimple_build function, perhaps with a 
> different name if you are concerned with overloading? we could use 
> gimple_ch_build or gimple_code_helper_build?

We can go with a private vect_gimple_build function until we sort out
the API issue to unblock Tamar (I'll reply to Richards reply with further 
thoughts on this)

> Similarly are you ok with the use of gimple_extract_op? I would lean towards using it as it is cleaner, but I don't have strong feelings.

I don't like using gimple_extract_op here, I think I outlined a variant
that is even shorter.

Richard.

> Joel
> 
> > -----Original Message-----
> > From: Richard Sandiford <richard.sandiford@arm.com>
> > Sent: 07 June 2022 09:18
> > To: Joel Hutton <Joel.Hutton@arm.com>
> > Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org
> > Subject: Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as
> > internal_fns
> > 
> > Joel Hutton <Joel.Hutton@arm.com> writes:
> > >> > Patches attached. They already incorporated the .cc rename, now
> > >> > rebased to be after the change to tree.h
> > >>
> > >> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
> > >>                        2, oprnd, half_type, unprom, vectype);
> > >>
> > >>    tree var = vect_recog_temp_ssa_var (itype, NULL);
> > >> -  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
> > >> -                                             oprnd[0], oprnd[1]);
> > >> +  gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0],
> > >> oprnd[1]);
> > >>
> > >>
> > >> you should be able to do without the new gimple_build overload by
> > >> using
> > >>
> > >>    gimple_seq stmts = NULL;
> > >>    gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]);
> > >>    gimple *pattern_stmt = gimple_seq_last_stmt (stmts);
> > >>
> > >> because 'gimple_build' is an existing API.
> > >
> > > Done.
> > >
> > > The gimple_build overload was at the request of Richard Sandiford, I
> > assume removing it is ok with you Richard S?
> > > From Richard Sandiford:
> > >> For example, I think we should hide this inside a new:
> > >>
> > >>   gimple_build (var, wide_code, oprnd[0], oprnd[1]);
> > >>
> > >> that works directly on code_helper, similarly to the new code_helper
> > >> gimple_build interfaces.
> > 
> > I thought the potential problem with the above is that gimple_build is a
> > folding interface, so in principle it's allowed to return an existing SSA_NAME
> > set by an existing statement (or even a constant).
> > I think in this context we do need to force a new statement to be created.
> > 
> > Of course, the hope is that there wouldn't still be such folding opportunities
> > at this stage, but I don't think it's guaranteed (especially with options
> > fuzzing).
> > 
> > Sind I was mentioned :-) ...
> > 
> > Could you run the patch through contrib/check_GNU_style.py?
> > There seem to be a few long lines.
> > 
> > > +  if (res_op.code.is_tree_code ())
> > 
> > Do you need this is_tree_code ()?  These comparisons…
> > 
> > > +  {
> > > +      widen_arith = (code == WIDEN_PLUS_EXPR
> > > +		     || code == WIDEN_MINUS_EXPR
> > > +		     || code == WIDEN_MULT_EXPR
> > > +		     || code == WIDEN_LSHIFT_EXPR);
> > 
> > …ought to be safe unconditionally.
> > 
> > > + }
> > > +  else
> > > +      widen_arith = false;
> > > +
> > > +  if (!widen_arith
> > > +      && !CONVERT_EXPR_CODE_P (code)
> > > +      && code != FIX_TRUNC_EXPR
> > > +      && code != FLOAT_EXPR)
> > > +    return false;
> > >
> > >    /* Check types of lhs and rhs.  */
> > > -  scalar_dest = gimple_assign_lhs (stmt);
> > > +  scalar_dest = gimple_get_lhs (stmt);
> > >    lhs_type = TREE_TYPE (scalar_dest);
> > >    vectype_out = STMT_VINFO_VECTYPE (stmt_info);
> > >
> > > @@ -4938,10 +4951,14 @@ vectorizable_conversion (vec_info *vinfo,
> > >
> > >    if (op_type == binary_op)
> > >      {
> > > -      gcc_assert (code == WIDEN_MULT_EXPR || code ==
> > WIDEN_LSHIFT_EXPR
> > > -		  || code == WIDEN_PLUS_EXPR || code ==
> > WIDEN_MINUS_EXPR);
> > > +      gcc_assert (code == WIDEN_MULT_EXPR
> > > +		  || code == WIDEN_LSHIFT_EXPR
> > > +		  || code == WIDEN_PLUS_EXPR
> > > +		  || code == WIDEN_MINUS_EXPR);
> > >
> > > -      op1 = gimple_assign_rhs2 (stmt);
> > > +
> > > +      op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
> > > +				     gimple_call_arg (stmt, 0);
> > >        tree vectype1_in;
> > >        if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1,
> > >  			       &op1, &slp_op1, &dt[1], &vectype1_in)) […] @@
> > -12181,7
> > > +12235,6 @@ supportable_widening_operation (vec_info *vinfo,
> > >    return false;
> > >  }
> > >
> > > -
> > >  /* Function supportable_narrowing_operation
> > >
> > >     Check whether an operation represented by the code CODE is a
> > 
> > Seems like a spurious change.
> > 
> > > @@ -12205,7 +12258,7 @@ supportable_widening_operation (vec_info
> > > *vinfo,  bool  supportable_narrowing_operation (enum tree_code code,
> > >  				 tree vectype_out, tree vectype_in,
> > > -				 enum tree_code *code1, int *multi_step_cvt,
> > > +				 tree_code* _code1, int *multi_step_cvt,
> > 
> > The original formatting (space before the “*”) was correct.
> > Names beginning with _ are reserved, so I think we need a different
> > name here.  Also, the name in the comment should stay in sync with
> > the name in the code.
> > 
> > That said though, I'm not sure…
> > 
> > >                                   vec<tree> *interm_types)
> > >  {
> > >    machine_mode vec_mode;
> > > @@ -12217,8 +12270,8 @@ supportable_narrowing_operation (enum
> > tree_code code,
> > >    tree intermediate_type, prev_type;
> > >    machine_mode intermediate_mode, prev_mode;
> > >    int i;
> > > -  unsigned HOST_WIDE_INT n_elts;
> > >    bool uns;
> > > +  tree_code * code1 = (tree_code*) _code1;
> > 
> > …the combination of these two changes makes sense on their own.
> > 
> > >
> > >    *multi_step_cvt = 0;
> > >    switch (code)
> > > @@ -12227,9 +12280,8 @@ supportable_narrowing_operation (enum
> > tree_code code,
> > >        c1 = VEC_PACK_TRUNC_EXPR;
> > >        if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
> > >  	  && VECTOR_BOOLEAN_TYPE_P (vectype)
> > > -	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype))
> > > -	  && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts)
> > > -	  && n_elts < BITS_PER_UNIT)
> > > +	  && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype)
> > > +	  && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
> > >  	optab1 = vec_pack_sbool_trunc_optab;
> > >        else
> > >  	optab1 = optab_for_tree_code (c1, vectype, optab_default);
> > > @@ -12320,9 +12372,8 @@ supportable_narrowing_operation (enum
> > tree_code code,
> > >  	  = lang_hooks.types.type_for_mode (intermediate_mode, uns);
> > >        if (VECTOR_BOOLEAN_TYPE_P (intermediate_type)
> > >  	  && VECTOR_BOOLEAN_TYPE_P (prev_type)
> > > -	  && SCALAR_INT_MODE_P (prev_mode)
> > > -	  && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant
> > (&n_elts)
> > > -	  && n_elts < BITS_PER_UNIT)
> > > +	  && intermediate_mode == prev_mode
> > > +	  && SCALAR_INT_MODE_P (prev_mode))
> > >  	interm_optab = vec_pack_sbool_trunc_optab;
> > >        else
> > >  	interm_optab
> > 
> > This part looks like a behavioural change, so I think it should be part
> > of a separate patch.
> > 
> > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > > index
> > 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd78
> > 4f10ee3d8ff4b4dc 100644
> > > --- a/gcc/tree-vectorizer.h
> > > +++ b/gcc/tree-vectorizer.h
> > > @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *,
> > stmt_vec_info, slp_tree,
> > >  				enum vect_def_type *,
> > >  				tree *, stmt_vec_info * = NULL);
> > >  extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree);
> > > -extern bool supportable_widening_operation (vec_info *,
> > > -					    enum tree_code, stmt_vec_info,
> > > -					    tree, tree, enum tree_code *,
> > > -					    enum tree_code *, int *,
> > > -					    vec<tree> *);
> > > +extern bool supportable_widening_operation (vec_info*, code_helper,
> > > +					    stmt_vec_info, tree, tree,
> > > +					    code_helper*, code_helper*,
> > > +					    int*, vec<tree> *);
> > >  extern bool supportable_narrowing_operation (enum tree_code, tree,
> > tree,
> > > -					     enum tree_code *, int *,
> > > +					     tree_code *, int *,
> > >  					     vec<tree> *);
> > >
> > >  extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
> > > diff --git a/gcc/tree.h b/gcc/tree.h
> > > index
> > f84958933d51144bb6ce7cc41eca5f7f06814550..00b0e4d1c696633fe38baad5
> > 295b1f90398d53fc 100644
> > > --- a/gcc/tree.h
> > > +++ b/gcc/tree.h
> > > @@ -92,6 +92,10 @@ public:
> > >    bool is_fn_code () const { return rep < 0; }
> > >    bool is_internal_fn () const;
> > >    bool is_builtin_fn () const;
> > > +  enum tree_code safe_as_tree_code () const { return is_tree_code () ?
> > > +    (tree_code)* this : MAX_TREE_CODES; }
> > > +  combined_fn safe_as_fn_code () const { return is_fn_code () ?
> > (combined_fn) *this
> > > +    : CFN_LAST;}
> > 
> > Since these don't fit on a line, the coding convention says that they
> > should be defined outside of the class.
> > 
> > Thanks,
> > Richard
> > 
> > >    int get_rep () const { return rep; }
> > >    bool operator== (const code_helper &other) { return rep == other.rep; }
> > >    bool operator!= (const code_helper &other) { return rep != other.rep; }
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-06-07  8:18         ` Richard Sandiford
  2022-06-07  9:01           ` Joel Hutton
@ 2022-06-13  9:18           ` Richard Biener
  1 sibling, 0 replies; 53+ messages in thread
From: Richard Biener @ 2022-06-13  9:18 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Joel Hutton, gcc-patches

On Tue, 7 Jun 2022, Richard Sandiford wrote:

> Joel Hutton <Joel.Hutton@arm.com> writes:
> >> > Patches attached. They already incorporated the .cc rename, now
> >> > rebased to be after the change to tree.h
> >>
> >> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
> >>                        2, oprnd, half_type, unprom, vectype);
> >>
> >>    tree var = vect_recog_temp_ssa_var (itype, NULL);
> >> -  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
> >> -                                             oprnd[0], oprnd[1]);
> >> +  gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0],
> >> oprnd[1]);
> >>
> >>
> >> you should be able to do without the new gimple_build overload
> >> by using
> >>
> >>    gimple_seq stmts = NULL;
> >>    gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]);
> >>    gimple *pattern_stmt = gimple_seq_last_stmt (stmts);
> >>
> >> because 'gimple_build' is an existing API.
> >
> > Done.
> >
> > The gimple_build overload was at the request of Richard Sandiford, I assume removing it is ok with you Richard S?
> > From Richard Sandiford:
> >> For example, I think we should hide this inside a new:
> >>
> >>   gimple_build (var, wide_code, oprnd[0], oprnd[1]);
> >>
> >> that works directly on code_helper, similarly to the new code_helper
> >> gimple_build interfaces.
> 
> I thought the potential problem with the above is that gimple_build
> is a folding interface, so in principle it's allowed to return an
> existing SSA_NAME set by an existing statement (or even a constant).
> I think in this context we do need to force a new statement to be
> created.

Yes, that's due to how we use vect_finish_stmt_generation (only?).
It might be useful to add an overload that takes a gimple_seq
instead of a single gimple * for the vectorized stmt and leave
all the magic to that.  Now - we have the additional issue
that we have STMT_VINFO_VEC_STMTS instead of STMT_VINFO_VEC_DEFS
(in the end we'll only ever need the defs, never the stmts I think).

I do think that we eventually want to 'enhance' the gimple.h
non-folding stmt building API, unfortunately I took the 'gimple_build'
name for the folding one, so alternatively we can unify assign/call
with gimple_build_assign_or_call (...).  I don't really like the
idea of having folding and non-folding APIs being overloads :/
Maybe the non-folding API should be CTORs (guess GTY won't like
that) or static member functions:

gimple *gimple::build (tree, code_helper, tree, tree);

and in the long run the gimple_build API should be (for some uses?)
off a class as well, like instead of

  gimple_seq seq = NULL;
  op = gimple_build (&seq, ...);

do

  gimple_builder b (location); // location defaulted to UNKNOWN
  op = b.build (...);


So - writing the above I somewhat like the idea of static
member functions in 'gimple' (yes, at the root of the class
hierarchy, definitely not at gimple_statement_with_memory_ops_base,
not sure if we want gassign::build for assigns
and only the code_helper 'overloads' at the class root).

Richard.

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-06-13  9:02             ` Richard Biener
@ 2022-06-30 13:20               ` Joel Hutton
  2022-07-12 12:32                 ` Richard Biener
  0 siblings, 1 reply; 53+ messages in thread
From: Joel Hutton @ 2022-06-30 13:20 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Sandiford, gcc-patches, Andre Simoes Dias Vieira

[-- Attachment #1: Type: text/plain, Size: 647 bytes --]

> We can go with a private vect_gimple_build function until we sort out the API
> issue to unblock Tamar (I'll reply to Richards reply with further thoughts on
> this)
> 

Done.

> > Similarly are you ok with the use of gimple_extract_op? I would lean
> towards using it as it is cleaner, but I don't have strong feelings.
> 
> I don't like using gimple_extract_op here, I think I outlined a variant that is
> even shorter.
> 

Done.

Updated patches attached, bootstrapped and regression tested on aarch64.

Tomorrow is my last working day at Arm, so it will likely be Andre that commits this/addresses any further comments.


[-- Attachment #2: 0001-Refactor-to-allow-internal_fn-s.patch --]
[-- Type: application/octet-stream, Size: 23390 bytes --]

From f1321c617838e94044cbae357a63db002fbd3edb Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Wed, 25 Aug 2021 14:31:15 +0100
Subject: [PATCH 1/3] Refactor to allow internal_fn's

Hi all,

This refactor allows widening patterns (such as widen_plus/widen_minus) to be represented as
either internal_fns or tree_codes.

[vect-patterns] Refactor as internal_fn's

Refactor vect-patterns to allow patterns to be internal_fns starting
with widening_plus/minus patterns

gcc/ChangeLog:

	* tree-core.h (ECF_WIDEN): New flag.
	* gimple-match.h (class code_helper): 	* tree-core.h (ECF_WIDEN): Flag to mark internal_fn as widening.
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Refactor to
    use code_helper.
	(vect_gimple_build): New function.
	* tree-vect-stmts.cc (vect_gen_widened_results_half): Refactor to
    use code_helper.
	(vect_create_vectorized_promotion_stmts): Refactor to use
    code_helper.
	(vectorizable_conversion): Refactor to use code_helper.
    gimple_call or gimple_assign.
	(supportable_widening_operation): Refactor to use code_helper.
	(supportable_narrowing_operation): Refactor to use code_helper.
	* tree-vectorizer.h (supportable_widening_operation): Change
    prototype to use code_helper.
	(supportable_narrowing_operation): change prototype to use
    code_helper.
	(vect_gimple_build): New function prototype.
	* tree.h (code_helper::safe_as_tree_code): New function.
    helper functions.
	(code_helper::safe_as_fn_code): New function.
---
 gcc/tree-core.h           |   3 +
 gcc/tree-vect-patterns.cc |  34 ++++++-
 gcc/tree-vect-stmts.cc    | 208 +++++++++++++++++++++++++-------------
 gcc/tree-vectorizer.h     |  14 +--
 gcc/tree.h                |  13 +++
 5 files changed, 189 insertions(+), 83 deletions(-)

diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index ab5fa01e5cb5fb56c1964b93b014ed55a4aa704a..cff6211080bced0bffb39e98039a6550897acf77 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -96,6 +96,9 @@ struct die_struct;
 /* Nonzero if this is a cold function.  */
 #define ECF_COLD		  (1 << 15)
 
+/* Nonzero if this is a widening function.  */
+#define ECF_WIDEN		  (1 << 16)
+
 /* Call argument flags.  */
 
 /* Nonzero if the argument is not used by the function.  */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 8f624863971392c891fde7278949c8818f646576..d892158f024fc045b897aebe76f2e2b66211cf83 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -25,6 +25,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl.h"
 #include "tree.h"
 #include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-fold.h"
 #include "ssa.h"
 #include "expmed.h"
 #include "optabs-tree.h"
@@ -1348,7 +1350,7 @@ vect_recog_sad_pattern (vec_info *vinfo,
 static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
-			     tree_code orig_code, tree_code wide_code,
+			     tree_code orig_code, code_helper wide_code,
 			     bool shift_p, const char *name)
 {
   gimple *last_stmt = last_stmt_info->stmt;
@@ -1391,7 +1393,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
       vecctype = get_vectype_for_scalar_type (vinfo, ctype);
     }
 
-  enum tree_code dummy_code;
+  code_helper dummy_code;
   int dummy_int;
   auto_vec<tree> dummy_vec;
   if (!vectype
@@ -1412,8 +1414,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 		       2, oprnd, half_type, unprom, vectype);
 
   tree var = vect_recog_temp_ssa_var (itype, NULL);
-  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
-					      oprnd[0], oprnd[1]);
+  gimple *pattern_stmt = vect_gimple_build (var, wide_code, oprnd[0], oprnd[1]);
 
   if (vecctype != vecitype)
     pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype,
@@ -5968,3 +5969,28 @@ vect_pattern_recog (vec_info *vinfo)
   /* After this no more add_stmt calls are allowed.  */
   vinfo->stmt_vec_info_ro = true;
 }
+
+/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
+   or internal_fn contained in ch, respectively.  */
+gimple *
+vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1)
+{
+  if (op0 == NULL_TREE)
+    return NULL;
+  if (ch.is_tree_code ())
+    return op1 == NULL_TREE ? gimple_build_assign (lhs, ch.safe_as_tree_code (),
+						   op0) :
+			      gimple_build_assign (lhs, ch.safe_as_tree_code (),
+						   op0, op1);
+  else
+  {
+    internal_fn fn = as_internal_fn (ch.safe_as_fn_code ());
+    gimple* stmt;
+    if (op1 == NULL_TREE)
+      stmt = gimple_build_call_internal (fn, 1, op0);
+    else
+      stmt = gimple_build_call_internal (fn, 2, op0, op1);
+    gimple_call_set_lhs (stmt, lhs);
+    return stmt;
+  }
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 346d8ce280437e00bfeb19a4b4adc59eb96207f9..d6aabb873c86ab8ff0bae41c7f6c3bad34d583c5 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4636,7 +4636,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
    STMT_INFO is the original scalar stmt that we are vectorizing.  */
 
 static gimple *
-vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
+vect_gen_widened_results_half (vec_info *vinfo, code_helper ch,
                                tree vec_oprnd0, tree vec_oprnd1, int op_type,
 			       tree vec_dest, gimple_stmt_iterator *gsi,
 			       stmt_vec_info stmt_info)
@@ -4645,12 +4645,11 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
   tree new_temp;
 
   /* Generate half of the widened result:  */
-  gcc_assert (op_type == TREE_CODE_LENGTH (code));
   if (op_type != binary_op)
     vec_oprnd1 = NULL;
-  new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1);
+  new_stmt = vect_gimple_build (vec_dest, ch, vec_oprnd0, vec_oprnd1);
   new_temp = make_ssa_name (vec_dest, new_stmt);
-  gimple_assign_set_lhs (new_stmt, new_temp);
+  gimple_set_lhs (new_stmt, new_temp);
   vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 
   return new_stmt;
@@ -4729,8 +4728,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 					vec<tree> *vec_oprnds1,
 					stmt_vec_info stmt_info, tree vec_dest,
 					gimple_stmt_iterator *gsi,
-					enum tree_code code1,
-					enum tree_code code2, int op_type)
+					code_helper ch1,
+					code_helper ch2, int op_type)
 {
   int i;
   tree vop0, vop1, new_tmp1, new_tmp2;
@@ -4746,10 +4745,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 	vop1 = NULL_TREE;
 
       /* Generate the two halves of promotion operation.  */
-      new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1,
+      new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
-      new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1,
+      new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
       if (is_gimple_call (new_stmt1))
@@ -4846,8 +4845,9 @@ vectorizable_conversion (vec_info *vinfo,
   tree scalar_dest;
   tree op0, op1 = NULL_TREE;
   loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
-  enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK;
-  enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
+  tree_code tc1;
+  code_helper code, code1, code2;
+  code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
   tree new_temp;
   enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type};
   int ndts = 2;
@@ -4876,31 +4876,43 @@ vectorizable_conversion (vec_info *vinfo,
       && ! vec_stmt)
     return false;
 
-  gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!stmt)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return false;
 
-  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
+  if (gimple_get_lhs (stmt) == NULL_TREE
+      || TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
     return false;
 
-  code = gimple_assign_rhs_code (stmt);
-  if (!CONVERT_EXPR_CODE_P (code)
-      && code != FIX_TRUNC_EXPR
-      && code != FLOAT_EXPR
-      && code != WIDEN_PLUS_EXPR
-      && code != WIDEN_MINUS_EXPR
-      && code != WIDEN_MULT_EXPR
-      && code != WIDEN_LSHIFT_EXPR)
+  if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
+    return false;
+
+  if (is_gimple_assign (stmt))
+    {
+      code = gimple_assign_rhs_code (stmt);
+      op_type = TREE_CODE_LENGTH (code.safe_as_tree_code ());
+    }
+  else if (gimple_call_internal_p (stmt))
+    {
+      code = gimple_call_internal_fn (stmt);
+      op_type = gimple_call_num_args (stmt);
+    }
+  else
     return false;
 
   bool widen_arith = (code == WIDEN_PLUS_EXPR
-		      || code == WIDEN_MINUS_EXPR
-		      || code == WIDEN_MULT_EXPR
-		      || code == WIDEN_LSHIFT_EXPR);
-  op_type = TREE_CODE_LENGTH (code);
+		 || code == WIDEN_MINUS_EXPR
+		 || code == WIDEN_MULT_EXPR
+		 || code == WIDEN_LSHIFT_EXPR);
+
+  if (!widen_arith
+      && !CONVERT_EXPR_CODE_P (code)
+      && code != FIX_TRUNC_EXPR
+      && code != FLOAT_EXPR)
+    return false;
 
   /* Check types of lhs and rhs.  */
-  scalar_dest = gimple_assign_lhs (stmt);
+  scalar_dest = gimple_get_lhs (stmt);
   lhs_type = TREE_TYPE (scalar_dest);
   vectype_out = STMT_VINFO_VECTYPE (stmt_info);
 
@@ -4938,10 +4950,14 @@ vectorizable_conversion (vec_info *vinfo,
 
   if (op_type == binary_op)
     {
-      gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR);
+      gcc_assert (code == WIDEN_MULT_EXPR
+		  || code == WIDEN_LSHIFT_EXPR
+		  || code == WIDEN_PLUS_EXPR
+		  || code == WIDEN_MINUS_EXPR);
 
-      op1 = gimple_assign_rhs2 (stmt);
+
+      op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
+				     gimple_call_arg (stmt, 0);
       tree vectype1_in;
       if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1,
 			       &op1, &slp_op1, &dt[1], &vectype1_in))
@@ -5025,8 +5041,12 @@ vectorizable_conversion (vec_info *vinfo,
 	  && code != FLOAT_EXPR
 	  && !CONVERT_EXPR_CODE_P (code))
 	return false;
-      if (supportable_convert_operation (code, vectype_out, vectype_in, &code1))
+      if (supportable_convert_operation (code.safe_as_tree_code (), vectype_out,
+					 vectype_in, &tc1))
+      {
+	code1 = tc1;
 	break;
+      }
       /* FALLTHRU */
     unsupported:
       if (dump_enabled_p ())
@@ -5037,9 +5057,11 @@ vectorizable_conversion (vec_info *vinfo,
     case WIDEN:
       if (known_eq (nunits_in, nunits_out))
 	{
-	  if (!supportable_half_widening_operation (code, vectype_out,
-						   vectype_in, &code1))
+	  if (!supportable_half_widening_operation (code.safe_as_tree_code (),
+						    vectype_out, vectype_in,
+						    &tc1))
 	    goto unsupported;
+	  code1 = tc1;
 	  gcc_assert (!(multi_step_cvt && op_type == binary_op));
 	  break;
 	}
@@ -5073,14 +5095,17 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (GET_MODE_SIZE (rhs_mode) == fltsz)
 	    {
-	      if (!supportable_convert_operation (code, vectype_out,
-						  cvt_type, &codecvt1))
+	      tc1 = ERROR_MARK;
+	      if (!supportable_convert_operation (code.safe_as_tree_code (),
+						  vectype_out,
+						  cvt_type, &tc1))
 		goto unsupported;
+	      codecvt1 = tc1;
 	    }
-	  else if (!supportable_widening_operation (vinfo, code, stmt_info,
-						    vectype_out, cvt_type,
-						    &codecvt1, &codecvt2,
-						    &multi_step_cvt,
+	  else if (!supportable_widening_operation (vinfo, code,
+						    stmt_info, vectype_out,
+						    cvt_type, &codecvt1,
+						    &codecvt2, &multi_step_cvt,
 						    &interm_types))
 	    continue;
 	  else
@@ -5088,8 +5113,9 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info,
 					      cvt_type,
-					      vectype_in, &code1, &code2,
-					      &multi_step_cvt, &interm_types))
+					      vectype_in, &code1,
+					      &code2, &multi_step_cvt,
+					      &interm_types))
 	    {
 	      found_mode = true;
 	      break;
@@ -5111,10 +5137,15 @@ vectorizable_conversion (vec_info *vinfo,
 
     case NARROW:
       gcc_assert (op_type == unary_op);
-      if (supportable_narrowing_operation (code, vectype_out, vectype_in,
-					   &code1, &multi_step_cvt,
+      if (supportable_narrowing_operation (code.safe_as_tree_code (),
+					   vectype_out,
+					   vectype_in,
+					   &tc1, &multi_step_cvt,
 					   &interm_types))
-	break;
+	{
+	  code1 = tc1;
+	  break;
+	}
 
       if (code != FIX_TRUNC_EXPR
 	  || GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode))
@@ -5125,13 +5156,18 @@ vectorizable_conversion (vec_info *vinfo,
       cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
       if (cvt_type == NULL_TREE)
 	goto unsupported;
-      if (!supportable_convert_operation (code, cvt_type, vectype_in,
-					  &codecvt1))
+      if (!supportable_convert_operation (code.safe_as_tree_code (), cvt_type,
+					  vectype_in,
+					  &tc1))
 	goto unsupported;
+      codecvt1 = tc1;
       if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type,
-					   &code1, &multi_step_cvt,
+					   &tc1, &multi_step_cvt,
 					   &interm_types))
-	break;
+	{
+	  code1 = tc1;
+	  break;
+	}
       goto unsupported;
 
     default:
@@ -5245,8 +5281,10 @@ vectorizable_conversion (vec_info *vinfo,
       FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	{
 	  /* Arguments are ready, create the new vector stmt.  */
-	  gcc_assert (TREE_CODE_LENGTH (code1) == unary_op);
-	  gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0);
+	  gcc_assert (TREE_CODE_LENGTH ((tree_code) code1) == unary_op);
+	  gassign *new_stmt = gimple_build_assign (vec_dest,
+						   code1.safe_as_tree_code (),
+						   vop0);
 	  new_temp = make_ssa_name (vec_dest, new_stmt);
 	  gimple_assign_set_lhs (new_stmt, new_temp);
 	  vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
@@ -5278,7 +5316,7 @@ vectorizable_conversion (vec_info *vinfo,
       for (i = multi_step_cvt; i >= 0; i--)
 	{
 	  tree this_dest = vec_dsts[i];
-	  enum tree_code c1 = code1, c2 = code2;
+	  code_helper c1 = code1, c2 = code2;
 	  if (i == 0 && codecvt2 != ERROR_MARK)
 	    {
 	      c1 = codecvt1;
@@ -5288,7 +5326,8 @@ vectorizable_conversion (vec_info *vinfo,
 	    vect_create_half_widening_stmts (vinfo, &vec_oprnds0,
 						    &vec_oprnds1, stmt_info,
 						    this_dest, gsi,
-						    c1, op_type);
+						    c1.safe_as_tree_code (),
+						    op_type);
 	  else
 	    vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0,
 						    &vec_oprnds1, stmt_info,
@@ -5301,9 +5340,11 @@ vectorizable_conversion (vec_info *vinfo,
 	  gimple *new_stmt;
 	  if (cvt_type)
 	    {
-	      gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
+	      gcc_assert (TREE_CODE_LENGTH ((tree_code) codecvt1) == unary_op);
 	      new_temp = make_ssa_name (vec_dest);
-	      new_stmt = gimple_build_assign (new_temp, codecvt1, vop0);
+	      new_stmt = gimple_build_assign (new_temp,
+					      codecvt1.safe_as_tree_code (),
+					      vop0);
 	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    }
 	  else
@@ -5327,10 +5368,12 @@ vectorizable_conversion (vec_info *vinfo,
       if (cvt_type)
 	FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	  {
-	    gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
+	    gcc_assert (TREE_CODE_LENGTH (((tree_code) codecvt1)) == unary_op);
 	    new_temp = make_ssa_name (vec_dest);
 	    gassign *new_stmt
-	      = gimple_build_assign (new_temp, codecvt1, vop0);
+	      = gimple_build_assign (new_temp,
+				     codecvt1.safe_as_tree_code (),
+				     vop0);
 	    vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    vec_oprnds0[i] = new_temp;
 	  }
@@ -5338,7 +5381,8 @@ vectorizable_conversion (vec_info *vinfo,
       vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0,
 					     multi_step_cvt,
 					     stmt_info, vec_dsts, gsi,
-					     slp_node, code1);
+					     slp_node,
+					     code1.safe_as_tree_code ());
       break;
     }
   if (!slp_node)
@@ -11926,9 +11970,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype)
 
 bool
 supportable_widening_operation (vec_info *vinfo,
-				enum tree_code code, stmt_vec_info stmt_info,
+				code_helper code,
+				stmt_vec_info stmt_info,
 				tree vectype_out, tree vectype_in,
-                                enum tree_code *code1, enum tree_code *code2,
+				code_helper *code1,
+				code_helper *code2,
                                 int *multi_step_cvt,
                                 vec<tree> *interm_types)
 {
@@ -11939,7 +11985,7 @@ supportable_widening_operation (vec_info *vinfo,
   optab optab1, optab2;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
-  enum tree_code c1, c2;
+  code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES;
   int i;
   tree prev_type, intermediate_type;
   machine_mode intermediate_mode, prev_mode;
@@ -11949,7 +11995,7 @@ supportable_widening_operation (vec_info *vinfo,
   if (loop_info)
     vect_loop = LOOP_VINFO_LOOP (loop_info);
 
-  switch (code)
+  switch (code.safe_as_tree_code ())
     {
     case WIDEN_MULT_EXPR:
       /* The result of a vectorized widening operation usually requires
@@ -11990,8 +12036,9 @@ supportable_widening_operation (vec_info *vinfo,
 	  && !nested_in_vect_loop_p (vect_loop, stmt_info)
 	  && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR,
 					     stmt_info, vectype_out,
-					     vectype_in, code1, code2,
-					     multi_step_cvt, interm_types))
+					     vectype_in, code1,
+					     code2, multi_step_cvt,
+					     interm_types))
         {
           /* Elements in a vector with vect_used_by_reduction property cannot
              be reordered if the use chain with this property does not have the
@@ -12054,6 +12101,9 @@ supportable_widening_operation (vec_info *vinfo,
       c2 = VEC_UNPACK_FIX_TRUNC_HI_EXPR;
       break;
 
+    case MAX_TREE_CODES:
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -12064,10 +12114,12 @@ supportable_widening_operation (vec_info *vinfo,
   if (code == FIX_TRUNC_EXPR)
     {
       /* The signedness is determined from output operand.  */
-      optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+      optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out,
+				    optab_default);
+      optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out,
+				    optab_default);
     }
-  else if (CONVERT_EXPR_CODE_P (code)
+  else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
 	   && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
 	   && VECTOR_BOOLEAN_TYPE_P (vectype)
 	   && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
@@ -12080,8 +12132,10 @@ supportable_widening_operation (vec_info *vinfo,
     }
   else
     {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+      optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype,
+				    optab_default);
+      optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype,
+				    optab_default);
     }
 
   if (!optab1 || !optab2)
@@ -12092,8 +12146,12 @@ supportable_widening_operation (vec_info *vinfo,
        || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
     return false;
 
-  *code1 = c1;
-  *code2 = c2;
+  if (code.is_tree_code ())
+  {
+    *code1 = c1;
+    *code2 = c2;
+  }
+
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
       && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
@@ -12114,7 +12172,7 @@ supportable_widening_operation (vec_info *vinfo,
   prev_type = vectype;
   prev_mode = vec_mode;
 
-  if (!CONVERT_EXPR_CODE_P (code))
+  if (!CONVERT_EXPR_CODE_P ((tree_code) code))
     return false;
 
   /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS
@@ -12145,8 +12203,12 @@ supportable_widening_operation (vec_info *vinfo,
 	}
       else
 	{
-	  optab3 = optab_for_tree_code (c1, intermediate_type, optab_default);
-	  optab4 = optab_for_tree_code (c2, intermediate_type, optab_default);
+	  optab3 = optab_for_tree_code (c1.safe_as_tree_code (),
+					intermediate_type,
+					optab_default);
+	  optab4 = optab_for_tree_code (c2.safe_as_tree_code (),
+					intermediate_type,
+					optab_default);
 	}
 
       if (!optab3 || !optab4
@@ -12205,7 +12267,7 @@ supportable_widening_operation (vec_info *vinfo,
 bool
 supportable_narrowing_operation (enum tree_code code,
 				 tree vectype_out, tree vectype_in,
-				 enum tree_code *code1, int *multi_step_cvt,
+				 tree_code *code1, int *multi_step_cvt,
                                  vec<tree> *interm_types)
 {
   machine_mode vec_mode;
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 642eb0aeb21264cd736a479b1ec25357abef29cd..6f70bd622c4a4dea8c432cd26c96d24af399ef3e 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree,
 				enum vect_def_type *,
 				tree *, stmt_vec_info * = NULL);
 extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree);
-extern bool supportable_widening_operation (vec_info *,
-					    enum tree_code, stmt_vec_info,
-					    tree, tree, enum tree_code *,
-					    enum tree_code *, int *,
-					    vec<tree> *);
+extern bool supportable_widening_operation (vec_info*, code_helper,
+					    stmt_vec_info, tree, tree,
+					    code_helper*, code_helper*,
+					    int*, vec<tree> *);
 extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
-					     enum tree_code *, int *,
+					     tree_code *, int *,
 					     vec<tree> *);
 
 extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
@@ -2558,4 +2557,7 @@ vect_is_integer_truncation (stmt_vec_info stmt_info)
 	  && TYPE_PRECISION (lhs_type) < TYPE_PRECISION (rhs_type));
 }
 
+/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
+   or internal_fn contained in ch, respectively.  */
+gimple * vect_gimple_build (tree, code_helper, tree, tree);
 #endif  /* GCC_TREE_VECTORIZER_H  */
diff --git a/gcc/tree.h b/gcc/tree.h
index 6f6ad5a3a5f4dd4173482dfe259acf539ba24000..24b5184122550fe21ab0a5387867b6c65c20bb03 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -93,6 +93,8 @@ public:
   bool is_internal_fn () const;
   bool is_builtin_fn () const;
   int get_rep () const { return rep; }
+  enum tree_code safe_as_tree_code () const;
+  combined_fn safe_as_fn_code () const;
   bool operator== (const code_helper &other) { return rep == other.rep; }
   bool operator!= (const code_helper &other) { return rep != other.rep; }
   bool operator== (tree_code c) { return rep == code_helper (c).rep; }
@@ -102,6 +104,17 @@ private:
   int rep;
 };
 
+inline enum tree_code
+code_helper::safe_as_tree_code () const
+{
+  return is_tree_code () ? (tree_code)* this : MAX_TREE_CODES;
+}
+
+inline combined_fn
+code_helper::safe_as_fn_code () const {
+  return is_fn_code () ? (combined_fn) *this : CFN_LAST;
+}
+
 inline code_helper::operator internal_fn () const
 {
   return as_internal_fn (combined_fn (*this));
-- 
2.17.1


[-- Attachment #3: 0002-Refactor-widen_plus-as-internal_fn.patch --]
[-- Type: application/octet-stream, Size: 22832 bytes --]

From 1e8afa697157c3cb520a36304326e14891444226 Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Wed, 26 Jan 2022 14:00:17 +0000
Subject: [PATCH 2/3] Refactor widen_plus as internal_fn

This patch replaces the existing tree_code widen_plus and widen_minus
patterns with internal_fn versions.

DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it provides convenience wrappers for defining conversions that require a hi/lo split, like widening and narrowing operations.  Each definition for <NAME> will require an optab named <OPTAB> and two other optabs that you specify for signed and unsigned. The hi/lo pair is necessary because the widening operations take n narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' operation operates on the first n/2 elements of input. The 'hi' operation operates on the second n/2 elements of input. Defining an internal_fn along with hi/lo variations allows a single internal function to be returned from a vect_recog function that will later be expanded to hi/lo.

DEF_INTERNAL_OPTAB_MULTI_FN is used in internal-fn.def to register a widening internal_fn. It is defined differently in different places and internal-fn.def is sourced from those places so the parameters given can be reused.
  internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later defined to generate the  'expand_' functions for the hi/lo versions of the fn.
  internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original and hi/lo variants of the internal_fn

 For example:
 IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>addl_hi_<mode> -> (u/s)addl2
                       IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>addl_lo_<mode> -> (u/s)addl

This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.

gcc/ChangeLog:

2022-04-13  Joel Hutton  <joel.hutton@arm.com>
2022-04-13  Tamar Christina  <tamar.christina@arm.com>

	* internal-fn.cc (INCLUDE_MAP): Include maps for use in optab
    lookup.
	(DEF_INTERNAL_OPTAB_MULTI_FN): Macro to define an internal_fn that
    expands into multiple internal_fns (for widening).
	(ifn_cmp): Function to compare ifn's for sorting/searching.
	(lookup_multi_ifn_optab): Add lookup function.
	(lookup_multi_internal_fn): Add lookup function.
	(commutative_binary_fn_p): Add widen_plus fn's.
	* internal-fn.def (DEF_INTERNAL_OPTAB_MULTI_FN): Define widening
    plus,minus functions.
	(VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code.
	(VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code.
	* internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
	(lookup_multi_ifn_optab): Add prototype.
	(lookup_multi_internal_fn): Add prototype.
	* optabs.cc (commutative_optab_p): Add widening plus, minus optabs.
	* optabs.def (OPTAB_CD): widen add, sub optabs
	* tree-core.h (ECF_MULTI): Flag to indicate if a function decays
    into hi/lo parts.
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
    patterns with a hi/lo split.
	(vect_recog_widen_plus_pattern): Refactor to return
    IFN_VECT_WIDEN_PLUS.
	(vect_recog_widen_minus_pattern): Refactor to return new
    IFN_VEC_WIDEN_MINUS.
	* tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus
    ifn
    support.
	(supportable_widening_operation): Add widen plus/minus ifn support.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-widen-add.c: Test that new
    IFN_VEC_WIDEN_PLUS is being used.
	* gcc.target/aarch64/vect-widen-sub.c: Test that new
    IFN_VEC_WIDEN_MINUS is being used.
---
 gcc/internal-fn.cc                            | 107 ++++++++++++++++++
 gcc/internal-fn.def                           |  23 ++++
 gcc/internal-fn.h                             |   7 ++
 gcc/optabs.cc                                 |  12 +-
 gcc/optabs.def                                |   2 +
 .../gcc.target/aarch64/vect-widen-add.c       |   4 +-
 .../gcc.target/aarch64/vect-widen-sub.c       |   4 +-
 gcc/tree-core.h                               |   4 +
 gcc/tree-vect-patterns.cc                     |  37 ++++--
 gcc/tree-vect-stmts.cc                        |  68 +++++++++--
 10 files changed, 249 insertions(+), 19 deletions(-)

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 91588f8bc9f7c3fe2bac17f3c4e6078cddb7b4d2..b2cb3e508027a84e4456d676d78b27b6c04b7b61 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
 along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
+#define INCLUDE_MAP
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = {
   0
 };
 
+const enum internal_fn internal_fn_hilo_keys_array[] = {
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  IFN_##NAME##_LO, \
+  IFN_##NAME##_HI,
+#include "internal-fn.def"
+  IFN_LAST
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+};
+
+const optab internal_fn_hilo_values_array[] = {
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  SOPTAB##_lo_optab, UOPTAB##_lo_optab, \
+  SOPTAB##_hi_optab, UOPTAB##_hi_optab,
+#include "internal-fn.def"
+  unknown_optab, unknown_optab
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+};
+
 /* Return the internal function called NAME, or IFN_LAST if there's
    no such function.  */
 
@@ -90,6 +111,62 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+static int
+ifn_cmp (const void *a_, const void *b_)
+{
+  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
+  auto *a = (const std::pair<ifn_pair, optab> *)a_;
+  auto *b = (const std::pair<ifn_pair, optab> *)b_;
+  return (int) (a->first.first) - (b->first.first);
+}
+
+/* Return the optab belonging to the given internal function NAME for the given
+   SIGN or unknown_optab.  */
+
+optab
+lookup_multi_ifn_optab (enum internal_fn fn, unsigned sign)
+{
+  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
+  typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type;
+  static fn_to_optab_map_type *fn_to_optab_map;
+
+  if (!fn_to_optab_map)
+    {
+      unsigned num
+	= sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn);
+      fn_to_optab_map = new fn_to_optab_map_type ();
+      for (unsigned int i = 0; i < num - 1; ++i)
+	{
+	  enum internal_fn fn = internal_fn_hilo_keys_array[i];
+	  optab v1 = internal_fn_hilo_values_array[2*i];
+	  optab v2 = internal_fn_hilo_values_array[2*i + 1];
+	  ifn_pair key1 (fn, 0);
+	  fn_to_optab_map->safe_push ({key1, v1});
+	  ifn_pair key2 (fn, 1);
+	  fn_to_optab_map->safe_push ({key2, v2});
+	}
+	fn_to_optab_map->qsort (ifn_cmp);
+    }
+
+  ifn_pair new_pair (fn, sign ? 1 : 0);
+  optab tmp;
+  std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp);
+  auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp);
+  return entry != fn_to_optab_map->end () ? entry->second : unknown_optab;
+}
+
+extern void
+lookup_multi_internal_fn (enum internal_fn ifn, enum internal_fn *lo,
+			  enum internal_fn *hi)
+{
+  int ecf_flags = internal_fn_flags (ifn);
+  gcc_assert (ecf_flags & ECF_MULTI);
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -3928,6 +4005,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -4013,6 +4093,32 @@ set_edom_supported_p (void)
 #endif
 }
 
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  static void							\
+  expand_##CODE (internal_fn, gcall *)				\
+  {								\
+    gcc_unreachable ();						\
+  }								\
+  static void							\
+  expand_##CODE##_LO (internal_fn fn, gcall *stmt)		\
+  {								\
+    tree ty = TREE_TYPE (gimple_get_lhs (stmt));		\
+    if (!TYPE_UNSIGNED (ty))					\
+      expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab);	\
+    else							\
+      expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab);	\
+  }								\
+  static void							\
+  expand_##CODE##_HI (internal_fn fn, gcall *stmt)		\
+  {								\
+    tree ty = TREE_TYPE (gimple_get_lhs (stmt));		\
+    if (!TYPE_UNSIGNED (ty))					\
+      expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab);	\
+    else							\
+      expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab);	\
+  }
+
 #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \
   static void						\
   expand_##CODE (internal_fn fn, gcall *stmt)		\
@@ -4029,6 +4135,7 @@ set_edom_supported_p (void)
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);			\
   }
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_MULTI_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index d2d550d358606022b1cb44fa842f06e0be507bc3..4635a9c8af9ad27bb05d7510388d0fe2270428e5 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -82,6 +82,13 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it
+   provides convenience wrappers for defining conversions that require a
+   hi/lo split, like widening and narrowing operations.  Each definition
+   for <NAME> will require an optab named <OPTAB> and two other optabs that
+   you specify for signed and unsigned.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -120,6 +127,14 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_OPTAB_MULTI_FN
+#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME, FLAGS | ECF_MULTI, OPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE)
+#endif
+
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -292,6 +307,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_PLUS,
+			     ECF_CONST | ECF_WIDEN | ECF_NOTHROW,
+			     vec_widen_add, vec_widen_saddl, vec_widen_uaddl,
+			     binary)
+DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_MINUS,
+			     ECF_CONST | ECF_WIDEN | ECF_NOTHROW,
+			     vec_widen_sub, vec_widen_ssubl, vec_widen_usubl,
+			     binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 23c014a963c4d72da92c763db87ee486a2adb485..b35de19747d251d19dc13de1e0323368bd2ebdf2 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern optab lookup_multi_ifn_optab (enum internal_fn, unsigned);
+extern void lookup_multi_internal_fn (enum internal_fn, enum internal_fn *,
+				      enum internal_fn *);
 
 /* Return the ECF_* flags for function FN.  */
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index a50dd798f2a454ac54e247f3e6cbab17577ea304..f9be369a6c5b99de5bbad664a11364d1c2cc4b95 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1312,7 +1312,17 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == smul_widen_optab
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
-	  || binoptab == umul_highpart_optab);
+	  || binoptab == umul_highpart_optab
+	  || binoptab == vec_widen_add_optab
+	  || binoptab == vec_widen_sub_optab
+	  || binoptab == vec_widen_saddl_hi_optab
+	  || binoptab == vec_widen_saddl_lo_optab
+	  || binoptab == vec_widen_ssubl_hi_optab
+	  || binoptab == vec_widen_ssubl_lo_optab
+	  || binoptab == vec_widen_uaddl_hi_optab
+	  || binoptab == vec_widen_uaddl_lo_optab
+	  || binoptab == vec_widen_usubl_hi_optab
+	  || binoptab == vec_widen_usubl_lo_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 801310ebaa7d469520809bb7efed6820f8eb866b..a7881dcb49e4ef07d8f07aa31214eb3a7a944e2e 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4")
 OPTAB_CD(umsub_widen_optab, "umsub$b$a4")
 OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4")
 OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4")
+OPTAB_CD(vec_widen_add_optab, "add$a$b3")
+OPTAB_CD(vec_widen_sub_optab, "sub$a$b3")
 OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b")
 OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b")
 OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index cff6211080bced0bffb39e98039a6550897acf77..d0c8b812cfb9c3ac83bf25fff0431b08cb7d823d 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -99,6 +99,10 @@ struct die_struct;
 /* Nonzero if this is a widening function.  */
 #define ECF_WIDEN		  (1 << 16)
 
+/* Nonzero if this is a function that decomposes into a lo/hi operation.  */
+#define ECF_MULTI		  (1 << 17)
+
+
 /* Call argument flags.  */
 
 /* Nonzero if the argument is not used by the function.  */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index d892158f024fc045b897aebe76f2e2b66211cf83..62ca28d725ed4ac8d7e4d493119e40772a0fbac6 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -1351,14 +1351,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
 			     tree_code orig_code, code_helper wide_code,
-			     bool shift_p, const char *name)
+			     bool shift_p, const char *name,
+			     enum optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-			     shift_p, 2, unprom, &half_type))
+			     shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1424,6 +1426,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 			      type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+			     stmt_vec_info last_stmt_info, tree *type_out,
+			     tree_code orig_code, internal_fn wide_ifn,
+			     bool shift_p, const char *name,
+			     enum optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+				      orig_code, ifn, shift_p, name,
+				      subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1437,26 +1453,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS.  See vect_recog_widen_op_pattern for details.  */
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  enum optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-				      "vect_recog_widen_plus_pattern");
+				      PLUS_EXPR, IFN_VEC_WIDEN_PLUS,
+				      false, "vect_recog_widen_plus_pattern",
+				      &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS.  See vect_recog_widen_op_pattern for details.  */
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  enum optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-				      "vect_recog_widen_minus_pattern");
+				      MINUS_EXPR, IFN_VEC_WIDEN_MINUS,
+				      false, "vect_recog_widen_minus_pattern",
+				      &subtype);
 }
 
 /* Function vect_recog_popcount_pattern
@@ -5629,6 +5649,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index d6aabb873c86ab8ff0bae41c7f6c3bad34d583c5..6fa0669fdfc8630842b3f9f32f4b4a253e79bb92 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4903,7 +4903,9 @@ vectorizable_conversion (vec_info *vinfo,
   bool widen_arith = (code == WIDEN_PLUS_EXPR
 		 || code == WIDEN_MINUS_EXPR
 		 || code == WIDEN_MULT_EXPR
-		 || code == WIDEN_LSHIFT_EXPR);
+		 || code == WIDEN_LSHIFT_EXPR
+		 || code == IFN_VEC_WIDEN_PLUS
+		 || code == IFN_VEC_WIDEN_MINUS);
 
   if (!widen_arith
       && !CONVERT_EXPR_CODE_P (code)
@@ -4953,7 +4955,9 @@ vectorizable_conversion (vec_info *vinfo,
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
 		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR);
+		  || code == WIDEN_MINUS_EXPR
+		  || code == IFN_VEC_WIDEN_PLUS
+		  || code == IFN_VEC_WIDEN_MINUS);
 
 
       op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
@@ -12130,14 +12134,62 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
-    {
-      optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype,
-				    optab_default);
-      optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype,
-				    optab_default);
+
+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn (code.safe_as_fn_code ());
+      int ecf_flags = internal_fn_flags (ifn);
+      gcc_assert (ecf_flags & ECF_MULTI);
+
+      switch (code.safe_as_fn_code ())
+	{
+	case CFN_VEC_WIDEN_PLUS:
+	  break;
+	case CFN_VEC_WIDEN_MINUS:
+	  break;
+	case CFN_LAST:
+	default:
+	  return false;
+	}
+
+      internal_fn lo, hi;
+      lookup_multi_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
+      optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
     }
 
+  if (code.is_tree_code ())
+  {
+    if (code == FIX_TRUNC_EXPR)
+      {
+	/* The signedness is determined from output operand.  */
+	optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out,
+				      optab_default);
+	optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out,
+				      optab_default);
+      }
+    else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
+	     && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+	     && VECTOR_BOOLEAN_TYPE_P (vectype)
+	     && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+	     && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+      {
+	/* If the input and result modes are the same, a different optab
+	   is needed where we pass in the number of units in vectype.  */
+	optab1 = vec_unpacks_sbool_lo_optab;
+	optab2 = vec_unpacks_sbool_hi_optab;
+      }
+    else
+      {
+	optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype,
+				      optab_default);
+	optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype,
+				      optab_default);
+      }
+  }
+
   if (!optab1 || !optab2)
     return false;
 
-- 
2.17.1


[-- Attachment #4: 0003-Remove-widen_plus-minus_expr-tree-codes.patch --]
[-- Type: application/octet-stream, Size: 19552 bytes --]

From 60664218e6e59510f02fb64b49a236e9e5b26c9f Mon Sep 17 00:00:00 2001
From: Joel Hutton <joel.hutton@arm.com>
Date: Fri, 28 Jan 2022 12:04:44 +0000
Subject: [PATCH 3/3] Remove widen_plus/minus_expr tree codes

This patch removes the old widen plus/minus tree codes which have been
replaced by internal functions.

gcc/ChangeLog:

	* doc/generic.texi: Remove old tree codes.
	* expr.cc (expand_expr_real_2): Remove old tree code cases.
	* gimple-pretty-print.cc (dump_binary_rhs): Remove old tree code
    cases.
	* optabs-tree.cc (optab_for_tree_code): Remove old tree code cases.
	(supportable_half_widening_operation): Remove old tree code cases.
	* tree-cfg.cc (verify_gimple_assign_binary): Remove old tree code
    cases.
	* tree-inline.cc (estimate_operator_cost): Remove old tree code
    cases.
	* tree-pretty-print.cc (dump_generic_node): Remove tree code definition.
	(op_symbol_code): Remove old tree code
    cases.
	* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Remove old tree code
    cases.
	(vect_analyze_data_ref_accesses): Remove old tree code
    cases.
	* tree-vect-generic.cc (expand_vector_operations_1): Remove old tree code
    cases.
	* tree-vect-patterns.cc (vect_widened_op_tree): Refactor ot replace
    usage in vect_recog_sad_pattern.
	(vect_recog_sad_pattern): Replace tree code widening pattern with
    internal function.
	(vect_recog_average_pattern): Replace tree code widening pattern
    with internal function.
	* tree-vect-stmts.cc (vectorizable_conversion): Remove old tree code
    cases.
	(supportable_widening_operation): Remove old tree code
    cases.
	* tree.def (WIDEN_PLUS_EXPR): Remove tree code definition.
	(WIDEN_MINUS_EXPR): Remove tree code definition.
	(VEC_WIDEN_PLUS_HI_EXPR): Remove tree code definition.
	(VEC_WIDEN_PLUS_LO_EXPR): Remove tree code definition.
	(VEC_WIDEN_MINUS_HI_EXPR): Remove tree code definition.
	(VEC_WIDEN_MINUS_LO_EXPR): Remove tree code definition.
---
 gcc/doc/generic.texi       | 31 -------------------------------
 gcc/expr.cc                |  6 ------
 gcc/gimple-pretty-print.cc |  4 ----
 gcc/optabs-tree.cc         | 24 ------------------------
 gcc/tree-cfg.cc            |  6 ------
 gcc/tree-inline.cc         |  6 ------
 gcc/tree-pretty-print.cc   | 12 ------------
 gcc/tree-vect-data-refs.cc |  8 +++-----
 gcc/tree-vect-generic.cc   |  4 ----
 gcc/tree-vect-patterns.cc  | 36 +++++++++++++++++++++++++-----------
 gcc/tree-vect-stmts.cc     | 18 ++----------------
 gcc/tree.def               |  6 ------
 12 files changed, 30 insertions(+), 131 deletions(-)

diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index e5f9d1be8ea81f3da002ec3bb925590d331a2551..344045efd419b0cc3a11771acf70d2fd279c48ac 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,10 +1811,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
-@tindex VEC_WIDEN_PLUS_HI_EXPR
-@tindex VEC_WIDEN_PLUS_LO_EXPR
-@tindex VEC_WIDEN_MINUS_HI_EXPR
-@tindex VEC_WIDEN_MINUS_LO_EXPR
 @tindex VEC_UNPACK_HI_EXPR
 @tindex VEC_UNPACK_LO_EXPR
 @tindex VEC_UNPACK_FLOAT_HI_EXPR
@@ -1861,33 +1857,6 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
-@item VEC_WIDEN_PLUS_HI_EXPR
-@itemx VEC_WIDEN_PLUS_LO_EXPR
-These nodes represent widening vector addition of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The result
-is a vector that contains half as many elements, of an integral type whose size
-is twice as wide.  In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.  In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.
-
-@item VEC_WIDEN_MINUS_HI_EXPR
-@itemx VEC_WIDEN_MINUS_LO_EXPR
-These nodes represent widening vector subtraction of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The high/low
-elements of the second vector are subtracted from the high/low elements of the
-first. The result is a vector that contains half as many elements, of an
-integral type whose size is twice as wide.  In the case of
-@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second
-vector are subtracted from the high @code{N/2} of the first to produce the
-vector of @code{N/2} products.  In the case of
-@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second
-vector are subtracted from the low @code{N/2} of the first to produce the
-vector of @code{N/2} products.
-
 @item VEC_UNPACK_HI_EXPR
 @itemx VEC_UNPACK_LO_EXPR
 These nodes represent unpacking of the high and low parts of the input vector,
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 5d66c9f21f0ccd2eafb322eb9001f0dc873e35b4..b80385d51ba22172750d94535e04c82f75661255 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -9386,8 +9386,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 					  target, unsignedp);
       return target;
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_MULT_EXPR:
       /* If first operand is constant, swap them.
 	 Thus the following special case checks need only
@@ -10165,10 +10163,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	return temp;
       }
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc
index ebd87b20a0adc080c4a8f9429e75f49b96e72f9a..2a1a5b7f811ca341e8ee7e85a9701d3a37ff80bf 100644
--- a/gcc/gimple-pretty-print.cc
+++ b/gcc/gimple-pretty-print.cc
@@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc,
     case VEC_PACK_FLOAT_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_SERIES_EXPR:
       for (p = get_tree_code_name (code); *p; p++)
 	pp_character (buffer, TOUPPER (*p));
diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc
index 8383fe820b080f6d66f83dcf3b77d3c9f869f4bc..2f5f93dc6624f86f6b5618cf6e7aa2b508053a64 100644
--- a/gcc/optabs-tree.cc
+++ b/gcc/optabs-tree.cc
@@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
       return (TYPE_UNSIGNED (type)
 	      ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab);
 
-    case VEC_WIDEN_PLUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab);
-
-    case VEC_WIDEN_PLUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab);
-
-    case VEC_WIDEN_MINUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab);
-
-    case VEC_WIDEN_MINUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab);
-
     case VEC_UNPACK_HI_EXPR:
       return (TYPE_UNSIGNED (type)
 	      ? vec_unpacku_hi_optab : vec_unpacks_hi_optab);
@@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
    'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO.
 
    Supported widening operations:
-    WIDEN_MINUS_EXPR
-    WIDEN_PLUS_EXPR
     WIDEN_MULT_EXPR
     WIDEN_LSHIFT_EXPR
 
@@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out,
     case WIDEN_LSHIFT_EXPR:
       *code1 = LSHIFT_EXPR;
       break;
-    case WIDEN_MINUS_EXPR:
-      *code1 = MINUS_EXPR;
-      break;
-    case WIDEN_PLUS_EXPR:
-      *code1 = PLUS_EXPR;
-      break;
     case WIDEN_MULT_EXPR:
       *code1 = MULT_EXPR;
       break;
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index bfcb1425f7e2e46e3d525808adda11560041dd68..757c6c73e351c13bc6695699d9f449530546f70f 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -3951,8 +3951,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case PLUS_EXPR:
     case MINUS_EXPR:
       {
@@ -4073,10 +4071,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 043e1d5987a4c4b0159109dafb85a805ca828c1e..c0bebb7f4de36838341ed62389ad0e2b79f03034 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4288,8 +4288,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
 
     case REALIGN_LOAD_EXPR:
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case DOT_PROD_EXPR:
@@ -4298,10 +4296,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case WIDEN_MULT_MINUS_EXPR:
     case WIDEN_LSHIFT_EXPR:
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index bfabe9e76279d7c3383b684ed61cc92228de4500..0ca8802576656f098e60cb77fa4312d1375ff3f0 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -2846,8 +2846,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
       break;
 
       /* Binary arithmetic and logic expressions.  */
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case MULT_EXPR:
@@ -3811,10 +3809,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
     case VEC_SERIES_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
     case VEC_WIDEN_MULT_ODD_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
@@ -4332,12 +4326,6 @@ op_symbol_code (enum tree_code code)
     case WIDEN_LSHIFT_EXPR:
       return "w<<";
 
-    case WIDEN_PLUS_EXPR:
-      return "w+";
-
-    case WIDEN_MINUS_EXPR:
-      return "w-";
-
     case POINTER_PLUS_EXPR:
       return "+";
 
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index d20a10a1524164eef788ab4b88ba57c7a09c3387..98dd56ff022233ccead36a1f5a5e896e352f9f5b 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type)
 	  || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR
 	  || gimple_assign_rhs_code (assign) == FLOAT_EXPR)
 	{
 	  tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign));
@@ -3172,8 +3170,8 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
 	    break;
 
 	  /* Check that the DR_INITs are compile-time constants.  */
-	  if (!tree_fits_shwi_p (DR_INIT (dra))
-	      || !tree_fits_shwi_p (DR_INIT (drb)))
+	  if (TREE_CODE (DR_INIT (dra)) != INTEGER_CST
+	      || TREE_CODE (DR_INIT (drb)) != INTEGER_CST)
 	    break;
 
 	  /* Different .GOMP_SIMD_LANE calls still give the same lane,
@@ -3225,7 +3223,7 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
 		  unsigned HOST_WIDE_INT step
 		    = absu_hwi (tree_to_shwi (DR_STEP (dra)));
 		  if (step != 0
-		      && step <= ((unsigned HOST_WIDE_INT)init_b - init_a))
+		      && step <= (unsigned HOST_WIDE_INT)(init_b - init_a))
 		    break;
 		}
 	    }
diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index 350129555a0c71c0896c4f1003163f3b3557c11b..066f05873118c2288c90604e6287c91ef9aed72b 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -2209,10 +2209,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi,
      arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
      calculated in the same way above.  */
   if (code == WIDEN_SUM_EXPR
-      || code == VEC_WIDEN_PLUS_HI_EXPR
-      || code == VEC_WIDEN_PLUS_LO_EXPR
-      || code == VEC_WIDEN_MINUS_HI_EXPR
-      || code == VEC_WIDEN_MINUS_LO_EXPR
       || code == VEC_WIDEN_MULT_HI_EXPR
       || code == VEC_WIDEN_MULT_LO_EXPR
       || code == VEC_WIDEN_MULT_EVEN_EXPR
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 62ca28d725ed4ac8d7e4d493119e40772a0fbac6..9cd3989656c024b1d0394b2fcde6f6d774dff74e 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -559,21 +559,29 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-		      tree_code widened_code, bool shift_p,
+		      code_helper widened_code, bool shift_p,
 		      unsigned int max_nops,
 		      vect_unpromoted_value *unprom, tree *common_type,
 		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else
+    rhs_code = gimple_call_combined_fn (stmt);
+
+  if (rhs_code.safe_as_tree_code () != code
+      && rhs_code.get_rep () != widened_code.get_rep ())
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = is_gimple_assign (stmt) ? gimple_assign_lhs (stmt):
+				      gimple_call_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -586,7 +594,11 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op;
+      if (is_gimple_assign (stmt))
+	op = gimple_op (stmt, i + 1);
+      else
+	op = gimple_call_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
 	{
 	  /* We already have a common type from earlier operands.
@@ -1299,8 +1311,9 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
-			     false, 2, unprom, &half_type))
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+			     CFN_VEC_WIDEN_MINUS, false, 2, unprom,
+			     &half_type))
     return NULL;
 
   vect_pattern_detected ("vect_recog_sad_pattern", last_stmt);
@@ -2337,9 +2350,10 @@ vect_recog_average_pattern (vec_info *vinfo,
   internal_fn ifn = IFN_AVG_FLOOR;
   vect_unpromoted_value unprom[3];
   tree new_type;
+  enum optab_subtype subtype;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-					    WIDEN_PLUS_EXPR, false, 3,
-					    unprom, &new_type);
+					    CFN_VEC_WIDEN_PLUS, false, 3,
+					    unprom, &new_type, &subtype);
   if (nops == 0)
     return NULL;
   if (nops == 3)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 6fa0669fdfc8630842b3f9f32f4b4a253e79bb92..92b17e6d0ec18ce3d90290dba9efec5d1968264c 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4900,9 +4900,7 @@ vectorizable_conversion (vec_info *vinfo,
   else
     return false;
 
-  bool widen_arith = (code == WIDEN_PLUS_EXPR
-		 || code == WIDEN_MINUS_EXPR
-		 || code == WIDEN_MULT_EXPR
+  bool widen_arith = (code == WIDEN_MULT_EXPR
 		 || code == WIDEN_LSHIFT_EXPR
 		 || code == IFN_VEC_WIDEN_PLUS
 		 || code == IFN_VEC_WIDEN_MINUS);
@@ -4954,8 +4952,6 @@ vectorizable_conversion (vec_info *vinfo,
     {
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR
 		  || code == IFN_VEC_WIDEN_PLUS
 		  || code == IFN_VEC_WIDEN_MINUS);
 
@@ -11986,7 +11982,7 @@ supportable_widening_operation (vec_info *vinfo,
   class loop *vect_loop = NULL;
   machine_mode vec_mode;
   enum insn_code icode1, icode2;
-  optab optab1, optab2;
+  optab optab1 = unknown_optab, optab2 = unknown_optab;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
   code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES;
@@ -12080,16 +12076,6 @@ supportable_widening_operation (vec_info *vinfo,
       c2 = VEC_WIDEN_LSHIFT_HI_EXPR;
       break;
 
-    case WIDEN_PLUS_EXPR:
-      c1 = VEC_WIDEN_PLUS_LO_EXPR;
-      c2 = VEC_WIDEN_PLUS_HI_EXPR;
-      break;
-
-    case WIDEN_MINUS_EXPR:
-      c1 = VEC_WIDEN_MINUS_LO_EXPR;
-      c2 = VEC_WIDEN_MINUS_HI_EXPR;
-      break;
-
     CASE_CONVERT:
       c1 = VEC_UNPACK_LO_EXPR;
       c2 = VEC_UNPACK_HI_EXPR;
diff --git a/gcc/tree.def b/gcc/tree.def
index 62650b6934b337c5d56e5393dc114173d72c9aa9..9b2dce3576440c445d3240b9ed937fe67c9a5992 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1383,8 +1383,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3)
    the first argument from type t1 to type t2, and then shifting it
    by the second argument.  */
 DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2)
 
 /* Widening vector multiplication.
    The two operands are vectors with N elements of size S. Multiplying the
@@ -1449,10 +1447,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2)
  */
 DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2)
 DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2)
 
 /* PREDICT_EXPR.  Specify hint for branch prediction.  The
    PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the
-- 
2.17.1


^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-06-30 13:20               ` Joel Hutton
@ 2022-07-12 12:32                 ` Richard Biener
  2023-03-17 10:14                   ` Andre Vieira (lists)
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2022-07-12 12:32 UTC (permalink / raw)
  To: Joel Hutton; +Cc: Richard Sandiford, gcc-patches, Andre Simoes Dias Vieira

On Thu, 30 Jun 2022, Joel Hutton wrote:

> > We can go with a private vect_gimple_build function until we sort out the API
> > issue to unblock Tamar (I'll reply to Richards reply with further thoughts on
> > this)
> > 
> 
> Done.
> 
> > > Similarly are you ok with the use of gimple_extract_op? I would lean
> > towards using it as it is cleaner, but I don't have strong feelings.
> > 
> > I don't like using gimple_extract_op here, I think I outlined a variant that is
> > even shorter.
> > 
> 
> Done.
> 
> Updated patches attached, bootstrapped and regression tested on aarch64.
> 
> Tomorrow is my last working day at Arm, so it will likely be Andre that commits this/addresses any further comments.

First sorry for the (repeated) delays.

In the first patch I still see ECF_WIDEN, I don't like that, we
use things like associative_binary_fn_p so for widening internal
functions similar predicates should be used.

In the second patch you add vec_widen_{add,sub} optabs

+OPTAB_CD(vec_widen_add_optab, "add$a$b3")
+OPTAB_CD(vec_widen_sub_optab, "sub$a$b3")

but a) the names are that of regular adds which is at least confusing
(if not wrong), b) there's no documentation for them in md.texi which,
c) doesn't explain why they are necessary when we have 
vec_widen_[su]{add,sub}l_optab

+      internal_fn ifn = as_internal_fn (code.safe_as_fn_code ());

asks for safe_as_internal_fn () (just complete the API, also with
safe_as_builtin_fn)

+      internal_fn lo, hi;
+      lookup_multi_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);

in fact this probably shows that the guarding condition should
be if (code.is_internal_fn ()) instead of if (code.is_fn_code ()).

+      optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
+      optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype));

this shows the two lookup_ APIs are inconsistent in having two vs. one
output, please make them consistent.  I'd say give
lookup_multi_internal_fn a enum { LO, HI } argument, returning the
result.  Given VEC_WIDEN_MULT has HI, LO, EVEN and ODD variants
that sounds more future proof.

The internal_fn stuff could probably get a 2nd eye from Richard.

In the third patch I see unrelated and wrong changes like

          /* Check that the DR_INITs are compile-time constants.  */
-         if (!tree_fits_shwi_p (DR_INIT (dra))
-             || !tree_fits_shwi_p (DR_INIT (drb)))
+         if (TREE_CODE (DR_INIT (dra)) != INTEGER_CST
+             || TREE_CODE (DR_INIT (drb)) != INTEGER_CST)
            break;

please strip the patch down to relevant changes.

-      tree op = gimple_op (assign, i + 1);
+      tree op;
+      if (is_gimple_assign (stmt))
+       op = gimple_op (stmt, i + 1);
+      else
+       op = gimple_call_arg (stmt, i);

somebody added gimple_arg which can be used here doing

       op = gimple_arg (stmt, i);

+  tree lhs = is_gimple_assign (stmt) ? gimple_assign_lhs (stmt):
+                                     gimple_call_lhs (stmt);

  tree lhs = gimple_get_lhs (stmt);

   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return 0;

-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else
+    rhs_code = gimple_call_combined_fn (stmt);
+
+  if (rhs_code.safe_as_tree_code () != code
+      && rhs_code.get_rep () != widened_code.get_rep ())
     return 0;

that's probably better refactored as

 if (is_gimple_assign (stmt))
   {
     if (code check)
       return 0;
   }
 else if (is_gimple_call (..))
  {
  ..
  }
 else
   return 0;

otherwise the last patch looks reasonable.

Richard.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2022-07-12 12:32                 ` Richard Biener
@ 2023-03-17 10:14                   ` Andre Vieira (lists)
  2023-03-17 11:52                     ` Richard Biener
  0 siblings, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-03-17 10:14 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Sandiford, gcc-patches

Hi Richard,

I'm only picking this up now. Just going through your earlier comments 
and stuff and I noticed we didn't address the situation with the 
gimple::build. Do you want me to add overloaded static member functions 
to cover all gimple_build_* functions, or just create one to replace 
vect_gimple_build and we create them as needed? It's more work but I 
think adding them all would be better. I'd even argue that it would be 
nice to replace the old ones with the new ones, but I can imagine you 
might not want that as it makes backporting and the likes a bit annoying...

Let me know what you prefer, I'll go work on your latest comments too.

Cheers,
Andre

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2023-03-17 10:14                   ` Andre Vieira (lists)
@ 2023-03-17 11:52                     ` Richard Biener
  2023-04-20 13:23                       ` Andre Vieira (lists)
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2023-03-17 11:52 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Sandiford, gcc-patches

On Fri, 17 Mar 2023, Andre Vieira (lists) wrote:

> Hi Richard,
> 
> I'm only picking this up now. Just going through your earlier comments and
> stuff and I noticed we didn't address the situation with the gimple::build. Do
> you want me to add overloaded static member functions to cover all
> gimple_build_* functions, or just create one to replace vect_gimple_build and
> we create them as needed? It's more work but I think adding them all would be
> better. I'd even argue that it would be nice to replace the old ones with the
> new ones, but I can imagine you might not want that as it makes backporting
> and the likes a bit annoying...
> 
> Let me know what you prefer, I'll go work on your latest comments too.

I think the series was resolved and I approved it.  As for
vect_gimple_build the better way forward would be to use
gimple_build () as existing but add a vect_finish_stmt_* handling
a gimple_seq.

Richard.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2023-03-17 11:52                     ` Richard Biener
@ 2023-04-20 13:23                       ` Andre Vieira (lists)
  2023-04-24 11:57                         ` Richard Biener
  0 siblings, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-04-20 13:23 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Sandiford, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1105 bytes --]

Rebased all three patches and made some small changes to the second one:
- removed sub and abd optabs from commutative_optab_p, I suspect this 
was a copy paste mistake,
- removed what I believe to be a superfluous switch case in vectorizable 
conversion, the one that was here:
+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn (code.as_fn_code ());
+      int ecf_flags = internal_fn_flags (ifn);
+      gcc_assert (ecf_flags & ECF_MULTI);
+
+      switch (code.as_fn_code ())
+	{
+	case CFN_VEC_WIDEN_PLUS:
+	  break;
+	case CFN_VEC_WIDEN_MINUS:
+	  break;
+	case CFN_LAST:
+	default:
+	  return false;
+	}
+
+      internal_fn lo, hi;
+      lookup_multi_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
+      optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
      }

I don't think we need to check they are a specfic fn code, as we look-up 
optabs and if they succeed then surely we can vectorize?

OK for trunk?

Kind regards,
Andre

[-- Attachment #2: ifn0.patch --]
[-- Type: text/plain, Size: 21121 bytes --]

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 8802141cd6edb298866025b8a55843eae1f0eb17..68dfba266d679c9738a3d5d70551a91cbdafcf66 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -25,6 +25,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl.h"
 #include "tree.h"
 #include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-fold.h"
 #include "ssa.h"
 #include "expmed.h"
 #include "optabs-tree.h"
@@ -1391,7 +1393,7 @@ vect_recog_sad_pattern (vec_info *vinfo,
 static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
-			     tree_code orig_code, tree_code wide_code,
+			     tree_code orig_code, code_helper wide_code,
 			     bool shift_p, const char *name)
 {
   gimple *last_stmt = last_stmt_info->stmt;
@@ -1434,7 +1436,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
       vecctype = get_vectype_for_scalar_type (vinfo, ctype);
     }
 
-  enum tree_code dummy_code;
+  code_helper dummy_code;
   int dummy_int;
   auto_vec<tree> dummy_vec;
   if (!vectype
@@ -1455,8 +1457,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 		       2, oprnd, half_type, unprom, vectype);
 
   tree var = vect_recog_temp_ssa_var (itype, NULL);
-  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
-					      oprnd[0], oprnd[1]);
+  gimple *pattern_stmt = vect_gimple_build (var, wide_code, oprnd[0], oprnd[1]);
 
   if (vecctype != vecitype)
     pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype,
@@ -6406,3 +6407,28 @@ vect_pattern_recog (vec_info *vinfo)
   /* After this no more add_stmt calls are allowed.  */
   vinfo->stmt_vec_info_ro = true;
 }
+
+/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
+   or internal_fn contained in ch, respectively.  */
+gimple *
+vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1)
+{
+  if (op0 == NULL_TREE)
+    return NULL;
+  if (ch.is_tree_code ())
+    return op1 == NULL_TREE ? gimple_build_assign (lhs, ch.safe_as_tree_code (),
+						   op0) :
+			      gimple_build_assign (lhs, ch.safe_as_tree_code (),
+						   op0, op1);
+  else
+  {
+    internal_fn fn = as_internal_fn (ch.safe_as_fn_code ());
+    gimple* stmt;
+    if (op1 == NULL_TREE)
+      stmt = gimple_build_call_internal (fn, 1, op0);
+    else
+      stmt = gimple_build_call_internal (fn, 2, op0, op1);
+    gimple_call_set_lhs (stmt, lhs);
+    return stmt;
+  }
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 6b7dbfd4a231baec24e740ffe0ce0b0bf7a1de6b..715ec2e30a4de620b8a5076c0e7f2f7fd1b0654e 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4768,7 +4768,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
    STMT_INFO is the original scalar stmt that we are vectorizing.  */
 
 static gimple *
-vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
+vect_gen_widened_results_half (vec_info *vinfo, code_helper ch,
                                tree vec_oprnd0, tree vec_oprnd1, int op_type,
 			       tree vec_dest, gimple_stmt_iterator *gsi,
 			       stmt_vec_info stmt_info)
@@ -4777,12 +4777,11 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
   tree new_temp;
 
   /* Generate half of the widened result:  */
-  gcc_assert (op_type == TREE_CODE_LENGTH (code));
   if (op_type != binary_op)
     vec_oprnd1 = NULL;
-  new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1);
+  new_stmt = vect_gimple_build (vec_dest, ch, vec_oprnd0, vec_oprnd1);
   new_temp = make_ssa_name (vec_dest, new_stmt);
-  gimple_assign_set_lhs (new_stmt, new_temp);
+  gimple_set_lhs (new_stmt, new_temp);
   vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 
   return new_stmt;
@@ -4861,8 +4860,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 					vec<tree> *vec_oprnds1,
 					stmt_vec_info stmt_info, tree vec_dest,
 					gimple_stmt_iterator *gsi,
-					enum tree_code code1,
-					enum tree_code code2, int op_type)
+					code_helper ch1,
+					code_helper ch2, int op_type)
 {
   int i;
   tree vop0, vop1, new_tmp1, new_tmp2;
@@ -4878,10 +4877,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 	vop1 = NULL_TREE;
 
       /* Generate the two halves of promotion operation.  */
-      new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1,
+      new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
-      new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1,
+      new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
       if (is_gimple_call (new_stmt1))
@@ -4978,8 +4977,9 @@ vectorizable_conversion (vec_info *vinfo,
   tree scalar_dest;
   tree op0, op1 = NULL_TREE;
   loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
-  enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK;
-  enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
+  tree_code tc1;
+  code_helper code, code1, code2;
+  code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
   tree new_temp;
   enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type};
   int ndts = 2;
@@ -5008,31 +5008,43 @@ vectorizable_conversion (vec_info *vinfo,
       && ! vec_stmt)
     return false;
 
-  gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!stmt)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return false;
 
-  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
+  if (gimple_get_lhs (stmt) == NULL_TREE
+      || TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
     return false;
 
-  code = gimple_assign_rhs_code (stmt);
-  if (!CONVERT_EXPR_CODE_P (code)
-      && code != FIX_TRUNC_EXPR
-      && code != FLOAT_EXPR
-      && code != WIDEN_PLUS_EXPR
-      && code != WIDEN_MINUS_EXPR
-      && code != WIDEN_MULT_EXPR
-      && code != WIDEN_LSHIFT_EXPR)
+  if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
+    return false;
+
+  if (is_gimple_assign (stmt))
+    {
+      code = gimple_assign_rhs_code (stmt);
+      op_type = TREE_CODE_LENGTH (code.safe_as_tree_code ());
+    }
+  else if (gimple_call_internal_p (stmt))
+    {
+      code = gimple_call_internal_fn (stmt);
+      op_type = gimple_call_num_args (stmt);
+    }
+  else
     return false;
 
   bool widen_arith = (code == WIDEN_PLUS_EXPR
-		      || code == WIDEN_MINUS_EXPR
-		      || code == WIDEN_MULT_EXPR
-		      || code == WIDEN_LSHIFT_EXPR);
-  op_type = TREE_CODE_LENGTH (code);
+		 || code == WIDEN_MINUS_EXPR
+		 || code == WIDEN_MULT_EXPR
+		 || code == WIDEN_LSHIFT_EXPR);
+
+  if (!widen_arith
+      && !CONVERT_EXPR_CODE_P (code)
+      && code != FIX_TRUNC_EXPR
+      && code != FLOAT_EXPR)
+    return false;
 
   /* Check types of lhs and rhs.  */
-  scalar_dest = gimple_assign_lhs (stmt);
+  scalar_dest = gimple_get_lhs (stmt);
   lhs_type = TREE_TYPE (scalar_dest);
   vectype_out = STMT_VINFO_VECTYPE (stmt_info);
 
@@ -5070,10 +5082,14 @@ vectorizable_conversion (vec_info *vinfo,
 
   if (op_type == binary_op)
     {
-      gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR);
+      gcc_assert (code == WIDEN_MULT_EXPR
+		  || code == WIDEN_LSHIFT_EXPR
+		  || code == WIDEN_PLUS_EXPR
+		  || code == WIDEN_MINUS_EXPR);
+
 
-      op1 = gimple_assign_rhs2 (stmt);
+      op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
+				     gimple_call_arg (stmt, 0);
       tree vectype1_in;
       if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1,
 			       &op1, &slp_op1, &dt[1], &vectype1_in))
@@ -5157,8 +5173,12 @@ vectorizable_conversion (vec_info *vinfo,
 	  && code != FLOAT_EXPR
 	  && !CONVERT_EXPR_CODE_P (code))
 	return false;
-      if (supportable_convert_operation (code, vectype_out, vectype_in, &code1))
+      if (supportable_convert_operation (code.safe_as_tree_code (), vectype_out,
+					 vectype_in, &tc1))
+      {
+	code1 = tc1;
 	break;
+      }
       /* FALLTHRU */
     unsupported:
       if (dump_enabled_p ())
@@ -5169,9 +5189,11 @@ vectorizable_conversion (vec_info *vinfo,
     case WIDEN:
       if (known_eq (nunits_in, nunits_out))
 	{
-	  if (!supportable_half_widening_operation (code, vectype_out,
-						   vectype_in, &code1))
+	  if (!supportable_half_widening_operation (code.safe_as_tree_code (),
+						    vectype_out, vectype_in,
+						    &tc1))
 	    goto unsupported;
+	  code1 = tc1;
 	  gcc_assert (!(multi_step_cvt && op_type == binary_op));
 	  break;
 	}
@@ -5205,14 +5227,17 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (GET_MODE_SIZE (rhs_mode) == fltsz)
 	    {
-	      if (!supportable_convert_operation (code, vectype_out,
-						  cvt_type, &codecvt1))
+	      tc1 = ERROR_MARK;
+	      if (!supportable_convert_operation (code.safe_as_tree_code (),
+						  vectype_out,
+						  cvt_type, &tc1))
 		goto unsupported;
+	      codecvt1 = tc1;
 	    }
-	  else if (!supportable_widening_operation (vinfo, code, stmt_info,
-						    vectype_out, cvt_type,
-						    &codecvt1, &codecvt2,
-						    &multi_step_cvt,
+	  else if (!supportable_widening_operation (vinfo, code,
+						    stmt_info, vectype_out,
+						    cvt_type, &codecvt1,
+						    &codecvt2, &multi_step_cvt,
 						    &interm_types))
 	    continue;
 	  else
@@ -5220,8 +5245,9 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info,
 					      cvt_type,
-					      vectype_in, &code1, &code2,
-					      &multi_step_cvt, &interm_types))
+					      vectype_in, &code1,
+					      &code2, &multi_step_cvt,
+					      &interm_types))
 	    {
 	      found_mode = true;
 	      break;
@@ -5243,10 +5269,15 @@ vectorizable_conversion (vec_info *vinfo,
 
     case NARROW:
       gcc_assert (op_type == unary_op);
-      if (supportable_narrowing_operation (code, vectype_out, vectype_in,
-					   &code1, &multi_step_cvt,
+      if (supportable_narrowing_operation (code.safe_as_tree_code (),
+					   vectype_out,
+					   vectype_in,
+					   &tc1, &multi_step_cvt,
 					   &interm_types))
-	break;
+	{
+	  code1 = tc1;
+	  break;
+	}
 
       if (code != FIX_TRUNC_EXPR
 	  || GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode))
@@ -5257,13 +5288,18 @@ vectorizable_conversion (vec_info *vinfo,
       cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
       if (cvt_type == NULL_TREE)
 	goto unsupported;
-      if (!supportable_convert_operation (code, cvt_type, vectype_in,
-					  &codecvt1))
+      if (!supportable_convert_operation (code.safe_as_tree_code (), cvt_type,
+					  vectype_in,
+					  &tc1))
 	goto unsupported;
+      codecvt1 = tc1;
       if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type,
-					   &code1, &multi_step_cvt,
+					   &tc1, &multi_step_cvt,
 					   &interm_types))
-	break;
+	{
+	  code1 = tc1;
+	  break;
+	}
       goto unsupported;
 
     default:
@@ -5377,8 +5413,10 @@ vectorizable_conversion (vec_info *vinfo,
       FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	{
 	  /* Arguments are ready, create the new vector stmt.  */
-	  gcc_assert (TREE_CODE_LENGTH (code1) == unary_op);
-	  gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0);
+	  gcc_assert (TREE_CODE_LENGTH ((tree_code) code1) == unary_op);
+	  gassign *new_stmt = gimple_build_assign (vec_dest,
+						   code1.safe_as_tree_code (),
+						   vop0);
 	  new_temp = make_ssa_name (vec_dest, new_stmt);
 	  gimple_assign_set_lhs (new_stmt, new_temp);
 	  vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
@@ -5410,7 +5448,7 @@ vectorizable_conversion (vec_info *vinfo,
       for (i = multi_step_cvt; i >= 0; i--)
 	{
 	  tree this_dest = vec_dsts[i];
-	  enum tree_code c1 = code1, c2 = code2;
+	  code_helper c1 = code1, c2 = code2;
 	  if (i == 0 && codecvt2 != ERROR_MARK)
 	    {
 	      c1 = codecvt1;
@@ -5420,7 +5458,8 @@ vectorizable_conversion (vec_info *vinfo,
 	    vect_create_half_widening_stmts (vinfo, &vec_oprnds0,
 						    &vec_oprnds1, stmt_info,
 						    this_dest, gsi,
-						    c1, op_type);
+						    c1.safe_as_tree_code (),
+						    op_type);
 	  else
 	    vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0,
 						    &vec_oprnds1, stmt_info,
@@ -5433,9 +5472,11 @@ vectorizable_conversion (vec_info *vinfo,
 	  gimple *new_stmt;
 	  if (cvt_type)
 	    {
-	      gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
+	      gcc_assert (TREE_CODE_LENGTH ((tree_code) codecvt1) == unary_op);
 	      new_temp = make_ssa_name (vec_dest);
-	      new_stmt = gimple_build_assign (new_temp, codecvt1, vop0);
+	      new_stmt = gimple_build_assign (new_temp,
+					      codecvt1.safe_as_tree_code (),
+					      vop0);
 	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    }
 	  else
@@ -5459,10 +5500,12 @@ vectorizable_conversion (vec_info *vinfo,
       if (cvt_type)
 	FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	  {
-	    gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
+	    gcc_assert (TREE_CODE_LENGTH (((tree_code) codecvt1)) == unary_op);
 	    new_temp = make_ssa_name (vec_dest);
 	    gassign *new_stmt
-	      = gimple_build_assign (new_temp, codecvt1, vop0);
+	      = gimple_build_assign (new_temp,
+				     codecvt1.safe_as_tree_code (),
+				     vop0);
 	    vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    vec_oprnds0[i] = new_temp;
 	  }
@@ -5470,7 +5513,8 @@ vectorizable_conversion (vec_info *vinfo,
       vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0,
 					     multi_step_cvt,
 					     stmt_info, vec_dsts, gsi,
-					     slp_node, code1);
+					     slp_node,
+					     code1.safe_as_tree_code ());
       break;
     }
   if (!slp_node)
@@ -12151,9 +12195,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype)
 
 bool
 supportable_widening_operation (vec_info *vinfo,
-				enum tree_code code, stmt_vec_info stmt_info,
+				code_helper code,
+				stmt_vec_info stmt_info,
 				tree vectype_out, tree vectype_in,
-                                enum tree_code *code1, enum tree_code *code2,
+				code_helper *code1,
+				code_helper *code2,
                                 int *multi_step_cvt,
                                 vec<tree> *interm_types)
 {
@@ -12164,7 +12210,7 @@ supportable_widening_operation (vec_info *vinfo,
   optab optab1, optab2;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
-  enum tree_code c1, c2;
+  code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES;
   int i;
   tree prev_type, intermediate_type;
   machine_mode intermediate_mode, prev_mode;
@@ -12174,7 +12220,7 @@ supportable_widening_operation (vec_info *vinfo,
   if (loop_info)
     vect_loop = LOOP_VINFO_LOOP (loop_info);
 
-  switch (code)
+  switch (code.safe_as_tree_code ())
     {
     case WIDEN_MULT_EXPR:
       /* The result of a vectorized widening operation usually requires
@@ -12215,8 +12261,9 @@ supportable_widening_operation (vec_info *vinfo,
 	  && !nested_in_vect_loop_p (vect_loop, stmt_info)
 	  && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR,
 					     stmt_info, vectype_out,
-					     vectype_in, code1, code2,
-					     multi_step_cvt, interm_types))
+					     vectype_in, code1,
+					     code2, multi_step_cvt,
+					     interm_types))
         {
           /* Elements in a vector with vect_used_by_reduction property cannot
              be reordered if the use chain with this property does not have the
@@ -12279,6 +12326,9 @@ supportable_widening_operation (vec_info *vinfo,
       c2 = VEC_UNPACK_FIX_TRUNC_HI_EXPR;
       break;
 
+    case MAX_TREE_CODES:
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -12289,10 +12339,12 @@ supportable_widening_operation (vec_info *vinfo,
   if (code == FIX_TRUNC_EXPR)
     {
       /* The signedness is determined from output operand.  */
-      optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+      optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out,
+				    optab_default);
+      optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out,
+				    optab_default);
     }
-  else if (CONVERT_EXPR_CODE_P (code)
+  else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
 	   && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
 	   && VECTOR_BOOLEAN_TYPE_P (vectype)
 	   && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
@@ -12305,8 +12357,10 @@ supportable_widening_operation (vec_info *vinfo,
     }
   else
     {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+      optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype,
+				    optab_default);
+      optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype,
+				    optab_default);
     }
 
   if (!optab1 || !optab2)
@@ -12317,8 +12371,12 @@ supportable_widening_operation (vec_info *vinfo,
        || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
     return false;
 
-  *code1 = c1;
-  *code2 = c2;
+  if (code.is_tree_code ())
+  {
+    *code1 = c1;
+    *code2 = c2;
+  }
+
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
       && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
@@ -12339,7 +12397,7 @@ supportable_widening_operation (vec_info *vinfo,
   prev_type = vectype;
   prev_mode = vec_mode;
 
-  if (!CONVERT_EXPR_CODE_P (code))
+  if (!CONVERT_EXPR_CODE_P ((tree_code) code))
     return false;
 
   /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS
@@ -12379,8 +12437,12 @@ supportable_widening_operation (vec_info *vinfo,
 	}
       else
 	{
-	  optab3 = optab_for_tree_code (c1, intermediate_type, optab_default);
-	  optab4 = optab_for_tree_code (c2, intermediate_type, optab_default);
+	  optab3 = optab_for_tree_code (c1.safe_as_tree_code (),
+					intermediate_type,
+					optab_default);
+	  optab4 = optab_for_tree_code (c2.safe_as_tree_code (),
+					intermediate_type,
+					optab_default);
 	}
 
       if (!optab3 || !optab4
@@ -12439,7 +12501,7 @@ supportable_widening_operation (vec_info *vinfo,
 bool
 supportable_narrowing_operation (enum tree_code code,
 				 tree vectype_out, tree vectype_in,
-				 enum tree_code *code1, int *multi_step_cvt,
+				 tree_code *code1, int *multi_step_cvt,
                                  vec<tree> *interm_types)
 {
   machine_mode vec_mode;
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 9cf2fb23fe397b467d89aa7cc5ebeaa293ed4cce..d241eba6ef3302225bbe37b374baa11e6472c280 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -2139,13 +2139,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree,
 				enum vect_def_type *,
 				tree *, stmt_vec_info * = NULL);
 extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree);
-extern bool supportable_widening_operation (vec_info *,
-					    enum tree_code, stmt_vec_info,
-					    tree, tree, enum tree_code *,
-					    enum tree_code *, int *,
-					    vec<tree> *);
+extern bool supportable_widening_operation (vec_info*, code_helper,
+					    stmt_vec_info, tree, tree,
+					    code_helper*, code_helper*,
+					    int*, vec<tree> *);
 extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
-					     enum tree_code *, int *,
+					     tree_code *, int *,
 					     vec<tree> *);
 
 extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
@@ -2583,4 +2582,7 @@ vect_is_integer_truncation (stmt_vec_info stmt_info)
 	  && TYPE_PRECISION (lhs_type) < TYPE_PRECISION (rhs_type));
 }
 
+/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
+   or internal_fn contained in ch, respectively.  */
+gimple * vect_gimple_build (tree, code_helper, tree, tree);
 #endif  /* GCC_TREE_VECTORIZER_H  */
diff --git a/gcc/tree.h b/gcc/tree.h
index abcdb5638d49aea4ccc46efa8e540b1fa78aa27a..a250a80e0321241e1158086acb2dd837d5827e10 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -93,6 +93,8 @@ public:
   bool is_internal_fn () const;
   bool is_builtin_fn () const;
   int get_rep () const { return rep; }
+  enum tree_code safe_as_tree_code () const;
+  combined_fn safe_as_fn_code () const;
   bool operator== (const code_helper &other) { return rep == other.rep; }
   bool operator!= (const code_helper &other) { return rep != other.rep; }
   bool operator== (tree_code c) { return rep == code_helper (c).rep; }
@@ -102,6 +104,17 @@ private:
   int rep;
 };
 
+inline enum tree_code
+code_helper::safe_as_tree_code () const
+{
+  return is_tree_code () ? (tree_code)* this : MAX_TREE_CODES;
+}
+
+inline combined_fn
+code_helper::safe_as_fn_code () const {
+  return is_fn_code () ? (combined_fn) *this : CFN_LAST;
+}
+
 inline code_helper::operator internal_fn () const
 {
   return as_internal_fn (combined_fn (*this));

[-- Attachment #3: ifn1.patch --]
[-- Type: text/plain, Size: 18605 bytes --]

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 6e81dc05e0e0714256759b0594816df451415a2d..e4d815cd577d266d2bccf6fb68d62aac91a8b4cf 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
 along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
+#define INCLUDE_MAP
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = {
   0
 };
 
+const enum internal_fn internal_fn_hilo_keys_array[] = {
+#undef DEF_INTERNAL_OPTAB_HILO_FN
+#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  IFN_##NAME##_LO, \
+  IFN_##NAME##_HI,
+#include "internal-fn.def"
+  IFN_LAST
+#undef DEF_INTERNAL_OPTAB_HILO_FN
+};
+
+const optab internal_fn_hilo_values_array[] = {
+#undef DEF_INTERNAL_OPTAB_HILO_FN
+#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  SOPTAB##_lo_optab, UOPTAB##_lo_optab, \
+  SOPTAB##_hi_optab, UOPTAB##_hi_optab,
+#include "internal-fn.def"
+  unknown_optab, unknown_optab
+#undef DEF_INTERNAL_OPTAB_HILO_FN
+};
+
 /* Return the internal function called NAME, or IFN_LAST if there's
    no such function.  */
 
@@ -90,6 +111,61 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+static int
+ifn_cmp (const void *a_, const void *b_)
+{
+  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
+  auto *a = (const std::pair<ifn_pair, optab> *)a_;
+  auto *b = (const std::pair<ifn_pair, optab> *)b_;
+  return (int) (a->first.first) - (b->first.first);
+}
+
+/* Return the optab belonging to the given internal function NAME for the given
+   SIGN or unknown_optab.  */
+
+optab
+lookup_hilo_ifn_optab (enum internal_fn fn, unsigned sign)
+{
+  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
+  typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type;
+  static fn_to_optab_map_type *fn_to_optab_map;
+
+  if (!fn_to_optab_map)
+    {
+      unsigned num
+	= sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn);
+      fn_to_optab_map = new fn_to_optab_map_type ();
+      for (unsigned int i = 0; i < num - 1; ++i)
+	{
+	  enum internal_fn fn = internal_fn_hilo_keys_array[i];
+	  optab v1 = internal_fn_hilo_values_array[2*i];
+	  optab v2 = internal_fn_hilo_values_array[2*i + 1];
+	  ifn_pair key1 (fn, 0);
+	  fn_to_optab_map->safe_push ({key1, v1});
+	  ifn_pair key2 (fn, 1);
+	  fn_to_optab_map->safe_push ({key2, v2});
+	}
+	fn_to_optab_map->qsort (ifn_cmp);
+    }
+
+  ifn_pair new_pair (fn, sign ? 1 : 0);
+  optab tmp;
+  std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp);
+  auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp);
+  return entry != fn_to_optab_map->end () ? entry->second : unknown_optab;
+}
+
+extern void
+lookup_hilo_internal_fn (enum internal_fn ifn, enum internal_fn *lo,
+			  enum internal_fn *hi)
+{
+  gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -3970,6 +4046,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -4043,6 +4122,42 @@ first_commutative_argument (internal_fn fn)
     }
 }
 
+/* Return true if FN has a wider output type than its argument types.  */
+
+bool
+widening_fn_p (internal_fn fn)
+{
+  switch (fn)
+    {
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_MINUS:
+      return true;
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if FN decomposes to _hi and _lo IFN.  If true this should also
+   be a widening function.  */
+
+bool
+decomposes_to_hilo_fn_p (internal_fn fn)
+{
+  if (!widening_fn_p (fn))
+    return false;
+
+  switch (fn)
+    {
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_MINUS:
+      return true;
+
+    default:
+      return false;
+    }
+}
+
 /* Return true if IFN_SET_EDOM is supported.  */
 
 bool
@@ -4055,6 +4170,32 @@ set_edom_supported_p (void)
 #endif
 }
 
+#undef DEF_INTERNAL_OPTAB_HILO_FN
+#define DEF_INTERNAL_OPTAB_HILO_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  static void							\
+  expand_##CODE (internal_fn, gcall *)				\
+  {								\
+    gcc_unreachable ();						\
+  }								\
+  static void							\
+  expand_##CODE##_LO (internal_fn fn, gcall *stmt)		\
+  {								\
+    tree ty = TREE_TYPE (gimple_get_lhs (stmt));		\
+    if (!TYPE_UNSIGNED (ty))					\
+      expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab);	\
+    else							\
+      expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab);	\
+  }								\
+  static void							\
+  expand_##CODE##_HI (internal_fn fn, gcall *stmt)		\
+  {								\
+    tree ty = TREE_TYPE (gimple_get_lhs (stmt));		\
+    if (!TYPE_UNSIGNED (ty))					\
+      expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab);	\
+    else							\
+      expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab);	\
+  }
+
 #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \
   static void						\
   expand_##CODE (internal_fn fn, gcall *stmt)		\
@@ -4071,6 +4212,7 @@ set_edom_supported_p (void)
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);			\
   }
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_HILO_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..347ed667d92620e0ee3ea15c58ecac6c242ebe73 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -85,6 +85,13 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it
+   provides convenience wrappers for defining conversions that require a
+   hi/lo split, like widening and narrowing operations.  Each definition
+   for <NAME> will require an optab named <OPTAB> and two other optabs that
+   you specify for signed and unsigned.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -123,6 +130,14 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_OPTAB_HILO_FN
+#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE)
+#endif
+
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -315,6 +330,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_OPTAB_HILO_FN (VEC_WIDEN_PLUS,
+			    ECF_CONST | ECF_NOTHROW,
+			    vec_widen_add, vec_widen_saddl, vec_widen_uaddl,
+			     binary)
+DEF_INTERNAL_OPTAB_HILO_FN (VEC_WIDEN_MINUS,
+			    ECF_CONST | ECF_NOTHROW,
+			    vec_widen_sub, vec_widen_ssubl, vec_widen_usubl,
+			     binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 08922ed4254898f5fffca3f33973e96ed9ce772f..6a5f8762e872ad2ef64ce2986a678e3b40622d81 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern optab lookup_hilo_ifn_optab (enum internal_fn, unsigned);
+extern void lookup_hilo_internal_fn (enum internal_fn, enum internal_fn *,
+				      enum internal_fn *);
 
 /* Return the ECF_* flags for function FN.  */
 
@@ -210,6 +217,8 @@ extern bool commutative_binary_fn_p (internal_fn);
 extern bool commutative_ternary_fn_p (internal_fn);
 extern int first_commutative_argument (internal_fn);
 extern bool associative_binary_fn_p (internal_fn);
+extern bool widening_fn_p (internal_fn);
+extern bool decomposes_to_hilo_fn_p (internal_fn);
 
 extern bool set_edom_supported_p (void);
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index c8e39c82d57a7d726e7da33d247b80f32ec9236c..d4dd7ee3d34d01c32ab432ae4e4ce9e4b522b2f7 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1314,7 +1314,12 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == smul_widen_optab
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
-	  || binoptab == umul_highpart_optab);
+	  || binoptab == umul_highpart_optab
+	  || binoptab == vec_widen_add_optab
+	  || binoptab == vec_widen_saddl_hi_optab
+	  || binoptab == vec_widen_saddl_lo_optab
+	  || binoptab == vec_widen_uaddl_hi_optab
+	  || binoptab == vec_widen_uaddl_lo_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 695f5911b300c9ca5737de9be809fa01aabe5e01..e064189103b3be70644468d11f3c91ac45ffe0d0 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4")
 OPTAB_CD(umsub_widen_optab, "umsub$b$a4")
 OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4")
 OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4")
+OPTAB_CD(vec_widen_add_optab, "add$a$b3")
+OPTAB_CD(vec_widen_sub_optab, "sub$a$b3")
 OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b")
 OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b")
 OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 68dfba266d679c9738a3d5d70551a91cbdafcf66..1a514461b2ca416f45a5fa9abe417980d33ef4df 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -1394,14 +1394,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
 			     tree_code orig_code, code_helper wide_code,
-			     bool shift_p, const char *name)
+			     bool shift_p, const char *name,
+			     enum optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-			     shift_p, 2, unprom, &half_type))
+			     shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1467,6 +1469,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 			      type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+			     stmt_vec_info last_stmt_info, tree *type_out,
+			     tree_code orig_code, internal_fn wide_ifn,
+			     bool shift_p, const char *name,
+			     enum optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+				      orig_code, ifn, shift_p, name,
+				      subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1480,26 +1496,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS.  See vect_recog_widen_op_pattern for details.  */
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  enum optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-				      "vect_recog_widen_plus_pattern");
+				      PLUS_EXPR, IFN_VEC_WIDEN_PLUS,
+				      false, "vect_recog_widen_plus_pattern",
+				      &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS.  See vect_recog_widen_op_pattern for details.  */
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  enum optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-				      "vect_recog_widen_minus_pattern");
+				      MINUS_EXPR, IFN_VEC_WIDEN_MINUS,
+				      false, "vect_recog_widen_minus_pattern",
+				      &subtype);
 }
 
 /* Function vect_recog_popcount_pattern
@@ -6067,6 +6087,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 715ec2e30a4de620b8a5076c0e7f2f7fd1b0654e..f4806073f48d4dedea3ac9bd855792b152d78919 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5035,7 +5035,9 @@ vectorizable_conversion (vec_info *vinfo,
   bool widen_arith = (code == WIDEN_PLUS_EXPR
 		 || code == WIDEN_MINUS_EXPR
 		 || code == WIDEN_MULT_EXPR
-		 || code == WIDEN_LSHIFT_EXPR);
+		 || code == WIDEN_LSHIFT_EXPR
+		 || code == IFN_VEC_WIDEN_PLUS
+		 || code == IFN_VEC_WIDEN_MINUS);
 
   if (!widen_arith
       && !CONVERT_EXPR_CODE_P (code)
@@ -5085,7 +5087,9 @@ vectorizable_conversion (vec_info *vinfo,
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
 		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR);
+		  || code == WIDEN_MINUS_EXPR
+		  || code == IFN_VEC_WIDEN_PLUS
+		  || code == IFN_VEC_WIDEN_MINUS);
 
 
       op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
@@ -12355,14 +12359,50 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
-    {
-      optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype,
-				    optab_default);
-      optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype,
-				    optab_default);
+
+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn (code.safe_as_fn_code ());
+      gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+      internal_fn lo, hi;
+      lookup_hilo_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
+      optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
     }
 
+  if (code.is_tree_code ())
+  {
+    if (code == FIX_TRUNC_EXPR)
+      {
+	/* The signedness is determined from output operand.  */
+	optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out,
+				      optab_default);
+	optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out,
+				      optab_default);
+      }
+    else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
+	     && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+	     && VECTOR_BOOLEAN_TYPE_P (vectype)
+	     && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+	     && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+      {
+	/* If the input and result modes are the same, a different optab
+	   is needed where we pass in the number of units in vectype.  */
+	optab1 = vec_unpacks_sbool_lo_optab;
+	optab2 = vec_unpacks_sbool_hi_optab;
+      }
+    else
+      {
+	optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype,
+				      optab_default);
+	optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype,
+				      optab_default);
+      }
+  }
+
   if (!optab1 || !optab2)
     return false;
 

[-- Attachment #4: ifn2.patch --]
[-- Type: text/plain, Size: 19234 bytes --]

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 1a1b26b1c6c23ce273bcd08dc9a973f777174007..25b1558dcb941ea491a19aeeb2cd8f4d2dbdf7c6 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -5365,10 +5365,6 @@ expand_debug_expr (tree exp)
     case VEC_WIDEN_MULT_ODD_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_PERM_EXPR:
     case VEC_DUPLICATE_EXPR:
     case VEC_SERIES_EXPR:
@@ -5405,8 +5401,6 @@ expand_debug_expr (tree exp)
     case WIDEN_MULT_EXPR:
     case WIDEN_MULT_PLUS_EXPR:
     case WIDEN_MULT_MINUS_EXPR:
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
       if (SCALAR_INT_MODE_P (GET_MODE (op0))
 	  && SCALAR_INT_MODE_P (mode))
 	{
@@ -5419,10 +5413,6 @@ expand_debug_expr (tree exp)
 	    op1 = simplify_gen_unary (ZERO_EXTEND, mode, op1, inner_mode);
 	  else
 	    op1 = simplify_gen_unary (SIGN_EXTEND, mode, op1, inner_mode);
-	  if (TREE_CODE (exp) == WIDEN_PLUS_EXPR)
-	    return simplify_gen_binary (PLUS, mode, op0, op1);
-	  else if (TREE_CODE (exp) == WIDEN_MINUS_EXPR)
-	    return simplify_gen_binary (MINUS, mode, op0, op1);
 	  op0 = simplify_gen_binary (MULT, mode, op0, op1);
 	  if (TREE_CODE (exp) == WIDEN_MULT_EXPR)
 	    return op0;
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 2c14b7abce2db0a3da0a21e916907947cb56a265..3816abaaf4d364d604a44942317f96f3f303e5b6 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,10 +1811,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
-@tindex VEC_WIDEN_PLUS_HI_EXPR
-@tindex VEC_WIDEN_PLUS_LO_EXPR
-@tindex VEC_WIDEN_MINUS_HI_EXPR
-@tindex VEC_WIDEN_MINUS_LO_EXPR
 @tindex VEC_UNPACK_HI_EXPR
 @tindex VEC_UNPACK_LO_EXPR
 @tindex VEC_UNPACK_FLOAT_HI_EXPR
@@ -1861,33 +1857,6 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
-@item VEC_WIDEN_PLUS_HI_EXPR
-@itemx VEC_WIDEN_PLUS_LO_EXPR
-These nodes represent widening vector addition of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The result
-is a vector that contains half as many elements, of an integral type whose size
-is twice as wide.  In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.  In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.
-
-@item VEC_WIDEN_MINUS_HI_EXPR
-@itemx VEC_WIDEN_MINUS_LO_EXPR
-These nodes represent widening vector subtraction of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The high/low
-elements of the second vector are subtracted from the high/low elements of the
-first. The result is a vector that contains half as many elements, of an
-integral type whose size is twice as wide.  In the case of
-@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second
-vector are subtracted from the high @code{N/2} of the first to produce the
-vector of @code{N/2} products.  In the case of
-@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second
-vector are subtracted from the low @code{N/2} of the first to produce the
-vector of @code{N/2} products.
-
 @item VEC_UNPACK_HI_EXPR
 @itemx VEC_UNPACK_LO_EXPR
 These nodes represent unpacking of the high and low parts of the input vector,
diff --git a/gcc/expr.cc b/gcc/expr.cc
index f8f5cc5a6ca67f291b3c8b7246d593c0be80272f..454d1391b19a7d2aa53f0a88876d1eaf0494de51 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -9601,8 +9601,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 					  target, unsignedp);
       return target;
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_MULT_EXPR:
       /* If first operand is constant, swap them.
 	 Thus the following special case checks need only
@@ -10380,10 +10378,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	return temp;
       }
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc
index 300e9d7ed1e7be73f30875e08c461a8880c3134e..d903826894e7f0dfd34dc0caad92eea3caa45e05 100644
--- a/gcc/gimple-pretty-print.cc
+++ b/gcc/gimple-pretty-print.cc
@@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc,
     case VEC_PACK_FLOAT_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_SERIES_EXPR:
       for (p = get_tree_code_name (code); *p; p++)
 	pp_character (buffer, TOUPPER (*p));
diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 4ca32a7b5d52f8426b09d1446a336650e143b41f..5ae7f7596c6fc6f901e4e47ae44f00185f4602b2 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -797,12 +797,6 @@ gimple_range_op_handler::maybe_non_standard ()
   if (gimple_code (m_stmt) == GIMPLE_ASSIGN)
     switch (gimple_assign_rhs_code (m_stmt))
       {
-	case WIDEN_PLUS_EXPR:
-	{
-	  signed_op = ptr_op_widen_plus_signed;
-	  unsigned_op = ptr_op_widen_plus_unsigned;
-	}
-	gcc_fallthrough ();
 	case WIDEN_MULT_EXPR:
 	{
 	  m_valid = false;
diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc
index 8010046c6a8b3e809c989ddef7a06ddaa68ae32a..ee1aa8c9676ee9c67edbf403e6295da391826a62 100644
--- a/gcc/optabs-tree.cc
+++ b/gcc/optabs-tree.cc
@@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
       return (TYPE_UNSIGNED (type)
 	      ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab);
 
-    case VEC_WIDEN_PLUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab);
-
-    case VEC_WIDEN_PLUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab);
-
-    case VEC_WIDEN_MINUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab);
-
-    case VEC_WIDEN_MINUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab);
-
     case VEC_UNPACK_HI_EXPR:
       return (TYPE_UNSIGNED (type)
 	      ? vec_unpacku_hi_optab : vec_unpacks_hi_optab);
@@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
    'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO.
 
    Supported widening operations:
-    WIDEN_MINUS_EXPR
-    WIDEN_PLUS_EXPR
     WIDEN_MULT_EXPR
     WIDEN_LSHIFT_EXPR
 
@@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out,
     case WIDEN_LSHIFT_EXPR:
       *code1 = LSHIFT_EXPR;
       break;
-    case WIDEN_MINUS_EXPR:
-      *code1 = MINUS_EXPR;
-      break;
-    case WIDEN_PLUS_EXPR:
-      *code1 = PLUS_EXPR;
-      break;
     case WIDEN_MULT_EXPR:
       *code1 = MULT_EXPR;
       break;
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index a9fcc7fd050f871437ef336ecfb8d6cc81280ee0..f80cd1465df83b5540492e619e56b9af249e9f31 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -4017,8 +4017,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case PLUS_EXPR:
     case MINUS_EXPR:
       {
@@ -4139,10 +4137,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index c702f0032a19203a7c536a01c1e7f47fc7b77add..6e5fd45a0c2435109dd3d50e8fc8e1d4969a1fd0 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4273,8 +4273,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
 
     case REALIGN_LOAD_EXPR:
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case DOT_PROD_EXPR:
@@ -4283,10 +4281,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case WIDEN_MULT_MINUS_EXPR:
     case WIDEN_LSHIFT_EXPR:
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 7947f9647a15110b52d195643ad7d28ee32d4236..9941d8bf80535a98e647b8928619a6bf08bc434c 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -2874,8 +2874,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
       break;
 
       /* Binary arithmetic and logic expressions.  */
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case MULT_EXPR:
@@ -3831,10 +3829,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
     case VEC_SERIES_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
     case VEC_WIDEN_MULT_ODD_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
@@ -4352,12 +4346,6 @@ op_symbol_code (enum tree_code code)
     case WIDEN_LSHIFT_EXPR:
       return "w<<";
 
-    case WIDEN_PLUS_EXPR:
-      return "w+";
-
-    case WIDEN_MINUS_EXPR:
-      return "w-";
-
     case POINTER_PLUS_EXPR:
       return "+";
 
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 8daf7bd7dd34d043b1d7b4cba1779f0ecf9f520a..213a3899a6c145bb057cd118bec1df7a05728aef 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type)
 	  || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR
 	  || gimple_assign_rhs_code (assign) == FLOAT_EXPR)
 	{
 	  tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign));
diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index 445da53292e9d1d2db62ca962fc017bb0e6c9bbe..342ffc5fa7f3b8f37e6bd4658d2f1fccf1d2c7fa 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -2227,10 +2227,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi,
      arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
      calculated in the same way above.  */
   if (code == WIDEN_SUM_EXPR
-      || code == VEC_WIDEN_PLUS_HI_EXPR
-      || code == VEC_WIDEN_PLUS_LO_EXPR
-      || code == VEC_WIDEN_MINUS_HI_EXPR
-      || code == VEC_WIDEN_MINUS_LO_EXPR
       || code == VEC_WIDEN_MULT_HI_EXPR
       || code == VEC_WIDEN_MULT_LO_EXPR
       || code == VEC_WIDEN_MULT_EVEN_EXPR
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 1a514461b2ca416f45a5fa9abe417980d33ef4df..13c69133d7ae565cf0334390cb0c303c89f98ac8 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -561,21 +561,35 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-		      tree_code widened_code, bool shift_p,
+		      code_helper widened_code, bool shift_p,
 		      unsigned int max_nops,
 		      vect_unpromoted_value *unprom, tree *common_type,
 		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    {
+      rhs_code = gimple_assign_rhs_code (stmt);
+      if (rhs_code.safe_as_tree_code () != code
+	  && rhs_code.get_rep () != widened_code.get_rep ())
+	return 0;
+    }
+  else if (is_gimple_call (stmt))
+    {
+      rhs_code = gimple_call_combined_fn (stmt);
+      if (rhs_code.get_rep () != widened_code.get_rep ())
+	return 0;
+    }
+  else
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = gimple_get_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -588,7 +602,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op = gimple_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
 	{
 	  /* We already have a common type from earlier operands.
@@ -1342,8 +1356,9 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
-			     false, 2, unprom, &half_type))
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+			     CFN_VEC_WIDEN_MINUS, false, 2, unprom,
+			     &half_type))
     return NULL;
 
   vect_pattern_detected ("vect_recog_sad_pattern", last_stmt);
@@ -2696,9 +2711,10 @@ vect_recog_average_pattern (vec_info *vinfo,
   internal_fn ifn = IFN_AVG_FLOOR;
   vect_unpromoted_value unprom[3];
   tree new_type;
+  enum optab_subtype subtype;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-					    WIDEN_PLUS_EXPR, false, 3,
-					    unprom, &new_type);
+					    CFN_VEC_WIDEN_PLUS, false, 3,
+					    unprom, &new_type, &subtype);
   if (nops == 0)
     return NULL;
   if (nops == 3)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index f4806073f48d4dedea3ac9bd855792b152d78919..38f4680d45ab80e8f86327327c13667d96bc5bea 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5032,9 +5032,7 @@ vectorizable_conversion (vec_info *vinfo,
   else
     return false;
 
-  bool widen_arith = (code == WIDEN_PLUS_EXPR
-		 || code == WIDEN_MINUS_EXPR
-		 || code == WIDEN_MULT_EXPR
+  bool widen_arith = (code == WIDEN_MULT_EXPR
 		 || code == WIDEN_LSHIFT_EXPR
 		 || code == IFN_VEC_WIDEN_PLUS
 		 || code == IFN_VEC_WIDEN_MINUS);
@@ -5086,8 +5084,6 @@ vectorizable_conversion (vec_info *vinfo,
     {
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR
 		  || code == IFN_VEC_WIDEN_PLUS
 		  || code == IFN_VEC_WIDEN_MINUS);
 
@@ -12211,7 +12207,7 @@ supportable_widening_operation (vec_info *vinfo,
   class loop *vect_loop = NULL;
   machine_mode vec_mode;
   enum insn_code icode1, icode2;
-  optab optab1, optab2;
+  optab optab1 = unknown_optab, optab2 = unknown_optab;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
   code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES;
@@ -12305,16 +12301,6 @@ supportable_widening_operation (vec_info *vinfo,
       c2 = VEC_WIDEN_LSHIFT_HI_EXPR;
       break;
 
-    case WIDEN_PLUS_EXPR:
-      c1 = VEC_WIDEN_PLUS_LO_EXPR;
-      c2 = VEC_WIDEN_PLUS_HI_EXPR;
-      break;
-
-    case WIDEN_MINUS_EXPR:
-      c1 = VEC_WIDEN_MINUS_LO_EXPR;
-      c2 = VEC_WIDEN_MINUS_HI_EXPR;
-      break;
-
     CASE_CONVERT:
       c1 = VEC_UNPACK_LO_EXPR;
       c2 = VEC_UNPACK_HI_EXPR;
diff --git a/gcc/tree.def b/gcc/tree.def
index ee02754354f015a16737c7e879d89c3e3be0d5aa..a58e608a90078818a7ade9d1173ac7ec84c48c7a 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3)
 DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2)
 
 /* Widening sad (sum of absolute differences).
-   The first two arguments are of type t1 which should be integer.
-   The third argument and the result are of type t2, such that t2 is at least
-   twice the size of t1.  Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
+   The first two arguments are of type t1 which should be a vector of integers.
+   The third argument and the result are of type t2, such that the size of
+   the elements of t2 is at least twice the size of the elements of t1.
+   Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
    equivalent to:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = PLUS_EXPR (tmp2, arg3)
   or:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
  */
@@ -1421,8 +1422,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3)
    the first argument from type t1 to type t2, and then shifting it
    by the second argument.  */
 DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2)
 
 /* Widening vector multiplication.
    The two operands are vectors with N elements of size S. Multiplying the
@@ -1487,10 +1486,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2)
  */
 DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2)
 DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2)
 
 /* PREDICT_EXPR.  Specify hint for branch prediction.  The
    PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2023-04-20 13:23                       ` Andre Vieira (lists)
@ 2023-04-24 11:57                         ` Richard Biener
  2023-04-24 13:01                           ` Richard Sandiford
  2023-04-25  9:55                           ` Andre Vieira (lists)
  0 siblings, 2 replies; 53+ messages in thread
From: Richard Biener @ 2023-04-24 11:57 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches

On Thu, Apr 20, 2023 at 3:24 PM Andre Vieira (lists) via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Rebased all three patches and made some small changes to the second one:
> - removed sub and abd optabs from commutative_optab_p, I suspect this
> was a copy paste mistake,
> - removed what I believe to be a superfluous switch case in vectorizable
> conversion, the one that was here:
> +  if (code.is_fn_code ())
> +     {
> +      internal_fn ifn = as_internal_fn (code.as_fn_code ());
> +      int ecf_flags = internal_fn_flags (ifn);
> +      gcc_assert (ecf_flags & ECF_MULTI);
> +
> +      switch (code.as_fn_code ())
> +       {
> +       case CFN_VEC_WIDEN_PLUS:
> +         break;
> +       case CFN_VEC_WIDEN_MINUS:
> +         break;
> +       case CFN_LAST:
> +       default:
> +         return false;
> +       }
> +
> +      internal_fn lo, hi;
> +      lookup_multi_internal_fn (ifn, &lo, &hi);
> +      *code1 = as_combined_fn (lo);
> +      *code2 = as_combined_fn (hi);
> +      optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
> +      optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
>       }
>
> I don't think we need to check they are a specfic fn code, as we look-up
> optabs and if they succeed then surely we can vectorize?
>
> OK for trunk?

In the first patch I see some uses of safe_as_tree_code like

+  if (ch.is_tree_code ())
+    return op1 == NULL_TREE ? gimple_build_assign (lhs,
ch.safe_as_tree_code (),
+                                                  op0) :
+                             gimple_build_assign (lhs, ch.safe_as_tree_code (),
+                                                  op0, op1);
+  else
+  {
+    internal_fn fn = as_internal_fn (ch.safe_as_fn_code ());
+    gimple* stmt;

where the context actually requires a valid tree code.  Please change those
to force to tree code / ifn code.  Just use explicit casts here and the other
places that are similar.  Before the as_internal_fn just put a
gcc_assert (ch.is_internal_fn ()).

Maybe the need for the (ugly) safe_as_tree_code/fn_code goes away then.

Otherwise patch1 looks OK.

Unfortunately there are no ChangeLog / patch descriptions on the changes.
patch2 has

-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    {
+      rhs_code = gimple_assign_rhs_code (stmt);
+      if (rhs_code.safe_as_tree_code () != code
+         && rhs_code.get_rep () != widened_code.get_rep ())
+       return 0;
+    }
+  else if (is_gimple_call (stmt))
+    {
+      rhs_code = gimple_call_combined_fn (stmt);
+      if (rhs_code.get_rep () != widened_code.get_rep ())
+       return 0;
+    }

that looks mightly complicated - esp. the use of get_rep ()
looks dangerous?  What's the intent of this?  Not that I
understand the existing code much.  A comment would
clearly help (also indicating test coverage).



> Kind regards,
> Andre

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2023-04-24 11:57                         ` Richard Biener
@ 2023-04-24 13:01                           ` Richard Sandiford
  2023-04-25 12:30                             ` Richard Biener
  2023-04-25  9:55                           ` Andre Vieira (lists)
  1 sibling, 1 reply; 53+ messages in thread
From: Richard Sandiford @ 2023-04-24 13:01 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches

Richard Biener <richard.guenther@gmail.com> writes:
> On Thu, Apr 20, 2023 at 3:24 PM Andre Vieira (lists) via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> Rebased all three patches and made some small changes to the second one:
>> - removed sub and abd optabs from commutative_optab_p, I suspect this
>> was a copy paste mistake,
>> - removed what I believe to be a superfluous switch case in vectorizable
>> conversion, the one that was here:
>> +  if (code.is_fn_code ())
>> +     {
>> +      internal_fn ifn = as_internal_fn (code.as_fn_code ());
>> +      int ecf_flags = internal_fn_flags (ifn);
>> +      gcc_assert (ecf_flags & ECF_MULTI);
>> +
>> +      switch (code.as_fn_code ())
>> +       {
>> +       case CFN_VEC_WIDEN_PLUS:
>> +         break;
>> +       case CFN_VEC_WIDEN_MINUS:
>> +         break;
>> +       case CFN_LAST:
>> +       default:
>> +         return false;
>> +       }
>> +
>> +      internal_fn lo, hi;
>> +      lookup_multi_internal_fn (ifn, &lo, &hi);
>> +      *code1 = as_combined_fn (lo);
>> +      *code2 = as_combined_fn (hi);
>> +      optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
>> +      optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
>>       }
>>
>> I don't think we need to check they are a specfic fn code, as we look-up
>> optabs and if they succeed then surely we can vectorize?
>>
>> OK for trunk?
>
> In the first patch I see some uses of safe_as_tree_code like
>
> +  if (ch.is_tree_code ())
> +    return op1 == NULL_TREE ? gimple_build_assign (lhs,
> ch.safe_as_tree_code (),
> +                                                  op0) :
> +                             gimple_build_assign (lhs, ch.safe_as_tree_code (),
> +                                                  op0, op1);
> +  else
> +  {
> +    internal_fn fn = as_internal_fn (ch.safe_as_fn_code ());
> +    gimple* stmt;
>
> where the context actually requires a valid tree code.  Please change those
> to force to tree code / ifn code.  Just use explicit casts here and the other
> places that are similar.  Before the as_internal_fn just put a
> gcc_assert (ch.is_internal_fn ()).

Also, doesn't the above ?: simplify to the "else" arm?  Null trailing
arguments would be ignored for unary operators.

I wasn't sure what to make of the op0 handling:

> +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
> +   or internal_fn contained in ch, respectively.  */
> +gimple *
> +vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1)
> +{
> +  if (op0 == NULL_TREE)
> +    return NULL;

Can that happen, and if so, does returning null make sense?
Maybe an assert would be safer.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2023-04-24 11:57                         ` Richard Biener
  2023-04-24 13:01                           ` Richard Sandiford
@ 2023-04-25  9:55                           ` Andre Vieira (lists)
  2023-04-28 12:36                             ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists)
                                               ` (2 more replies)
  1 sibling, 3 replies; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-04-25  9:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches



On 24/04/2023 12:57, Richard Biener wrote:
> On Thu, Apr 20, 2023 at 3:24 PM Andre Vieira (lists) via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> Rebased all three patches and made some small changes to the second one:
>> - removed sub and abd optabs from commutative_optab_p, I suspect this
>> was a copy paste mistake,
>> - removed what I believe to be a superfluous switch case in vectorizable
>> conversion, the one that was here:
>> +  if (code.is_fn_code ())
>> +     {
>> +      internal_fn ifn = as_internal_fn (code.as_fn_code ());
>> +      int ecf_flags = internal_fn_flags (ifn);
>> +      gcc_assert (ecf_flags & ECF_MULTI);
>> +
>> +      switch (code.as_fn_code ())
>> +       {
>> +       case CFN_VEC_WIDEN_PLUS:
>> +         break;
>> +       case CFN_VEC_WIDEN_MINUS:
>> +         break;
>> +       case CFN_LAST:
>> +       default:
>> +         return false;
>> +       }
>> +
>> +      internal_fn lo, hi;
>> +      lookup_multi_internal_fn (ifn, &lo, &hi);
>> +      *code1 = as_combined_fn (lo);
>> +      *code2 = as_combined_fn (hi);
>> +      optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
>> +      optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
>>        }
>>
>> I don't think we need to check they are a specfic fn code, as we look-up
>> optabs and if they succeed then surely we can vectorize?
>>
>> OK for trunk?
> 
> In the first patch I see some uses of safe_as_tree_code like
> 
> +  if (ch.is_tree_code ())
> +    return op1 == NULL_TREE ? gimple_build_assign (lhs,
> ch.safe_as_tree_code (),
> +                                                  op0) :
> +                             gimple_build_assign (lhs, ch.safe_as_tree_code (),
> +                                                  op0, op1);
> +  else
> +  {
> +    internal_fn fn = as_internal_fn (ch.safe_as_fn_code ());
> +    gimple* stmt;
> 
> where the context actually requires a valid tree code.  Please change those
> to force to tree code / ifn code.  Just use explicit casts here and the other
> places that are similar.  Before the as_internal_fn just put a
> gcc_assert (ch.is_internal_fn ()).
> 
> Maybe the need for the (ugly) safe_as_tree_code/fn_code goes away then.
> 
> Otherwise patch1 looks OK.
> 
> Unfortunately there are no ChangeLog / patch descriptions on the changes.
> patch2 has
> 
> -  tree_code rhs_code = gimple_assign_rhs_code (assign);
> -  if (rhs_code != code && rhs_code != widened_code)
> +  code_helper rhs_code;
> +  if (is_gimple_assign (stmt))
> +    {
> +      rhs_code = gimple_assign_rhs_code (stmt);
> +      if (rhs_code.safe_as_tree_code () != code
> +         && rhs_code.get_rep () != widened_code.get_rep ())
> +       return 0;
> +    }
> +  else if (is_gimple_call (stmt))
> +    {
> +      rhs_code = gimple_call_combined_fn (stmt);
> +      if (rhs_code.get_rep () != widened_code.get_rep ())
> +       return 0;
> +    }
> 
> that looks mightly complicated - esp. the use of get_rep ()
> looks dangerous?  What's the intent of this?  Not that I
> understand the existing code much.  A comment would
> clearly help (also indicating test coverage).

I don't think the use of get_rep here is dangerous, it's meant to avoid 
having to check whether widened_code is the same 'kind' as rhs_code. 
With get_rep we don't have to do this check first because tree_codes 
will have positive reps and combined_fns negative reps. Having said 
that, this can all be simplified and we don't need to use get_rep either 
as the == operator has been overloaded to use get_rep and even use the 
constructor on the rhs of the ==, so I suggest moving the check after 
assigning rhs_code and just doing:
if (rhs_code != code
     && rhs_code != widened_code)
   return 0;

> 
> 
>> Kind regards,
>> Andre

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2023-04-24 13:01                           ` Richard Sandiford
@ 2023-04-25 12:30                             ` Richard Biener
  2023-04-28 16:06                               ` Andre Vieira (lists)
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2023-04-25 12:30 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Biener, Andre Vieira (lists), gcc-patches

On Mon, 24 Apr 2023, Richard Sandiford wrote:

> Richard Biener <richard.guenther@gmail.com> writes:
> > On Thu, Apr 20, 2023 at 3:24?PM Andre Vieira (lists) via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> Rebased all three patches and made some small changes to the second one:
> >> - removed sub and abd optabs from commutative_optab_p, I suspect this
> >> was a copy paste mistake,
> >> - removed what I believe to be a superfluous switch case in vectorizable
> >> conversion, the one that was here:
> >> +  if (code.is_fn_code ())
> >> +     {
> >> +      internal_fn ifn = as_internal_fn (code.as_fn_code ());
> >> +      int ecf_flags = internal_fn_flags (ifn);
> >> +      gcc_assert (ecf_flags & ECF_MULTI);
> >> +
> >> +      switch (code.as_fn_code ())
> >> +       {
> >> +       case CFN_VEC_WIDEN_PLUS:
> >> +         break;
> >> +       case CFN_VEC_WIDEN_MINUS:
> >> +         break;
> >> +       case CFN_LAST:
> >> +       default:
> >> +         return false;
> >> +       }
> >> +
> >> +      internal_fn lo, hi;
> >> +      lookup_multi_internal_fn (ifn, &lo, &hi);
> >> +      *code1 = as_combined_fn (lo);
> >> +      *code2 = as_combined_fn (hi);
> >> +      optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
> >> +      optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
> >>       }
> >>
> >> I don't think we need to check they are a specfic fn code, as we look-up
> >> optabs and if they succeed then surely we can vectorize?
> >>
> >> OK for trunk?
> >
> > In the first patch I see some uses of safe_as_tree_code like
> >
> > +  if (ch.is_tree_code ())
> > +    return op1 == NULL_TREE ? gimple_build_assign (lhs,
> > ch.safe_as_tree_code (),
> > +                                                  op0) :
> > +                             gimple_build_assign (lhs, ch.safe_as_tree_code (),
> > +                                                  op0, op1);
> > +  else
> > +  {
> > +    internal_fn fn = as_internal_fn (ch.safe_as_fn_code ());
> > +    gimple* stmt;
> >
> > where the context actually requires a valid tree code.  Please change those
> > to force to tree code / ifn code.  Just use explicit casts here and the other
> > places that are similar.  Before the as_internal_fn just put a
> > gcc_assert (ch.is_internal_fn ()).
> 
> Also, doesn't the above ?: simplify to the "else" arm?  Null trailing
> arguments would be ignored for unary operators.
> 
> I wasn't sure what to make of the op0 handling:
> 
> > +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
> > +   or internal_fn contained in ch, respectively.  */
> > +gimple *
> > +vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1)
> > +{
> > +  if (op0 == NULL_TREE)
> > +    return NULL;
> 
> Can that happen, and if so, does returning null make sense?
> Maybe an assert would be safer.

Yeah, I was hoping to have a look whether the new gimple_build
overloads could be used to make this all better (but hoped we can
finally get this series in in some way).

Richard.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH 1/3] Refactor to allow internal_fn's
  2023-04-25  9:55                           ` Andre Vieira (lists)
@ 2023-04-28 12:36                             ` Andre Vieira (lists)
  2023-05-03 11:55                               ` Richard Biener
  2023-04-28 12:37                             ` [PATCH 2/3] Refactor widen_plus as internal_fn Andre Vieira (lists)
  2023-04-28 12:37                             ` [PATCH 3/3] Remove widen_plus/minus_expr tree codes Andre Vieira (lists)
  2 siblings, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-04-28 12:36 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1769 bytes --]

Hi,

I'm posting the patches separately now with ChangeLogs.

I made the suggested changes and tried to simplify the code a bit 
further. Where internal to tree-vect-stmts I changed most functions to 
use code_helper to avoid having to check at places we didn't need to. I 
was trying to simplify things further by also modifying 
supportable_half_widening_operation and supportable_convert_operation 
but the result of that was that I ended up moving the code to cast to 
tree code inside them rather than at the call site and it didn't look 
simpler, so I left those. Though if we did make those changes we'd no 
longer need to keep around the tc1 variable in 
vectorizable_conversion... Let me know what you think.

gcc/ChangeLog:

2023-04-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
             Joel Hutton  <joel.hutton@arm.com>

         * tree-vect-patterns.cc (vect_gimple_build): New Function.
         (vect_recog_widen_op_pattern): Refactor to use code_helper.
         * tree-vect-stmts.cc (vect_gen_widened_results_half): Likewise.
         (vect_create_vectorized_demotion_stmts): Likewise.
         (vect_create_vectorized_promotion_stmts): Likewise.
         (vect_create_half_widening_stmts): Likewise.
         (vectorizable_conversion): Likewise.
         (vectorizable_call): Likewise.
         (supportable_widening_operation): Likewise.
         (supportable_narrowing_operation): Likewise.
         (simple_integer_narrowing): Likewise.
         * tree-vectorizer.h (supportable_widening_operation): Likewise.
         (supportable_narrowing_operation): Likewise.
         (vect_gimple_build): New function prototype.
         * tree.h (code_helper::safe_as_tree_code): New function.
         (code_helper::safe_as_fn_code): New function.

[-- Attachment #2: ifn0_v2.patch --]
[-- Type: text/plain, Size: 22695 bytes --]

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 8802141cd6edb298866025b8a55843eae1f0eb17..b35023adade94c1996cd076c4b7419560e819c6b 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -25,6 +25,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl.h"
 #include "tree.h"
 #include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-fold.h"
 #include "ssa.h"
 #include "expmed.h"
 #include "optabs-tree.h"
@@ -1391,7 +1393,7 @@ vect_recog_sad_pattern (vec_info *vinfo,
 static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
-			     tree_code orig_code, tree_code wide_code,
+			     tree_code orig_code, code_helper wide_code,
 			     bool shift_p, const char *name)
 {
   gimple *last_stmt = last_stmt_info->stmt;
@@ -1434,7 +1436,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
       vecctype = get_vectype_for_scalar_type (vinfo, ctype);
     }
 
-  enum tree_code dummy_code;
+  code_helper dummy_code;
   int dummy_int;
   auto_vec<tree> dummy_vec;
   if (!vectype
@@ -1455,8 +1457,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 		       2, oprnd, half_type, unprom, vectype);
 
   tree var = vect_recog_temp_ssa_var (itype, NULL);
-  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
-					      oprnd[0], oprnd[1]);
+  gimple *pattern_stmt = vect_gimple_build (var, wide_code, oprnd[0], oprnd[1]);
 
   if (vecctype != vecitype)
     pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype,
@@ -6406,3 +6407,20 @@ vect_pattern_recog (vec_info *vinfo)
   /* After this no more add_stmt calls are allowed.  */
   vinfo->stmt_vec_info_ro = true;
 }
+
+/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
+   or internal_fn contained in ch, respectively.  */
+gimple *
+vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1)
+{
+  gcc_assert (op0 != NULL_TREE);
+  if (ch.is_tree_code ())
+    return gimple_build_assign (lhs, (tree_code) ch, op0, op1);
+
+  gcc_assert (ch.is_internal_fn ());
+  gimple* stmt = gimple_build_call_internal (as_internal_fn ((combined_fn) ch),
+					     op1 == NULL_TREE ? 1 : 2,
+					     op0, op1);
+  gimple_call_set_lhs (stmt, lhs);
+  return stmt;
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 6b7dbfd4a231baec24e740ffe0ce0b0bf7a1de6b..ce47f4940fa9a1baca4ba1162065cfc3b4072eba 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -3258,13 +3258,13 @@ vectorizable_bswap (vec_info *vinfo,
 
 static bool
 simple_integer_narrowing (tree vectype_out, tree vectype_in,
-			  tree_code *convert_code)
+			  code_helper *convert_code)
 {
   if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype_out))
       || !INTEGRAL_TYPE_P (TREE_TYPE (vectype_in)))
     return false;
 
-  tree_code code;
+  code_helper code;
   int multi_step_cvt = 0;
   auto_vec <tree, 8> interm_types;
   if (!supportable_narrowing_operation (NOP_EXPR, vectype_out, vectype_in,
@@ -3478,7 +3478,7 @@ vectorizable_call (vec_info *vinfo,
   tree callee = gimple_call_fndecl (stmt);
 
   /* First try using an internal function.  */
-  tree_code convert_code = ERROR_MARK;
+  code_helper convert_code = MAX_TREE_CODES;
   if (cfn != CFN_LAST
       && (modifier == NONE
 	  || (modifier == NARROW
@@ -3664,8 +3664,8 @@ vectorizable_call (vec_info *vinfo,
 			  continue;
 			}
 		      new_temp = make_ssa_name (vec_dest);
-		      new_stmt = gimple_build_assign (new_temp, convert_code,
-						      prev_res, half_res);
+		      new_stmt = vect_gimple_build (new_temp, convert_code,
+						    prev_res, half_res);
 		      vect_finish_stmt_generation (vinfo, stmt_info,
 						   new_stmt, gsi);
 		    }
@@ -3755,8 +3755,8 @@ vectorizable_call (vec_info *vinfo,
 		  continue;
 		}
 	      new_temp = make_ssa_name (vec_dest);
-	      new_stmt = gimple_build_assign (new_temp, convert_code,
-					      prev_res, half_res);
+	      new_stmt = vect_gimple_build (new_temp, convert_code, prev_res,
+					    half_res);
 	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    }
 	  else
@@ -4768,7 +4768,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
    STMT_INFO is the original scalar stmt that we are vectorizing.  */
 
 static gimple *
-vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
+vect_gen_widened_results_half (vec_info *vinfo, code_helper ch,
                                tree vec_oprnd0, tree vec_oprnd1, int op_type,
 			       tree vec_dest, gimple_stmt_iterator *gsi,
 			       stmt_vec_info stmt_info)
@@ -4777,12 +4777,11 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
   tree new_temp;
 
   /* Generate half of the widened result:  */
-  gcc_assert (op_type == TREE_CODE_LENGTH (code));
   if (op_type != binary_op)
     vec_oprnd1 = NULL;
-  new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1);
+  new_stmt = vect_gimple_build (vec_dest, ch, vec_oprnd0, vec_oprnd1);
   new_temp = make_ssa_name (vec_dest, new_stmt);
-  gimple_assign_set_lhs (new_stmt, new_temp);
+  gimple_set_lhs (new_stmt, new_temp);
   vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 
   return new_stmt;
@@ -4799,7 +4798,7 @@ vect_create_vectorized_demotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds,
 				       stmt_vec_info stmt_info,
 				       vec<tree> &vec_dsts,
 				       gimple_stmt_iterator *gsi,
-				       slp_tree slp_node, enum tree_code code)
+				       slp_tree slp_node, code_helper code)
 {
   unsigned int i;
   tree vop0, vop1, new_tmp, vec_dest;
@@ -4811,9 +4810,9 @@ vect_create_vectorized_demotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds,
       /* Create demotion operation.  */
       vop0 = (*vec_oprnds)[i];
       vop1 = (*vec_oprnds)[i + 1];
-      gassign *new_stmt = gimple_build_assign (vec_dest, code, vop0, vop1);
+      gimple *new_stmt = vect_gimple_build (vec_dest, code, vop0, vop1);
       new_tmp = make_ssa_name (vec_dest, new_stmt);
-      gimple_assign_set_lhs (new_stmt, new_tmp);
+      gimple_set_lhs (new_stmt, new_tmp);
       vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 
       if (multi_step_cvt)
@@ -4861,8 +4860,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 					vec<tree> *vec_oprnds1,
 					stmt_vec_info stmt_info, tree vec_dest,
 					gimple_stmt_iterator *gsi,
-					enum tree_code code1,
-					enum tree_code code2, int op_type)
+					code_helper ch1,
+					code_helper ch2, int op_type)
 {
   int i;
   tree vop0, vop1, new_tmp1, new_tmp2;
@@ -4878,10 +4877,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 	vop1 = NULL_TREE;
 
       /* Generate the two halves of promotion operation.  */
-      new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1,
+      new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
-      new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1,
+      new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
       if (is_gimple_call (new_stmt1))
@@ -4912,7 +4911,7 @@ vect_create_half_widening_stmts (vec_info *vinfo,
 					vec<tree> *vec_oprnds1,
 					stmt_vec_info stmt_info, tree vec_dest,
 					gimple_stmt_iterator *gsi,
-					enum tree_code code1,
+					code_helper code1,
 					int op_type)
 {
   int i;
@@ -4942,13 +4941,13 @@ vect_create_half_widening_stmts (vec_info *vinfo,
 	  new_stmt2 = gimple_build_assign (new_tmp2, NOP_EXPR, vop1);
 	  vect_finish_stmt_generation (vinfo, stmt_info, new_stmt2, gsi);
 	  /* Perform the operation.  With both vector inputs widened.  */
-	  new_stmt3 = gimple_build_assign (vec_dest, code1, new_tmp1, new_tmp2);
+	  new_stmt3 = vect_gimple_build (vec_dest, code1, new_tmp1, new_tmp2);
 	}
       else
 	{
 	  /* Perform the operation.  With the single vector input widened.  */
-	  new_stmt3 = gimple_build_assign (vec_dest, code1, new_tmp1, vop1);
-      }
+	  new_stmt3 = vect_gimple_build (vec_dest, code1, new_tmp1, vop1);
+	}
 
       new_tmp3 = make_ssa_name (vec_dest, new_stmt3);
       gimple_assign_set_lhs (new_stmt3, new_tmp3);
@@ -4978,8 +4977,9 @@ vectorizable_conversion (vec_info *vinfo,
   tree scalar_dest;
   tree op0, op1 = NULL_TREE;
   loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
-  enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK;
-  enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
+  tree_code tc1;
+  code_helper code, code1, code2;
+  code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
   tree new_temp;
   enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type};
   int ndts = 2;
@@ -5008,31 +5008,43 @@ vectorizable_conversion (vec_info *vinfo,
       && ! vec_stmt)
     return false;
 
-  gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!stmt)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return false;
 
-  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
+  if (gimple_get_lhs (stmt) == NULL_TREE
+      || TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
     return false;
 
-  code = gimple_assign_rhs_code (stmt);
-  if (!CONVERT_EXPR_CODE_P (code)
-      && code != FIX_TRUNC_EXPR
-      && code != FLOAT_EXPR
-      && code != WIDEN_PLUS_EXPR
-      && code != WIDEN_MINUS_EXPR
-      && code != WIDEN_MULT_EXPR
-      && code != WIDEN_LSHIFT_EXPR)
+  if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
+    return false;
+
+  if (is_gimple_assign (stmt))
+    {
+      code = gimple_assign_rhs_code (stmt);
+      op_type = TREE_CODE_LENGTH ((tree_code) code);
+    }
+  else if (gimple_call_internal_p (stmt))
+    {
+      code = gimple_call_internal_fn (stmt);
+      op_type = gimple_call_num_args (stmt);
+    }
+  else
     return false;
 
   bool widen_arith = (code == WIDEN_PLUS_EXPR
-		      || code == WIDEN_MINUS_EXPR
-		      || code == WIDEN_MULT_EXPR
-		      || code == WIDEN_LSHIFT_EXPR);
-  op_type = TREE_CODE_LENGTH (code);
+		 || code == WIDEN_MINUS_EXPR
+		 || code == WIDEN_MULT_EXPR
+		 || code == WIDEN_LSHIFT_EXPR);
+
+  if (!widen_arith
+      && !CONVERT_EXPR_CODE_P (code)
+      && code != FIX_TRUNC_EXPR
+      && code != FLOAT_EXPR)
+    return false;
 
   /* Check types of lhs and rhs.  */
-  scalar_dest = gimple_assign_lhs (stmt);
+  scalar_dest = gimple_get_lhs (stmt);
   lhs_type = TREE_TYPE (scalar_dest);
   vectype_out = STMT_VINFO_VECTYPE (stmt_info);
 
@@ -5070,10 +5082,14 @@ vectorizable_conversion (vec_info *vinfo,
 
   if (op_type == binary_op)
     {
-      gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR);
+      gcc_assert (code == WIDEN_MULT_EXPR
+		  || code == WIDEN_LSHIFT_EXPR
+		  || code == WIDEN_PLUS_EXPR
+		  || code == WIDEN_MINUS_EXPR);
+
 
-      op1 = gimple_assign_rhs2 (stmt);
+      op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
+				     gimple_call_arg (stmt, 0);
       tree vectype1_in;
       if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1,
 			       &op1, &slp_op1, &dt[1], &vectype1_in))
@@ -5157,8 +5173,13 @@ vectorizable_conversion (vec_info *vinfo,
 	  && code != FLOAT_EXPR
 	  && !CONVERT_EXPR_CODE_P (code))
 	return false;
-      if (supportable_convert_operation (code, vectype_out, vectype_in, &code1))
+      gcc_assert (code.is_tree_code ());
+      if (supportable_convert_operation ((tree_code) code, vectype_out,
+					 vectype_in, &tc1))
+      {
+	code1 = tc1;
 	break;
+      }
       /* FALLTHRU */
     unsupported:
       if (dump_enabled_p ())
@@ -5169,9 +5190,12 @@ vectorizable_conversion (vec_info *vinfo,
     case WIDEN:
       if (known_eq (nunits_in, nunits_out))
 	{
-	  if (!supportable_half_widening_operation (code, vectype_out,
-						   vectype_in, &code1))
+	  if (!(code.is_tree_code ()
+		&& supportable_half_widening_operation ((tree_code) code,
+							vectype_out, vectype_in,
+							&tc1)))
 	    goto unsupported;
+	  code1 = tc1;
 	  gcc_assert (!(multi_step_cvt && op_type == binary_op));
 	  break;
 	}
@@ -5205,14 +5229,17 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (GET_MODE_SIZE (rhs_mode) == fltsz)
 	    {
-	      if (!supportable_convert_operation (code, vectype_out,
-						  cvt_type, &codecvt1))
+	      tc1 = ERROR_MARK;
+	      gcc_assert (code.is_tree_code ());
+	      if (!supportable_convert_operation ((tree_code) code, vectype_out,
+						  cvt_type, &tc1))
 		goto unsupported;
+	      codecvt1 = tc1;
 	    }
-	  else if (!supportable_widening_operation (vinfo, code, stmt_info,
-						    vectype_out, cvt_type,
-						    &codecvt1, &codecvt2,
-						    &multi_step_cvt,
+	  else if (!supportable_widening_operation (vinfo, code,
+						    stmt_info, vectype_out,
+						    cvt_type, &codecvt1,
+						    &codecvt2, &multi_step_cvt,
 						    &interm_types))
 	    continue;
 	  else
@@ -5220,8 +5247,9 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info,
 					      cvt_type,
-					      vectype_in, &code1, &code2,
-					      &multi_step_cvt, &interm_types))
+					      vectype_in, &code1,
+					      &code2, &multi_step_cvt,
+					      &interm_types))
 	    {
 	      found_mode = true;
 	      break;
@@ -5257,9 +5285,11 @@ vectorizable_conversion (vec_info *vinfo,
       cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
       if (cvt_type == NULL_TREE)
 	goto unsupported;
-      if (!supportable_convert_operation (code, cvt_type, vectype_in,
-					  &codecvt1))
+      if (!code.is_tree_code ()
+	  || !supportable_convert_operation ((tree_code) code, cvt_type,
+					     vectype_in, &tc1))
 	goto unsupported;
+      codecvt1 = tc1;
       if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type,
 					   &code1, &multi_step_cvt,
 					   &interm_types))
@@ -5377,10 +5407,9 @@ vectorizable_conversion (vec_info *vinfo,
       FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	{
 	  /* Arguments are ready, create the new vector stmt.  */
-	  gcc_assert (TREE_CODE_LENGTH (code1) == unary_op);
-	  gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0);
+	  gimple *new_stmt = vect_gimple_build (vec_dest, code1, vop0);
 	  new_temp = make_ssa_name (vec_dest, new_stmt);
-	  gimple_assign_set_lhs (new_stmt, new_temp);
+	  gimple_set_lhs (new_stmt, new_temp);
 	  vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 
 	  if (slp_node)
@@ -5410,17 +5439,16 @@ vectorizable_conversion (vec_info *vinfo,
       for (i = multi_step_cvt; i >= 0; i--)
 	{
 	  tree this_dest = vec_dsts[i];
-	  enum tree_code c1 = code1, c2 = code2;
+	  code_helper c1 = code1, c2 = code2;
 	  if (i == 0 && codecvt2 != ERROR_MARK)
 	    {
 	      c1 = codecvt1;
 	      c2 = codecvt2;
 	    }
 	  if (known_eq (nunits_out, nunits_in))
-	    vect_create_half_widening_stmts (vinfo, &vec_oprnds0,
-						    &vec_oprnds1, stmt_info,
-						    this_dest, gsi,
-						    c1, op_type);
+	    vect_create_half_widening_stmts (vinfo, &vec_oprnds0, &vec_oprnds1,
+					     stmt_info, this_dest, gsi, c1,
+					     op_type);
 	  else
 	    vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0,
 						    &vec_oprnds1, stmt_info,
@@ -5433,9 +5461,8 @@ vectorizable_conversion (vec_info *vinfo,
 	  gimple *new_stmt;
 	  if (cvt_type)
 	    {
-	      gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
 	      new_temp = make_ssa_name (vec_dest);
-	      new_stmt = gimple_build_assign (new_temp, codecvt1, vop0);
+	      new_stmt = vect_gimple_build (new_temp, codecvt1, vop0);
 	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    }
 	  else
@@ -5459,10 +5486,8 @@ vectorizable_conversion (vec_info *vinfo,
       if (cvt_type)
 	FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	  {
-	    gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
 	    new_temp = make_ssa_name (vec_dest);
-	    gassign *new_stmt
-	      = gimple_build_assign (new_temp, codecvt1, vop0);
+	    gimple *new_stmt = vect_gimple_build (new_temp, codecvt1, vop0);
 	    vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    vec_oprnds0[i] = new_temp;
 	  }
@@ -12151,9 +12176,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype)
 
 bool
 supportable_widening_operation (vec_info *vinfo,
-				enum tree_code code, stmt_vec_info stmt_info,
+				code_helper code,
+				stmt_vec_info stmt_info,
 				tree vectype_out, tree vectype_in,
-                                enum tree_code *code1, enum tree_code *code2,
+				code_helper *code1,
+				code_helper *code2,
                                 int *multi_step_cvt,
                                 vec<tree> *interm_types)
 {
@@ -12164,7 +12191,7 @@ supportable_widening_operation (vec_info *vinfo,
   optab optab1, optab2;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
-  enum tree_code c1, c2;
+  tree_code c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES;
   int i;
   tree prev_type, intermediate_type;
   machine_mode intermediate_mode, prev_mode;
@@ -12174,8 +12201,12 @@ supportable_widening_operation (vec_info *vinfo,
   if (loop_info)
     vect_loop = LOOP_VINFO_LOOP (loop_info);
 
-  switch (code)
+  switch (code.safe_as_tree_code ())
     {
+    case MAX_TREE_CODES:
+      /* Don't set c1 and c2 if code is not a tree_code.  */
+      break;
+
     case WIDEN_MULT_EXPR:
       /* The result of a vectorized widening operation usually requires
 	 two vectors (because the widened results do not fit into one vector).
@@ -12215,8 +12246,9 @@ supportable_widening_operation (vec_info *vinfo,
 	  && !nested_in_vect_loop_p (vect_loop, stmt_info)
 	  && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR,
 					     stmt_info, vectype_out,
-					     vectype_in, code1, code2,
-					     multi_step_cvt, interm_types))
+					     vectype_in, code1,
+					     code2, multi_step_cvt,
+					     interm_types))
         {
           /* Elements in a vector with vect_used_by_reduction property cannot
              be reordered if the use chain with this property does not have the
@@ -12292,7 +12324,7 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
       optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
     }
-  else if (CONVERT_EXPR_CODE_P (code)
+  else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
 	   && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
 	   && VECTOR_BOOLEAN_TYPE_P (vectype)
 	   && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
@@ -12317,8 +12349,12 @@ supportable_widening_operation (vec_info *vinfo,
        || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
     return false;
 
-  *code1 = c1;
-  *code2 = c2;
+  if (code.is_tree_code ())
+  {
+    *code1 = c1;
+    *code2 = c2;
+  }
+
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
       && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
@@ -12339,7 +12375,7 @@ supportable_widening_operation (vec_info *vinfo,
   prev_type = vectype;
   prev_mode = vec_mode;
 
-  if (!CONVERT_EXPR_CODE_P (code))
+  if (!CONVERT_EXPR_CODE_P ((tree_code) code))
     return false;
 
   /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS
@@ -12437,9 +12473,9 @@ supportable_widening_operation (vec_info *vinfo,
    narrowing operation (short in the above example).   */
 
 bool
-supportable_narrowing_operation (enum tree_code code,
+supportable_narrowing_operation (code_helper code,
 				 tree vectype_out, tree vectype_in,
-				 enum tree_code *code1, int *multi_step_cvt,
+				 code_helper *code1, int *multi_step_cvt,
                                  vec<tree> *interm_types)
 {
   machine_mode vec_mode;
@@ -12454,8 +12490,11 @@ supportable_narrowing_operation (enum tree_code code,
   unsigned HOST_WIDE_INT n_elts;
   bool uns;
 
+  if (!code.is_tree_code ())
+    return false;
+
   *multi_step_cvt = 0;
-  switch (code)
+  switch ((tree_code) code)
     {
     CASE_CONVERT:
       c1 = VEC_PACK_TRUNC_EXPR;
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 9cf2fb23fe397b467d89aa7cc5ebeaa293ed4cce..f215cd0639bcf803c9d0554cfdc57823431991d5 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -2139,13 +2139,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree,
 				enum vect_def_type *,
 				tree *, stmt_vec_info * = NULL);
 extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree);
-extern bool supportable_widening_operation (vec_info *,
-					    enum tree_code, stmt_vec_info,
-					    tree, tree, enum tree_code *,
-					    enum tree_code *, int *,
-					    vec<tree> *);
-extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
-					     enum tree_code *, int *,
+extern bool supportable_widening_operation (vec_info*, code_helper,
+					    stmt_vec_info, tree, tree,
+					    code_helper*, code_helper*,
+					    int*, vec<tree> *);
+extern bool supportable_narrowing_operation (code_helper, tree, tree,
+					     code_helper *, int *,
 					     vec<tree> *);
 
 extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
@@ -2583,4 +2582,7 @@ vect_is_integer_truncation (stmt_vec_info stmt_info)
 	  && TYPE_PRECISION (lhs_type) < TYPE_PRECISION (rhs_type));
 }
 
+/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
+   or internal_fn contained in ch, respectively.  */
+gimple * vect_gimple_build (tree, code_helper, tree, tree = NULL_TREE);
 #endif  /* GCC_TREE_VECTORIZER_H  */
diff --git a/gcc/tree.h b/gcc/tree.h
index abcdb5638d49aea4ccc46efa8e540b1fa78aa27a..f6cd528e7d789c3f81fb2da3c1e1a29fa11f6e0f 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -93,6 +93,8 @@ public:
   bool is_internal_fn () const;
   bool is_builtin_fn () const;
   int get_rep () const { return rep; }
+  enum tree_code safe_as_tree_code () const;
+  combined_fn safe_as_fn_code () const;
   bool operator== (const code_helper &other) { return rep == other.rep; }
   bool operator!= (const code_helper &other) { return rep != other.rep; }
   bool operator== (tree_code c) { return rep == code_helper (c).rep; }
@@ -102,6 +104,17 @@ private:
   int rep;
 };
 
+inline enum tree_code
+code_helper::safe_as_tree_code () const
+{
+  return is_tree_code () ? (tree_code) *this : MAX_TREE_CODES;
+}
+
+inline combined_fn
+code_helper::safe_as_fn_code () const {
+  return is_fn_code () ? (combined_fn) *this : CFN_LAST;
+}
+
 inline code_helper::operator internal_fn () const
 {
   return as_internal_fn (combined_fn (*this));

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-04-25  9:55                           ` Andre Vieira (lists)
  2023-04-28 12:36                             ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists)
@ 2023-04-28 12:37                             ` Andre Vieira (lists)
  2023-05-03 12:11                               ` Richard Biener
  2023-04-28 12:37                             ` [PATCH 3/3] Remove widen_plus/minus_expr tree codes Andre Vieira (lists)
  2 siblings, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-04-28 12:37 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3598 bytes --]

This patch replaces the existing tree_code widen_plus and widen_minus
patterns with internal_fn versions.

DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it 
provides convenience wrappers for defining conversions that require a 
hi/lo split, like widening and narrowing operations.  Each definition 
for <NAME> will require an optab named <OPTAB> and two other optabs that 
you specify for signed and unsigned. The hi/lo pair is necessary because 
the widening operations take n narrow elements as inputs and return n/2 
wide elements as outputs. The 'lo' operation operates on the first n/2 
elements of input. The 'hi' operation operates on the second n/2 
elements of input. Defining an internal_fn along with hi/lo variations 
allows a single internal function to be returned from a vect_recog 
function that will later be expanded to hi/lo.

DEF_INTERNAL_OPTAB_HILO_FN is used in internal-fn.def to register a 
widening internal_fn. It is defined differently in different places and 
internal-fn.def is sourced from those places so the parameters given can 
be reused.
   internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, 
later defined to generate the  'expand_' functions for the hi/lo 
versions of the fn.
   internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the 
original and hi/lo variants of the internal_fn

  For example:
  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>addl_hi_<mode> -> 
(u/s)addl2
                        IFN_VEC_WIDEN_PLUS_LO  -> 
vec_widen_<su>addl_lo_<mode> -> (u/s)addl

This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS 
tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.

gcc/ChangeLog:

2023-04-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
             Joel Hutton  <joel.hutton@arm.com>
             Tamar Christina  <tamar.christina@arm.com>

	* internal-fn.cc (INCLUDE_MAP): Include maps for use in optab
     lookup.
	(DEF_INTERNAL_OPTAB_HILO_FN): Macro to define an internal_fn that
     expands into multiple internal_fns (for widening).
	(ifn_cmp): Function to compare ifn's for sorting/searching.
	(lookup_hilo_ifn_optab): Add lookup function.
	(lookup_hilo_internal_fn): Add lookup function.
	(commutative_binary_fn_p): Add widen_plus fn's.
	(widening_fn_p): New function.
	(decomposes_to_hilo_fn_p): New function.
	* internal-fn.def (DEF_INTERNAL_OPTAB_HILO_FN): Define widening
     plus,minus functions.
	(VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code.
	(VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code.
	* internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
	(lookup_hilo_ifn_optab): Add prototype.
	(lookup_hilo_internal_fn): Likewise.
	(widening_fn_p): Likewise.
	(decomposes_to_hilo_fn_p): Likewise.
	* optabs.cc (commutative_optab_p): Add widening plus, minus optabs.
	* optabs.def (OPTAB_CD): widen add, sub optabs
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
     patterns with a hi/lo split.
	(vect_recog_widen_plus_pattern): Refactor to return
     IFN_VECT_WIDEN_PLUS.
	(vect_recog_widen_minus_pattern): Refactor to return new
     IFN_VEC_WIDEN_MINUS.
	* tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus
     ifn
     support.
	(supportable_widening_operation): Add widen plus/minus ifn support.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-widen-add.c: Test that new
     IFN_VEC_WIDEN_PLUS is being used.
	* gcc.target/aarch64/vect-widen-sub.c: Test that new
     IFN_VEC_WIDEN_MINUS is being used.

[-- Attachment #2: ifn1_v2.patch --]
[-- Type: text/plain, Size: 18412 bytes --]

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 6e81dc05e0e0714256759b0594816df451415a2d..e4d815cd577d266d2bccf6fb68d62aac91a8b4cf 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
 along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
+#define INCLUDE_MAP
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = {
   0
 };
 
+const enum internal_fn internal_fn_hilo_keys_array[] = {
+#undef DEF_INTERNAL_OPTAB_HILO_FN
+#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  IFN_##NAME##_LO, \
+  IFN_##NAME##_HI,
+#include "internal-fn.def"
+  IFN_LAST
+#undef DEF_INTERNAL_OPTAB_HILO_FN
+};
+
+const optab internal_fn_hilo_values_array[] = {
+#undef DEF_INTERNAL_OPTAB_HILO_FN
+#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  SOPTAB##_lo_optab, UOPTAB##_lo_optab, \
+  SOPTAB##_hi_optab, UOPTAB##_hi_optab,
+#include "internal-fn.def"
+  unknown_optab, unknown_optab
+#undef DEF_INTERNAL_OPTAB_HILO_FN
+};
+
 /* Return the internal function called NAME, or IFN_LAST if there's
    no such function.  */
 
@@ -90,6 +111,61 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+static int
+ifn_cmp (const void *a_, const void *b_)
+{
+  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
+  auto *a = (const std::pair<ifn_pair, optab> *)a_;
+  auto *b = (const std::pair<ifn_pair, optab> *)b_;
+  return (int) (a->first.first) - (b->first.first);
+}
+
+/* Return the optab belonging to the given internal function NAME for the given
+   SIGN or unknown_optab.  */
+
+optab
+lookup_hilo_ifn_optab (enum internal_fn fn, unsigned sign)
+{
+  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
+  typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type;
+  static fn_to_optab_map_type *fn_to_optab_map;
+
+  if (!fn_to_optab_map)
+    {
+      unsigned num
+	= sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn);
+      fn_to_optab_map = new fn_to_optab_map_type ();
+      for (unsigned int i = 0; i < num - 1; ++i)
+	{
+	  enum internal_fn fn = internal_fn_hilo_keys_array[i];
+	  optab v1 = internal_fn_hilo_values_array[2*i];
+	  optab v2 = internal_fn_hilo_values_array[2*i + 1];
+	  ifn_pair key1 (fn, 0);
+	  fn_to_optab_map->safe_push ({key1, v1});
+	  ifn_pair key2 (fn, 1);
+	  fn_to_optab_map->safe_push ({key2, v2});
+	}
+	fn_to_optab_map->qsort (ifn_cmp);
+    }
+
+  ifn_pair new_pair (fn, sign ? 1 : 0);
+  optab tmp;
+  std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp);
+  auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp);
+  return entry != fn_to_optab_map->end () ? entry->second : unknown_optab;
+}
+
+extern void
+lookup_hilo_internal_fn (enum internal_fn ifn, enum internal_fn *lo,
+			  enum internal_fn *hi)
+{
+  gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -3970,6 +4046,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -4043,6 +4122,42 @@ first_commutative_argument (internal_fn fn)
     }
 }
 
+/* Return true if FN has a wider output type than its argument types.  */
+
+bool
+widening_fn_p (internal_fn fn)
+{
+  switch (fn)
+    {
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_MINUS:
+      return true;
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if FN decomposes to _hi and _lo IFN.  If true this should also
+   be a widening function.  */
+
+bool
+decomposes_to_hilo_fn_p (internal_fn fn)
+{
+  if (!widening_fn_p (fn))
+    return false;
+
+  switch (fn)
+    {
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_MINUS:
+      return true;
+
+    default:
+      return false;
+    }
+}
+
 /* Return true if IFN_SET_EDOM is supported.  */
 
 bool
@@ -4055,6 +4170,32 @@ set_edom_supported_p (void)
 #endif
 }
 
+#undef DEF_INTERNAL_OPTAB_HILO_FN
+#define DEF_INTERNAL_OPTAB_HILO_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  static void							\
+  expand_##CODE (internal_fn, gcall *)				\
+  {								\
+    gcc_unreachable ();						\
+  }								\
+  static void							\
+  expand_##CODE##_LO (internal_fn fn, gcall *stmt)		\
+  {								\
+    tree ty = TREE_TYPE (gimple_get_lhs (stmt));		\
+    if (!TYPE_UNSIGNED (ty))					\
+      expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab);	\
+    else							\
+      expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab);	\
+  }								\
+  static void							\
+  expand_##CODE##_HI (internal_fn fn, gcall *stmt)		\
+  {								\
+    tree ty = TREE_TYPE (gimple_get_lhs (stmt));		\
+    if (!TYPE_UNSIGNED (ty))					\
+      expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab);	\
+    else							\
+      expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab);	\
+  }
+
 #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \
   static void						\
   expand_##CODE (internal_fn fn, gcall *stmt)		\
@@ -4071,6 +4212,7 @@ set_edom_supported_p (void)
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);			\
   }
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_HILO_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..347ed667d92620e0ee3ea15c58ecac6c242ebe73 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -85,6 +85,13 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it
+   provides convenience wrappers for defining conversions that require a
+   hi/lo split, like widening and narrowing operations.  Each definition
+   for <NAME> will require an optab named <OPTAB> and two other optabs that
+   you specify for signed and unsigned.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -123,6 +130,14 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_OPTAB_HILO_FN
+#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE)
+#endif
+
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -315,6 +330,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_OPTAB_HILO_FN (VEC_WIDEN_PLUS,
+			    ECF_CONST | ECF_NOTHROW,
+			    vec_widen_add, vec_widen_saddl, vec_widen_uaddl,
+			     binary)
+DEF_INTERNAL_OPTAB_HILO_FN (VEC_WIDEN_MINUS,
+			    ECF_CONST | ECF_NOTHROW,
+			    vec_widen_sub, vec_widen_ssubl, vec_widen_usubl,
+			     binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 08922ed4254898f5fffca3f33973e96ed9ce772f..6a5f8762e872ad2ef64ce2986a678e3b40622d81 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern optab lookup_hilo_ifn_optab (enum internal_fn, unsigned);
+extern void lookup_hilo_internal_fn (enum internal_fn, enum internal_fn *,
+				      enum internal_fn *);
 
 /* Return the ECF_* flags for function FN.  */
 
@@ -210,6 +217,8 @@ extern bool commutative_binary_fn_p (internal_fn);
 extern bool commutative_ternary_fn_p (internal_fn);
 extern int first_commutative_argument (internal_fn);
 extern bool associative_binary_fn_p (internal_fn);
+extern bool widening_fn_p (internal_fn);
+extern bool decomposes_to_hilo_fn_p (internal_fn);
 
 extern bool set_edom_supported_p (void);
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index c8e39c82d57a7d726e7da33d247b80f32ec9236c..d4dd7ee3d34d01c32ab432ae4e4ce9e4b522b2f7 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1314,7 +1314,12 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == smul_widen_optab
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
-	  || binoptab == umul_highpart_optab);
+	  || binoptab == umul_highpart_optab
+	  || binoptab == vec_widen_add_optab
+	  || binoptab == vec_widen_saddl_hi_optab
+	  || binoptab == vec_widen_saddl_lo_optab
+	  || binoptab == vec_widen_uaddl_hi_optab
+	  || binoptab == vec_widen_uaddl_lo_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 695f5911b300c9ca5737de9be809fa01aabe5e01..e064189103b3be70644468d11f3c91ac45ffe0d0 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4")
 OPTAB_CD(umsub_widen_optab, "umsub$b$a4")
 OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4")
 OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4")
+OPTAB_CD(vec_widen_add_optab, "add$a$b3")
+OPTAB_CD(vec_widen_sub_optab, "sub$a$b3")
 OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b")
 OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b")
 OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index b35023adade94c1996cd076c4b7419560e819c6b..3175dd92187c0935f78ebbf2eb476bdcf8b4ccd1 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -1394,14 +1394,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
 			     tree_code orig_code, code_helper wide_code,
-			     bool shift_p, const char *name)
+			     bool shift_p, const char *name,
+			     enum optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-			     shift_p, 2, unprom, &half_type))
+			     shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1467,6 +1469,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 			      type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+			     stmt_vec_info last_stmt_info, tree *type_out,
+			     tree_code orig_code, internal_fn wide_ifn,
+			     bool shift_p, const char *name,
+			     enum optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+				      orig_code, ifn, shift_p, name,
+				      subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1480,26 +1496,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS.  See vect_recog_widen_op_pattern for details.  */
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  enum optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-				      "vect_recog_widen_plus_pattern");
+				      PLUS_EXPR, IFN_VEC_WIDEN_PLUS,
+				      false, "vect_recog_widen_plus_pattern",
+				      &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS.  See vect_recog_widen_op_pattern for details.  */
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  enum optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-				      "vect_recog_widen_minus_pattern");
+				      MINUS_EXPR, IFN_VEC_WIDEN_MINUS,
+				      false, "vect_recog_widen_minus_pattern",
+				      &subtype);
 }
 
 /* Function vect_recog_popcount_pattern
@@ -6067,6 +6087,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index ce47f4940fa9a1baca4ba1162065cfc3b4072eba..2a7ef2439e12d1966e8884433963a3d387a856b7 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5035,7 +5035,9 @@ vectorizable_conversion (vec_info *vinfo,
   bool widen_arith = (code == WIDEN_PLUS_EXPR
 		 || code == WIDEN_MINUS_EXPR
 		 || code == WIDEN_MULT_EXPR
-		 || code == WIDEN_LSHIFT_EXPR);
+		 || code == WIDEN_LSHIFT_EXPR
+		 || code == IFN_VEC_WIDEN_PLUS
+		 || code == IFN_VEC_WIDEN_MINUS);
 
   if (!widen_arith
       && !CONVERT_EXPR_CODE_P (code)
@@ -5085,7 +5087,9 @@ vectorizable_conversion (vec_info *vinfo,
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
 		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR);
+		  || code == WIDEN_MINUS_EXPR
+		  || code == IFN_VEC_WIDEN_PLUS
+		  || code == IFN_VEC_WIDEN_MINUS);
 
 
       op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
@@ -12335,12 +12339,46 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
-    {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+
+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn ((combined_fn) code);
+      gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+      internal_fn lo, hi;
+      lookup_hilo_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
+      optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
     }
 
+  if (code.is_tree_code ())
+  {
+    if (code == FIX_TRUNC_EXPR)
+      {
+	/* The signedness is determined from output operand.  */
+	optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
+	optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+      }
+    else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
+	     && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+	     && VECTOR_BOOLEAN_TYPE_P (vectype)
+	     && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+	     && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+      {
+	/* If the input and result modes are the same, a different optab
+	   is needed where we pass in the number of units in vectype.  */
+	optab1 = vec_unpacks_sbool_lo_optab;
+	optab2 = vec_unpacks_sbool_hi_optab;
+      }
+    else
+      {
+	optab1 = optab_for_tree_code (c1, vectype, optab_default);
+	optab2 = optab_for_tree_code (c2, vectype, optab_default);
+      }
+  }
+
   if (!optab1 || !optab2)
     return false;
 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH 3/3] Remove widen_plus/minus_expr tree codes
  2023-04-25  9:55                           ` Andre Vieira (lists)
  2023-04-28 12:36                             ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists)
  2023-04-28 12:37                             ` [PATCH 2/3] Refactor widen_plus as internal_fn Andre Vieira (lists)
@ 2023-04-28 12:37                             ` Andre Vieira (lists)
  2023-05-03 12:29                               ` Richard Biener
  2 siblings, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-04-28 12:37 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1543 bytes --]

This is a rebase of Joel's previous patch.

This patch removes the old widen plus/minus tree codes which have been
replaced by internal functions.

gcc/ChangeLog:

2023-04-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
             Joel Hutton  <joel.hutton@arm.com>

	* doc/generic.texi: Remove old tree codes.
	* expr.cc (expand_expr_real_2): Remove old tree code cases.
	* gimple-pretty-print.cc (dump_binary_rhs): Likewise.
	* optabs-tree.cc (optab_for_tree_code): Likewise.
	(supportable_half_widening_operation): Likewise.
	* tree-cfg.cc (verify_gimple_assign_binary): Likewise.
	* tree-inline.cc (estimate_operator_cost): Likewise.
	(op_symbol_code): Likewise.
	* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Likewise.
	(vect_analyze_data_ref_accesses): Likewise.
	* tree-vect-generic.cc (expand_vector_operations_1): Likewise.
	* cfgexpand.cc (expand_debug_expr): Likewise.
	* tree-vect-stmts.cc (vectorizable_conversion): Likewise.
	(supportable_widening_operation): Likewise.
	* gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard):
	Likewise.
	* tree-vect-patterns.cc (vect_widened_op_tree): Refactor to replace
	usage in vect_recog_sad_pattern.
	(vect_recog_sad_pattern): Replace tree code widening pattern with
	internal function.
	(vect_recog_average_pattern): Likewise.
	* tree-pretty-print.cc (dump_generic_node): Remove tree code definition.
	* tree.def (WIDEN_PLUS_EXPR, WIDEN_MINUS_EXPR, VEC_WIDEN_PLUS_HI_EXPR,
	VEC_WIDEN_PLUS_LO_EXPR, VEC_WIDEN_MINUS_HI_EXPR,
	VEC_WIDEN_MINUS_LO_EXPR): Likewise

[-- Attachment #2: ifn2_v2.patch --]
[-- Type: text/plain, Size: 19090 bytes --]

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 1a1b26b1c6c23ce273bcd08dc9a973f777174007..25b1558dcb941ea491a19aeeb2cd8f4d2dbdf7c6 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -5365,10 +5365,6 @@ expand_debug_expr (tree exp)
     case VEC_WIDEN_MULT_ODD_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_PERM_EXPR:
     case VEC_DUPLICATE_EXPR:
     case VEC_SERIES_EXPR:
@@ -5405,8 +5401,6 @@ expand_debug_expr (tree exp)
     case WIDEN_MULT_EXPR:
     case WIDEN_MULT_PLUS_EXPR:
     case WIDEN_MULT_MINUS_EXPR:
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
       if (SCALAR_INT_MODE_P (GET_MODE (op0))
 	  && SCALAR_INT_MODE_P (mode))
 	{
@@ -5419,10 +5413,6 @@ expand_debug_expr (tree exp)
 	    op1 = simplify_gen_unary (ZERO_EXTEND, mode, op1, inner_mode);
 	  else
 	    op1 = simplify_gen_unary (SIGN_EXTEND, mode, op1, inner_mode);
-	  if (TREE_CODE (exp) == WIDEN_PLUS_EXPR)
-	    return simplify_gen_binary (PLUS, mode, op0, op1);
-	  else if (TREE_CODE (exp) == WIDEN_MINUS_EXPR)
-	    return simplify_gen_binary (MINUS, mode, op0, op1);
 	  op0 = simplify_gen_binary (MULT, mode, op0, op1);
 	  if (TREE_CODE (exp) == WIDEN_MULT_EXPR)
 	    return op0;
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 2c14b7abce2db0a3da0a21e916907947cb56a265..3816abaaf4d364d604a44942317f96f3f303e5b6 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,10 +1811,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
-@tindex VEC_WIDEN_PLUS_HI_EXPR
-@tindex VEC_WIDEN_PLUS_LO_EXPR
-@tindex VEC_WIDEN_MINUS_HI_EXPR
-@tindex VEC_WIDEN_MINUS_LO_EXPR
 @tindex VEC_UNPACK_HI_EXPR
 @tindex VEC_UNPACK_LO_EXPR
 @tindex VEC_UNPACK_FLOAT_HI_EXPR
@@ -1861,33 +1857,6 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
-@item VEC_WIDEN_PLUS_HI_EXPR
-@itemx VEC_WIDEN_PLUS_LO_EXPR
-These nodes represent widening vector addition of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The result
-is a vector that contains half as many elements, of an integral type whose size
-is twice as wide.  In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.  In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.
-
-@item VEC_WIDEN_MINUS_HI_EXPR
-@itemx VEC_WIDEN_MINUS_LO_EXPR
-These nodes represent widening vector subtraction of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The high/low
-elements of the second vector are subtracted from the high/low elements of the
-first. The result is a vector that contains half as many elements, of an
-integral type whose size is twice as wide.  In the case of
-@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second
-vector are subtracted from the high @code{N/2} of the first to produce the
-vector of @code{N/2} products.  In the case of
-@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second
-vector are subtracted from the low @code{N/2} of the first to produce the
-vector of @code{N/2} products.
-
 @item VEC_UNPACK_HI_EXPR
 @itemx VEC_UNPACK_LO_EXPR
 These nodes represent unpacking of the high and low parts of the input vector,
diff --git a/gcc/expr.cc b/gcc/expr.cc
index f8f5cc5a6ca67f291b3c8b7246d593c0be80272f..454d1391b19a7d2aa53f0a88876d1eaf0494de51 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -9601,8 +9601,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 					  target, unsignedp);
       return target;
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_MULT_EXPR:
       /* If first operand is constant, swap them.
 	 Thus the following special case checks need only
@@ -10380,10 +10378,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	return temp;
       }
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc
index 300e9d7ed1e7be73f30875e08c461a8880c3134e..d903826894e7f0dfd34dc0caad92eea3caa45e05 100644
--- a/gcc/gimple-pretty-print.cc
+++ b/gcc/gimple-pretty-print.cc
@@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc,
     case VEC_PACK_FLOAT_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_SERIES_EXPR:
       for (p = get_tree_code_name (code); *p; p++)
 	pp_character (buffer, TOUPPER (*p));
diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 4ca32a7b5d52f8426b09d1446a336650e143b41f..5ae7f7596c6fc6f901e4e47ae44f00185f4602b2 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -797,12 +797,6 @@ gimple_range_op_handler::maybe_non_standard ()
   if (gimple_code (m_stmt) == GIMPLE_ASSIGN)
     switch (gimple_assign_rhs_code (m_stmt))
       {
-	case WIDEN_PLUS_EXPR:
-	{
-	  signed_op = ptr_op_widen_plus_signed;
-	  unsigned_op = ptr_op_widen_plus_unsigned;
-	}
-	gcc_fallthrough ();
 	case WIDEN_MULT_EXPR:
 	{
 	  m_valid = false;
diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc
index 8010046c6a8b3e809c989ddef7a06ddaa68ae32a..ee1aa8c9676ee9c67edbf403e6295da391826a62 100644
--- a/gcc/optabs-tree.cc
+++ b/gcc/optabs-tree.cc
@@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
       return (TYPE_UNSIGNED (type)
 	      ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab);
 
-    case VEC_WIDEN_PLUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab);
-
-    case VEC_WIDEN_PLUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab);
-
-    case VEC_WIDEN_MINUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab);
-
-    case VEC_WIDEN_MINUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab);
-
     case VEC_UNPACK_HI_EXPR:
       return (TYPE_UNSIGNED (type)
 	      ? vec_unpacku_hi_optab : vec_unpacks_hi_optab);
@@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
    'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO.
 
    Supported widening operations:
-    WIDEN_MINUS_EXPR
-    WIDEN_PLUS_EXPR
     WIDEN_MULT_EXPR
     WIDEN_LSHIFT_EXPR
 
@@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out,
     case WIDEN_LSHIFT_EXPR:
       *code1 = LSHIFT_EXPR;
       break;
-    case WIDEN_MINUS_EXPR:
-      *code1 = MINUS_EXPR;
-      break;
-    case WIDEN_PLUS_EXPR:
-      *code1 = PLUS_EXPR;
-      break;
     case WIDEN_MULT_EXPR:
       *code1 = MULT_EXPR;
       break;
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index a9fcc7fd050f871437ef336ecfb8d6cc81280ee0..f80cd1465df83b5540492e619e56b9af249e9f31 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -4017,8 +4017,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case PLUS_EXPR:
     case MINUS_EXPR:
       {
@@ -4139,10 +4137,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index c702f0032a19203a7c536a01c1e7f47fc7b77add..6e5fd45a0c2435109dd3d50e8fc8e1d4969a1fd0 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4273,8 +4273,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
 
     case REALIGN_LOAD_EXPR:
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case DOT_PROD_EXPR:
@@ -4283,10 +4281,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case WIDEN_MULT_MINUS_EXPR:
     case WIDEN_LSHIFT_EXPR:
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 7947f9647a15110b52d195643ad7d28ee32d4236..9941d8bf80535a98e647b8928619a6bf08bc434c 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -2874,8 +2874,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
       break;
 
       /* Binary arithmetic and logic expressions.  */
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case MULT_EXPR:
@@ -3831,10 +3829,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
     case VEC_SERIES_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
     case VEC_WIDEN_MULT_ODD_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
@@ -4352,12 +4346,6 @@ op_symbol_code (enum tree_code code)
     case WIDEN_LSHIFT_EXPR:
       return "w<<";
 
-    case WIDEN_PLUS_EXPR:
-      return "w+";
-
-    case WIDEN_MINUS_EXPR:
-      return "w-";
-
     case POINTER_PLUS_EXPR:
       return "+";
 
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 8daf7bd7dd34d043b1d7b4cba1779f0ecf9f520a..213a3899a6c145bb057cd118bec1df7a05728aef 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type)
 	  || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR
 	  || gimple_assign_rhs_code (assign) == FLOAT_EXPR)
 	{
 	  tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign));
diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index 445da53292e9d1d2db62ca962fc017bb0e6c9bbe..342ffc5fa7f3b8f37e6bd4658d2f1fccf1d2c7fa 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -2227,10 +2227,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi,
      arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
      calculated in the same way above.  */
   if (code == WIDEN_SUM_EXPR
-      || code == VEC_WIDEN_PLUS_HI_EXPR
-      || code == VEC_WIDEN_PLUS_LO_EXPR
-      || code == VEC_WIDEN_MINUS_HI_EXPR
-      || code == VEC_WIDEN_MINUS_LO_EXPR
       || code == VEC_WIDEN_MULT_HI_EXPR
       || code == VEC_WIDEN_MULT_LO_EXPR
       || code == VEC_WIDEN_MULT_EVEN_EXPR
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 3175dd92187c0935f78ebbf2eb476bdcf8b4ccd1..ab3162b5ac66ea8a96c0ea7c45138ca5ee13423f 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -561,21 +561,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-		      tree_code widened_code, bool shift_p,
+		      code_helper widened_code, bool shift_p,
 		      unsigned int max_nops,
 		      vect_unpromoted_value *unprom, tree *common_type,
 		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
+    return 0;
+
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else if (is_gimple_call (stmt))
+    rhs_code = gimple_call_combined_fn (stmt);
+  else
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  if (rhs_code != code
+      && rhs_code != widened_code)
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = gimple_get_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -588,7 +597,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op = gimple_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
 	{
 	  /* We already have a common type from earlier operands.
@@ -1342,8 +1351,9 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
-			     false, 2, unprom, &half_type))
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+			     CFN_VEC_WIDEN_MINUS, false, 2, unprom,
+			     &half_type))
     return NULL;
 
   vect_pattern_detected ("vect_recog_sad_pattern", last_stmt);
@@ -2696,9 +2706,10 @@ vect_recog_average_pattern (vec_info *vinfo,
   internal_fn ifn = IFN_AVG_FLOOR;
   vect_unpromoted_value unprom[3];
   tree new_type;
+  enum optab_subtype subtype;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-					    WIDEN_PLUS_EXPR, false, 3,
-					    unprom, &new_type);
+					    CFN_VEC_WIDEN_PLUS, false, 3,
+					    unprom, &new_type, &subtype);
   if (nops == 0)
     return NULL;
   if (nops == 3)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 2a7ef2439e12d1966e8884433963a3d387a856b7..ef3ac551f7fe247893b021d98e43c581e2078dbb 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5032,9 +5032,7 @@ vectorizable_conversion (vec_info *vinfo,
   else
     return false;
 
-  bool widen_arith = (code == WIDEN_PLUS_EXPR
-		 || code == WIDEN_MINUS_EXPR
-		 || code == WIDEN_MULT_EXPR
+  bool widen_arith = (code == WIDEN_MULT_EXPR
 		 || code == WIDEN_LSHIFT_EXPR
 		 || code == IFN_VEC_WIDEN_PLUS
 		 || code == IFN_VEC_WIDEN_MINUS);
@@ -5086,8 +5084,6 @@ vectorizable_conversion (vec_info *vinfo,
     {
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR
 		  || code == IFN_VEC_WIDEN_PLUS
 		  || code == IFN_VEC_WIDEN_MINUS);
 
@@ -12192,7 +12188,7 @@ supportable_widening_operation (vec_info *vinfo,
   class loop *vect_loop = NULL;
   machine_mode vec_mode;
   enum insn_code icode1, icode2;
-  optab optab1, optab2;
+  optab optab1 = unknown_optab, optab2 = unknown_optab;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
   tree_code c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES;
@@ -12290,16 +12286,6 @@ supportable_widening_operation (vec_info *vinfo,
       c2 = VEC_WIDEN_LSHIFT_HI_EXPR;
       break;
 
-    case WIDEN_PLUS_EXPR:
-      c1 = VEC_WIDEN_PLUS_LO_EXPR;
-      c2 = VEC_WIDEN_PLUS_HI_EXPR;
-      break;
-
-    case WIDEN_MINUS_EXPR:
-      c1 = VEC_WIDEN_MINUS_LO_EXPR;
-      c2 = VEC_WIDEN_MINUS_HI_EXPR;
-      break;
-
     CASE_CONVERT:
       c1 = VEC_UNPACK_LO_EXPR;
       c2 = VEC_UNPACK_HI_EXPR;
diff --git a/gcc/tree.def b/gcc/tree.def
index ee02754354f015a16737c7e879d89c3e3be0d5aa..a58e608a90078818a7ade9d1173ac7ec84c48c7a 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3)
 DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2)
 
 /* Widening sad (sum of absolute differences).
-   The first two arguments are of type t1 which should be integer.
-   The third argument and the result are of type t2, such that t2 is at least
-   twice the size of t1.  Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
+   The first two arguments are of type t1 which should be a vector of integers.
+   The third argument and the result are of type t2, such that the size of
+   the elements of t2 is at least twice the size of the elements of t1.
+   Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
    equivalent to:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = PLUS_EXPR (tmp2, arg3)
   or:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
  */
@@ -1421,8 +1422,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3)
    the first argument from type t1 to type t2, and then shifting it
    by the second argument.  */
 DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2)
 
 /* Widening vector multiplication.
    The two operands are vectors with N elements of size S. Multiplying the
@@ -1487,10 +1486,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2)
  */
 DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2)
 DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2)
 
 /* PREDICT_EXPR.  Specify hint for branch prediction.  The
    PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
  2023-04-25 12:30                             ` Richard Biener
@ 2023-04-28 16:06                               ` Andre Vieira (lists)
  0 siblings, 0 replies; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-04-28 16:06 UTC (permalink / raw)
  To: Richard Biener, Richard Sandiford; +Cc: Richard Biener, gcc-patches



On 25/04/2023 13:30, Richard Biener wrote:
> On Mon, 24 Apr 2023, Richard Sandiford wrote:
> 
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Thu, Apr 20, 2023 at 3:24?PM Andre Vieira (lists) via Gcc-patches
>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>
>>>> Rebased all three patches and made some small changes to the second one:
>>>> - removed sub and abd optabs from commutative_optab_p, I suspect this
>>>> was a copy paste mistake,
>>>> - removed what I believe to be a superfluous switch case in vectorizable
>>>> conversion, the one that was here:
>>>> +  if (code.is_fn_code ())
>>>> +     {
>>>> +      internal_fn ifn = as_internal_fn (code.as_fn_code ());
>>>> +      int ecf_flags = internal_fn_flags (ifn);
>>>> +      gcc_assert (ecf_flags & ECF_MULTI);
>>>> +
>>>> +      switch (code.as_fn_code ())
>>>> +       {
>>>> +       case CFN_VEC_WIDEN_PLUS:
>>>> +         break;
>>>> +       case CFN_VEC_WIDEN_MINUS:
>>>> +         break;
>>>> +       case CFN_LAST:
>>>> +       default:
>>>> +         return false;
>>>> +       }
>>>> +
>>>> +      internal_fn lo, hi;
>>>> +      lookup_multi_internal_fn (ifn, &lo, &hi);
>>>> +      *code1 = as_combined_fn (lo);
>>>> +      *code2 = as_combined_fn (hi);
>>>> +      optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
>>>> +      optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
>>>>        }
>>>>
>>>> I don't think we need to check they are a specfic fn code, as we look-up
>>>> optabs and if they succeed then surely we can vectorize?
>>>>
>>>> OK for trunk?
>>>
>>> In the first patch I see some uses of safe_as_tree_code like
>>>
>>> +  if (ch.is_tree_code ())
>>> +    return op1 == NULL_TREE ? gimple_build_assign (lhs,
>>> ch.safe_as_tree_code (),
>>> +                                                  op0) :
>>> +                             gimple_build_assign (lhs, ch.safe_as_tree_code (),
>>> +                                                  op0, op1);
>>> +  else
>>> +  {
>>> +    internal_fn fn = as_internal_fn (ch.safe_as_fn_code ());
>>> +    gimple* stmt;
>>>
>>> where the context actually requires a valid tree code.  Please change those
>>> to force to tree code / ifn code.  Just use explicit casts here and the other
>>> places that are similar.  Before the as_internal_fn just put a
>>> gcc_assert (ch.is_internal_fn ()).
>>
>> Also, doesn't the above ?: simplify to the "else" arm?  Null trailing
>> arguments would be ignored for unary operators.
>>
>> I wasn't sure what to make of the op0 handling:
>>
>>> +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
>>> +   or internal_fn contained in ch, respectively.  */
>>> +gimple *
>>> +vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1)
>>> +{
>>> +  if (op0 == NULL_TREE)
>>> +    return NULL;
>>
>> Can that happen, and if so, does returning null make sense?
>> Maybe an assert would be safer.
> 
> Yeah, I was hoping to have a look whether the new gimple_build
> overloads could be used to make this all better (but hoped we can
> finally get this series in in some way).
> 
> Richard.

Yeah, in the newest version of the first patch of the series I found 
that most of the time I can get away with only really needing to 
distinguish between tree_code and internal_fn when building gimple, for 
which it currently uses vect_gimple_build, but it does feel like that 
could easily be a gimple function.

Having said that, as I partially mention in the patch, I didn't rewrite 
the optabs-tree supportable_half_widening and supportable_conversion (or 
whatever they are called) because those also at some point need to 
access the stmt and there is a massive difference in how we handle 
gassigns and gcall's from that perspective, but maybe we can generalize 
that too somehow...

Anyway have a look at the new versions (posted just some minutes after 
the email I'm replying too haha! timing :P)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 1/3] Refactor to allow internal_fn's
  2023-04-28 12:36                             ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists)
@ 2023-05-03 11:55                               ` Richard Biener
  2023-05-04 15:20                                 ` Andre Vieira (lists)
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2023-05-03 11:55 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches

On Fri, 28 Apr 2023, Andre Vieira (lists) wrote:

> Hi,
> 
> I'm posting the patches separately now with ChangeLogs.
> 
> I made the suggested changes and tried to simplify the code a bit further.
> Where internal to tree-vect-stmts I changed most functions to use code_helper
> to avoid having to check at places we didn't need to. I was trying to simplify
> things further by also modifying supportable_half_widening_operation and
> supportable_convert_operation but the result of that was that I ended up
> moving the code to cast to tree code inside them rather than at the call site
> and it didn't look simpler, so I left those. Though if we did make those
> changes we'd no longer need to keep around the tc1 variable in
> vectorizable_conversion... Let me know what you think.

I see that

-  else if (CONVERT_EXPR_CODE_P (code)
+  else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())

is convenient (as much as I dislike safe_as_tree_code).  Isn't
the following

-  if (!CONVERT_EXPR_CODE_P (code))
+  if (!CONVERT_EXPR_CODE_P ((tree_code) code))
     return false;

then wrong?  In other places you added an assert - I assume
that we might want to checking assert in the cast operators?
(those were added mainly for convenience, maybe we want
as_a <>, etc. here - not sure if those will play well with
enums though).  Just suggestions for eventual followups in
this area.

+inline enum tree_code
+code_helper::safe_as_tree_code () const
+{
+  return is_tree_code () ? (tree_code) *this : MAX_TREE_CODES;
+}
+
+inline combined_fn
+code_helper::safe_as_fn_code () const {
+  return is_fn_code () ? (combined_fn) *this : CFN_LAST;
+}
+

newline after the last 'const'.  Can you place a comment before
these to explain their intended use?  Aka give the case the
code isn't the desired kind a safe value?

The patch is OK with just the last bit fixed.

Thanks,
Richard.

> gcc/ChangeLog:
> 
> 2023-04-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Joel Hutton  <joel.hutton@arm.com>
> 
>         * tree-vect-patterns.cc (vect_gimple_build): New Function.
>         (vect_recog_widen_op_pattern): Refactor to use code_helper.
>         * tree-vect-stmts.cc (vect_gen_widened_results_half): Likewise.
>         (vect_create_vectorized_demotion_stmts): Likewise.
>         (vect_create_vectorized_promotion_stmts): Likewise.
>         (vect_create_half_widening_stmts): Likewise.
>         (vectorizable_conversion): Likewise.
>         (vectorizable_call): Likewise.
>         (supportable_widening_operation): Likewise.
>         (supportable_narrowing_operation): Likewise.
>         (simple_integer_narrowing): Likewise.
>         * tree-vectorizer.h (supportable_widening_operation): Likewise.
>         (supportable_narrowing_operation): Likewise.
>         (vect_gimple_build): New function prototype.
>         * tree.h (code_helper::safe_as_tree_code): New function.
>         (code_helper::safe_as_fn_code): New function.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-04-28 12:37                             ` [PATCH 2/3] Refactor widen_plus as internal_fn Andre Vieira (lists)
@ 2023-05-03 12:11                               ` Richard Biener
  2023-05-03 19:07                                 ` Richard Sandiford
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2023-05-03 12:11 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches

On Fri, 28 Apr 2023, Andre Vieira (lists) wrote:

> This patch replaces the existing tree_code widen_plus and widen_minus
> patterns with internal_fn versions.
> 
> DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it provides
> convenience wrappers for defining conversions that require a hi/lo split, like
> widening and narrowing operations.  Each definition for <NAME> will require an
> optab named <OPTAB> and two other optabs that you specify for signed and
> unsigned. The hi/lo pair is necessary because the widening operations take n
> narrow elements as inputs and return n/2 wide elements as outputs. The 'lo'
> operation operates on the first n/2 elements of input. The 'hi' operation
> operates on the second n/2 elements of input. Defining an internal_fn along
> with hi/lo variations allows a single internal function to be returned from a
> vect_recog function that will later be expanded to hi/lo.
> 
> DEF_INTERNAL_OPTAB_HILO_FN is used in internal-fn.def to register a widening
> internal_fn. It is defined differently in different places and internal-fn.def
> is sourced from those places so the parameters given can be reused.
>   internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later
> defined to generate the  'expand_' functions for the hi/lo versions of the fn.
>   internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original
> and hi/lo variants of the internal_fn
> 
>  For example:
>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>addl_hi_<mode> ->
> (u/s)addl2
>                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>addl_lo_<mode>
> -> (u/s)addl
> 
> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.

I'll note that it's interesting we have widen multiplication as
the only existing example where we have both HI/LO and EVEN/ODD cases.
I think we want to share as much of the infrastructure to eventually
support targets doing even/odd (I guess all VLA vector targets will
be even/odd?).

DEF_INTERNAL_OPTAB_HILO_FN also looks to be implicitely directed to
widening operations (otherwise no signed/unsigned variants would be
necessary).  What I don't understand is why we need an optab
without _hi/_lo but in that case no signed/unsigned variant?

Looks like all plus, plus_lo and plus_hi are commutative but
only plus is widening?!  So is the setup that the vectorizer
doesn't know about the split and uses 'plus' but then the
expander performs the split?  It does look a bit awkward here
(the plain 'plus' is just used for the scalar case during
pattern recog it seems).

I'd rather have DEF_INTERNAL_OPTAB_HILO_FN split up, declaring
the hi/lo pairs and the scalar variant separately using
DEF_INTERNAL_FN without expander for that, and having
DEF_INTERNAL_HILO_WIDEN_OPTAB_FN and DEF_INTERNAL_EVENODD_WIDEN_OPTAB_FN
for the signed/unsigned pairs?  (if we need that helper at all)

Targets shouldn't need to implement the plain optab (it shouldn't
exist) and the vectorizer should query the hi/lo or even/odd
optabs for support instead.

The vectorizer parts look OK to me, I'd like Richard to chime
in on the optab parts as well.

Thanks,
Richard.

> gcc/ChangeLog:
> 
> 2023-04-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Joel Hutton  <joel.hutton@arm.com>
>             Tamar Christina  <tamar.christina@arm.com>
> 
>     	* internal-fn.cc (INCLUDE_MAP): Include maps for use in optab
>     lookup.
>     	(DEF_INTERNAL_OPTAB_HILO_FN): Macro to define an internal_fn that
>     expands into multiple internal_fns (for widening).
> 	(ifn_cmp): Function to compare ifn's for sorting/searching.
> 	(lookup_hilo_ifn_optab): Add lookup function.
> 	(lookup_hilo_internal_fn): Add lookup function.
> 	(commutative_binary_fn_p): Add widen_plus fn's.
> 	(widening_fn_p): New function.
> 	(decomposes_to_hilo_fn_p): New function.
> 	* internal-fn.def (DEF_INTERNAL_OPTAB_HILO_FN): Define widening
>     plus,minus functions.
> 	(VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code.
> 	(VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code.
> 	* internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
> 	(lookup_hilo_ifn_optab): Add prototype.
> 	(lookup_hilo_internal_fn): Likewise.
> 	(widening_fn_p): Likewise.
> 	(decomposes_to_hilo_fn_p): Likewise.
> 	* optabs.cc (commutative_optab_p): Add widening plus, minus optabs.
> 	* optabs.def (OPTAB_CD): widen add, sub optabs
> 	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
>     patterns with a hi/lo split.
>     	(vect_recog_widen_plus_pattern): Refactor to return
>     IFN_VECT_WIDEN_PLUS.
>     	(vect_recog_widen_minus_pattern): Refactor to return new
>     IFN_VEC_WIDEN_MINUS.
>     	* tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus
>     ifn
>     support.
> 	(supportable_widening_operation): Add widen plus/minus ifn support.
> 
> gcc/testsuite/ChangeLog:
> 
>     	* gcc.target/aarch64/vect-widen-add.c: Test that new
>     IFN_VEC_WIDEN_PLUS is being used.
>     	* gcc.target/aarch64/vect-widen-sub.c: Test that new
>     IFN_VEC_WIDEN_MINUS is being used.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/3] Remove widen_plus/minus_expr tree codes
  2023-04-28 12:37                             ` [PATCH 3/3] Remove widen_plus/minus_expr tree codes Andre Vieira (lists)
@ 2023-05-03 12:29                               ` Richard Biener
  2023-05-10  9:15                                 ` Andre Vieira (lists)
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2023-05-03 12:29 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches

On Fri, 28 Apr 2023, Andre Vieira (lists) wrote:

> This is a rebase of Joel's previous patch.
> 
> This patch removes the old widen plus/minus tree codes which have been
> replaced by internal functions.

I guess that's obvious then.  I wonder what we do to internal
fns in debug stmts?  Looks like we throw those away and do not
generate debug stmts from calls.

Given you remove handling of the scalar WIDEN_PLUS/MINUS_EXPR
codes everywhere do we want to add checking code the scalar
IFNs do not appear in the IL?  For at least some cases there
are corresponding functions handling internal functions that
you could have amended otherwise.

Richard.

> gcc/ChangeLog:
> 
> 2023-04-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Joel Hutton  <joel.hutton@arm.com>
> 
> 	* doc/generic.texi: Remove old tree codes.
> 	* expr.cc (expand_expr_real_2): Remove old tree code cases.
> 	* gimple-pretty-print.cc (dump_binary_rhs): Likewise.
> 	* optabs-tree.cc (optab_for_tree_code): Likewise.
> 	(supportable_half_widening_operation): Likewise.
> 	* tree-cfg.cc (verify_gimple_assign_binary): Likewise.
> 	* tree-inline.cc (estimate_operator_cost): Likewise.
> 	(op_symbol_code): Likewise.
> 	* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Likewise.
> 	(vect_analyze_data_ref_accesses): Likewise.
> 	* tree-vect-generic.cc (expand_vector_operations_1): Likewise.
> 	* cfgexpand.cc (expand_debug_expr): Likewise.
> 	* tree-vect-stmts.cc (vectorizable_conversion): Likewise.
> 	(supportable_widening_operation): Likewise.
> 	* gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard):
> 	Likewise.
> 	* tree-vect-patterns.cc (vect_widened_op_tree): Refactor to replace
> 	usage in vect_recog_sad_pattern.
> 	(vect_recog_sad_pattern): Replace tree code widening pattern with
> 	internal function.
> 	(vect_recog_average_pattern): Likewise.
> 	* tree-pretty-print.cc (dump_generic_node): Remove tree code
> 	definition.
> 	* tree.def (WIDEN_PLUS_EXPR, WIDEN_MINUS_EXPR, VEC_WIDEN_PLUS_HI_EXPR,
> 	VEC_WIDEN_PLUS_LO_EXPR, VEC_WIDEN_MINUS_HI_EXPR,
> 	VEC_WIDEN_MINUS_LO_EXPR): Likewise
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-03 12:11                               ` Richard Biener
@ 2023-05-03 19:07                                 ` Richard Sandiford
  2023-05-12 12:16                                   ` Andre Vieira (lists)
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Sandiford @ 2023-05-03 19:07 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches

Richard Biener <rguenther@suse.de> writes:
> On Fri, 28 Apr 2023, Andre Vieira (lists) wrote:
>
>> This patch replaces the existing tree_code widen_plus and widen_minus
>> patterns with internal_fn versions.
>> 
>> DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it provides
>> convenience wrappers for defining conversions that require a hi/lo split, like
>> widening and narrowing operations.  Each definition for <NAME> will require an
>> optab named <OPTAB> and two other optabs that you specify for signed and
>> unsigned. The hi/lo pair is necessary because the widening operations take n
>> narrow elements as inputs and return n/2 wide elements as outputs. The 'lo'
>> operation operates on the first n/2 elements of input. The 'hi' operation
>> operates on the second n/2 elements of input. Defining an internal_fn along
>> with hi/lo variations allows a single internal function to be returned from a
>> vect_recog function that will later be expanded to hi/lo.
>> 
>> DEF_INTERNAL_OPTAB_HILO_FN is used in internal-fn.def to register a widening
>> internal_fn. It is defined differently in different places and internal-fn.def
>> is sourced from those places so the parameters given can be reused.
>>   internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later
>> defined to generate the  'expand_' functions for the hi/lo versions of the fn.
>>   internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original
>> and hi/lo variants of the internal_fn
>> 
>>  For example:
>>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
>> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>addl_hi_<mode> ->
>> (u/s)addl2
>>                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>addl_lo_<mode>
>> -> (u/s)addl
>> 
>> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
>> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
>
> I'll note that it's interesting we have widen multiplication as
> the only existing example where we have both HI/LO and EVEN/ODD cases.
> I think we want to share as much of the infrastructure to eventually
> support targets doing even/odd (I guess all VLA vector targets will
> be even/odd?).

Can't speak for all, but SVE2 certainly is.

> DEF_INTERNAL_OPTAB_HILO_FN also looks to be implicitely directed to
> widening operations (otherwise no signed/unsigned variants would be
> necessary).  What I don't understand is why we need an optab
> without _hi/_lo but in that case no signed/unsigned variant?
>
> Looks like all plus, plus_lo and plus_hi are commutative but
> only plus is widening?!  So is the setup that the vectorizer
> doesn't know about the split and uses 'plus' but then the
> expander performs the split?  It does look a bit awkward here
> (the plain 'plus' is just used for the scalar case during
> pattern recog it seems).
>
> I'd rather have DEF_INTERNAL_OPTAB_HILO_FN split up, declaring
> the hi/lo pairs and the scalar variant separately using
> DEF_INTERNAL_FN without expander for that, and having
> DEF_INTERNAL_HILO_WIDEN_OPTAB_FN and DEF_INTERNAL_EVENODD_WIDEN_OPTAB_FN
> for the signed/unsigned pairs?  (if we need that helper at all)
>
> Targets shouldn't need to implement the plain optab (it shouldn't
> exist) and the vectorizer should query the hi/lo or even/odd
> optabs for support instead.

I dread these kinds of review because I think I'm almost certain to
flatly contradict something I said last time round, but +1 FWIW.
It seems OK to define an ifn to represent the combined effect, for the
scalar case, but that shouldn't leak into optabs unless we actually want
to use the ifn for "real" scalar ops (as opposed to a temporary
placeholder during pattern recognition).

On the optabs/ifn bits:

> +static int
> +ifn_cmp (const void *a_, const void *b_)
> +{
> +  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
> +  auto *a = (const std::pair<ifn_pair, optab> *)a_;
> +  auto *b = (const std::pair<ifn_pair, optab> *)b_;
> +  return (int) (a->first.first) - (b->first.first);
> +}
> +
> +/* Return the optab belonging to the given internal function NAME for the given
> +   SIGN or unknown_optab.  */
> +
> +optab
> +lookup_hilo_ifn_optab (enum internal_fn fn, unsigned sign)

There is no NAME parameter.  It also isn't clear what SIGN means:
is 1 for unsigned or signed?  Would be better to use signop and
TYPE_SIGN IMO.

> +{
> +  typedef std::pair<enum internal_fn, unsigned> ifn_pair;
> +  typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type;
> +  static fn_to_optab_map_type *fn_to_optab_map;
> +
> +  if (!fn_to_optab_map)
> +    {
> +      unsigned num
> +	= sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn);
> +      fn_to_optab_map = new fn_to_optab_map_type ();
> +      for (unsigned int i = 0; i < num - 1; ++i)
> +	{
> +	  enum internal_fn fn = internal_fn_hilo_keys_array[i];
> +	  optab v1 = internal_fn_hilo_values_array[2*i];
> +	  optab v2 = internal_fn_hilo_values_array[2*i + 1];
> +	  ifn_pair key1 (fn, 0);
> +	  fn_to_optab_map->safe_push ({key1, v1});
> +	  ifn_pair key2 (fn, 1);
> +	  fn_to_optab_map->safe_push ({key2, v2});
> +	}
> +	fn_to_optab_map->qsort (ifn_cmp);
> +    }
> +
> +  ifn_pair new_pair (fn, sign ? 1 : 0);
> +  optab tmp;
> +  std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp);
> +  auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp);
> +  return entry != fn_to_optab_map->end () ? entry->second : unknown_optab;
> +}
> +

Do we need to use a map for this?  It seems like it follows mechanically
from the macro definition and could be handled using a switch statement
and preprocessor logic.

Also, it would be good to make direct_internal_fn_optab DTRT for this
case, rather than needing a separate function.

> +extern void
> +lookup_hilo_internal_fn (enum internal_fn ifn, enum internal_fn *lo,
> +			  enum internal_fn *hi)
> +{
> +  gcc_assert (decomposes_to_hilo_fn_p (ifn));
> +
> +  *lo = internal_fn (ifn + 1);
> +  *hi = internal_fn (ifn + 2);
> +}

Nit: spurious extern.  Function needs a comment.  There have been
requests to drop redundant "enum" keywords from new code.

> +/* Return true if FN decomposes to _hi and _lo IFN.  If true this should also
> +   be a widening function.  */
> +
> +bool
> +decomposes_to_hilo_fn_p (internal_fn fn)
> +{
> +  if (!widening_fn_p (fn))
> +    return false;
> +
> +  switch (fn)
> +    {
> +    case IFN_VEC_WIDEN_PLUS:
> +    case IFN_VEC_WIDEN_MINUS:
> +      return true;
> +
> +    default:
> +      return false;
> +    }
> +}
> +

Similarly here I think we should use the preprocessor.  It isn't clear
why this returns false for !widening_fn_p.  Narrowing hi/lo functions
would decompose in a similar way.

As a general comment, how about naming the new macro:

  DEF_INTERNAL_SIGNED_HILO_OPTAB_FN

and make it invoke DEF_INTERNAL_SIGNED_OPTAB_FN twice, once for
the hi and once for the lo?

The new optabs need to be documented in md.texi.  I think it'd be
better to drop the "l" suffix in "addl" and "subl", since that's an
Arm convention and is redundant with the earlier "widen".

Sorry for the nitpicks and thanks for picking up this work.

Richard

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 1/3] Refactor to allow internal_fn's
  2023-05-03 11:55                               ` Richard Biener
@ 2023-05-04 15:20                                 ` Andre Vieira (lists)
  2023-05-05  6:09                                   ` Richard Biener
  0 siblings, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-05-04 15:20 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches



On 03/05/2023 12:55, Richard Biener wrote:
> On Fri, 28 Apr 2023, Andre Vieira (lists) wrote:
> 
>> Hi,
>>
>> I'm posting the patches separately now with ChangeLogs.
>>
>> I made the suggested changes and tried to simplify the code a bit further.
>> Where internal to tree-vect-stmts I changed most functions to use code_helper
>> to avoid having to check at places we didn't need to. I was trying to simplify
>> things further by also modifying supportable_half_widening_operation and
>> supportable_convert_operation but the result of that was that I ended up
>> moving the code to cast to tree code inside them rather than at the call site
>> and it didn't look simpler, so I left those. Though if we did make those
>> changes we'd no longer need to keep around the tc1 variable in
>> vectorizable_conversion... Let me know what you think.
> 
> I see that
> 
> -  else if (CONVERT_EXPR_CODE_P (code)
> +  else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
> 
> is convenient (as much as I dislike safe_as_tree_code).  Isn't
> the following
> 
> -  if (!CONVERT_EXPR_CODE_P (code))
> +  if (!CONVERT_EXPR_CODE_P ((tree_code) code))
>       return false;
For some reason I thought the code could only reach here if code was a 
tree code, but I guess if we have an ifn and the modes aren't the same 
as the wide_vectype it would fall to this, which for an ifn this would 
fail. I am wondering whether it needs to though, the multi-step widening 
should also work for ifn's no? We'd need to adapt it, to not use c1, c2 
but hi, lo in case of ifn I guess.. and then use a different optab look 
up too?

Though I'm thinking, maybe this should be a follow-up and just not have 
that 'feature' for now. The feature being, supporting multi-step 
conversion for new widening IFN's.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 1/3] Refactor to allow internal_fn's
  2023-05-04 15:20                                 ` Andre Vieira (lists)
@ 2023-05-05  6:09                                   ` Richard Biener
  2023-05-12 12:14                                     ` Andre Vieira (lists)
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2023-05-05  6:09 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches

On Thu, 4 May 2023, Andre Vieira (lists) wrote:

> 
> 
> On 03/05/2023 12:55, Richard Biener wrote:
> > On Fri, 28 Apr 2023, Andre Vieira (lists) wrote:
> > 
> >> Hi,
> >>
> >> I'm posting the patches separately now with ChangeLogs.
> >>
> >> I made the suggested changes and tried to simplify the code a bit further.
> >> Where internal to tree-vect-stmts I changed most functions to use
> >> code_helper
> >> to avoid having to check at places we didn't need to. I was trying to
> >> simplify
> >> things further by also modifying supportable_half_widening_operation and
> >> supportable_convert_operation but the result of that was that I ended up
> >> moving the code to cast to tree code inside them rather than at the call
> >> site
> >> and it didn't look simpler, so I left those. Though if we did make those
> >> changes we'd no longer need to keep around the tc1 variable in
> >> vectorizable_conversion... Let me know what you think.
> > 
> > I see that
> > 
> > -  else if (CONVERT_EXPR_CODE_P (code)
> > +  else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
> > 
> > is convenient (as much as I dislike safe_as_tree_code).  Isn't
> > the following
> > 
> > -  if (!CONVERT_EXPR_CODE_P (code))
> > +  if (!CONVERT_EXPR_CODE_P ((tree_code) code))
> >       return false;
> For some reason I thought the code could only reach here if code was a tree
> code, but I guess if we have an ifn and the modes aren't the same as the
> wide_vectype it would fall to this, which for an ifn this would fail. I am
> wondering whether it needs to though, the multi-step widening should also work
> for ifn's no? We'd need to adapt it, to not use c1, c2 but hi, lo in case of
> ifn I guess.. and then use a different optab look up too?
> 
> Though I'm thinking, maybe this should be a follow-up and just not have that
> 'feature' for now. The feature being, supporting multi-step conversion for new
> widening IFN's.

Yes, I think we should address this in a followup if needed.

Richard.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/3] Remove widen_plus/minus_expr tree codes
  2023-05-03 12:29                               ` Richard Biener
@ 2023-05-10  9:15                                 ` Andre Vieira (lists)
  2023-05-12 12:18                                   ` Andre Vieira (lists)
  0 siblings, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-05-10  9:15 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches



On 03/05/2023 13:29, Richard Biener wrote:
> On Fri, 28 Apr 2023, Andre Vieira (lists) wrote:
> 
>> This is a rebase of Joel's previous patch.
>>
>> This patch removes the old widen plus/minus tree codes which have been
>> replaced by internal functions.
> 
> I guess that's obvious then.  I wonder what we do to internal
> fns in debug stmts?  Looks like we throw those away and do not
> generate debug stmts from calls.
See the comment above the removed lines in expand_debug_expr:
  /* Vector stuff.  For most of the codes we don't have rtl codes.  */

And it then just returns NULL for those expr's. So the behaviour there 
remains unchanged, not saying we couldn't do anything but I don
> 

> Given you remove handling of the scalar WIDEN_PLUS/MINUS_EXPR
> codes everywhere do we want to add checking code the scalar
> IFNs do not appear in the IL?  For at least some cases there
> are corresponding functions handling internal functions that
> you could have amended otherwise.

I am making some changes to PATCH 2 of this series, in the new version I 
am adding some extra code to the gimple checks, one of which is to error 
if it comes a cross an IFN that decomposes to HILO as that should only 
occur as an intermediary representation of the vect pass.
> 
> Richard.
> 
>> gcc/ChangeLog:
>>
>> 2023-04-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>>              Joel Hutton  <joel.hutton@arm.com>
>>
>> 	* doc/generic.texi: Remove old tree codes.
>> 	* expr.cc (expand_expr_real_2): Remove old tree code cases.
>> 	* gimple-pretty-print.cc (dump_binary_rhs): Likewise.
>> 	* optabs-tree.cc (optab_for_tree_code): Likewise.
>> 	(supportable_half_widening_operation): Likewise.
>> 	* tree-cfg.cc (verify_gimple_assign_binary): Likewise.
>> 	* tree-inline.cc (estimate_operator_cost): Likewise.
>> 	(op_symbol_code): Likewise.
>> 	* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Likewise.
>> 	(vect_analyze_data_ref_accesses): Likewise.
>> 	* tree-vect-generic.cc (expand_vector_operations_1): Likewise.
>> 	* cfgexpand.cc (expand_debug_expr): Likewise.
>> 	* tree-vect-stmts.cc (vectorizable_conversion): Likewise.
>> 	(supportable_widening_operation): Likewise.
>> 	* gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard):
>> 	Likewise.
>> 	* tree-vect-patterns.cc (vect_widened_op_tree): Refactor to replace
>> 	usage in vect_recog_sad_pattern.
>> 	(vect_recog_sad_pattern): Replace tree code widening pattern with
>> 	internal function.
>> 	(vect_recog_average_pattern): Likewise.
>> 	* tree-pretty-print.cc (dump_generic_node): Remove tree code
>> 	definition.
>> 	* tree.def (WIDEN_PLUS_EXPR, WIDEN_MINUS_EXPR, VEC_WIDEN_PLUS_HI_EXPR,
>> 	VEC_WIDEN_PLUS_LO_EXPR, VEC_WIDEN_MINUS_HI_EXPR,
>> 	VEC_WIDEN_MINUS_LO_EXPR): Likewise
>>
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 1/3] Refactor to allow internal_fn's
  2023-05-05  6:09                                   ` Richard Biener
@ 2023-05-12 12:14                                     ` Andre Vieira (lists)
  2023-05-12 13:18                                       ` Richard Biener
  0 siblings, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-05-12 12:14 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1132 bytes --]

Hi,

I think I tackled all of your comments, let me know if I missed something.


gcc/ChangeLog:

2023-05-12  Andre Vieira  <andre.simoesdiasvieira@arm.com>
             Joel Hutton  <joel.hutton@arm.com>

         * tree-vect-patterns.cc (vect_gimple_build): New Function.
         (vect_recog_widen_op_pattern): Refactor to use code_helper.
         * tree-vect-stmts.cc (vect_gen_widened_results_half): Likewise.
         (vect_create_vectorized_demotion_stmts): Likewise.
         (vect_create_vectorized_promotion_stmts): Likewise.
         (vect_create_half_widening_stmts): Likewise.
         (vectorizable_conversion): Likewise.
         (vectorizable_call): Likewise.
         (supportable_widening_operation): Likewise.
         (supportable_narrowing_operation): Likewise.
         (simple_integer_narrowing): Likewise.
         * tree-vectorizer.h (supportable_widening_operation): Likewise.
         (supportable_narrowing_operation): Likewise.
         (vect_gimple_build): New function prototype.
         * tree.h (code_helper::safe_as_tree_code): New function.
         (code_helper::safe_as_fn_code): New function.

[-- Attachment #2: ifn0v3.patch --]
[-- Type: text/plain, Size: 23124 bytes --]

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 33a8b2bb60601dc1a67de62a56bbf3c355e12dbd..1778af0242898e3dc73d94d22a5b8505628a53b5 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -25,6 +25,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl.h"
 #include "tree.h"
 #include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-fold.h"
 #include "ssa.h"
 #include "expmed.h"
 #include "optabs-tree.h"
@@ -1392,7 +1394,7 @@ vect_recog_sad_pattern (vec_info *vinfo,
 static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
-			     tree_code orig_code, tree_code wide_code,
+			     tree_code orig_code, code_helper wide_code,
 			     bool shift_p, const char *name)
 {
   gimple *last_stmt = last_stmt_info->stmt;
@@ -1435,7 +1437,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
       vecctype = get_vectype_for_scalar_type (vinfo, ctype);
     }
 
-  enum tree_code dummy_code;
+  code_helper dummy_code;
   int dummy_int;
   auto_vec<tree> dummy_vec;
   if (!vectype
@@ -1456,8 +1458,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 		       2, oprnd, half_type, unprom, vectype);
 
   tree var = vect_recog_temp_ssa_var (itype, NULL);
-  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
-					      oprnd[0], oprnd[1]);
+  gimple *pattern_stmt = vect_gimple_build (var, wide_code, oprnd[0], oprnd[1]);
 
   if (vecctype != vecitype)
     pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype,
@@ -6808,3 +6809,20 @@ vect_pattern_recog (vec_info *vinfo)
   /* After this no more add_stmt calls are allowed.  */
   vinfo->stmt_vec_info_ro = true;
 }
+
+/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
+   or internal_fn contained in ch, respectively.  */
+gimple *
+vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1)
+{
+  gcc_assert (op0 != NULL_TREE);
+  if (ch.is_tree_code ())
+    return gimple_build_assign (lhs, (tree_code) ch, op0, op1);
+
+  gcc_assert (ch.is_internal_fn ());
+  gimple* stmt = gimple_build_call_internal (as_internal_fn ((combined_fn) ch),
+					     op1 == NULL_TREE ? 1 : 2,
+					     op0, op1);
+  gimple_call_set_lhs (stmt, lhs);
+  return stmt;
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 61a2da4ecee9c449c1469cab3c4cfa1a782471d5..d152ae9ab10b361b88c0f839d6951c43b954750a 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -3261,13 +3261,13 @@ vectorizable_bswap (vec_info *vinfo,
 
 static bool
 simple_integer_narrowing (tree vectype_out, tree vectype_in,
-			  tree_code *convert_code)
+			  code_helper *convert_code)
 {
   if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype_out))
       || !INTEGRAL_TYPE_P (TREE_TYPE (vectype_in)))
     return false;
 
-  tree_code code;
+  code_helper code;
   int multi_step_cvt = 0;
   auto_vec <tree, 8> interm_types;
   if (!supportable_narrowing_operation (NOP_EXPR, vectype_out, vectype_in,
@@ -3481,7 +3481,7 @@ vectorizable_call (vec_info *vinfo,
   tree callee = gimple_call_fndecl (stmt);
 
   /* First try using an internal function.  */
-  tree_code convert_code = ERROR_MARK;
+  code_helper convert_code = MAX_TREE_CODES;
   if (cfn != CFN_LAST
       && (modifier == NONE
 	  || (modifier == NARROW
@@ -3667,8 +3667,8 @@ vectorizable_call (vec_info *vinfo,
 			  continue;
 			}
 		      new_temp = make_ssa_name (vec_dest);
-		      new_stmt = gimple_build_assign (new_temp, convert_code,
-						      prev_res, half_res);
+		      new_stmt = vect_gimple_build (new_temp, convert_code,
+						    prev_res, half_res);
 		      vect_finish_stmt_generation (vinfo, stmt_info,
 						   new_stmt, gsi);
 		    }
@@ -3758,8 +3758,8 @@ vectorizable_call (vec_info *vinfo,
 		  continue;
 		}
 	      new_temp = make_ssa_name (vec_dest);
-	      new_stmt = gimple_build_assign (new_temp, convert_code,
-					      prev_res, half_res);
+	      new_stmt = vect_gimple_build (new_temp, convert_code, prev_res,
+					    half_res);
 	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    }
 	  else
@@ -4771,7 +4771,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
    STMT_INFO is the original scalar stmt that we are vectorizing.  */
 
 static gimple *
-vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
+vect_gen_widened_results_half (vec_info *vinfo, code_helper ch,
                                tree vec_oprnd0, tree vec_oprnd1, int op_type,
 			       tree vec_dest, gimple_stmt_iterator *gsi,
 			       stmt_vec_info stmt_info)
@@ -4780,12 +4780,11 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code,
   tree new_temp;
 
   /* Generate half of the widened result:  */
-  gcc_assert (op_type == TREE_CODE_LENGTH (code));
   if (op_type != binary_op)
     vec_oprnd1 = NULL;
-  new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1);
+  new_stmt = vect_gimple_build (vec_dest, ch, vec_oprnd0, vec_oprnd1);
   new_temp = make_ssa_name (vec_dest, new_stmt);
-  gimple_assign_set_lhs (new_stmt, new_temp);
+  gimple_set_lhs (new_stmt, new_temp);
   vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 
   return new_stmt;
@@ -4802,7 +4801,7 @@ vect_create_vectorized_demotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds,
 				       stmt_vec_info stmt_info,
 				       vec<tree> &vec_dsts,
 				       gimple_stmt_iterator *gsi,
-				       slp_tree slp_node, enum tree_code code)
+				       slp_tree slp_node, code_helper code)
 {
   unsigned int i;
   tree vop0, vop1, new_tmp, vec_dest;
@@ -4814,9 +4813,9 @@ vect_create_vectorized_demotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds,
       /* Create demotion operation.  */
       vop0 = (*vec_oprnds)[i];
       vop1 = (*vec_oprnds)[i + 1];
-      gassign *new_stmt = gimple_build_assign (vec_dest, code, vop0, vop1);
+      gimple *new_stmt = vect_gimple_build (vec_dest, code, vop0, vop1);
       new_tmp = make_ssa_name (vec_dest, new_stmt);
-      gimple_assign_set_lhs (new_stmt, new_tmp);
+      gimple_set_lhs (new_stmt, new_tmp);
       vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 
       if (multi_step_cvt)
@@ -4864,8 +4863,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 					vec<tree> *vec_oprnds1,
 					stmt_vec_info stmt_info, tree vec_dest,
 					gimple_stmt_iterator *gsi,
-					enum tree_code code1,
-					enum tree_code code2, int op_type)
+					code_helper ch1,
+					code_helper ch2, int op_type)
 {
   int i;
   tree vop0, vop1, new_tmp1, new_tmp2;
@@ -4881,10 +4880,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo,
 	vop1 = NULL_TREE;
 
       /* Generate the two halves of promotion operation.  */
-      new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1,
+      new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
-      new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1,
+      new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1,
 						 op_type, vec_dest, gsi,
 						 stmt_info);
       if (is_gimple_call (new_stmt1))
@@ -4915,7 +4914,7 @@ vect_create_half_widening_stmts (vec_info *vinfo,
 					vec<tree> *vec_oprnds1,
 					stmt_vec_info stmt_info, tree vec_dest,
 					gimple_stmt_iterator *gsi,
-					enum tree_code code1,
+					code_helper code1,
 					int op_type)
 {
   int i;
@@ -4945,13 +4944,13 @@ vect_create_half_widening_stmts (vec_info *vinfo,
 	  new_stmt2 = gimple_build_assign (new_tmp2, NOP_EXPR, vop1);
 	  vect_finish_stmt_generation (vinfo, stmt_info, new_stmt2, gsi);
 	  /* Perform the operation.  With both vector inputs widened.  */
-	  new_stmt3 = gimple_build_assign (vec_dest, code1, new_tmp1, new_tmp2);
+	  new_stmt3 = vect_gimple_build (vec_dest, code1, new_tmp1, new_tmp2);
 	}
       else
 	{
 	  /* Perform the operation.  With the single vector input widened.  */
-	  new_stmt3 = gimple_build_assign (vec_dest, code1, new_tmp1, vop1);
-      }
+	  new_stmt3 = vect_gimple_build (vec_dest, code1, new_tmp1, vop1);
+	}
 
       new_tmp3 = make_ssa_name (vec_dest, new_stmt3);
       gimple_assign_set_lhs (new_stmt3, new_tmp3);
@@ -4981,8 +4980,9 @@ vectorizable_conversion (vec_info *vinfo,
   tree scalar_dest;
   tree op0, op1 = NULL_TREE;
   loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
-  enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK;
-  enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
+  tree_code tc1;
+  code_helper code, code1, code2;
+  code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK;
   tree new_temp;
   enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type};
   int ndts = 2;
@@ -5011,31 +5011,43 @@ vectorizable_conversion (vec_info *vinfo,
       && ! vec_stmt)
     return false;
 
-  gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!stmt)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
     return false;
 
-  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
+  if (gimple_get_lhs (stmt) == NULL_TREE
+      || TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
     return false;
 
-  code = gimple_assign_rhs_code (stmt);
-  if (!CONVERT_EXPR_CODE_P (code)
-      && code != FIX_TRUNC_EXPR
-      && code != FLOAT_EXPR
-      && code != WIDEN_PLUS_EXPR
-      && code != WIDEN_MINUS_EXPR
-      && code != WIDEN_MULT_EXPR
-      && code != WIDEN_LSHIFT_EXPR)
+  if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME)
+    return false;
+
+  if (is_gimple_assign (stmt))
+    {
+      code = gimple_assign_rhs_code (stmt);
+      op_type = TREE_CODE_LENGTH ((tree_code) code);
+    }
+  else if (gimple_call_internal_p (stmt))
+    {
+      code = gimple_call_internal_fn (stmt);
+      op_type = gimple_call_num_args (stmt);
+    }
+  else
     return false;
 
   bool widen_arith = (code == WIDEN_PLUS_EXPR
-		      || code == WIDEN_MINUS_EXPR
-		      || code == WIDEN_MULT_EXPR
-		      || code == WIDEN_LSHIFT_EXPR);
-  op_type = TREE_CODE_LENGTH (code);
+		 || code == WIDEN_MINUS_EXPR
+		 || code == WIDEN_MULT_EXPR
+		 || code == WIDEN_LSHIFT_EXPR);
+
+  if (!widen_arith
+      && !CONVERT_EXPR_CODE_P (code)
+      && code != FIX_TRUNC_EXPR
+      && code != FLOAT_EXPR)
+    return false;
 
   /* Check types of lhs and rhs.  */
-  scalar_dest = gimple_assign_lhs (stmt);
+  scalar_dest = gimple_get_lhs (stmt);
   lhs_type = TREE_TYPE (scalar_dest);
   vectype_out = STMT_VINFO_VECTYPE (stmt_info);
 
@@ -5073,10 +5085,14 @@ vectorizable_conversion (vec_info *vinfo,
 
   if (op_type == binary_op)
     {
-      gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR);
+      gcc_assert (code == WIDEN_MULT_EXPR
+		  || code == WIDEN_LSHIFT_EXPR
+		  || code == WIDEN_PLUS_EXPR
+		  || code == WIDEN_MINUS_EXPR);
+
 
-      op1 = gimple_assign_rhs2 (stmt);
+      op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
+				     gimple_call_arg (stmt, 0);
       tree vectype1_in;
       if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1,
 			       &op1, &slp_op1, &dt[1], &vectype1_in))
@@ -5160,8 +5176,13 @@ vectorizable_conversion (vec_info *vinfo,
 	  && code != FLOAT_EXPR
 	  && !CONVERT_EXPR_CODE_P (code))
 	return false;
-      if (supportable_convert_operation (code, vectype_out, vectype_in, &code1))
+      gcc_assert (code.is_tree_code ());
+      if (supportable_convert_operation ((tree_code) code, vectype_out,
+					 vectype_in, &tc1))
+      {
+	code1 = tc1;
 	break;
+      }
       /* FALLTHRU */
     unsupported:
       if (dump_enabled_p ())
@@ -5172,9 +5193,12 @@ vectorizable_conversion (vec_info *vinfo,
     case WIDEN:
       if (known_eq (nunits_in, nunits_out))
 	{
-	  if (!supportable_half_widening_operation (code, vectype_out,
-						   vectype_in, &code1))
+	  if (!(code.is_tree_code ()
+		&& supportable_half_widening_operation ((tree_code) code,
+							vectype_out, vectype_in,
+							&tc1)))
 	    goto unsupported;
+	  code1 = tc1;
 	  gcc_assert (!(multi_step_cvt && op_type == binary_op));
 	  break;
 	}
@@ -5208,14 +5232,17 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (GET_MODE_SIZE (rhs_mode) == fltsz)
 	    {
-	      if (!supportable_convert_operation (code, vectype_out,
-						  cvt_type, &codecvt1))
+	      tc1 = ERROR_MARK;
+	      gcc_assert (code.is_tree_code ());
+	      if (!supportable_convert_operation ((tree_code) code, vectype_out,
+						  cvt_type, &tc1))
 		goto unsupported;
+	      codecvt1 = tc1;
 	    }
-	  else if (!supportable_widening_operation (vinfo, code, stmt_info,
-						    vectype_out, cvt_type,
-						    &codecvt1, &codecvt2,
-						    &multi_step_cvt,
+	  else if (!supportable_widening_operation (vinfo, code,
+						    stmt_info, vectype_out,
+						    cvt_type, &codecvt1,
+						    &codecvt2, &multi_step_cvt,
 						    &interm_types))
 	    continue;
 	  else
@@ -5223,8 +5250,9 @@ vectorizable_conversion (vec_info *vinfo,
 
 	  if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info,
 					      cvt_type,
-					      vectype_in, &code1, &code2,
-					      &multi_step_cvt, &interm_types))
+					      vectype_in, &code1,
+					      &code2, &multi_step_cvt,
+					      &interm_types))
 	    {
 	      found_mode = true;
 	      break;
@@ -5260,9 +5288,11 @@ vectorizable_conversion (vec_info *vinfo,
       cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
       if (cvt_type == NULL_TREE)
 	goto unsupported;
-      if (!supportable_convert_operation (code, cvt_type, vectype_in,
-					  &codecvt1))
+      if (!code.is_tree_code ()
+	  || !supportable_convert_operation ((tree_code) code, cvt_type,
+					     vectype_in, &tc1))
 	goto unsupported;
+      codecvt1 = tc1;
       if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type,
 					   &code1, &multi_step_cvt,
 					   &interm_types))
@@ -5380,10 +5410,9 @@ vectorizable_conversion (vec_info *vinfo,
       FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	{
 	  /* Arguments are ready, create the new vector stmt.  */
-	  gcc_assert (TREE_CODE_LENGTH (code1) == unary_op);
-	  gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0);
+	  gimple *new_stmt = vect_gimple_build (vec_dest, code1, vop0);
 	  new_temp = make_ssa_name (vec_dest, new_stmt);
-	  gimple_assign_set_lhs (new_stmt, new_temp);
+	  gimple_set_lhs (new_stmt, new_temp);
 	  vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 
 	  if (slp_node)
@@ -5413,17 +5442,16 @@ vectorizable_conversion (vec_info *vinfo,
       for (i = multi_step_cvt; i >= 0; i--)
 	{
 	  tree this_dest = vec_dsts[i];
-	  enum tree_code c1 = code1, c2 = code2;
+	  code_helper c1 = code1, c2 = code2;
 	  if (i == 0 && codecvt2 != ERROR_MARK)
 	    {
 	      c1 = codecvt1;
 	      c2 = codecvt2;
 	    }
 	  if (known_eq (nunits_out, nunits_in))
-	    vect_create_half_widening_stmts (vinfo, &vec_oprnds0,
-						    &vec_oprnds1, stmt_info,
-						    this_dest, gsi,
-						    c1, op_type);
+	    vect_create_half_widening_stmts (vinfo, &vec_oprnds0, &vec_oprnds1,
+					     stmt_info, this_dest, gsi, c1,
+					     op_type);
 	  else
 	    vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0,
 						    &vec_oprnds1, stmt_info,
@@ -5436,9 +5464,8 @@ vectorizable_conversion (vec_info *vinfo,
 	  gimple *new_stmt;
 	  if (cvt_type)
 	    {
-	      gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
 	      new_temp = make_ssa_name (vec_dest);
-	      new_stmt = gimple_build_assign (new_temp, codecvt1, vop0);
+	      new_stmt = vect_gimple_build (new_temp, codecvt1, vop0);
 	      vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    }
 	  else
@@ -5462,10 +5489,8 @@ vectorizable_conversion (vec_info *vinfo,
       if (cvt_type)
 	FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0)
 	  {
-	    gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op);
 	    new_temp = make_ssa_name (vec_dest);
-	    gassign *new_stmt
-	      = gimple_build_assign (new_temp, codecvt1, vop0);
+	    gimple *new_stmt = vect_gimple_build (new_temp, codecvt1, vop0);
 	    vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
 	    vec_oprnds0[i] = new_temp;
 	  }
@@ -12294,9 +12319,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype)
 
 bool
 supportable_widening_operation (vec_info *vinfo,
-				enum tree_code code, stmt_vec_info stmt_info,
+				code_helper code,
+				stmt_vec_info stmt_info,
 				tree vectype_out, tree vectype_in,
-                                enum tree_code *code1, enum tree_code *code2,
+				code_helper *code1,
+				code_helper *code2,
                                 int *multi_step_cvt,
                                 vec<tree> *interm_types)
 {
@@ -12307,7 +12334,7 @@ supportable_widening_operation (vec_info *vinfo,
   optab optab1, optab2;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
-  enum tree_code c1, c2;
+  tree_code c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES;
   int i;
   tree prev_type, intermediate_type;
   machine_mode intermediate_mode, prev_mode;
@@ -12317,8 +12344,12 @@ supportable_widening_operation (vec_info *vinfo,
   if (loop_info)
     vect_loop = LOOP_VINFO_LOOP (loop_info);
 
-  switch (code)
+  switch (code.safe_as_tree_code ())
     {
+    case MAX_TREE_CODES:
+      /* Don't set c1 and c2 if code is not a tree_code.  */
+      break;
+
     case WIDEN_MULT_EXPR:
       /* The result of a vectorized widening operation usually requires
 	 two vectors (because the widened results do not fit into one vector).
@@ -12358,8 +12389,9 @@ supportable_widening_operation (vec_info *vinfo,
 	  && !nested_in_vect_loop_p (vect_loop, stmt_info)
 	  && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR,
 					     stmt_info, vectype_out,
-					     vectype_in, code1, code2,
-					     multi_step_cvt, interm_types))
+					     vectype_in, code1,
+					     code2, multi_step_cvt,
+					     interm_types))
         {
           /* Elements in a vector with vect_used_by_reduction property cannot
              be reordered if the use chain with this property does not have the
@@ -12435,7 +12467,7 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
       optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
     }
-  else if (CONVERT_EXPR_CODE_P (code)
+  else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())
 	   && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
 	   && VECTOR_BOOLEAN_TYPE_P (vectype)
 	   && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
@@ -12460,8 +12492,12 @@ supportable_widening_operation (vec_info *vinfo,
        || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
     return false;
 
-  *code1 = c1;
-  *code2 = c2;
+  if (code.is_tree_code ())
+  {
+    *code1 = c1;
+    *code2 = c2;
+  }
+
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
       && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
@@ -12482,7 +12518,7 @@ supportable_widening_operation (vec_info *vinfo,
   prev_type = vectype;
   prev_mode = vec_mode;
 
-  if (!CONVERT_EXPR_CODE_P (code))
+  if (!CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()))
     return false;
 
   /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS
@@ -12580,9 +12616,9 @@ supportable_widening_operation (vec_info *vinfo,
    narrowing operation (short in the above example).   */
 
 bool
-supportable_narrowing_operation (enum tree_code code,
+supportable_narrowing_operation (code_helper code,
 				 tree vectype_out, tree vectype_in,
-				 enum tree_code *code1, int *multi_step_cvt,
+				 code_helper *code1, int *multi_step_cvt,
                                  vec<tree> *interm_types)
 {
   machine_mode vec_mode;
@@ -12597,8 +12633,11 @@ supportable_narrowing_operation (enum tree_code code,
   unsigned HOST_WIDE_INT n_elts;
   bool uns;
 
+  if (!code.is_tree_code ())
+    return false;
+
   *multi_step_cvt = 0;
-  switch (code)
+  switch ((tree_code) code)
     {
     CASE_CONVERT:
       c1 = VEC_PACK_TRUNC_EXPR;
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 9cf2fb23fe397b467d89aa7cc5ebeaa293ed4cce..f215cd0639bcf803c9d0554cfdc57823431991d5 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -2139,13 +2139,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree,
 				enum vect_def_type *,
 				tree *, stmt_vec_info * = NULL);
 extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree);
-extern bool supportable_widening_operation (vec_info *,
-					    enum tree_code, stmt_vec_info,
-					    tree, tree, enum tree_code *,
-					    enum tree_code *, int *,
-					    vec<tree> *);
-extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
-					     enum tree_code *, int *,
+extern bool supportable_widening_operation (vec_info*, code_helper,
+					    stmt_vec_info, tree, tree,
+					    code_helper*, code_helper*,
+					    int*, vec<tree> *);
+extern bool supportable_narrowing_operation (code_helper, tree, tree,
+					     code_helper *, int *,
 					     vec<tree> *);
 
 extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
@@ -2583,4 +2582,7 @@ vect_is_integer_truncation (stmt_vec_info stmt_info)
 	  && TYPE_PRECISION (lhs_type) < TYPE_PRECISION (rhs_type));
 }
 
+/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code,
+   or internal_fn contained in ch, respectively.  */
+gimple * vect_gimple_build (tree, code_helper, tree, tree = NULL_TREE);
 #endif  /* GCC_TREE_VECTORIZER_H  */
diff --git a/gcc/tree.h b/gcc/tree.h
index 0b72663e6a1a94406127f6253460f498b7a3ea9c..6dcb28ebc1df456e3798b8f0b43bae42c145d43d 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -93,6 +93,8 @@ public:
   bool is_internal_fn () const;
   bool is_builtin_fn () const;
   int get_rep () const { return rep; }
+  tree_code safe_as_tree_code () const;
+  combined_fn safe_as_fn_code () const;
   bool operator== (const code_helper &other) { return rep == other.rep; }
   bool operator!= (const code_helper &other) { return rep != other.rep; }
   bool operator== (tree_code c) { return rep == code_helper (c).rep; }
@@ -102,6 +104,25 @@ private:
   int rep;
 };
 
+/* Helper function that returns the tree_code representation of THIS
+   code_helper if it is a tree_code and MAX_TREE_CODES otherwise.  This is
+   useful when passing a code_helper to a tree_code only check.  */
+
+inline tree_code
+code_helper::safe_as_tree_code () const
+{
+  return is_tree_code () ? (tree_code) *this : MAX_TREE_CODES;
+}
+
+/* Helper function that returns the combined_fn representation of THIS
+   code_helper if it is a fn_code and CFN_LAST otherwise.  This is useful when
+   passing a code_helper to a combined_fn only check.  */
+
+inline combined_fn
+code_helper::safe_as_fn_code () const {
+  return is_fn_code () ? (combined_fn) *this : CFN_LAST;
+}
+
 inline code_helper::operator internal_fn () const
 {
   return as_internal_fn (combined_fn (*this));

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-03 19:07                                 ` Richard Sandiford
@ 2023-05-12 12:16                                   ` Andre Vieira (lists)
  2023-05-12 13:28                                     ` Richard Biener
  0 siblings, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-05-12 12:16 UTC (permalink / raw)
  To: Richard Biener, Richard Biener, gcc-patches, richard.sandiford

[-- Attachment #1: Type: text/plain, Size: 4743 bytes --]

I have dealt with, I think..., most of your comments. There's quite a 
few changes, I think it's all a bit simpler now. I made some other 
changes to the costing in tree-inline.cc and gimple-range-op.cc in which 
I try to preserve the same behaviour as we had with the tree codes 
before. Also added some extra checks to tree-cfg.cc that made sense to me.

I am still regression testing the gimple-range-op change, as that was a 
last minute change, but the rest survived a bootstrap and regression 
test on aarch64-unknown-linux-gnu.

cover letter:

This patch replaces the existing tree_code widen_plus and widen_minus
patterns with internal_fn versions.

DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and 
DEF_INTERNAL_OPTAB_NARROWING_HILO_FN are like 
DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively 
except they provide convenience wrappers for defining conversions that 
require a hi/lo split.  Each definition for <NAME> will require optabs 
for _hi and _lo and each of those will also require a signed and 
unsigned version in the case of widening. The hi/lo pair is necessary 
because the widening and narrowing operations take n narrow elements as 
inputs and return n/2 wide elements as outputs. The 'lo' operation 
operates on the first n/2 elements of input. The 'hi' operation operates 
on the second n/2 elements of input. Defining an internal_fn along with 
hi/lo variations allows a single internal function to be returned from a 
vect_recog function that will later be expanded to hi/lo.


  For example:
  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> -> 
(u/s)addl2
                        IFN_VEC_WIDEN_PLUS_LO  -> 
vec_widen_<su>add_lo_<mode> -> (u/s)addl

This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS 
tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.

gcc/ChangeLog:

2023-05-12  Andre Vieira  <andre.simoesdiasvieira@arm.com>
             Joel Hutton  <joel.hutton@arm.com>
             Tamar Christina  <tamar.christina@arm.com>

         * config/aarch64/aarch64-simd.md 
(vec_widen_<su>addl_lo_<mode>): Rename
         this ...
         (vec_widen_<su>add_lo_<mode>): ... to this.
         (vec_widen_<su>addl_hi_<mode>): Rename this ...
         (vec_widen_<su>add_hi_<mode>): ... to this.
         (vec_widen_<su>subl_lo_<mode>): Rename this ...
         (vec_widen_<su>sub_lo_<mode>): ... to this.
         (vec_widen_<su>subl_hi_<mode>): Rename this ...
         (vec_widen_<su>sub_hi_<mode>): ...to this.
         * doc/generic.texi: Document new IFN codes.
	* internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to define an
         internal_fn that expands into multiple internal_fns for widening.
         (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing.
  	(ifn_cmp): Function to compare ifn's for sorting/searching.
	(lookup_hilo_internal_fn): Add lookup function.
	(commutative_binary_fn_p): Add widen_plus fn's.
	(widening_fn_p): New function.
	(narrowing_fn_p): New function.
	(decomposes_to_hilo_fn_p): New function.
         (direct_internal_fn_optab): Change visibility.
	* internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define widening
     plus,minus functions.
	(VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code.
	(VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code.
	* internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
         (direct_internal_fn_optab): Declare new prototype.
	(lookup_hilo_internal_fn): Likewise.
	(widening_fn_p): Likewise.
	(Narrowing_fn_p): Likewise.
	(decomposes_to_hilo_fn_p): Likewise.
	* optabs.cc (commutative_optab_p): Add widening plus optabs.
	* optabs.def (OPTAB_D): Define widen add, sub optabs.
         * tree-cfg.cc (verify_gimple_call): Add checks for new widen
         add and sub IFNs.
         * tree-inline.cc (estimate_num_insns): Return same
         cost for widen add and sub IFNs as previous tree_codes.
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
     patterns with a hi/lo split.
         (vect_recog_sad_pattern): Refactor to use new IFN codes.
         (vect_recog_widen_plus_pattern): Likewise.
         (vect_recog_widen_minus_pattern): Likewise.
         (vect_recog_average_pattern): Likewise.
	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
         _HILO IFNs.
	(supportable_widening_operation): Likewise.
         * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-widen-add.c: Test that new
     IFN_VEC_WIDEN_PLUS is being used.
	* gcc.target/aarch64/vect-widen-sub.c: Test that new
     IFN_VEC_WIDEN_MINUS is being used.

[-- Attachment #2: ifn1v3.patch --]
[-- Type: text/plain, Size: 34240 bytes --]

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bfc98a8d943467b33390defab9682f44efab5907..ffbbecb9409e1c2835d658c2a8855cd0e955c0f2 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4626,7 +4626,7 @@
   [(set_attr "type" "neon_<ADDSUB:optab>_long")]
 )
 
-(define_expand "vec_widen_<su>addl_lo_<mode>"
+(define_expand "vec_widen_<su>add_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4638,7 +4638,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>addl_hi_<mode>"
+(define_expand "vec_widen_<su>add_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4650,7 +4650,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_lo_<mode>"
+(define_expand "vec_widen_<su>sub_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4662,7 +4662,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_hi_<mode>"
+(define_expand "vec_widen_<su>sub_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 8b2882da4fe7da07d22b4e5384d049ba7d3907bf..0fd7e6cce8bbd4ecb8027b702722adcf6c32eb55 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,6 +1811,10 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
+@tindex IFN_VEC_WIDEN_PLUS_HI
+@tindex IFN_VEC_WIDEN_PLUS_LO
+@tindex IFN_VEC_WIDEN_MINUS_HI
+@tindex IFN_VEC_WIDEN_MINUS_LO
 @tindex VEC_WIDEN_PLUS_HI_EXPR
 @tindex VEC_WIDEN_PLUS_LO_EXPR
 @tindex VEC_WIDEN_MINUS_HI_EXPR
@@ -1861,6 +1865,33 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
+@item IFN_VEC_WIDEN_PLUS_HI
+@itemx IFN_VEC_WIDEN_PLUS_LO
+These internal functions represent widening vector addition of the high and low
+parts of the two input vectors, respectively.  Their operands are vectors that
+contain the same number of elements (@code{N}) of the same integral type. The
+result is a vector that contains half as many elements, of an integral type
+whose size is twice as wide.  In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the
+high @code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} products.  In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low
+@code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} products.
+
+@item IFN_VEC_WIDEN_MINUS_HI
+@itemx IFN_VEC_WIDEN_MINUS_LO
+These internal functions represent widening vector subtraction of the high and
+low parts of the two input vectors, respectively.  Their operands are vectors
+that contain the same number of elements (@code{N}) of the same integral type.
+The high/low elements of the second vector are subtracted from the high/low
+elements of the first. The result is a vector that contains half as many
+elements, of an integral type whose size is twice as wide.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second
+vector are subtracted from the high @code{N/2} of the first to produce the
+vector of @code{N/2} products.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second
+vector are subtracted from the low @code{N/2} of the first to produce the
+vector of @code{N/2} products.
+
 @item VEC_WIDEN_PLUS_HI_EXPR
 @itemx VEC_WIDEN_PLUS_LO_EXPR
 These nodes represent widening vector addition of the high and low parts of
diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 594bd3043f0e944299ddfff219f757ef15a3dd61..66636d82df27626e7911efd0cb8526921b39633f 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -1187,6 +1187,7 @@ gimple_range_op_handler::maybe_non_standard ()
 {
   range_operator *signed_op = ptr_op_widen_mult_signed;
   range_operator *unsigned_op = ptr_op_widen_mult_unsigned;
+  bool signed1, signed2, signed_ret;
   if (gimple_code (m_stmt) == GIMPLE_ASSIGN)
     switch (gimple_assign_rhs_code (m_stmt))
       {
@@ -1202,32 +1203,55 @@ gimple_range_op_handler::maybe_non_standard ()
 	  m_op1 = gimple_assign_rhs1 (m_stmt);
 	  m_op2 = gimple_assign_rhs2 (m_stmt);
 	  tree ret = gimple_assign_lhs (m_stmt);
-	  bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
-	  bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
-	  bool signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
-
-	  /* Normally these operands should all have the same sign, but
-	     some passes and violate this by taking mismatched sign args.  At
-	     the moment the only one that's possible is mismatch inputs and
-	     unsigned output.  Once ranger supports signs for the operands we
-	     can properly fix it,  for now only accept the case we can do
-	     correctly.  */
-	  if ((signed1 ^ signed2) && signed_ret)
-	    return;
-
-	  m_valid = true;
-	  if (signed2 && !signed1)
-	    std::swap (m_op1, m_op2);
-
-	  if (signed1 || signed2)
-	    m_int = signed_op;
-	  else
-	    m_int = unsigned_op;
+	  signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+	  signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+	  signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
 	  break;
 	}
 	default:
-	  break;
+	  return;
       }
+  else if (gimple_code (m_stmt) == GIMPLE_CALL
+      && gimple_call_internal_p (m_stmt)
+      && gimple_get_lhs (m_stmt) != NULL_TREE)
+    switch (gimple_call_internal_fn (m_stmt))
+      {
+      case IFN_VEC_WIDEN_PLUS_LO:
+      case IFN_VEC_WIDEN_PLUS_HI:
+	  {
+	    signed_op = ptr_op_widen_plus_signed;
+	    unsigned_op = ptr_op_widen_plus_unsigned;
+	    m_valid = false;
+	    m_op1 = gimple_call_arg (m_stmt, 0);
+	    m_op2 = gimple_call_arg (m_stmt, 1);
+	    tree ret = gimple_get_lhs (m_stmt);
+	    signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+	    signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+	    signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
+	    break;
+	  }
+      default:
+	return;
+      }
+  else
+    return;
+
+    /* Normally these operands should all have the same sign, but some passes
+       and violate this by taking mismatched sign args.  At the moment the only
+       one that's possible is mismatch inputs and unsigned output.  Once ranger
+       supports signs for the operands we can properly fix it,  for now only
+       accept the case we can do correctly.  */
+    if ((signed1 ^ signed2) && signed_ret)
+      return;
+
+    m_valid = true;
+    if (signed2 && !signed1)
+      std::swap (m_op1, m_op2);
+
+    if (signed1 || signed2)
+      m_int = signed_op;
+    else
+      m_int = unsigned_op;
 }
 
 // Set up a gimple_range_op_handler for any built in function which can be
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..1acea5ae33046b70de247b1688aea874d9956abc 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -90,6 +90,19 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+/*  Given an internal_fn IFN that is a HILO function, return its corresponding
+    LO and HI internal_fns.  */
+
+extern void
+lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
+{
+  gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -137,7 +150,16 @@ const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = {
 #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) TYPE##_direct,
 #define DEF_INTERNAL_SIGNED_OPTAB_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \
 				     UNSIGNED_OPTAB, TYPE) TYPE##_direct,
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \
+					    UNSIGNED_OPTAB, TYPE)		  \
+TYPE##_direct, TYPE##_direct, TYPE##_direct,
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE)	\
+TYPE##_direct, TYPE##_direct, TYPE##_direct,
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
   not_direct
 };
 
@@ -3852,7 +3874,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
 
 /* Return the optab used by internal function FN.  */
 
-static optab
+optab
 direct_internal_fn_optab (internal_fn fn, tree_pair types)
 {
   switch (fn)
@@ -3971,6 +3993,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS_HILO:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -4044,6 +4069,88 @@ first_commutative_argument (internal_fn fn)
     }
 }
 
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as wide as the element size of the input vectors.  */
+
+bool
+widening_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \
+    case IFN_##NAME##_HILO:\
+    case IFN_##NAME##_HI: \
+    case IFN_##NAME##_LO: \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as narrow as the element size of the input vectors.  */
+
+bool
+narrowing_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \
+    case IFN_##NAME##_HILO:\
+    case IFN_##NAME##_HI: \
+    case IFN_##NAME##_LO: \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if FN decomposes to _hi and _lo IFN.  */
+
+bool
+decomposes_to_hilo_fn_p (internal_fn fn)
+{
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \
+    case IFN_##NAME##_HILO:\
+      return true;
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \
+    case IFN_##NAME##_HILO:\
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
 /* Return true if IFN_SET_EDOM is supported.  */
 
 bool
@@ -4071,7 +4178,33 @@ set_edom_supported_p (void)
     optab which_optab = direct_internal_fn_optab (fn, types);		\
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);			\
   }
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR,	    \
+					    SIGNED_OPTAB, UNSIGNED_OPTAB,   \
+					    TYPE)			    \
+  static void								    \
+  expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED,		    \
+			gcall *stmt ATTRIBUTE_UNUSED)			    \
+  {									    \
+    gcc_unreachable ();							    \
+  }									    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_HI, FLAGS, SELECTOR, SIGNED_OPTAB,    \
+			       UNSIGNED_OPTAB, TYPE)			    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_LO, FLAGS, SELECTOR, SIGNED_OPTAB,    \
+			       UNSIGNED_OPTAB, TYPE)
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE)	\
+  static void								\
+  expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED,		\
+			gcall *stmt ATTRIBUTE_UNUSED)			\
+  {									\
+    gcc_unreachable ();							\
+  }									\
+  DEF_INTERNAL_OPTAB_FN(CODE##_LO, FLAGS, OPTAB, TYPE)			\
+  DEF_INTERNAL_OPTAB_FN(CODE##_HI, FLAGS, OPTAB, TYPE)
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_FN
+#undef DEF_INTERNAL_SIGNED_OPTAB_FN
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
@@ -4080,6 +4213,7 @@ set_edom_supported_p (void)
 
    where STMT is the statement that performs the call. */
 static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = {
+
 #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE,
 #include "internal-fn.def"
   0
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..012dd323b86dd7cfcc5c13d3a2bb2a453937155d 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -85,6 +85,13 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_SIGNED_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it
+   provides convenience wrappers for defining conversions that require a
+   hi/lo split, like widening and narrowing operations.  Each definition
+   for <NAME> will require an optab named <OPTAB> and two other optabs that
+   you specify for signed and unsigned.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -123,6 +130,20 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \
+  DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE) \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE)
+#endif
+
+#ifndef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, FLAGS, OPTAB, TYPE) \
+  DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, OPTAB##_lo, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, OPTAB##_hi, TYPE)
+#endif
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -315,6 +336,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_PLUS,
+				     ECF_CONST | ECF_NOTHROW,
+				     first,
+				     vec_widen_sadd, vec_widen_uadd,
+				     binary)
+DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_MINUS,
+				     ECF_CONST | ECF_NOTHROW,
+				     first,
+				     vec_widen_ssub, vec_widen_usub,
+				     binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 08922ed4254898f5fffca3f33973e96ed9ce772f..8ba07d6d1338e75bc5a451d9e403112a608f3ea2 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,8 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *);
+extern optab direct_internal_fn_optab (internal_fn, tree_pair);
 
 /* Return the ECF_* flags for function FN.  */
 
@@ -210,6 +216,9 @@ extern bool commutative_binary_fn_p (internal_fn);
 extern bool commutative_ternary_fn_p (internal_fn);
 extern int first_commutative_argument (internal_fn);
 extern bool associative_binary_fn_p (internal_fn);
+extern bool widening_fn_p (code_helper);
+extern bool narrowing_fn_p (code_helper);
+extern bool decomposes_to_hilo_fn_p (internal_fn);
 
 extern bool set_edom_supported_p (void);
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index c8e39c82d57a7d726e7da33d247b80f32ec9236c..5a08d91e550b2d92e9572211f811fdba99a33a38 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1314,7 +1314,15 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == smul_widen_optab
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
-	  || binoptab == umul_highpart_optab);
+	  || binoptab == umul_highpart_optab
+	  || binoptab == vec_widen_saddl_hi_optab
+	  || binoptab == vec_widen_saddl_lo_optab
+	  || binoptab == vec_widen_uaddl_hi_optab
+	  || binoptab == vec_widen_uaddl_lo_optab
+	  || binoptab == vec_widen_sadd_hi_optab
+	  || binoptab == vec_widen_sadd_lo_optab
+	  || binoptab == vec_widen_uadd_hi_optab
+	  || binoptab == vec_widen_uadd_lo_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 695f5911b300c9ca5737de9be809fa01aabe5e01..16d121722c8c5723d9b164f5a2c616dc7ec143de 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -410,6 +410,10 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a")
 OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a")
 OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a")
 OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a")
+OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a")
+OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a")
+OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a")
+OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a")
 OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
 OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
 OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")
@@ -422,6 +426,10 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a")
 OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
 OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
 OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
+OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a")
+OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a")
+OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a")
+OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a")
 OPTAB_D (vec_addsub_optab, "vec_addsub$a3")
 OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4")
 OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 0aeebb67fac864db284985f4a6f0653af281d62b..28464ad9e3a7ea25557ffebcdbdbc1340f9e0d8b 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "profile.h"
 #include "sreal.h"
+#include "internal-fn.h"
 
 /* This file contains functions for building the Control Flow Graph (CFG)
    for a function tree.  */
@@ -3411,6 +3412,52 @@ verify_gimple_call (gcall *stmt)
 	  debug_generic_stmt (fn);
 	  return true;
 	}
+      internal_fn ifn = gimple_call_internal_fn (stmt);
+      if (ifn == IFN_LAST)
+	{
+	  error ("gimple call has an invalid IFN");
+	  debug_generic_stmt (fn);
+	  return true;
+	}
+      else if (decomposes_to_hilo_fn_p (ifn))
+	{
+	  /* Non decomposed HILO stmts should not appear in IL, these are
+	     merely used as an internal representation to the auto-vectorizer
+	     pass and should have been expanded to their _LO _HI variants.  */
+	  error ("gimple call has an non decomposed HILO IFN");
+	  debug_generic_stmt (fn);
+	  return true;
+	}
+      else if (ifn == IFN_VEC_WIDEN_PLUS_LO
+	       || ifn == IFN_VEC_WIDEN_PLUS_HI
+	       || ifn == IFN_VEC_WIDEN_MINUS_LO
+	       || ifn == IFN_VEC_WIDEN_MINUS_HI)
+	{
+	  tree rhs1_type = TREE_TYPE (gimple_call_arg (stmt, 0));
+	  tree rhs2_type = TREE_TYPE (gimple_call_arg (stmt, 1));
+	  tree lhs_type = TREE_TYPE (gimple_get_lhs (stmt));
+	  if (TREE_CODE (lhs_type) == VECTOR_TYPE)
+	    {
+	      if (TREE_CODE (rhs1_type) != VECTOR_TYPE
+		  || TREE_CODE (rhs2_type) != VECTOR_TYPE)
+		{
+		  error ("invalid non-vector operands in vector IFN call");
+		  debug_generic_stmt (fn);
+		  return true;
+		}
+	      lhs_type = TREE_TYPE (lhs_type);
+	      rhs1_type = TREE_TYPE (rhs1_type);
+	      rhs2_type = TREE_TYPE (rhs2_type);
+	    }
+	  if (POINTER_TYPE_P (lhs_type)
+	      || POINTER_TYPE_P (rhs1_type)
+	      || POINTER_TYPE_P (rhs2_type))
+	    {
+	      error ("invalid (pointer) operands in vector IFN call");
+	      debug_generic_stmt (fn);
+	      return true;
+	    }
+	}
     }
   else
     {
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 63a19f8d1d89c6bd5d8e55a299cbffaa324b4b84..d74d8db2173b1ab117250fea89de5212d5e354ec 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4433,7 +4433,20 @@ estimate_num_insns (gimple *stmt, eni_weights *weights)
 	tree decl;
 
 	if (gimple_call_internal_p (stmt))
-	  return 0;
+	  {
+	    internal_fn fn = gimple_call_internal_fn (stmt);
+	    switch (fn)
+	      {
+	      case IFN_VEC_WIDEN_PLUS_HI:
+	      case IFN_VEC_WIDEN_PLUS_LO:
+	      case IFN_VEC_WIDEN_MINUS_HI:
+	      case IFN_VEC_WIDEN_MINUS_LO:
+		return 1;
+
+	      default:
+		return 0;
+	      }
+	  }
 	else if ((decl = gimple_call_fndecl (stmt))
 		 && fndecl_built_in_p (decl))
 	  {
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 1778af0242898e3dc73d94d22a5b8505628a53b5..93cebc72beb4f65249a69b2665dfeb8a0991c1d1 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-		      tree_code widened_code, bool shift_p,
+		      code_helper widened_code, bool shift_p,
 		      unsigned int max_nops,
 		      vect_unpromoted_value *unprom, tree *common_type,
 		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
+    return 0;
+
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else if (is_gimple_call (stmt))
+    rhs_code = gimple_call_combined_fn (stmt);
+  else
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  if (rhs_code != code
+      && rhs_code != widened_code)
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = gimple_get_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op = gimple_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
 	{
 	  /* We already have a common type from earlier operands.
@@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+			     IFN_VEC_WIDEN_MINUS_HILO,
 			     false, 2, unprom, &half_type))
     return NULL;
 
@@ -1395,14 +1405,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
 			     tree_code orig_code, code_helper wide_code,
-			     bool shift_p, const char *name)
+			     bool shift_p, const char *name,
+			     optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-			     shift_p, 2, unprom, &half_type))
+			     shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 			      type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+			     stmt_vec_info last_stmt_info, tree *type_out,
+			     tree_code orig_code, internal_fn wide_ifn,
+			     bool shift_p, const char *name,
+			     optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+				      orig_code, ifn, shift_p, name,
+				      subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS_HILO.  See vect_recog_widen_op_pattern for details.  */
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-				      "vect_recog_widen_plus_pattern");
+				      PLUS_EXPR, IFN_VEC_WIDEN_PLUS_HILO,
+				      false, "vect_recog_widen_plus_pattern",
+				      &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS_HILO.  See vect_recog_widen_op_pattern for details.  */
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-				      "vect_recog_widen_minus_pattern");
+				      MINUS_EXPR, IFN_VEC_WIDEN_MINUS_HILO,
+				      false, "vect_recog_widen_minus_pattern",
+				      &subtype);
 }
 
 /* Function vect_recog_ctz_ffs_pattern
@@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo,
   vect_unpromoted_value unprom[3];
   tree new_type;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-					    WIDEN_PLUS_EXPR, false, 3,
+					    IFN_VEC_WIDEN_PLUS_HILO, false, 3,
 					    unprom, &new_type);
   if (nops == 0)
     return NULL;
@@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index d152ae9ab10b361b88c0f839d6951c43b954750a..24c811ebe01fb8b003100dea494cf64fea72a975 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5038,7 +5038,9 @@ vectorizable_conversion (vec_info *vinfo,
   bool widen_arith = (code == WIDEN_PLUS_EXPR
 		 || code == WIDEN_MINUS_EXPR
 		 || code == WIDEN_MULT_EXPR
-		 || code == WIDEN_LSHIFT_EXPR);
+		 || code == WIDEN_LSHIFT_EXPR
+		 || code == IFN_VEC_WIDEN_PLUS_HILO
+		 || code == IFN_VEC_WIDEN_MINUS_HILO);
 
   if (!widen_arith
       && !CONVERT_EXPR_CODE_P (code)
@@ -5088,7 +5090,9 @@ vectorizable_conversion (vec_info *vinfo,
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
 		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR);
+		  || code == WIDEN_MINUS_EXPR
+		  || code == IFN_VEC_WIDEN_PLUS_HILO
+		  || code == IFN_VEC_WIDEN_MINUS_HILO);
 
 
       op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
@@ -12478,10 +12482,43 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
+
+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn ((combined_fn) code);
+      gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+      internal_fn lo, hi;
+      lookup_hilo_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = direct_internal_fn_optab (lo, {vectype, vectype});
+      optab2 = direct_internal_fn_optab (hi, {vectype, vectype});
+    }
+  else if (code.is_tree_code ())
     {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+      if (code == FIX_TRUNC_EXPR)
+	{
+	  /* The signedness is determined from output operand.  */
+	  optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
+	  optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+	}
+      else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ())
+	       && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+	       && VECTOR_BOOLEAN_TYPE_P (vectype)
+	       && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+	       && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+	{
+	  /* If the input and result modes are the same, a different optab
+	     is needed where we pass in the number of units in vectype.  */
+	  optab1 = vec_unpacks_sbool_lo_optab;
+	  optab2 = vec_unpacks_sbool_hi_optab;
+	}
+      else
+	{
+	  optab1 = optab_for_tree_code (c1, vectype, optab_default);
+	  optab2 = optab_for_tree_code (c2, vectype, optab_default);
+	}
     }
 
   if (!optab1 || !optab2)
diff --git a/gcc/tree.def b/gcc/tree.def
index 90ceeec0b512bfa5f983359c0af03cc71de32007..b37b0b35927b92a6536e5c2d9805ffce8319a240 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3)
 DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2)
 
 /* Widening sad (sum of absolute differences).
-   The first two arguments are of type t1 which should be integer.
-   The third argument and the result are of type t2, such that t2 is at least
-   twice the size of t1.  Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
+   The first two arguments are of type t1 which should be a vector of integers.
+   The third argument and the result are of type t2, such that the size of
+   the elements of t2 is at least twice the size of the elements of t1.
+   Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
    equivalent to:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = PLUS_EXPR (tmp2, arg3)
   or:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
  */

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/3] Remove widen_plus/minus_expr tree codes
  2023-05-10  9:15                                 ` Andre Vieira (lists)
@ 2023-05-12 12:18                                   ` Andre Vieira (lists)
  0 siblings, 0 replies; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-05-12 12:18 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1583 bytes --]

Moved the 'changes' from this patch back to the second so it's all just 
about removing code that we no longer use. I don't really know why Joel 
formatted the patches this way, but I thought I'd keep it as is for now.

cover letter:

This patch removes the old widen plus/minus tree codes which have been
replaced by internal functions.

gcc/ChangeLog:

2023-05-12  Andre Vieira  <andre.simoesdiasvieira@arm.com>
             Joel Hutton  <joel.hutton@arm.com>

	* cfgexpand.cc (expand_debug_expr): Remove old tree codes.
	* doc/generic.texi: Likewise.
	* expr.cc (expand_expr_real_2): Likewise.
	* gimple-pretty-print.cc (dump_binary_rhs): Likewise.
	* gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard):
	Likewise.
	* optabs-tree.cc (optab_for_tree_code): Likewise.
	(supportable_half_widening_operation): Likewise.
	* optabs.cc (commutative_optab_p): Likewise.
         * optabs.def (OPTAB_D): Likewise.
         * tree-cfg.cc (verify_gimple_assign_binary): Likewise.
	* tree-inline.cc (estimate_operator_cost): Likewise.
	(op_symbol_code): Likewise.
	* tree-pretty-print.cc (dump_generic_node): Remove tree code definition.
	* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Likewise.
	(vect_analyze_data_ref_accesses): Likewise.
	* tree-vect-generic.cc (expand_vector_operations_1): Likewise.
	* tree-vect-stmts.cc (vectorizable_conversion): Likewise.
	(supportable_widening_operation): Likewise.
	* tree.def (WIDEN_PLUS_EXPR, WIDEN_MINUS_EXPR, VEC_WIDEN_PLUS_HI_EXPR,
	VEC_WIDEN_PLUS_LO_EXPR, VEC_WIDEN_MINUS_HI_EXPR,
	VEC_WIDEN_MINUS_LO_EXPR): Likewise.

[-- Attachment #2: ifn2v3.patch --]
[-- Type: text/plain, Size: 17304 bytes --]

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 1a1b26b1c6c23ce273bcd08dc9a973f777174007..25b1558dcb941ea491a19aeeb2cd8f4d2dbdf7c6 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -5365,10 +5365,6 @@ expand_debug_expr (tree exp)
     case VEC_WIDEN_MULT_ODD_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_PERM_EXPR:
     case VEC_DUPLICATE_EXPR:
     case VEC_SERIES_EXPR:
@@ -5405,8 +5401,6 @@ expand_debug_expr (tree exp)
     case WIDEN_MULT_EXPR:
     case WIDEN_MULT_PLUS_EXPR:
     case WIDEN_MULT_MINUS_EXPR:
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
       if (SCALAR_INT_MODE_P (GET_MODE (op0))
 	  && SCALAR_INT_MODE_P (mode))
 	{
@@ -5419,10 +5413,6 @@ expand_debug_expr (tree exp)
 	    op1 = simplify_gen_unary (ZERO_EXTEND, mode, op1, inner_mode);
 	  else
 	    op1 = simplify_gen_unary (SIGN_EXTEND, mode, op1, inner_mode);
-	  if (TREE_CODE (exp) == WIDEN_PLUS_EXPR)
-	    return simplify_gen_binary (PLUS, mode, op0, op1);
-	  else if (TREE_CODE (exp) == WIDEN_MINUS_EXPR)
-	    return simplify_gen_binary (MINUS, mode, op0, op1);
 	  op0 = simplify_gen_binary (MULT, mode, op0, op1);
 	  if (TREE_CODE (exp) == WIDEN_MULT_EXPR)
 	    return op0;
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 0fd7e6cce8bbd4ecb8027b702722adcf6c32eb55..a23d57af20610e0bb4809f06fb0c91253ae56d11 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1815,10 +1815,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex IFN_VEC_WIDEN_PLUS_LO
 @tindex IFN_VEC_WIDEN_MINUS_HI
 @tindex IFN_VEC_WIDEN_MINUS_LO
-@tindex VEC_WIDEN_PLUS_HI_EXPR
-@tindex VEC_WIDEN_PLUS_LO_EXPR
-@tindex VEC_WIDEN_MINUS_HI_EXPR
-@tindex VEC_WIDEN_MINUS_LO_EXPR
 @tindex VEC_UNPACK_HI_EXPR
 @tindex VEC_UNPACK_LO_EXPR
 @tindex VEC_UNPACK_FLOAT_HI_EXPR
@@ -1892,33 +1888,6 @@ vector of @code{N/2} products.  In the case of
 vector are subtracted from the low @code{N/2} of the first to produce the
 vector of @code{N/2} products.
 
-@item VEC_WIDEN_PLUS_HI_EXPR
-@itemx VEC_WIDEN_PLUS_LO_EXPR
-These nodes represent widening vector addition of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The result
-is a vector that contains half as many elements, of an integral type whose size
-is twice as wide.  In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.  In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low
-@code{N/2} elements of the two vectors are added to produce the vector of
-@code{N/2} products.
-
-@item VEC_WIDEN_MINUS_HI_EXPR
-@itemx VEC_WIDEN_MINUS_LO_EXPR
-These nodes represent widening vector subtraction of the high and low parts of
-the two input vectors, respectively.  Their operands are vectors that contain
-the same number of elements (@code{N}) of the same integral type. The high/low
-elements of the second vector are subtracted from the high/low elements of the
-first. The result is a vector that contains half as many elements, of an
-integral type whose size is twice as wide.  In the case of
-@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second
-vector are subtracted from the high @code{N/2} of the first to produce the
-vector of @code{N/2} products.  In the case of
-@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second
-vector are subtracted from the low @code{N/2} of the first to produce the
-vector of @code{N/2} products.
-
 @item VEC_UNPACK_HI_EXPR
 @itemx VEC_UNPACK_LO_EXPR
 These nodes represent unpacking of the high and low parts of the input vector,
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 758dda9ec68a8ba7a7b0e247aee50fd7996aa1d7..dd03688167b04299be213a5c379876499cb6a317 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -9600,8 +9600,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 					  target, unsignedp);
       return target;
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_MULT_EXPR:
       /* If first operand is constant, swap them.
 	 Thus the following special case checks need only
@@ -10379,10 +10377,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	return temp;
       }
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc
index e46f7d5f55a31bf6453cd33683aa536f7fbe606f..8db221f65fe7e2fc1ce25685240f11516af87fe6 100644
--- a/gcc/gimple-pretty-print.cc
+++ b/gcc/gimple-pretty-print.cc
@@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc,
     case VEC_PACK_FLOAT_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_SERIES_EXPR:
       for (p = get_tree_code_name (code); *p; p++)
 	pp_character (buffer, TOUPPER (*p));
diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 66636d82df27626e7911efd0cb8526921b39633f..466985bfd39a147d47ac525b7fe9bc3fd2d0b7b3 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -1191,12 +1191,6 @@ gimple_range_op_handler::maybe_non_standard ()
   if (gimple_code (m_stmt) == GIMPLE_ASSIGN)
     switch (gimple_assign_rhs_code (m_stmt))
       {
-	case WIDEN_PLUS_EXPR:
-	{
-	  signed_op = ptr_op_widen_plus_signed;
-	  unsigned_op = ptr_op_widen_plus_unsigned;
-	}
-	gcc_fallthrough ();
 	case WIDEN_MULT_EXPR:
 	{
 	  m_valid = false;
diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc
index 8010046c6a8b3e809c989ddef7a06ddaa68ae32a..ee1aa8c9676ee9c67edbf403e6295da391826a62 100644
--- a/gcc/optabs-tree.cc
+++ b/gcc/optabs-tree.cc
@@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
       return (TYPE_UNSIGNED (type)
 	      ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab);
 
-    case VEC_WIDEN_PLUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab);
-
-    case VEC_WIDEN_PLUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab);
-
-    case VEC_WIDEN_MINUS_LO_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab);
-
-    case VEC_WIDEN_MINUS_HI_EXPR:
-      return (TYPE_UNSIGNED (type)
-	      ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab);
-
     case VEC_UNPACK_HI_EXPR:
       return (TYPE_UNSIGNED (type)
 	      ? vec_unpacku_hi_optab : vec_unpacks_hi_optab);
@@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type,
    'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO.
 
    Supported widening operations:
-    WIDEN_MINUS_EXPR
-    WIDEN_PLUS_EXPR
     WIDEN_MULT_EXPR
     WIDEN_LSHIFT_EXPR
 
@@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out,
     case WIDEN_LSHIFT_EXPR:
       *code1 = LSHIFT_EXPR;
       break;
-    case WIDEN_MINUS_EXPR:
-      *code1 = MINUS_EXPR;
-      break;
-    case WIDEN_PLUS_EXPR:
-      *code1 = PLUS_EXPR;
-      break;
     case WIDEN_MULT_EXPR:
       *code1 = MULT_EXPR;
       break;
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index 5a08d91e550b2d92e9572211f811fdba99a33a38..4309733a39be3d2a82dd2b13a50d73e6ddc2e0ff 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1315,10 +1315,6 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
 	  || binoptab == umul_highpart_optab
-	  || binoptab == vec_widen_saddl_hi_optab
-	  || binoptab == vec_widen_saddl_lo_optab
-	  || binoptab == vec_widen_uaddl_hi_optab
-	  || binoptab == vec_widen_uaddl_lo_optab
 	  || binoptab == vec_widen_sadd_hi_optab
 	  || binoptab == vec_widen_sadd_lo_optab
 	  || binoptab == vec_widen_uadd_hi_optab
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 16d121722c8c5723d9b164f5a2c616dc7ec143de..d4b3befdb822b98f12a9a440261f1b8e81432639 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -406,10 +406,6 @@ OPTAB_D (vec_widen_smult_even_optab, "vec_widen_smult_even_$a")
 OPTAB_D (vec_widen_smult_hi_optab, "vec_widen_smult_hi_$a")
 OPTAB_D (vec_widen_smult_lo_optab, "vec_widen_smult_lo_$a")
 OPTAB_D (vec_widen_smult_odd_optab, "vec_widen_smult_odd_$a")
-OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a")
-OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a")
-OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a")
-OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a")
 OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a")
 OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a")
 OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a")
@@ -422,10 +418,6 @@ OPTAB_D (vec_widen_umult_lo_optab, "vec_widen_umult_lo_$a")
 OPTAB_D (vec_widen_umult_odd_optab, "vec_widen_umult_odd_$a")
 OPTAB_D (vec_widen_ushiftl_hi_optab, "vec_widen_ushiftl_hi_$a")
 OPTAB_D (vec_widen_ushiftl_lo_optab, "vec_widen_ushiftl_lo_$a")
-OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a")
-OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
-OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
-OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
 OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a")
 OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a")
 OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a")
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 28464ad9e3a7ea25557ffebcdbdbc1340f9e0d8b..dc28f5bbfa6272a92b68489fe67446bd3eba0caf 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -4068,8 +4068,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case PLUS_EXPR:
     case MINUS_EXPR:
       {
@@ -4190,10 +4188,6 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index d74d8db2173b1ab117250fea89de5212d5e354ec..7b056c7dc7e173b0bc9981a5c98f0d50685b6b66 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4273,8 +4273,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
 
     case REALIGN_LOAD_EXPR:
 
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case DOT_PROD_EXPR:
@@ -4283,10 +4281,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case WIDEN_MULT_MINUS_EXPR:
     case WIDEN_LSHIFT_EXPR:
 
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 7947f9647a15110b52d195643ad7d28ee32d4236..9941d8bf80535a98e647b8928619a6bf08bc434c 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -2874,8 +2874,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
       break;
 
       /* Binary arithmetic and logic expressions.  */
-    case WIDEN_PLUS_EXPR:
-    case WIDEN_MINUS_EXPR:
     case WIDEN_SUM_EXPR:
     case WIDEN_MULT_EXPR:
     case MULT_EXPR:
@@ -3831,10 +3829,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
     case VEC_SERIES_EXPR:
     case VEC_WIDEN_MULT_HI_EXPR:
     case VEC_WIDEN_MULT_LO_EXPR:
-    case VEC_WIDEN_PLUS_HI_EXPR:
-    case VEC_WIDEN_PLUS_LO_EXPR:
-    case VEC_WIDEN_MINUS_HI_EXPR:
-    case VEC_WIDEN_MINUS_LO_EXPR:
     case VEC_WIDEN_MULT_EVEN_EXPR:
     case VEC_WIDEN_MULT_ODD_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
@@ -4352,12 +4346,6 @@ op_symbol_code (enum tree_code code)
     case WIDEN_LSHIFT_EXPR:
       return "w<<";
 
-    case WIDEN_PLUS_EXPR:
-      return "w+";
-
-    case WIDEN_MINUS_EXPR:
-      return "w-";
-
     case POINTER_PLUS_EXPR:
       return "+";
 
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 6721ab6efc4f029be8e2315c31ba87d94230cda5..68b29ee4661a0d08cf8f7048c23f992d4440b08b 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type)
 	  || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR
 	  || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR
-	  || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR
 	  || gimple_assign_rhs_code (assign) == FLOAT_EXPR)
 	{
 	  tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign));
diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index 59115b2e1629358e85cb770f6da04cc5a2adb27a..7f55966310cee67238b2561e333ea45ee6153d9a 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -2198,10 +2198,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi,
      arguments, not the widened result.  VEC_UNPACK_FLOAT_*_EXPR is
      calculated in the same way above.  */
   if (code == WIDEN_SUM_EXPR
-      || code == VEC_WIDEN_PLUS_HI_EXPR
-      || code == VEC_WIDEN_PLUS_LO_EXPR
-      || code == VEC_WIDEN_MINUS_HI_EXPR
-      || code == VEC_WIDEN_MINUS_LO_EXPR
       || code == VEC_WIDEN_MULT_HI_EXPR
       || code == VEC_WIDEN_MULT_LO_EXPR
       || code == VEC_WIDEN_MULT_EVEN_EXPR
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 24c811ebe01fb8b003100dea494cf64fea72a975..7a818a3b7ad4c9e6b1f45abcc1e4fbd056aa1d29 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5035,9 +5035,7 @@ vectorizable_conversion (vec_info *vinfo,
   else
     return false;
 
-  bool widen_arith = (code == WIDEN_PLUS_EXPR
-		 || code == WIDEN_MINUS_EXPR
-		 || code == WIDEN_MULT_EXPR
+  bool widen_arith = (code == WIDEN_MULT_EXPR
 		 || code == WIDEN_LSHIFT_EXPR
 		 || code == IFN_VEC_WIDEN_PLUS_HILO
 		 || code == IFN_VEC_WIDEN_MINUS_HILO);
@@ -5089,8 +5087,6 @@ vectorizable_conversion (vec_info *vinfo,
     {
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
-		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR
 		  || code == IFN_VEC_WIDEN_PLUS_HILO
 		  || code == IFN_VEC_WIDEN_MINUS_HILO);
 
@@ -12335,7 +12331,7 @@ supportable_widening_operation (vec_info *vinfo,
   class loop *vect_loop = NULL;
   machine_mode vec_mode;
   enum insn_code icode1, icode2;
-  optab optab1, optab2;
+  optab optab1 = unknown_optab, optab2 = unknown_optab;
   tree vectype = vectype_in;
   tree wide_vectype = vectype_out;
   tree_code c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES;
@@ -12433,16 +12429,6 @@ supportable_widening_operation (vec_info *vinfo,
       c2 = VEC_WIDEN_LSHIFT_HI_EXPR;
       break;
 
-    case WIDEN_PLUS_EXPR:
-      c1 = VEC_WIDEN_PLUS_LO_EXPR;
-      c2 = VEC_WIDEN_PLUS_HI_EXPR;
-      break;
-
-    case WIDEN_MINUS_EXPR:
-      c1 = VEC_WIDEN_MINUS_LO_EXPR;
-      c2 = VEC_WIDEN_MINUS_HI_EXPR;
-      break;
-
     CASE_CONVERT:
       c1 = VEC_UNPACK_LO_EXPR;
       c2 = VEC_UNPACK_HI_EXPR;
diff --git a/gcc/tree.def b/gcc/tree.def
index b37b0b35927b92a6536e5c2d9805ffce8319a240..1fc2ca7a7249d4767aa2448219bc21a8c650aeb4 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1422,8 +1422,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3)
    the first argument from type t1 to type t2, and then shifting it
    by the second argument.  */
 DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2)
-DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2)
 
 /* Widening vector multiplication.
    The two operands are vectors with N elements of size S. Multiplying the
@@ -1488,10 +1486,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2)
  */
 DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2)
 DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2)
-DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2)
 
 /* PREDICT_EXPR.  Specify hint for branch prediction.  The
    PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 1/3] Refactor to allow internal_fn's
  2023-05-12 12:14                                     ` Andre Vieira (lists)
@ 2023-05-12 13:18                                       ` Richard Biener
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Biener @ 2023-05-12 13:18 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches

On Fri, 12 May 2023, Andre Vieira (lists) wrote:

> Hi,
> 
> I think I tackled all of your comments, let me know if I missed something.

This first and the last patch look good to me now.  Let me comment on the
second.

Thanks,
Richard.

> 
> gcc/ChangeLog:
> 
> 2023-05-12  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Joel Hutton  <joel.hutton@arm.com>
> 
>         * tree-vect-patterns.cc (vect_gimple_build): New Function.
>         (vect_recog_widen_op_pattern): Refactor to use code_helper.
>         * tree-vect-stmts.cc (vect_gen_widened_results_half): Likewise.
>         (vect_create_vectorized_demotion_stmts): Likewise.
>         (vect_create_vectorized_promotion_stmts): Likewise.
>         (vect_create_half_widening_stmts): Likewise.
>         (vectorizable_conversion): Likewise.
>         (vectorizable_call): Likewise.
>         (supportable_widening_operation): Likewise.
>         (supportable_narrowing_operation): Likewise.
>         (simple_integer_narrowing): Likewise.
>         * tree-vectorizer.h (supportable_widening_operation): Likewise.
>         (supportable_narrowing_operation): Likewise.
>         (vect_gimple_build): New function prototype.
>         * tree.h (code_helper::safe_as_tree_code): New function.
>         (code_helper::safe_as_fn_code): New function.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-12 12:16                                   ` Andre Vieira (lists)
@ 2023-05-12 13:28                                     ` Richard Biener
  2023-05-12 13:55                                       ` Andre Vieira (lists)
  2023-05-12 14:01                                       ` Richard Sandiford
  0 siblings, 2 replies; 53+ messages in thread
From: Richard Biener @ 2023-05-12 13:28 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Biener, gcc-patches, richard.sandiford

On Fri, 12 May 2023, Andre Vieira (lists) wrote:

> I have dealt with, I think..., most of your comments. There's quite a few
> changes, I think it's all a bit simpler now. I made some other changes to the
> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
> the same behaviour as we had with the tree codes before. Also added some extra
> checks to tree-cfg.cc that made sense to me.
> 
> I am still regression testing the gimple-range-op change, as that was a last
> minute change, but the rest survived a bootstrap and regression test on
> aarch64-unknown-linux-gnu.
> 
> cover letter:
> 
> This patch replaces the existing tree_code widen_plus and widen_minus
> patterns with internal_fn versions.
> 
> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
> except they provide convenience wrappers for defining conversions that require
> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
> and each of those will also require a signed and unsigned version in the case
> of widening. The hi/lo pair is necessary because the widening and narrowing
> operations take n narrow elements as inputs and return n/2 wide elements as
> outputs. The 'lo' operation operates on the first n/2 elements of input. The
> 'hi' operation operates on the second n/2 elements of input. Defining an
> internal_fn along with hi/lo variations allows a single internal function to
> be returned from a vect_recog function that will later be expanded to hi/lo.
> 
> 
>  For example:
>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
> (u/s)addl2
>                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
> -> (u/s)addl
> 
> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.

What I still don't understand is how we are so narrowly focused on
HI/LO?  We need a combined scalar IFN for pattern selection (not
sure why that's now called _HILO, I expected no suffix).  Then there's
three possibilities the target can implement this:

 1) with a widen_[su]add<mode> instruction - I _think_ that's what
    RISCV is going to offer since it is a target where vector modes
    have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
    RVV can do a V4HI to V4SI widening and widening add/subtract
    using vwadd[u] and vwsub[u] (the HI->SI widening is actually
    done with a widening add of zero - eh).
    IIRC GCN is the same here.
 2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
    codes currently support (exclusively)
 3) similar, but widen_[su]add{_even,_odd}<mode>

that said, things like decomposes_to_hilo_fn_p look to paint us into
a 2) corner without good reason.

Richard.

> gcc/ChangeLog:
> 
> 2023-05-12  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Joel Hutton  <joel.hutton@arm.com>
>             Tamar Christina  <tamar.christina@arm.com>
> 
>         * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>):
> Rename
>         this ...
>         (vec_widen_<su>add_lo_<mode>): ... to this.
>         (vec_widen_<su>addl_hi_<mode>): Rename this ...
>         (vec_widen_<su>add_hi_<mode>): ... to this.
>         (vec_widen_<su>subl_lo_<mode>): Rename this ...
>         (vec_widen_<su>sub_lo_<mode>): ... to this.
>         (vec_widen_<su>subl_hi_<mode>): Rename this ...
>         (vec_widen_<su>sub_hi_<mode>): ...to this.
>         * doc/generic.texi: Document new IFN codes.
> 	* internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to
> 	define an
>         internal_fn that expands into multiple internal_fns for widening.
>         (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing.
>  	(ifn_cmp): Function to compare ifn's for sorting/searching.
> 	(lookup_hilo_internal_fn): Add lookup function.
> 	(commutative_binary_fn_p): Add widen_plus fn's.
> 	(widening_fn_p): New function.
> 	(narrowing_fn_p): New function.
> 	(decomposes_to_hilo_fn_p): New function.
> 	         (direct_internal_fn_optab): Change visibility.
>     	* internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define
>     widening
>     plus,minus functions.
> 	(VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code.
> 	(VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code.
> 	* internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
> 	         (direct_internal_fn_optab): Declare new prototype.
> 	(lookup_hilo_internal_fn): Likewise.
> 	(widening_fn_p): Likewise.
> 	(Narrowing_fn_p): Likewise.
> 	(decomposes_to_hilo_fn_p): Likewise.
> 	* optabs.cc (commutative_optab_p): Add widening plus optabs.
> 	* optabs.def (OPTAB_D): Define widen add, sub optabs.
>         * tree-cfg.cc (verify_gimple_call): Add checks for new widen
>         add and sub IFNs.
>         * tree-inline.cc (estimate_num_insns): Return same
>         cost for widen add and sub IFNs as previous tree_codes.
>     	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
>     patterns with a hi/lo split.
>         (vect_recog_sad_pattern): Refactor to use new IFN codes.
>         (vect_recog_widen_plus_pattern): Likewise.
>         (vect_recog_widen_minus_pattern): Likewise.
>         (vect_recog_average_pattern): Likewise.
> 	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
> 	         _HILO IFNs.
> 	(supportable_widening_operation): Likewise.
>         * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
> 
> gcc/testsuite/ChangeLog:
> 
>     	* gcc.target/aarch64/vect-widen-add.c: Test that new
>     IFN_VEC_WIDEN_PLUS is being used.
>     	* gcc.target/aarch64/vect-widen-sub.c: Test that new
>     IFN_VEC_WIDEN_MINUS is being used.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-12 13:28                                     ` Richard Biener
@ 2023-05-12 13:55                                       ` Andre Vieira (lists)
  2023-05-12 14:01                                       ` Richard Sandiford
  1 sibling, 0 replies; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-05-12 13:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, gcc-patches, richard.sandiford



On 12/05/2023 14:28, Richard Biener wrote:
> On Fri, 12 May 2023, Andre Vieira (lists) wrote:
> 
>> I have dealt with, I think..., most of your comments. There's quite a few
>> changes, I think it's all a bit simpler now. I made some other changes to the
>> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
>> the same behaviour as we had with the tree codes before. Also added some extra
>> checks to tree-cfg.cc that made sense to me.
>>
>> I am still regression testing the gimple-range-op change, as that was a last
>> minute change, but the rest survived a bootstrap and regression test on
>> aarch64-unknown-linux-gnu.
>>
>> cover letter:
>>
>> This patch replaces the existing tree_code widen_plus and widen_minus
>> patterns with internal_fn versions.
>>
>> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
>> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
>> except they provide convenience wrappers for defining conversions that require
>> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
>> and each of those will also require a signed and unsigned version in the case
>> of widening. The hi/lo pair is necessary because the widening and narrowing
>> operations take n narrow elements as inputs and return n/2 wide elements as
>> outputs. The 'lo' operation operates on the first n/2 elements of input. The
>> 'hi' operation operates on the second n/2 elements of input. Defining an
>> internal_fn along with hi/lo variations allows a single internal function to
>> be returned from a vect_recog function that will later be expanded to hi/lo.
>>
>>
>>   For example:
>>   IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
>> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
>> (u/s)addl2
>>                         IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
>> -> (u/s)addl
>>
>> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
>> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
> 
> What I still don't understand is how we are so narrowly focused on
> HI/LO?  We need a combined scalar IFN for pattern selection (not
> sure why that's now called _HILO, I expected no suffix).  Then there's
> three possibilities the target can implement this:
> 
>   1) with a widen_[su]add<mode> instruction - I _think_ that's what
>      RISCV is going to offer since it is a target where vector modes
>      have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
>      RVV can do a V4HI to V4SI widening and widening add/subtract
>      using vwadd[u] and vwsub[u] (the HI->SI widening is actually
>      done with a widening add of zero - eh).
>      IIRC GCN is the same here.
>   2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
>      codes currently support (exclusively)
>   3) similar, but widen_[su]add{_even,_odd}<mode>
> 
> that said, things like decomposes_to_hilo_fn_p look to paint us into
> a 2) corner without good reason.

I was kind of just keeping the naming, I had forgotten to mention I was 
also going to add _EVENODD but you are right, the pattern selection IFN 
does not need to be restrictive.

And then at supportable_widening_operation we could check what the 
target offers support for (either 1, 2 or 3). We can then actually just 
get rid of decomposes_to_hilo_fn_p and just assume that for all 
narrowing or widening IFN's there are optabs (that may or may not be 
implemented by a target) for all three variants

Having said that, that means we should have an optab to cover 1, which 
should probably just have the original name. Let me write it out...

Say we have a IFN_VEC_WIDEN_PLUS pattern and assume its signed, 
supportable_widening_operation would then first check if the target 
supported vec_widen_sadd_optab for say V8HI -> V8SI? Risc-V would take 
this path I guess?

If the target doesn't then it could check for support for:
vec_widen_sadd_lo_optab V4HI -> V4SI
vec_widen_sadd_hi_optab V4HI -> V4SI

AArch64 Advanced SIMD would implement this.

If the target still didn't support this it would check for (not sure 
about the modes here):
vec_widen_sadd_even_optab VNx8HI -> VNx4SI
vec_widen_sadd_odd_optab VNx8HI -> VNx4SI

This is one SVE would implement.


So that would mean that I'd probably end up rewriting
#define DEF_INTERNAL_OPTAB_WIDENING_FN (NAME, FLAGS, SELECTOR, SOPTAB, 
UOPTAB, TYPE)
as:
for1)
   DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, 
TYPE)

for 2)
   DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_LO, FLAGS, SELECTOR, SOPTAB, 
UOPTAB, TYPE)
   DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_HI, FLAGS, SELECTOR, SOPTAB, 
UOPTAB, TYPE)

for 3)
   DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_EVEN, FLAGS, SELECTOR, SOPTAB, 
UOPTAB, TYPE)
   DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_ODD, FLAGS, SELECTOR, SOPTAB, 
UOPTAB, TYPE)

And the same for narrowing (but with DEF_INTERNAL_OPTAB_FN instead of 
SIGNED_OPTAB).

So each widening and narrowing IFN would have optabs for all its 
variants and each target implements the ones it supports.

I'm happy to do this, but implementing support to handle the 1 and 3 
variants without having optabs for them right now seems a bit odd and it 
would delay this patch, so I suggest I add the framework and the optabs 
but leave adding the vectorizer support for later? I can add comments to 
where I think that should go.

> Richard.
> 
>> gcc/ChangeLog:
>>
>> 2023-05-12  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>>              Joel Hutton  <joel.hutton@arm.com>
>>              Tamar Christina  <tamar.christina@arm.com>
>>
>>          * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>):
>> Rename
>>          this ...
>>          (vec_widen_<su>add_lo_<mode>): ... to this.
>>          (vec_widen_<su>addl_hi_<mode>): Rename this ...
>>          (vec_widen_<su>add_hi_<mode>): ... to this.
>>          (vec_widen_<su>subl_lo_<mode>): Rename this ...
>>          (vec_widen_<su>sub_lo_<mode>): ... to this.
>>          (vec_widen_<su>subl_hi_<mode>): Rename this ...
>>          (vec_widen_<su>sub_hi_<mode>): ...to this.
>>          * doc/generic.texi: Document new IFN codes.
>> 	* internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to
>> 	define an
>>          internal_fn that expands into multiple internal_fns for widening.
>>          (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing.
>>   	(ifn_cmp): Function to compare ifn's for sorting/searching.
>> 	(lookup_hilo_internal_fn): Add lookup function.
>> 	(commutative_binary_fn_p): Add widen_plus fn's.
>> 	(widening_fn_p): New function.
>> 	(narrowing_fn_p): New function.
>> 	(decomposes_to_hilo_fn_p): New function.
>> 	         (direct_internal_fn_optab): Change visibility.
>>      	* internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define
>>      widening
>>      plus,minus functions.
>> 	(VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code.
>> 	(VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code.
>> 	* internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
>> 	         (direct_internal_fn_optab): Declare new prototype.
>> 	(lookup_hilo_internal_fn): Likewise.
>> 	(widening_fn_p): Likewise.
>> 	(Narrowing_fn_p): Likewise.
>> 	(decomposes_to_hilo_fn_p): Likewise.
>> 	* optabs.cc (commutative_optab_p): Add widening plus optabs.
>> 	* optabs.def (OPTAB_D): Define widen add, sub optabs.
>>          * tree-cfg.cc (verify_gimple_call): Add checks for new widen
>>          add and sub IFNs.
>>          * tree-inline.cc (estimate_num_insns): Return same
>>          cost for widen add and sub IFNs as previous tree_codes.
>>      	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
>>      patterns with a hi/lo split.
>>          (vect_recog_sad_pattern): Refactor to use new IFN codes.
>>          (vect_recog_widen_plus_pattern): Likewise.
>>          (vect_recog_widen_minus_pattern): Likewise.
>>          (vect_recog_average_pattern): Likewise.
>> 	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
>> 	         _HILO IFNs.
>> 	(supportable_widening_operation): Likewise.
>>          * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
>>
>> gcc/testsuite/ChangeLog:
>>
>>      	* gcc.target/aarch64/vect-widen-add.c: Test that new
>>      IFN_VEC_WIDEN_PLUS is being used.
>>      	* gcc.target/aarch64/vect-widen-sub.c: Test that new
>>      IFN_VEC_WIDEN_MINUS is being used.
>>
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-12 13:28                                     ` Richard Biener
  2023-05-12 13:55                                       ` Andre Vieira (lists)
@ 2023-05-12 14:01                                       ` Richard Sandiford
  2023-05-15 10:20                                         ` Richard Biener
  1 sibling, 1 reply; 53+ messages in thread
From: Richard Sandiford @ 2023-05-12 14:01 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches

Richard Biener <rguenther@suse.de> writes:
> On Fri, 12 May 2023, Andre Vieira (lists) wrote:
>
>> I have dealt with, I think..., most of your comments. There's quite a few
>> changes, I think it's all a bit simpler now. I made some other changes to the
>> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
>> the same behaviour as we had with the tree codes before. Also added some extra
>> checks to tree-cfg.cc that made sense to me.
>> 
>> I am still regression testing the gimple-range-op change, as that was a last
>> minute change, but the rest survived a bootstrap and regression test on
>> aarch64-unknown-linux-gnu.
>> 
>> cover letter:
>> 
>> This patch replaces the existing tree_code widen_plus and widen_minus
>> patterns with internal_fn versions.
>> 
>> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
>> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
>> except they provide convenience wrappers for defining conversions that require
>> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
>> and each of those will also require a signed and unsigned version in the case
>> of widening. The hi/lo pair is necessary because the widening and narrowing
>> operations take n narrow elements as inputs and return n/2 wide elements as
>> outputs. The 'lo' operation operates on the first n/2 elements of input. The
>> 'hi' operation operates on the second n/2 elements of input. Defining an
>> internal_fn along with hi/lo variations allows a single internal function to
>> be returned from a vect_recog function that will later be expanded to hi/lo.
>> 
>> 
>>  For example:
>>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
>> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
>> (u/s)addl2
>>                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
>> -> (u/s)addl
>> 
>> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
>> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
>
> What I still don't understand is how we are so narrowly focused on
> HI/LO?  We need a combined scalar IFN for pattern selection (not
> sure why that's now called _HILO, I expected no suffix).  Then there's
> three possibilities the target can implement this:
>
>  1) with a widen_[su]add<mode> instruction - I _think_ that's what
>     RISCV is going to offer since it is a target where vector modes
>     have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
>     RVV can do a V4HI to V4SI widening and widening add/subtract
>     using vwadd[u] and vwsub[u] (the HI->SI widening is actually
>     done with a widening add of zero - eh).
>     IIRC GCN is the same here.

SVE currently does this too, but the addition and widening are
separate operations.  E.g. in principle there's no reason why
you can't sign-extend one operand, zero-extend the other, and
then add the result together.  Or you could extend them from
different sizes (QI and HI).  All of those are supported
(if the costing allows them).

If the target has operations to do combined extending and adding (or
whatever), then at the moment we rely on combine to generate them.

So I think this case is separate from Andre's work.  The addition
itself is just an ordinary addition, and any widening happens by
vectorising a CONVERT/NOP_EXPR.

>  2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
>     codes currently support (exclusively)
>  3) similar, but widen_[su]add{_even,_odd}<mode>
>
> that said, things like decomposes_to_hilo_fn_p look to paint us into
> a 2) corner without good reason.

I suppose one question is: how much of the patch is really specific
to HI/LO, and how much is just grouping two halves together?  The nice
thing about the internal-fn grouping macros is that, if (3) is
implemented in future, the structure will strongly encourage even/odd
pairs to be supported for all operations that support hi/lo.  That is,
I would expect the grouping macros to be extended to define even/odd
ifns alongside hi/lo ones, rather than adding separate definitions
for even/odd functions.

If so, at least from the internal-fn.* side of things, I think the question
is whether it's OK to stick with hilo names for now, or whether we should
use more forward-looking names.

Thanks,
Richard

>
> Richard.
>
>> gcc/ChangeLog:
>> 
>> 2023-05-12  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>>             Joel Hutton  <joel.hutton@arm.com>
>>             Tamar Christina  <tamar.christina@arm.com>
>> 
>>         * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>):
>> Rename
>>         this ...
>>         (vec_widen_<su>add_lo_<mode>): ... to this.
>>         (vec_widen_<su>addl_hi_<mode>): Rename this ...
>>         (vec_widen_<su>add_hi_<mode>): ... to this.
>>         (vec_widen_<su>subl_lo_<mode>): Rename this ...
>>         (vec_widen_<su>sub_lo_<mode>): ... to this.
>>         (vec_widen_<su>subl_hi_<mode>): Rename this ...
>>         (vec_widen_<su>sub_hi_<mode>): ...to this.
>>         * doc/generic.texi: Document new IFN codes.
>> 	* internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to
>> 	define an
>>         internal_fn that expands into multiple internal_fns for widening.
>>         (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing.
>>  	(ifn_cmp): Function to compare ifn's for sorting/searching.
>> 	(lookup_hilo_internal_fn): Add lookup function.
>> 	(commutative_binary_fn_p): Add widen_plus fn's.
>> 	(widening_fn_p): New function.
>> 	(narrowing_fn_p): New function.
>> 	(decomposes_to_hilo_fn_p): New function.
>> 	         (direct_internal_fn_optab): Change visibility.
>>     	* internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define
>>     widening
>>     plus,minus functions.
>> 	(VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code.
>> 	(VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code.
>> 	* internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
>> 	         (direct_internal_fn_optab): Declare new prototype.
>> 	(lookup_hilo_internal_fn): Likewise.
>> 	(widening_fn_p): Likewise.
>> 	(Narrowing_fn_p): Likewise.
>> 	(decomposes_to_hilo_fn_p): Likewise.
>> 	* optabs.cc (commutative_optab_p): Add widening plus optabs.
>> 	* optabs.def (OPTAB_D): Define widen add, sub optabs.
>>         * tree-cfg.cc (verify_gimple_call): Add checks for new widen
>>         add and sub IFNs.
>>         * tree-inline.cc (estimate_num_insns): Return same
>>         cost for widen add and sub IFNs as previous tree_codes.
>>     	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
>>     patterns with a hi/lo split.
>>         (vect_recog_sad_pattern): Refactor to use new IFN codes.
>>         (vect_recog_widen_plus_pattern): Likewise.
>>         (vect_recog_widen_minus_pattern): Likewise.
>>         (vect_recog_average_pattern): Likewise.
>> 	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
>> 	         _HILO IFNs.
>> 	(supportable_widening_operation): Likewise.
>>         * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>     	* gcc.target/aarch64/vect-widen-add.c: Test that new
>>     IFN_VEC_WIDEN_PLUS is being used.
>>     	* gcc.target/aarch64/vect-widen-sub.c: Test that new
>>     IFN_VEC_WIDEN_MINUS is being used.
>> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-12 14:01                                       ` Richard Sandiford
@ 2023-05-15 10:20                                         ` Richard Biener
  2023-05-15 10:47                                           ` Richard Sandiford
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2023-05-15 10:20 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches

On Fri, 12 May 2023, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > On Fri, 12 May 2023, Andre Vieira (lists) wrote:
> >
> >> I have dealt with, I think..., most of your comments. There's quite a few
> >> changes, I think it's all a bit simpler now. I made some other changes to the
> >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
> >> the same behaviour as we had with the tree codes before. Also added some extra
> >> checks to tree-cfg.cc that made sense to me.
> >> 
> >> I am still regression testing the gimple-range-op change, as that was a last
> >> minute change, but the rest survived a bootstrap and regression test on
> >> aarch64-unknown-linux-gnu.
> >> 
> >> cover letter:
> >> 
> >> This patch replaces the existing tree_code widen_plus and widen_minus
> >> patterns with internal_fn versions.
> >> 
> >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
> >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
> >> except they provide convenience wrappers for defining conversions that require
> >> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
> >> and each of those will also require a signed and unsigned version in the case
> >> of widening. The hi/lo pair is necessary because the widening and narrowing
> >> operations take n narrow elements as inputs and return n/2 wide elements as
> >> outputs. The 'lo' operation operates on the first n/2 elements of input. The
> >> 'hi' operation operates on the second n/2 elements of input. Defining an
> >> internal_fn along with hi/lo variations allows a single internal function to
> >> be returned from a vect_recog function that will later be expanded to hi/lo.
> >> 
> >> 
> >>  For example:
> >>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
> >> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
> >> (u/s)addl2
> >>                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
> >> -> (u/s)addl
> >> 
> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
> >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
> >
> > What I still don't understand is how we are so narrowly focused on
> > HI/LO?  We need a combined scalar IFN for pattern selection (not
> > sure why that's now called _HILO, I expected no suffix).  Then there's
> > three possibilities the target can implement this:
> >
> >  1) with a widen_[su]add<mode> instruction - I _think_ that's what
> >     RISCV is going to offer since it is a target where vector modes
> >     have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
> >     RVV can do a V4HI to V4SI widening and widening add/subtract
> >     using vwadd[u] and vwsub[u] (the HI->SI widening is actually
> >     done with a widening add of zero - eh).
> >     IIRC GCN is the same here.
> 
> SVE currently does this too, but the addition and widening are
> separate operations.  E.g. in principle there's no reason why
> you can't sign-extend one operand, zero-extend the other, and
> then add the result together.  Or you could extend them from
> different sizes (QI and HI).  All of those are supported
> (if the costing allows them).

I see.  So why does the target the expose widen_[su]add<mode> at all?

> If the target has operations to do combined extending and adding (or
> whatever), then at the moment we rely on combine to generate them.
> 
> So I think this case is separate from Andre's work.  The addition
> itself is just an ordinary addition, and any widening happens by
> vectorising a CONVERT/NOP_EXPR.
> 
> >  2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
> >     codes currently support (exclusively)
> >  3) similar, but widen_[su]add{_even,_odd}<mode>
> >
> > that said, things like decomposes_to_hilo_fn_p look to paint us into
> > a 2) corner without good reason.
> 
> I suppose one question is: how much of the patch is really specific
> to HI/LO, and how much is just grouping two halves together?

Yep, that I don't know for sure.

>  The nice
> thing about the internal-fn grouping macros is that, if (3) is
> implemented in future, the structure will strongly encourage even/odd
> pairs to be supported for all operations that support hi/lo.  That is,
> I would expect the grouping macros to be extended to define even/odd
> ifns alongside hi/lo ones, rather than adding separate definitions
> for even/odd functions.
> 
> If so, at least from the internal-fn.* side of things, I think the question
> is whether it's OK to stick with hilo names for now, or whether we should
> use more forward-looking names.

I think for parts that are independent we could use a more
forward-looking name.  Maybe _halves?  But I'm also not sure
how much of that is really needed (it seems to be tied around
optimizing optabs space?)

Richard.

> Thanks,
> Richard
> 
> >
> > Richard.
> >
> >> gcc/ChangeLog:
> >> 
> >> 2023-05-12  Andre Vieira  <andre.simoesdiasvieira@arm.com>
> >>             Joel Hutton  <joel.hutton@arm.com>
> >>             Tamar Christina  <tamar.christina@arm.com>
> >> 
> >>         * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>):
> >> Rename
> >>         this ...
> >>         (vec_widen_<su>add_lo_<mode>): ... to this.
> >>         (vec_widen_<su>addl_hi_<mode>): Rename this ...
> >>         (vec_widen_<su>add_hi_<mode>): ... to this.
> >>         (vec_widen_<su>subl_lo_<mode>): Rename this ...
> >>         (vec_widen_<su>sub_lo_<mode>): ... to this.
> >>         (vec_widen_<su>subl_hi_<mode>): Rename this ...
> >>         (vec_widen_<su>sub_hi_<mode>): ...to this.
> >>         * doc/generic.texi: Document new IFN codes.
> >> 	* internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to
> >> 	define an
> >>         internal_fn that expands into multiple internal_fns for widening.
> >>         (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing.
> >>  	(ifn_cmp): Function to compare ifn's for sorting/searching.
> >> 	(lookup_hilo_internal_fn): Add lookup function.
> >> 	(commutative_binary_fn_p): Add widen_plus fn's.
> >> 	(widening_fn_p): New function.
> >> 	(narrowing_fn_p): New function.
> >> 	(decomposes_to_hilo_fn_p): New function.
> >> 	         (direct_internal_fn_optab): Change visibility.
> >>     	* internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define
> >>     widening
> >>     plus,minus functions.
> >> 	(VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code.
> >> 	(VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code.
> >> 	* internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
> >> 	         (direct_internal_fn_optab): Declare new prototype.
> >> 	(lookup_hilo_internal_fn): Likewise.
> >> 	(widening_fn_p): Likewise.
> >> 	(Narrowing_fn_p): Likewise.
> >> 	(decomposes_to_hilo_fn_p): Likewise.
> >> 	* optabs.cc (commutative_optab_p): Add widening plus optabs.
> >> 	* optabs.def (OPTAB_D): Define widen add, sub optabs.
> >>         * tree-cfg.cc (verify_gimple_call): Add checks for new widen
> >>         add and sub IFNs.
> >>         * tree-inline.cc (estimate_num_insns): Return same
> >>         cost for widen add and sub IFNs as previous tree_codes.
> >>     	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
> >>     patterns with a hi/lo split.
> >>         (vect_recog_sad_pattern): Refactor to use new IFN codes.
> >>         (vect_recog_widen_plus_pattern): Likewise.
> >>         (vect_recog_widen_minus_pattern): Likewise.
> >>         (vect_recog_average_pattern): Likewise.
> >> 	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
> >> 	         _HILO IFNs.
> >> 	(supportable_widening_operation): Likewise.
> >>         * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
> >> 
> >> gcc/testsuite/ChangeLog:
> >> 
> >>     	* gcc.target/aarch64/vect-widen-add.c: Test that new
> >>     IFN_VEC_WIDEN_PLUS is being used.
> >>     	* gcc.target/aarch64/vect-widen-sub.c: Test that new
> >>     IFN_VEC_WIDEN_MINUS is being used.
> >> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-15 10:20                                         ` Richard Biener
@ 2023-05-15 10:47                                           ` Richard Sandiford
  2023-05-15 11:01                                             ` Richard Biener
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Sandiford @ 2023-05-15 10:47 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches

Richard Biener <rguenther@suse.de> writes:
> On Fri, 12 May 2023, Richard Sandiford wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>> > On Fri, 12 May 2023, Andre Vieira (lists) wrote:
>> >
>> >> I have dealt with, I think..., most of your comments. There's quite a few
>> >> changes, I think it's all a bit simpler now. I made some other changes to the
>> >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
>> >> the same behaviour as we had with the tree codes before. Also added some extra
>> >> checks to tree-cfg.cc that made sense to me.
>> >> 
>> >> I am still regression testing the gimple-range-op change, as that was a last
>> >> minute change, but the rest survived a bootstrap and regression test on
>> >> aarch64-unknown-linux-gnu.
>> >> 
>> >> cover letter:
>> >> 
>> >> This patch replaces the existing tree_code widen_plus and widen_minus
>> >> patterns with internal_fn versions.
>> >> 
>> >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
>> >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
>> >> except they provide convenience wrappers for defining conversions that require
>> >> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
>> >> and each of those will also require a signed and unsigned version in the case
>> >> of widening. The hi/lo pair is necessary because the widening and narrowing
>> >> operations take n narrow elements as inputs and return n/2 wide elements as
>> >> outputs. The 'lo' operation operates on the first n/2 elements of input. The
>> >> 'hi' operation operates on the second n/2 elements of input. Defining an
>> >> internal_fn along with hi/lo variations allows a single internal function to
>> >> be returned from a vect_recog function that will later be expanded to hi/lo.
>> >> 
>> >> 
>> >>  For example:
>> >>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
>> >> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
>> >> (u/s)addl2
>> >>                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
>> >> -> (u/s)addl
>> >> 
>> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
>> >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
>> >
>> > What I still don't understand is how we are so narrowly focused on
>> > HI/LO?  We need a combined scalar IFN for pattern selection (not
>> > sure why that's now called _HILO, I expected no suffix).  Then there's
>> > three possibilities the target can implement this:
>> >
>> >  1) with a widen_[su]add<mode> instruction - I _think_ that's what
>> >     RISCV is going to offer since it is a target where vector modes
>> >     have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
>> >     RVV can do a V4HI to V4SI widening and widening add/subtract
>> >     using vwadd[u] and vwsub[u] (the HI->SI widening is actually
>> >     done with a widening add of zero - eh).
>> >     IIRC GCN is the same here.
>> 
>> SVE currently does this too, but the addition and widening are
>> separate operations.  E.g. in principle there's no reason why
>> you can't sign-extend one operand, zero-extend the other, and
>> then add the result together.  Or you could extend them from
>> different sizes (QI and HI).  All of those are supported
>> (if the costing allows them).
>
> I see.  So why does the target the expose widen_[su]add<mode> at all?

It shouldn't (need to) do that.  I don't think we should have an optab
for the unsplit operation.

At least on SVE, we really want the extensions to be fused with loads
(where possible) rather than with arithmetic.

We can still do the widening arithmetic in one go.  It's just that
fusing with the loads works for the mixed-sign and mixed-size cases,
and can handle more than just doubling the element size.

>> If the target has operations to do combined extending and adding (or
>> whatever), then at the moment we rely on combine to generate them.
>> 
>> So I think this case is separate from Andre's work.  The addition
>> itself is just an ordinary addition, and any widening happens by
>> vectorising a CONVERT/NOP_EXPR.
>> 
>> >  2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
>> >     codes currently support (exclusively)
>> >  3) similar, but widen_[su]add{_even,_odd}<mode>
>> >
>> > that said, things like decomposes_to_hilo_fn_p look to paint us into
>> > a 2) corner without good reason.
>> 
>> I suppose one question is: how much of the patch is really specific
>> to HI/LO, and how much is just grouping two halves together?
>
> Yep, that I don't know for sure.
>
>>  The nice
>> thing about the internal-fn grouping macros is that, if (3) is
>> implemented in future, the structure will strongly encourage even/odd
>> pairs to be supported for all operations that support hi/lo.  That is,
>> I would expect the grouping macros to be extended to define even/odd
>> ifns alongside hi/lo ones, rather than adding separate definitions
>> for even/odd functions.
>> 
>> If so, at least from the internal-fn.* side of things, I think the question
>> is whether it's OK to stick with hilo names for now, or whether we should
>> use more forward-looking names.
>
> I think for parts that are independent we could use a more
> forward-looking name.  Maybe _halves?

Using _halves for the ifn macros sounds good to me FWIW.

> But I'm also not sure
> how much of that is really needed (it seems to be tied around
> optimizing optabs space?)

Not sure what you mean by "this".  Optabs space shouldn't be a problem
though.  The optab encoding gives us a full int to play with, and it
could easily go up to 64 bits if necessary/convenient.

At least on the internal-fn.* side, the aim is really just to establish
a regular structure, so that we don't have arbitrary differences between
different widening operations, or too much cut-&-paste.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-15 10:47                                           ` Richard Sandiford
@ 2023-05-15 11:01                                             ` Richard Biener
  2023-05-15 11:10                                               ` Richard Sandiford
  2023-05-15 11:53                                               ` Andre Vieira (lists)
  0 siblings, 2 replies; 53+ messages in thread
From: Richard Biener @ 2023-05-15 11:01 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches

On Mon, 15 May 2023, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > On Fri, 12 May 2023, Richard Sandiford wrote:
> >
> >> Richard Biener <rguenther@suse.de> writes:
> >> > On Fri, 12 May 2023, Andre Vieira (lists) wrote:
> >> >
> >> >> I have dealt with, I think..., most of your comments. There's quite a few
> >> >> changes, I think it's all a bit simpler now. I made some other changes to the
> >> >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
> >> >> the same behaviour as we had with the tree codes before. Also added some extra
> >> >> checks to tree-cfg.cc that made sense to me.
> >> >> 
> >> >> I am still regression testing the gimple-range-op change, as that was a last
> >> >> minute change, but the rest survived a bootstrap and regression test on
> >> >> aarch64-unknown-linux-gnu.
> >> >> 
> >> >> cover letter:
> >> >> 
> >> >> This patch replaces the existing tree_code widen_plus and widen_minus
> >> >> patterns with internal_fn versions.
> >> >> 
> >> >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
> >> >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
> >> >> except they provide convenience wrappers for defining conversions that require
> >> >> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
> >> >> and each of those will also require a signed and unsigned version in the case
> >> >> of widening. The hi/lo pair is necessary because the widening and narrowing
> >> >> operations take n narrow elements as inputs and return n/2 wide elements as
> >> >> outputs. The 'lo' operation operates on the first n/2 elements of input. The
> >> >> 'hi' operation operates on the second n/2 elements of input. Defining an
> >> >> internal_fn along with hi/lo variations allows a single internal function to
> >> >> be returned from a vect_recog function that will later be expanded to hi/lo.
> >> >> 
> >> >> 
> >> >>  For example:
> >> >>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
> >> >> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
> >> >> (u/s)addl2
> >> >>                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
> >> >> -> (u/s)addl
> >> >> 
> >> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
> >> >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
> >> >
> >> > What I still don't understand is how we are so narrowly focused on
> >> > HI/LO?  We need a combined scalar IFN for pattern selection (not
> >> > sure why that's now called _HILO, I expected no suffix).  Then there's
> >> > three possibilities the target can implement this:
> >> >
> >> >  1) with a widen_[su]add<mode> instruction - I _think_ that's what
> >> >     RISCV is going to offer since it is a target where vector modes
> >> >     have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
> >> >     RVV can do a V4HI to V4SI widening and widening add/subtract
> >> >     using vwadd[u] and vwsub[u] (the HI->SI widening is actually
> >> >     done with a widening add of zero - eh).
> >> >     IIRC GCN is the same here.
> >> 
> >> SVE currently does this too, but the addition and widening are
> >> separate operations.  E.g. in principle there's no reason why
> >> you can't sign-extend one operand, zero-extend the other, and
> >> then add the result together.  Or you could extend them from
> >> different sizes (QI and HI).  All of those are supported
> >> (if the costing allows them).
> >
> > I see.  So why does the target the expose widen_[su]add<mode> at all?
> 
> It shouldn't (need to) do that.  I don't think we should have an optab
> for the unsplit operation.
> 
> At least on SVE, we really want the extensions to be fused with loads
> (where possible) rather than with arithmetic.
> 
> We can still do the widening arithmetic in one go.  It's just that
> fusing with the loads works for the mixed-sign and mixed-size cases,
> and can handle more than just doubling the element size.
> 
> >> If the target has operations to do combined extending and adding (or
> >> whatever), then at the moment we rely on combine to generate them.
> >> 
> >> So I think this case is separate from Andre's work.  The addition
> >> itself is just an ordinary addition, and any widening happens by
> >> vectorising a CONVERT/NOP_EXPR.
> >> 
> >> >  2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
> >> >     codes currently support (exclusively)
> >> >  3) similar, but widen_[su]add{_even,_odd}<mode>
> >> >
> >> > that said, things like decomposes_to_hilo_fn_p look to paint us into
> >> > a 2) corner without good reason.
> >> 
> >> I suppose one question is: how much of the patch is really specific
> >> to HI/LO, and how much is just grouping two halves together?
> >
> > Yep, that I don't know for sure.
> >
> >>  The nice
> >> thing about the internal-fn grouping macros is that, if (3) is
> >> implemented in future, the structure will strongly encourage even/odd
> >> pairs to be supported for all operations that support hi/lo.  That is,
> >> I would expect the grouping macros to be extended to define even/odd
> >> ifns alongside hi/lo ones, rather than adding separate definitions
> >> for even/odd functions.
> >> 
> >> If so, at least from the internal-fn.* side of things, I think the question
> >> is whether it's OK to stick with hilo names for now, or whether we should
> >> use more forward-looking names.
> >
> > I think for parts that are independent we could use a more
> > forward-looking name.  Maybe _halves?
> 
> Using _halves for the ifn macros sounds good to me FWIW.
> 
> > But I'm also not sure
> > how much of that is really needed (it seems to be tied around
> > optimizing optabs space?)
> 
> Not sure what you mean by "this".  Optabs space shouldn't be a problem
> though.  The optab encoding gives us a full int to play with, and it
> could easily go up to 64 bits if necessary/convenient.
> 
> At least on the internal-fn.* side, the aim is really just to establish
> a regular structure, so that we don't have arbitrary differences between
> different widening operations, or too much cut-&-paste.

Hmm, I'm looking at the need for the std::map and 
internal_fn_hilo_keys_array and internal_fn_hilo_values_array.
The vectorizer pieces contain

+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn ((combined_fn) code);
+      gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+      internal_fn lo, hi;
+      lookup_hilo_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
+      optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype));

so that tries to automatically associate the scalar widening IFN
with the set(s) of IFN pairs we can split to.  But then this
list should be static and there's no need to create a std::map?
Maybe gencfn-macros.cc can be enhanced to output these static
cases?  Or the vectorizer could (as it did previously) simply
open-code the handled cases (I guess since we deal with two
cases only now I'd prefer that).

Thanks,
Richard.


> Thanks,
> Richard
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-15 11:01                                             ` Richard Biener
@ 2023-05-15 11:10                                               ` Richard Sandiford
  2023-05-15 11:53                                               ` Andre Vieira (lists)
  1 sibling, 0 replies; 53+ messages in thread
From: Richard Sandiford @ 2023-05-15 11:10 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches

Richard Biener <rguenther@suse.de> writes:
> On Mon, 15 May 2023, Richard Sandiford wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>> > But I'm also not sure
>> > how much of that is really needed (it seems to be tied around
>> > optimizing optabs space?)
>> 
>> Not sure what you mean by "this".  Optabs space shouldn't be a problem
>> though.  The optab encoding gives us a full int to play with, and it
>> could easily go up to 64 bits if necessary/convenient.
>> 
>> At least on the internal-fn.* side, the aim is really just to establish
>> a regular structure, so that we don't have arbitrary differences between
>> different widening operations, or too much cut-&-paste.
>
> Hmm, I'm looking at the need for the std::map and 
> internal_fn_hilo_keys_array and internal_fn_hilo_values_array.
> The vectorizer pieces contain
>
> +  if (code.is_fn_code ())
> +     {
> +      internal_fn ifn = as_internal_fn ((combined_fn) code);
> +      gcc_assert (decomposes_to_hilo_fn_p (ifn));
> +
> +      internal_fn lo, hi;
> +      lookup_hilo_internal_fn (ifn, &lo, &hi);
> +      *code1 = as_combined_fn (lo);
> +      *code2 = as_combined_fn (hi);
> +      optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
> +      optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
>
> so that tries to automatically associate the scalar widening IFN
> with the set(s) of IFN pairs we can split to.  But then this
> list should be static and there's no need to create a std::map?
> Maybe gencfn-macros.cc can be enhanced to output these static
> cases?  Or the vectorizer could (as it did previously) simply
> open-code the handled cases (I guess since we deal with two
> cases only now I'd prefer that).

Ah, yeah, I pushed back against that too.  I think it should be possible
to do it using the preprocessor, if the macros are defined appropriately.
But if it isn't possible to do it with macros then I agree that a
generator would be better than initialisation within the compiler.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-15 11:01                                             ` Richard Biener
  2023-05-15 11:10                                               ` Richard Sandiford
@ 2023-05-15 11:53                                               ` Andre Vieira (lists)
  2023-05-15 12:21                                                 ` Richard Biener
  1 sibling, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-05-15 11:53 UTC (permalink / raw)
  To: Richard Biener, Richard Sandiford; +Cc: Richard Biener, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 7818 bytes --]



On 15/05/2023 12:01, Richard Biener wrote:
> On Mon, 15 May 2023, Richard Sandiford wrote:
> 
>> Richard Biener <rguenther@suse.de> writes:
>>> On Fri, 12 May 2023, Richard Sandiford wrote:
>>>
>>>> Richard Biener <rguenther@suse.de> writes:
>>>>> On Fri, 12 May 2023, Andre Vieira (lists) wrote:
>>>>>
>>>>>> I have dealt with, I think..., most of your comments. There's quite a few
>>>>>> changes, I think it's all a bit simpler now. I made some other changes to the
>>>>>> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
>>>>>> the same behaviour as we had with the tree codes before. Also added some extra
>>>>>> checks to tree-cfg.cc that made sense to me.
>>>>>>
>>>>>> I am still regression testing the gimple-range-op change, as that was a last
>>>>>> minute change, but the rest survived a bootstrap and regression test on
>>>>>> aarch64-unknown-linux-gnu.
>>>>>>
>>>>>> cover letter:
>>>>>>
>>>>>> This patch replaces the existing tree_code widen_plus and widen_minus
>>>>>> patterns with internal_fn versions.
>>>>>>
>>>>>> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
>>>>>> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
>>>>>> except they provide convenience wrappers for defining conversions that require
>>>>>> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
>>>>>> and each of those will also require a signed and unsigned version in the case
>>>>>> of widening. The hi/lo pair is necessary because the widening and narrowing
>>>>>> operations take n narrow elements as inputs and return n/2 wide elements as
>>>>>> outputs. The 'lo' operation operates on the first n/2 elements of input. The
>>>>>> 'hi' operation operates on the second n/2 elements of input. Defining an
>>>>>> internal_fn along with hi/lo variations allows a single internal function to
>>>>>> be returned from a vect_recog function that will later be expanded to hi/lo.
>>>>>>
>>>>>>
>>>>>>   For example:
>>>>>>   IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
>>>>>> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
>>>>>> (u/s)addl2
>>>>>>                         IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
>>>>>> -> (u/s)addl
>>>>>>
>>>>>> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
>>>>>> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
>>>>>
>>>>> What I still don't understand is how we are so narrowly focused on
>>>>> HI/LO?  We need a combined scalar IFN for pattern selection (not
>>>>> sure why that's now called _HILO, I expected no suffix).  Then there's
>>>>> three possibilities the target can implement this:
>>>>>
>>>>>   1) with a widen_[su]add<mode> instruction - I _think_ that's what
>>>>>      RISCV is going to offer since it is a target where vector modes
>>>>>      have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
>>>>>      RVV can do a V4HI to V4SI widening and widening add/subtract
>>>>>      using vwadd[u] and vwsub[u] (the HI->SI widening is actually
>>>>>      done with a widening add of zero - eh).
>>>>>      IIRC GCN is the same here.
>>>>
>>>> SVE currently does this too, but the addition and widening are
>>>> separate operations.  E.g. in principle there's no reason why
>>>> you can't sign-extend one operand, zero-extend the other, and
>>>> then add the result together.  Or you could extend them from
>>>> different sizes (QI and HI).  All of those are supported
>>>> (if the costing allows them).
>>>
>>> I see.  So why does the target the expose widen_[su]add<mode> at all?
>>
>> It shouldn't (need to) do that.  I don't think we should have an optab
>> for the unsplit operation.
>>
>> At least on SVE, we really want the extensions to be fused with loads
>> (where possible) rather than with arithmetic.
>>
>> We can still do the widening arithmetic in one go.  It's just that
>> fusing with the loads works for the mixed-sign and mixed-size cases,
>> and can handle more than just doubling the element size.
>>
>>>> If the target has operations to do combined extending and adding (or
>>>> whatever), then at the moment we rely on combine to generate them.
>>>>
>>>> So I think this case is separate from Andre's work.  The addition
>>>> itself is just an ordinary addition, and any widening happens by
>>>> vectorising a CONVERT/NOP_EXPR.
>>>>
>>>>>   2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
>>>>>      codes currently support (exclusively)
>>>>>   3) similar, but widen_[su]add{_even,_odd}<mode>
>>>>>
>>>>> that said, things like decomposes_to_hilo_fn_p look to paint us into
>>>>> a 2) corner without good reason.
>>>>
>>>> I suppose one question is: how much of the patch is really specific
>>>> to HI/LO, and how much is just grouping two halves together?
>>>
>>> Yep, that I don't know for sure.
>>>
>>>>   The nice
>>>> thing about the internal-fn grouping macros is that, if (3) is
>>>> implemented in future, the structure will strongly encourage even/odd
>>>> pairs to be supported for all operations that support hi/lo.  That is,
>>>> I would expect the grouping macros to be extended to define even/odd
>>>> ifns alongside hi/lo ones, rather than adding separate definitions
>>>> for even/odd functions.
>>>>
>>>> If so, at least from the internal-fn.* side of things, I think the question
>>>> is whether it's OK to stick with hilo names for now, or whether we should
>>>> use more forward-looking names.
>>>
>>> I think for parts that are independent we could use a more
>>> forward-looking name.  Maybe _halves?
>>
>> Using _halves for the ifn macros sounds good to me FWIW.
>>
>>> But I'm also not sure
>>> how much of that is really needed (it seems to be tied around
>>> optimizing optabs space?)
>>
>> Not sure what you mean by "this".  Optabs space shouldn't be a problem
>> though.  The optab encoding gives us a full int to play with, and it
>> could easily go up to 64 bits if necessary/convenient.
>>
>> At least on the internal-fn.* side, the aim is really just to establish
>> a regular structure, so that we don't have arbitrary differences between
>> different widening operations, or too much cut-&-paste.
> 
> Hmm, I'm looking at the need for the std::map and
> internal_fn_hilo_keys_array and internal_fn_hilo_values_array.
> The vectorizer pieces contain
> 
> +  if (code.is_fn_code ())
> +     {
> +      internal_fn ifn = as_internal_fn ((combined_fn) code);
> +      gcc_assert (decomposes_to_hilo_fn_p (ifn));
> +
> +      internal_fn lo, hi;
> +      lookup_hilo_internal_fn (ifn, &lo, &hi);
> +      *code1 = as_combined_fn (lo);
> +      *code2 = as_combined_fn (hi);
> +      optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
> +      optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
> 
> so that tries to automatically associate the scalar widening IFN
> with the set(s) of IFN pairs we can split to.  But then this
> list should be static and there's no need to create a std::map?
> Maybe gencfn-macros.cc can be enhanced to output these static
> cases?  Or the vectorizer could (as it did previously) simply
> open-code the handled cases (I guess since we deal with two
> cases only now I'd prefer that).
> 
> Thanks,
> Richard.
> 
> 
>> Thanks,
>> Richard
>>
> 
The patch I uploaded last no longer has std::map nor 
internal_fn_hilo_keys_array and internal_fn_hilo_values_array. (I've 
attached it again)

I'm not sure I understand the _halves, do you mean that for the case 
where I had _hilo or _HILO before we rename that to _halves/_HALVES such 
that it later represents both _hi/_lo separation and _even/_odd?

And am I correct to assume we are just giving up on having a 
INTERNAL_OPTAB_FN idea for 1)?

Kind regards,
Andre

[-- Attachment #2: ifn1v3.patch --]
[-- Type: text/plain, Size: 34240 bytes --]

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bfc98a8d943467b33390defab9682f44efab5907..ffbbecb9409e1c2835d658c2a8855cd0e955c0f2 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4626,7 +4626,7 @@
   [(set_attr "type" "neon_<ADDSUB:optab>_long")]
 )
 
-(define_expand "vec_widen_<su>addl_lo_<mode>"
+(define_expand "vec_widen_<su>add_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4638,7 +4638,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>addl_hi_<mode>"
+(define_expand "vec_widen_<su>add_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4650,7 +4650,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_lo_<mode>"
+(define_expand "vec_widen_<su>sub_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4662,7 +4662,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_hi_<mode>"
+(define_expand "vec_widen_<su>sub_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 8b2882da4fe7da07d22b4e5384d049ba7d3907bf..0fd7e6cce8bbd4ecb8027b702722adcf6c32eb55 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,6 +1811,10 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
+@tindex IFN_VEC_WIDEN_PLUS_HI
+@tindex IFN_VEC_WIDEN_PLUS_LO
+@tindex IFN_VEC_WIDEN_MINUS_HI
+@tindex IFN_VEC_WIDEN_MINUS_LO
 @tindex VEC_WIDEN_PLUS_HI_EXPR
 @tindex VEC_WIDEN_PLUS_LO_EXPR
 @tindex VEC_WIDEN_MINUS_HI_EXPR
@@ -1861,6 +1865,33 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
+@item IFN_VEC_WIDEN_PLUS_HI
+@itemx IFN_VEC_WIDEN_PLUS_LO
+These internal functions represent widening vector addition of the high and low
+parts of the two input vectors, respectively.  Their operands are vectors that
+contain the same number of elements (@code{N}) of the same integral type. The
+result is a vector that contains half as many elements, of an integral type
+whose size is twice as wide.  In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the
+high @code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} products.  In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low
+@code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} products.
+
+@item IFN_VEC_WIDEN_MINUS_HI
+@itemx IFN_VEC_WIDEN_MINUS_LO
+These internal functions represent widening vector subtraction of the high and
+low parts of the two input vectors, respectively.  Their operands are vectors
+that contain the same number of elements (@code{N}) of the same integral type.
+The high/low elements of the second vector are subtracted from the high/low
+elements of the first. The result is a vector that contains half as many
+elements, of an integral type whose size is twice as wide.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second
+vector are subtracted from the high @code{N/2} of the first to produce the
+vector of @code{N/2} products.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second
+vector are subtracted from the low @code{N/2} of the first to produce the
+vector of @code{N/2} products.
+
 @item VEC_WIDEN_PLUS_HI_EXPR
 @itemx VEC_WIDEN_PLUS_LO_EXPR
 These nodes represent widening vector addition of the high and low parts of
diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 594bd3043f0e944299ddfff219f757ef15a3dd61..66636d82df27626e7911efd0cb8526921b39633f 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -1187,6 +1187,7 @@ gimple_range_op_handler::maybe_non_standard ()
 {
   range_operator *signed_op = ptr_op_widen_mult_signed;
   range_operator *unsigned_op = ptr_op_widen_mult_unsigned;
+  bool signed1, signed2, signed_ret;
   if (gimple_code (m_stmt) == GIMPLE_ASSIGN)
     switch (gimple_assign_rhs_code (m_stmt))
       {
@@ -1202,32 +1203,55 @@ gimple_range_op_handler::maybe_non_standard ()
 	  m_op1 = gimple_assign_rhs1 (m_stmt);
 	  m_op2 = gimple_assign_rhs2 (m_stmt);
 	  tree ret = gimple_assign_lhs (m_stmt);
-	  bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
-	  bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
-	  bool signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
-
-	  /* Normally these operands should all have the same sign, but
-	     some passes and violate this by taking mismatched sign args.  At
-	     the moment the only one that's possible is mismatch inputs and
-	     unsigned output.  Once ranger supports signs for the operands we
-	     can properly fix it,  for now only accept the case we can do
-	     correctly.  */
-	  if ((signed1 ^ signed2) && signed_ret)
-	    return;
-
-	  m_valid = true;
-	  if (signed2 && !signed1)
-	    std::swap (m_op1, m_op2);
-
-	  if (signed1 || signed2)
-	    m_int = signed_op;
-	  else
-	    m_int = unsigned_op;
+	  signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+	  signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+	  signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
 	  break;
 	}
 	default:
-	  break;
+	  return;
       }
+  else if (gimple_code (m_stmt) == GIMPLE_CALL
+      && gimple_call_internal_p (m_stmt)
+      && gimple_get_lhs (m_stmt) != NULL_TREE)
+    switch (gimple_call_internal_fn (m_stmt))
+      {
+      case IFN_VEC_WIDEN_PLUS_LO:
+      case IFN_VEC_WIDEN_PLUS_HI:
+	  {
+	    signed_op = ptr_op_widen_plus_signed;
+	    unsigned_op = ptr_op_widen_plus_unsigned;
+	    m_valid = false;
+	    m_op1 = gimple_call_arg (m_stmt, 0);
+	    m_op2 = gimple_call_arg (m_stmt, 1);
+	    tree ret = gimple_get_lhs (m_stmt);
+	    signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+	    signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+	    signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
+	    break;
+	  }
+      default:
+	return;
+      }
+  else
+    return;
+
+    /* Normally these operands should all have the same sign, but some passes
+       and violate this by taking mismatched sign args.  At the moment the only
+       one that's possible is mismatch inputs and unsigned output.  Once ranger
+       supports signs for the operands we can properly fix it,  for now only
+       accept the case we can do correctly.  */
+    if ((signed1 ^ signed2) && signed_ret)
+      return;
+
+    m_valid = true;
+    if (signed2 && !signed1)
+      std::swap (m_op1, m_op2);
+
+    if (signed1 || signed2)
+      m_int = signed_op;
+    else
+      m_int = unsigned_op;
 }
 
 // Set up a gimple_range_op_handler for any built in function which can be
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..1acea5ae33046b70de247b1688aea874d9956abc 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -90,6 +90,19 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+/*  Given an internal_fn IFN that is a HILO function, return its corresponding
+    LO and HI internal_fns.  */
+
+extern void
+lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
+{
+  gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -137,7 +150,16 @@ const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = {
 #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) TYPE##_direct,
 #define DEF_INTERNAL_SIGNED_OPTAB_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \
 				     UNSIGNED_OPTAB, TYPE) TYPE##_direct,
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \
+					    UNSIGNED_OPTAB, TYPE)		  \
+TYPE##_direct, TYPE##_direct, TYPE##_direct,
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE)	\
+TYPE##_direct, TYPE##_direct, TYPE##_direct,
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
   not_direct
 };
 
@@ -3852,7 +3874,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
 
 /* Return the optab used by internal function FN.  */
 
-static optab
+optab
 direct_internal_fn_optab (internal_fn fn, tree_pair types)
 {
   switch (fn)
@@ -3971,6 +3993,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS_HILO:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -4044,6 +4069,88 @@ first_commutative_argument (internal_fn fn)
     }
 }
 
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as wide as the element size of the input vectors.  */
+
+bool
+widening_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \
+    case IFN_##NAME##_HILO:\
+    case IFN_##NAME##_HI: \
+    case IFN_##NAME##_LO: \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as narrow as the element size of the input vectors.  */
+
+bool
+narrowing_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \
+    case IFN_##NAME##_HILO:\
+    case IFN_##NAME##_HI: \
+    case IFN_##NAME##_LO: \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if FN decomposes to _hi and _lo IFN.  */
+
+bool
+decomposes_to_hilo_fn_p (internal_fn fn)
+{
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \
+    case IFN_##NAME##_HILO:\
+      return true;
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \
+    case IFN_##NAME##_HILO:\
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
 /* Return true if IFN_SET_EDOM is supported.  */
 
 bool
@@ -4071,7 +4178,33 @@ set_edom_supported_p (void)
     optab which_optab = direct_internal_fn_optab (fn, types);		\
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);			\
   }
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR,	    \
+					    SIGNED_OPTAB, UNSIGNED_OPTAB,   \
+					    TYPE)			    \
+  static void								    \
+  expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED,		    \
+			gcall *stmt ATTRIBUTE_UNUSED)			    \
+  {									    \
+    gcc_unreachable ();							    \
+  }									    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_HI, FLAGS, SELECTOR, SIGNED_OPTAB,    \
+			       UNSIGNED_OPTAB, TYPE)			    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_LO, FLAGS, SELECTOR, SIGNED_OPTAB,    \
+			       UNSIGNED_OPTAB, TYPE)
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE)	\
+  static void								\
+  expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED,		\
+			gcall *stmt ATTRIBUTE_UNUSED)			\
+  {									\
+    gcc_unreachable ();							\
+  }									\
+  DEF_INTERNAL_OPTAB_FN(CODE##_LO, FLAGS, OPTAB, TYPE)			\
+  DEF_INTERNAL_OPTAB_FN(CODE##_HI, FLAGS, OPTAB, TYPE)
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_FN
+#undef DEF_INTERNAL_SIGNED_OPTAB_FN
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
@@ -4080,6 +4213,7 @@ set_edom_supported_p (void)
 
    where STMT is the statement that performs the call. */
 static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = {
+
 #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE,
 #include "internal-fn.def"
   0
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..012dd323b86dd7cfcc5c13d3a2bb2a453937155d 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -85,6 +85,13 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_SIGNED_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it
+   provides convenience wrappers for defining conversions that require a
+   hi/lo split, like widening and narrowing operations.  Each definition
+   for <NAME> will require an optab named <OPTAB> and two other optabs that
+   you specify for signed and unsigned.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -123,6 +130,20 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \
+  DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE) \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE)
+#endif
+
+#ifndef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, FLAGS, OPTAB, TYPE) \
+  DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, OPTAB##_lo, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, OPTAB##_hi, TYPE)
+#endif
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -315,6 +336,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_PLUS,
+				     ECF_CONST | ECF_NOTHROW,
+				     first,
+				     vec_widen_sadd, vec_widen_uadd,
+				     binary)
+DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_MINUS,
+				     ECF_CONST | ECF_NOTHROW,
+				     first,
+				     vec_widen_ssub, vec_widen_usub,
+				     binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 08922ed4254898f5fffca3f33973e96ed9ce772f..8ba07d6d1338e75bc5a451d9e403112a608f3ea2 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,8 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *);
+extern optab direct_internal_fn_optab (internal_fn, tree_pair);
 
 /* Return the ECF_* flags for function FN.  */
 
@@ -210,6 +216,9 @@ extern bool commutative_binary_fn_p (internal_fn);
 extern bool commutative_ternary_fn_p (internal_fn);
 extern int first_commutative_argument (internal_fn);
 extern bool associative_binary_fn_p (internal_fn);
+extern bool widening_fn_p (code_helper);
+extern bool narrowing_fn_p (code_helper);
+extern bool decomposes_to_hilo_fn_p (internal_fn);
 
 extern bool set_edom_supported_p (void);
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index c8e39c82d57a7d726e7da33d247b80f32ec9236c..5a08d91e550b2d92e9572211f811fdba99a33a38 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1314,7 +1314,15 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == smul_widen_optab
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
-	  || binoptab == umul_highpart_optab);
+	  || binoptab == umul_highpart_optab
+	  || binoptab == vec_widen_saddl_hi_optab
+	  || binoptab == vec_widen_saddl_lo_optab
+	  || binoptab == vec_widen_uaddl_hi_optab
+	  || binoptab == vec_widen_uaddl_lo_optab
+	  || binoptab == vec_widen_sadd_hi_optab
+	  || binoptab == vec_widen_sadd_lo_optab
+	  || binoptab == vec_widen_uadd_hi_optab
+	  || binoptab == vec_widen_uadd_lo_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 695f5911b300c9ca5737de9be809fa01aabe5e01..16d121722c8c5723d9b164f5a2c616dc7ec143de 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -410,6 +410,10 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a")
 OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a")
 OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a")
 OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a")
+OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a")
+OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a")
+OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a")
+OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a")
 OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
 OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
 OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")
@@ -422,6 +426,10 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a")
 OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
 OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
 OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
+OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a")
+OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a")
+OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a")
+OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a")
 OPTAB_D (vec_addsub_optab, "vec_addsub$a3")
 OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4")
 OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 0aeebb67fac864db284985f4a6f0653af281d62b..28464ad9e3a7ea25557ffebcdbdbc1340f9e0d8b 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "profile.h"
 #include "sreal.h"
+#include "internal-fn.h"
 
 /* This file contains functions for building the Control Flow Graph (CFG)
    for a function tree.  */
@@ -3411,6 +3412,52 @@ verify_gimple_call (gcall *stmt)
 	  debug_generic_stmt (fn);
 	  return true;
 	}
+      internal_fn ifn = gimple_call_internal_fn (stmt);
+      if (ifn == IFN_LAST)
+	{
+	  error ("gimple call has an invalid IFN");
+	  debug_generic_stmt (fn);
+	  return true;
+	}
+      else if (decomposes_to_hilo_fn_p (ifn))
+	{
+	  /* Non decomposed HILO stmts should not appear in IL, these are
+	     merely used as an internal representation to the auto-vectorizer
+	     pass and should have been expanded to their _LO _HI variants.  */
+	  error ("gimple call has an non decomposed HILO IFN");
+	  debug_generic_stmt (fn);
+	  return true;
+	}
+      else if (ifn == IFN_VEC_WIDEN_PLUS_LO
+	       || ifn == IFN_VEC_WIDEN_PLUS_HI
+	       || ifn == IFN_VEC_WIDEN_MINUS_LO
+	       || ifn == IFN_VEC_WIDEN_MINUS_HI)
+	{
+	  tree rhs1_type = TREE_TYPE (gimple_call_arg (stmt, 0));
+	  tree rhs2_type = TREE_TYPE (gimple_call_arg (stmt, 1));
+	  tree lhs_type = TREE_TYPE (gimple_get_lhs (stmt));
+	  if (TREE_CODE (lhs_type) == VECTOR_TYPE)
+	    {
+	      if (TREE_CODE (rhs1_type) != VECTOR_TYPE
+		  || TREE_CODE (rhs2_type) != VECTOR_TYPE)
+		{
+		  error ("invalid non-vector operands in vector IFN call");
+		  debug_generic_stmt (fn);
+		  return true;
+		}
+	      lhs_type = TREE_TYPE (lhs_type);
+	      rhs1_type = TREE_TYPE (rhs1_type);
+	      rhs2_type = TREE_TYPE (rhs2_type);
+	    }
+	  if (POINTER_TYPE_P (lhs_type)
+	      || POINTER_TYPE_P (rhs1_type)
+	      || POINTER_TYPE_P (rhs2_type))
+	    {
+	      error ("invalid (pointer) operands in vector IFN call");
+	      debug_generic_stmt (fn);
+	      return true;
+	    }
+	}
     }
   else
     {
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 63a19f8d1d89c6bd5d8e55a299cbffaa324b4b84..d74d8db2173b1ab117250fea89de5212d5e354ec 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4433,7 +4433,20 @@ estimate_num_insns (gimple *stmt, eni_weights *weights)
 	tree decl;
 
 	if (gimple_call_internal_p (stmt))
-	  return 0;
+	  {
+	    internal_fn fn = gimple_call_internal_fn (stmt);
+	    switch (fn)
+	      {
+	      case IFN_VEC_WIDEN_PLUS_HI:
+	      case IFN_VEC_WIDEN_PLUS_LO:
+	      case IFN_VEC_WIDEN_MINUS_HI:
+	      case IFN_VEC_WIDEN_MINUS_LO:
+		return 1;
+
+	      default:
+		return 0;
+	      }
+	  }
 	else if ((decl = gimple_call_fndecl (stmt))
 		 && fndecl_built_in_p (decl))
 	  {
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 1778af0242898e3dc73d94d22a5b8505628a53b5..93cebc72beb4f65249a69b2665dfeb8a0991c1d1 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-		      tree_code widened_code, bool shift_p,
+		      code_helper widened_code, bool shift_p,
 		      unsigned int max_nops,
 		      vect_unpromoted_value *unprom, tree *common_type,
 		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
+    return 0;
+
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else if (is_gimple_call (stmt))
+    rhs_code = gimple_call_combined_fn (stmt);
+  else
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  if (rhs_code != code
+      && rhs_code != widened_code)
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = gimple_get_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op = gimple_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
 	{
 	  /* We already have a common type from earlier operands.
@@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+			     IFN_VEC_WIDEN_MINUS_HILO,
 			     false, 2, unprom, &half_type))
     return NULL;
 
@@ -1395,14 +1405,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
 			     tree_code orig_code, code_helper wide_code,
-			     bool shift_p, const char *name)
+			     bool shift_p, const char *name,
+			     optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-			     shift_p, 2, unprom, &half_type))
+			     shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 			      type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+			     stmt_vec_info last_stmt_info, tree *type_out,
+			     tree_code orig_code, internal_fn wide_ifn,
+			     bool shift_p, const char *name,
+			     optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+				      orig_code, ifn, shift_p, name,
+				      subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS_HILO.  See vect_recog_widen_op_pattern for details.  */
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-				      "vect_recog_widen_plus_pattern");
+				      PLUS_EXPR, IFN_VEC_WIDEN_PLUS_HILO,
+				      false, "vect_recog_widen_plus_pattern",
+				      &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS_HILO.  See vect_recog_widen_op_pattern for details.  */
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-				      "vect_recog_widen_minus_pattern");
+				      MINUS_EXPR, IFN_VEC_WIDEN_MINUS_HILO,
+				      false, "vect_recog_widen_minus_pattern",
+				      &subtype);
 }
 
 /* Function vect_recog_ctz_ffs_pattern
@@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo,
   vect_unpromoted_value unprom[3];
   tree new_type;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-					    WIDEN_PLUS_EXPR, false, 3,
+					    IFN_VEC_WIDEN_PLUS_HILO, false, 3,
 					    unprom, &new_type);
   if (nops == 0)
     return NULL;
@@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index d152ae9ab10b361b88c0f839d6951c43b954750a..24c811ebe01fb8b003100dea494cf64fea72a975 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5038,7 +5038,9 @@ vectorizable_conversion (vec_info *vinfo,
   bool widen_arith = (code == WIDEN_PLUS_EXPR
 		 || code == WIDEN_MINUS_EXPR
 		 || code == WIDEN_MULT_EXPR
-		 || code == WIDEN_LSHIFT_EXPR);
+		 || code == WIDEN_LSHIFT_EXPR
+		 || code == IFN_VEC_WIDEN_PLUS_HILO
+		 || code == IFN_VEC_WIDEN_MINUS_HILO);
 
   if (!widen_arith
       && !CONVERT_EXPR_CODE_P (code)
@@ -5088,7 +5090,9 @@ vectorizable_conversion (vec_info *vinfo,
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
 		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR);
+		  || code == WIDEN_MINUS_EXPR
+		  || code == IFN_VEC_WIDEN_PLUS_HILO
+		  || code == IFN_VEC_WIDEN_MINUS_HILO);
 
 
       op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
@@ -12478,10 +12482,43 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
+
+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn ((combined_fn) code);
+      gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+      internal_fn lo, hi;
+      lookup_hilo_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = direct_internal_fn_optab (lo, {vectype, vectype});
+      optab2 = direct_internal_fn_optab (hi, {vectype, vectype});
+    }
+  else if (code.is_tree_code ())
     {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+      if (code == FIX_TRUNC_EXPR)
+	{
+	  /* The signedness is determined from output operand.  */
+	  optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
+	  optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+	}
+      else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ())
+	       && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+	       && VECTOR_BOOLEAN_TYPE_P (vectype)
+	       && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+	       && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+	{
+	  /* If the input and result modes are the same, a different optab
+	     is needed where we pass in the number of units in vectype.  */
+	  optab1 = vec_unpacks_sbool_lo_optab;
+	  optab2 = vec_unpacks_sbool_hi_optab;
+	}
+      else
+	{
+	  optab1 = optab_for_tree_code (c1, vectype, optab_default);
+	  optab2 = optab_for_tree_code (c2, vectype, optab_default);
+	}
     }
 
   if (!optab1 || !optab2)
diff --git a/gcc/tree.def b/gcc/tree.def
index 90ceeec0b512bfa5f983359c0af03cc71de32007..b37b0b35927b92a6536e5c2d9805ffce8319a240 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3)
 DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2)
 
 /* Widening sad (sum of absolute differences).
-   The first two arguments are of type t1 which should be integer.
-   The third argument and the result are of type t2, such that t2 is at least
-   twice the size of t1.  Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
+   The first two arguments are of type t1 which should be a vector of integers.
+   The third argument and the result are of type t2, such that the size of
+   the elements of t2 is at least twice the size of the elements of t1.
+   Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
    equivalent to:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = PLUS_EXPR (tmp2, arg3)
   or:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
  */

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-15 11:53                                               ` Andre Vieira (lists)
@ 2023-05-15 12:21                                                 ` Richard Biener
  2023-05-18 17:15                                                   ` Andre Vieira (lists)
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2023-05-15 12:21 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Sandiford, Richard Biener, gcc-patches

On Mon, 15 May 2023, Andre Vieira (lists) wrote:

> 
> 
> On 15/05/2023 12:01, Richard Biener wrote:
> > On Mon, 15 May 2023, Richard Sandiford wrote:
> > 
> >> Richard Biener <rguenther@suse.de> writes:
> >>> On Fri, 12 May 2023, Richard Sandiford wrote:
> >>>
> >>>> Richard Biener <rguenther@suse.de> writes:
> >>>>> On Fri, 12 May 2023, Andre Vieira (lists) wrote:
> >>>>>
> >>>>>> I have dealt with, I think..., most of your comments. There's quite a
> >>>>>> few
> >>>>>> changes, I think it's all a bit simpler now. I made some other changes
> >>>>>> to the
> >>>>>> costing in tree-inline.cc and gimple-range-op.cc in which I try to
> >>>>>> preserve
> >>>>>> the same behaviour as we had with the tree codes before. Also added
> >>>>>> some extra
> >>>>>> checks to tree-cfg.cc that made sense to me.
> >>>>>>
> >>>>>> I am still regression testing the gimple-range-op change, as that was a
> >>>>>> last
> >>>>>> minute change, but the rest survived a bootstrap and regression test on
> >>>>>> aarch64-unknown-linux-gnu.
> >>>>>>
> >>>>>> cover letter:
> >>>>>>
> >>>>>> This patch replaces the existing tree_code widen_plus and widen_minus
> >>>>>> patterns with internal_fn versions.
> >>>>>>
> >>>>>> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and
> >>>>>> DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
> >>>>>> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN
> >>>>>> respectively
> >>>>>> except they provide convenience wrappers for defining conversions that
> >>>>>> require
> >>>>>> a hi/lo split.  Each definition for <NAME> will require optabs for _hi
> >>>>>> and _lo
> >>>>>> and each of those will also require a signed and unsigned version in
> >>>>>> the case
> >>>>>> of widening. The hi/lo pair is necessary because the widening and
> >>>>>> narrowing
> >>>>>> operations take n narrow elements as inputs and return n/2 wide
> >>>>>> elements as
> >>>>>> outputs. The 'lo' operation operates on the first n/2 elements of
> >>>>>> input. The
> >>>>>> 'hi' operation operates on the second n/2 elements of input. Defining
> >>>>>> an
> >>>>>> internal_fn along with hi/lo variations allows a single internal
> >>>>>> function to
> >>>>>> be returned from a vect_recog function that will later be expanded to
> >>>>>> hi/lo.
> >>>>>>
> >>>>>>
> >>>>>>   For example:
> >>>>>>   IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
> >>>>>> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
> >>>>>> (u/s)addl2
> >>>>>>                         IFN_VEC_WIDEN_PLUS_LO  ->
> >>>>>> vec_widen_<su>add_lo_<mode>
> >>>>>> -> (u/s)addl
> >>>>>>
> >>>>>> This gives the same functionality as the previous
> >>>>>> WIDEN_PLUS/WIDEN_MINUS tree
> >>>>>> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
> >>>>>
> >>>>> What I still don't understand is how we are so narrowly focused on
> >>>>> HI/LO?  We need a combined scalar IFN for pattern selection (not
> >>>>> sure why that's now called _HILO, I expected no suffix).  Then there's
> >>>>> three possibilities the target can implement this:
> >>>>>
> >>>>>   1) with a widen_[su]add<mode> instruction - I _think_ that's what
> >>>>>      RISCV is going to offer since it is a target where vector modes
> >>>>>      have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
> >>>>>      RVV can do a V4HI to V4SI widening and widening add/subtract
> >>>>>      using vwadd[u] and vwsub[u] (the HI->SI widening is actually
> >>>>>      done with a widening add of zero - eh).
> >>>>>      IIRC GCN is the same here.
> >>>>
> >>>> SVE currently does this too, but the addition and widening are
> >>>> separate operations.  E.g. in principle there's no reason why
> >>>> you can't sign-extend one operand, zero-extend the other, and
> >>>> then add the result together.  Or you could extend them from
> >>>> different sizes (QI and HI).  All of those are supported
> >>>> (if the costing allows them).
> >>>
> >>> I see.  So why does the target the expose widen_[su]add<mode> at all?
> >>
> >> It shouldn't (need to) do that.  I don't think we should have an optab
> >> for the unsplit operation.
> >>
> >> At least on SVE, we really want the extensions to be fused with loads
> >> (where possible) rather than with arithmetic.
> >>
> >> We can still do the widening arithmetic in one go.  It's just that
> >> fusing with the loads works for the mixed-sign and mixed-size cases,
> >> and can handle more than just doubling the element size.
> >>
> >>>> If the target has operations to do combined extending and adding (or
> >>>> whatever), then at the moment we rely on combine to generate them.
> >>>>
> >>>> So I think this case is separate from Andre's work.  The addition
> >>>> itself is just an ordinary addition, and any widening happens by
> >>>> vectorising a CONVERT/NOP_EXPR.
> >>>>
> >>>>>   2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
> >>>>>      codes currently support (exclusively)
> >>>>>   3) similar, but widen_[su]add{_even,_odd}<mode>
> >>>>>
> >>>>> that said, things like decomposes_to_hilo_fn_p look to paint us into
> >>>>> a 2) corner without good reason.
> >>>>
> >>>> I suppose one question is: how much of the patch is really specific
> >>>> to HI/LO, and how much is just grouping two halves together?
> >>>
> >>> Yep, that I don't know for sure.
> >>>
> >>>>   The nice
> >>>> thing about the internal-fn grouping macros is that, if (3) is
> >>>> implemented in future, the structure will strongly encourage even/odd
> >>>> pairs to be supported for all operations that support hi/lo.  That is,
> >>>> I would expect the grouping macros to be extended to define even/odd
> >>>> ifns alongside hi/lo ones, rather than adding separate definitions
> >>>> for even/odd functions.
> >>>>
> >>>> If so, at least from the internal-fn.* side of things, I think the
> >>>> question
> >>>> is whether it's OK to stick with hilo names for now, or whether we should
> >>>> use more forward-looking names.
> >>>
> >>> I think for parts that are independent we could use a more
> >>> forward-looking name.  Maybe _halves?
> >>
> >> Using _halves for the ifn macros sounds good to me FWIW.
> >>
> >>> But I'm also not sure
> >>> how much of that is really needed (it seems to be tied around
> >>> optimizing optabs space?)
> >>
> >> Not sure what you mean by "this".  Optabs space shouldn't be a problem
> >> though.  The optab encoding gives us a full int to play with, and it
> >> could easily go up to 64 bits if necessary/convenient.
> >>
> >> At least on the internal-fn.* side, the aim is really just to establish
> >> a regular structure, so that we don't have arbitrary differences between
> >> different widening operations, or too much cut-&-paste.
> > 
> > Hmm, I'm looking at the need for the std::map and
> > internal_fn_hilo_keys_array and internal_fn_hilo_values_array.
> > The vectorizer pieces contain
> > 
> > +  if (code.is_fn_code ())
> > +     {
> > +      internal_fn ifn = as_internal_fn ((combined_fn) code);
> > +      gcc_assert (decomposes_to_hilo_fn_p (ifn));
> > +
> > +      internal_fn lo, hi;
> > +      lookup_hilo_internal_fn (ifn, &lo, &hi);
> > +      *code1 = as_combined_fn (lo);
> > +      *code2 = as_combined_fn (hi);
> > +      optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
> > +      optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
> > 
> > so that tries to automatically associate the scalar widening IFN
> > with the set(s) of IFN pairs we can split to.  But then this
> > list should be static and there's no need to create a std::map?
> > Maybe gencfn-macros.cc can be enhanced to output these static
> > cases?  Or the vectorizer could (as it did previously) simply
> > open-code the handled cases (I guess since we deal with two
> > cases only now I'd prefer that).
> > 
> > Thanks,
> > Richard.
> > 
> > 
> >> Thanks,
> >> Richard
> >>
> > 
> The patch I uploaded last no longer has std::map nor
> internal_fn_hilo_keys_array and internal_fn_hilo_values_array. (I've attached
> it again)

Whoops, too many patches ...

> I'm not sure I understand the _halves, do you mean that for the case where I
> had _hilo or _HILO before we rename that to _halves/_HALVES such that it later
> represents both _hi/_lo separation and _even/_odd?

I don't see much shared stuff, but I guess we'd see when we add a case
for EVEN/ODD.  The verifier contains

+      else if (decomposes_to_hilo_fn_p (ifn))
+       {
+         /* Non decomposed HILO stmts should not appear in IL, these are
+            merely used as an internal representation to the 
auto-vectorizer
+            pass and should have been expanded to their _LO _HI variants.  
*/
+         error ("gimple call has an non decomposed HILO IFN");
+         debug_generic_stmt (fn);
+         return true;

I think to support case 1) that's not wanted.  Instead what you could
check is that the types involved are vector types, so a subset of
what you check for IFN_VEC_WIDEN_PLUS_LO etc. (but oddly it's not
verified those are all operating on vector types only?)

+/*  Given an internal_fn IFN that is a HILO function, return its 
corresponding
+    LO and HI internal_fns.  */
+
+extern void
+lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn 
*hi)
+{
+  gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);

that might become fragile if we add EVEN/ODD besides HI/LO unless
we merge those with a DEF_INTERNAL_OPTAB_WIDENING_HILO_EVENODD_FN
case, right?

> And am I correct to assume we are just giving up on having a INTERNAL_OPTAB_FN
> idea for 1)?

Well, I think we want all of them in the end (or at least support them
if target need arises).  full vector, hi/lo and even/odd.

Richard.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-15 12:21                                                 ` Richard Biener
@ 2023-05-18 17:15                                                   ` Andre Vieira (lists)
  2023-05-22 13:06                                                     ` Richard Biener
  0 siblings, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-05-18 17:15 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Sandiford, Richard Biener, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2935 bytes --]

How about this?

Not sure about the DEF_INTERNAL documentation I rewrote in 
internal-fn.def, was struggling to word these, so improvements welcome!

gcc/ChangeLog:

2023-04-25  Andre Vieira  <andre.simoesdiasvieira@arm.com>
             Joel Hutton  <joel.hutton@arm.com>
             Tamar Christina  <tamar.christina@arm.com>

         * config/aarch64/aarch64-simd.md 
(vec_widen_<su>addl_lo_<mode>): Rename
         this ...
         (vec_widen_<su>add_lo_<mode>): ... to this.
         (vec_widen_<su>addl_hi_<mode>): Rename this ...
         (vec_widen_<su>add_hi_<mode>): ... to this.
         (vec_widen_<su>subl_lo_<mode>): Rename this ...
         (vec_widen_<su>sub_lo_<mode>): ... to this.
         (vec_widen_<su>subl_hi_<mode>): Rename this ...
         (vec_widen_<su>sub_hi_<mode>): ...to this.
         * doc/generic.texi: Document new IFN codes.
	* internal-fn.cc (ifn_cmp): Function to compare ifn's for 
sorting/searching.
	(lookup_hilo_internal_fn): Add lookup function.
	(commutative_binary_fn_p): Add widen_plus fn's.
	(widening_fn_p): New function.
	(narrowing_fn_p): New function.
         (direct_internal_fn_optab): Change visibility.
	* internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an
         internal_fn that expands into multiple internal_fns for widening.
         (DEF_INTERNAL_NARROWING_OPTAB_FN): Likewise but for narrowing.
         (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO,
          IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD,
          IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, 
IFN_VEC_WIDEN_MINUS_LO,
          IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define 
widening
         plus,minus functions.
	* internal-fn.h (direct_internal_fn_optab): Declare new prototype.
	(lookup_hilo_internal_fn): Likewise.
	(widening_fn_p): Likewise.
	(Narrowing_fn_p): Likewise.
	* optabs.cc (commutative_optab_p): Add widening plus optabs.
	* optabs.def (OPTAB_D): Define widen add, sub optabs.
         * tree-cfg.cc (verify_gimple_call): Add checks for widening ifns.
         * tree-inline.cc (estimate_num_insns): Return same
         cost for widen add and sub IFNs as previous tree_codes.
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
         patterns with a hi/lo or even/odd split.
         (vect_recog_sad_pattern): Refactor to use new IFN codes.
         (vect_recog_widen_plus_pattern): Likewise.
         (vect_recog_widen_minus_pattern): Likewise.
         (vect_recog_average_pattern): Likewise.
	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
         _HILO IFNs.
	(supportable_widening_operation): Likewise.
         * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-widen-add.c: Test that new
     IFN_VEC_WIDEN_PLUS is being used.
	* gcc.target/aarch64/vect-widen-sub.c: Test that new
     IFN_VEC_WIDEN_MINUS is being used.

[-- Attachment #2: ifn1v4.patch --]
[-- Type: text/plain, Size: 39515 bytes --]

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bfc98a8d943467b33390defab9682f44efab5907..ffbbecb9409e1c2835d658c2a8855cd0e955c0f2 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4626,7 +4626,7 @@
   [(set_attr "type" "neon_<ADDSUB:optab>_long")]
 )
 
-(define_expand "vec_widen_<su>addl_lo_<mode>"
+(define_expand "vec_widen_<su>add_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4638,7 +4638,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>addl_hi_<mode>"
+(define_expand "vec_widen_<su>add_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4650,7 +4650,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_lo_<mode>"
+(define_expand "vec_widen_<su>sub_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4662,7 +4662,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_hi_<mode>"
+(define_expand "vec_widen_<su>sub_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 8b2882da4fe7da07d22b4e5384d049ba7d3907bf..5e36dac2b1a10257616f12cdfb0b12d0f2879ae9 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,10 +1811,16 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
-@tindex VEC_WIDEN_PLUS_HI_EXPR
-@tindex VEC_WIDEN_PLUS_LO_EXPR
-@tindex VEC_WIDEN_MINUS_HI_EXPR
-@tindex VEC_WIDEN_MINUS_LO_EXPR
+@tindex IFN_VEC_WIDEN_PLUS
+@tindex IFN_VEC_WIDEN_PLUS_HI
+@tindex IFN_VEC_WIDEN_PLUS_LO
+@tindex IFN_VEC_WIDEN_PLUS_EVEN
+@tindex IFN_VEC_WIDEN_PLUS_ODD
+@tindex IFN_VEC_WIDEN_MINUS
+@tindex IFN_VEC_WIDEN_MINUS_HI
+@tindex IFN_VEC_WIDEN_MINUS_LO
+@tindex IFN_VEC_WIDEN_MINUS_EVEN
+@tindex IFN_VEC_WIDEN_MINUS_ODD
 @tindex VEC_UNPACK_HI_EXPR
 @tindex VEC_UNPACK_LO_EXPR
 @tindex VEC_UNPACK_FLOAT_HI_EXPR
@@ -1861,6 +1867,82 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
+@item IFN_VEC_WIDEN_PLUS
+This internal function represents widening vector addition of two input
+vectors.  Its operands are vectors that contain the same number of elements
+(@code{N}) of the same integral type.  The result is a vector that contains
+the same amount (@code{N}) of elements, of an integral type whose size is twice
+as wide, as the input vectors.  If the current target does not implement the
+corresponding optabs the vectorizer may choose to split it into either a pair
+of @code{IFN_VEC_WIDEN_PLUS_HI} and @code{IFN_VEC_WIDEN_PLUS_LO} or
+@code{IFN_VEC_WIDEN_PLUS_EVEN} and @code{IFN_VEC_WIDEN_PLUS_ODD}, depending
+on what optabs the target implements.
+
+@item IFN_VEC_WIDEN_PLUS_HI
+@itemx IFN_VEC_WIDEN_PLUS_LO
+These internal functions represent widening vector addition of the high and low
+parts of the two input vectors, respectively.  Their operands are vectors that
+contain the same number of elements (@code{N}) of the same integral type. The
+result is a vector that contains half as many elements, of an integral type
+whose size is twice as wide.  In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the
+high @code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} additions.  In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low
+@code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} additions.
+
+@item IFN_VEC_WIDEN_PLUS_EVEN
+@itemx IFN_VEC_WIDEN_PLUS_ODD
+These internal functions represent widening vector addition of the even and odd
+elements of the two input vectors, respectively.  Their operands are vectors
+that contain the same number of elements (@code{N}) of the same integral type.
+The result is a vector that contains half as many elements, of an integral type
+whose size is twice as wide.  In the case of @code{IFN_VEC_WIDEN_PLUS_EVEN} the
+even @code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} additions.  In the case of @code{IFN_VEC_WIDEN_PLUS_ODD} the odd
+@code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} additions.
+
+@item IFN_VEC_WIDEN_MINUS
+This internal function represents widening vector subtraction of two input
+vectors.  Its operands are vectors that contain the same number of elements
+(@code{N}) of the same integral type.  The result is a vector that contains
+the same amount (@code{N}) of elements, of an integral type whose size is twice
+as wide, as the input vectors.  If the current target does not implement the
+corresponding optabs the vectorizer may choose to split it into either a pair
+of @code{IFN_VEC_WIDEN_MINUS_HI} and @code{IFN_VEC_WIDEN_MINUS_LO} or
+@code{IFN_VEC_WIDEN_MINUS_EVEN} and @code{IFN_VEC_WIDEN_MINUS_ODD}, depending
+on what optabs the target implements.
+
+@item IFN_VEC_WIDEN_MINUS_HI
+@itemx IFN_VEC_WIDEN_MINUS_LO
+These internal functions represent widening vector subtraction of the high and
+low parts of the two input vectors, respectively.  Their operands are vectors
+that contain the same number of elements (@code{N}) of the same integral type.
+The high/low elements of the second vector are subtracted from the high/low
+elements of the first. The result is a vector that contains half as many
+elements, of an integral type whose size is twice as wide.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second
+vector are subtracted from the high @code{N/2} of the first to produce the
+vector of @code{N/2} subtractions.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second
+vector are subtracted from the low @code{N/2} of the first to produce the
+vector of @code{N/2} subtractions.
+
+@item IFN_VEC_WIDEN_MINUS_EVEN
+@itemx IFN_VEC_WIDEN_MINUS_ODD
+These internal functions represent widening vector subtraction of the even and
+odd parts of the two input vectors, respectively.  Their operands are vectors
+that contain the same number of elements (@code{N}) of the same integral type.
+The even/odd elements of the second vector are subtracted from the even/odd
+elements of the first. The result is a vector that contains half as many
+elements, of an integral type whose size is twice as wide.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_EVEN} the even @code{N/2} elements of the second
+vector are subtracted from the even @code{N/2} of the first to produce the
+vector of @code{N/2} subtractions.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_ODD} the odd @code{N/2} elements of the second
+vector are subtracted from the odd @code{N/2} of the first to produce the
+vector of @code{N/2} subtractions.
+
 @item VEC_WIDEN_PLUS_HI_EXPR
 @itemx VEC_WIDEN_PLUS_LO_EXPR
 These nodes represent widening vector addition of the high and low parts of
diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 594bd3043f0e944299ddfff219f757ef15a3dd61..33f4b7064a2a22aad49f27b24b409e91a5b89c69 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -1187,6 +1187,7 @@ gimple_range_op_handler::maybe_non_standard ()
 {
   range_operator *signed_op = ptr_op_widen_mult_signed;
   range_operator *unsigned_op = ptr_op_widen_mult_unsigned;
+  bool signed1, signed2, signed_ret;
   if (gimple_code (m_stmt) == GIMPLE_ASSIGN)
     switch (gimple_assign_rhs_code (m_stmt))
       {
@@ -1202,32 +1203,55 @@ gimple_range_op_handler::maybe_non_standard ()
 	  m_op1 = gimple_assign_rhs1 (m_stmt);
 	  m_op2 = gimple_assign_rhs2 (m_stmt);
 	  tree ret = gimple_assign_lhs (m_stmt);
-	  bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
-	  bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
-	  bool signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
-
-	  /* Normally these operands should all have the same sign, but
-	     some passes and violate this by taking mismatched sign args.  At
-	     the moment the only one that's possible is mismatch inputs and
-	     unsigned output.  Once ranger supports signs for the operands we
-	     can properly fix it,  for now only accept the case we can do
-	     correctly.  */
-	  if ((signed1 ^ signed2) && signed_ret)
-	    return;
-
-	  m_valid = true;
-	  if (signed2 && !signed1)
-	    std::swap (m_op1, m_op2);
-
-	  if (signed1 || signed2)
-	    m_int = signed_op;
-	  else
-	    m_int = unsigned_op;
+	  signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+	  signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+	  signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
 	  break;
 	}
 	default:
-	  break;
+	  return;
+      }
+  else if (gimple_code (m_stmt) == GIMPLE_CALL
+      && gimple_call_internal_p (m_stmt)
+      && gimple_get_lhs (m_stmt) != NULL_TREE)
+    switch (gimple_call_internal_fn (m_stmt))
+      {
+      case IFN_VEC_WIDEN_PLUS_LO:
+      case IFN_VEC_WIDEN_PLUS_HI:
+	  {
+	    signed_op = ptr_op_widen_plus_signed;
+	    unsigned_op = ptr_op_widen_plus_unsigned;
+	    m_valid = false;
+	    m_op1 = gimple_call_arg (m_stmt, 0);
+	    m_op2 = gimple_call_arg (m_stmt, 1);
+	    tree ret = gimple_get_lhs (m_stmt);
+	    signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+	    signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+	    signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
+	    break;
+	  }
+      default:
+	return;
       }
+  else
+    return;
+
+  /* Normally these operands should all have the same sign, but some passes
+     and violate this by taking mismatched sign args.  At the moment the only
+     one that's possible is mismatch inputs and unsigned output.  Once ranger
+     supports signs for the operands we can properly fix it,  for now only
+     accept the case we can do correctly.  */
+  if ((signed1 ^ signed2) && signed_ret)
+    return;
+
+  m_valid = true;
+  if (signed2 && !signed1)
+    std::swap (m_op1, m_op2);
+
+  if (signed1 || signed2)
+    m_int = signed_op;
+  else
+    m_int = unsigned_op;
 }
 
 // Set up a gimple_range_op_handler for any built in function which can be
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..348bee35a35ae4ed9a8652f5349f430c2733e1cb 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -90,6 +90,71 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+/*  Given an internal_fn IFN that is either a widening or narrowing function, return its
+    corresponding LO and HI internal_fns.  */
+
+extern void
+lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
+{
+  gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn));
+
+  switch (ifn)
+    {
+    default:
+      gcc_unreachable ();
+#undef DEF_INTERNAL_FN
+#undef DEF_INTERNAL_WIDENING_OPTAB_FN
+#undef DEF_INTERNAL_NARROWING_OPTAB_FN
+#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE)
+#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T)	\
+    case IFN_##NAME:						\
+      *lo = internal_fn (IFN_##NAME##_LO);			\
+      *hi = internal_fn (IFN_##NAME##_HI);			\
+      break;
+#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T)	\
+    case IFN_##NAME:					\
+      *lo = internal_fn (IFN_##NAME##_LO);		\
+      *hi = internal_fn (IFN_##NAME##_HI);		\
+      break;
+#include "internal-fn.def"
+#undef DEF_INTERNAL_FN
+#undef DEF_INTERNAL_WIDENING_OPTAB_FN
+#undef DEF_INTERNAL_NARROWING_OPTAB_FN
+    }
+}
+
+extern void
+lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even,
+			    internal_fn *odd)
+{
+  gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn));
+
+  switch (ifn)
+    {
+    default:
+      gcc_unreachable ();
+#undef DEF_INTERNAL_FN
+#undef DEF_INTERNAL_WIDENING_OPTAB_FN
+#undef DEF_INTERNAL_NARROWING_OPTAB_FN
+#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE)
+#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T)	\
+    case IFN_##NAME:						\
+      *even = internal_fn (IFN_##NAME##_EVEN);			\
+      *odd = internal_fn (IFN_##NAME##_ODD);			\
+      break;
+#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T)	\
+    case IFN_##NAME:					\
+      *even = internal_fn (IFN_##NAME##_EVEN);		\
+      *odd = internal_fn (IFN_##NAME##_ODD);		\
+      break;
+#include "internal-fn.def"
+#undef DEF_INTERNAL_FN
+#undef DEF_INTERNAL_WIDENING_OPTAB_FN
+#undef DEF_INTERNAL_NARROWING_OPTAB_FN
+    }
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -3852,7 +3917,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
 
 /* Return the optab used by internal function FN.  */
 
-static optab
+optab
 direct_internal_fn_optab (internal_fn fn, tree_pair types)
 {
   switch (fn)
@@ -3971,6 +4036,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -4044,6 +4112,68 @@ first_commutative_argument (internal_fn fn)
     }
 }
 
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as wide as the element size of the input vectors.  */
+
+bool
+widening_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_WIDENING_OPTAB_FN
+    #define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \
+    case IFN_##NAME:						  \
+    case IFN_##NAME##_HI:					  \
+    case IFN_##NAME##_LO:					  \
+    case IFN_##NAME##_EVEN:					  \
+    case IFN_##NAME##_ODD:					  \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_WIDENING_OPTAB_FN
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as narrow as the element size of the input vectors.  */
+
+bool
+narrowing_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_NARROWING_OPTAB_FN
+    #define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T)  \
+    case IFN_##NAME##:					    \
+    case IFN_##NAME##_HI:				    \
+    case IFN_##NAME##_LO:				    \
+    case IFN_##NAME##_HI:				    \
+    case IFN_##NAME##_LO:				    \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_NARROWING_OPTAB_FN
+
+    default:
+      return false;
+    }
+}
+
 /* Return true if IFN_SET_EDOM is supported.  */
 
 bool
@@ -4072,6 +4202,8 @@ set_edom_supported_p (void)
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);			\
   }
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_FN
+#undef DEF_INTERNAL_SIGNED_OPTAB_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
@@ -4080,6 +4212,7 @@ set_edom_supported_p (void)
 
    where STMT is the statement that performs the call. */
 static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = {
+
 #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE,
 #include "internal-fn.def"
   0
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..e9edaa201ad4ad171a49119efa9d6bff49add9f4 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -85,6 +85,34 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_WIDENING_OPTAB_FN is a wrapper that defines five internal
+   functions with DEF_INTERNAL_SIGNED_OPTAB_FN:
+   - one that describes a widening operation with the same number of elements
+   in the output and input vectors,
+   - two that describe a pair of high-low widening operations where the output
+   vectors each have half the number of elements of the input vectors,
+   corresponding to the result of the widening operation on the top half and
+   bottom half, these have the suffixes _HI and _LO,
+   - and two that describe a pair of even-odd widening operations where the
+   output vectors each have half the number of elements of the input vectors,
+   corresponding to the result of the widening operation on the even and odd
+   elements, these have the suffixes _EVEN and _ODD.
+   These five internal functions will require two optabs each, a SIGNED_OPTAB
+   and an UNSIGNED_OTPAB.
+
+   DEF_INTERNAL_NARROWING_OPTAB_FN is a wrapper that defines five internal
+   functions with DEF_INTERNAL_OPTAB_FN:
+   - one that describes a narrowing operation with the same number of elements
+   in the output and input vectors,
+   - two that describe a pair of high-low narrowing operations where the output
+   vector has the same number of elements in the top or bottom halves as the
+   full input vectors, these have the suffixes _HI and _LO.
+   - and two that describe a pair of even-odd narrowing operations where the
+   output vector has the same number of elements, in the even or odd positions,
+   as the full input vectors, these have the suffixes _EVEN and _ODD.
+   These five internal functions will require an optab each.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -123,6 +151,24 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_WIDENING_OPTAB_FN
+#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE)		    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE)			    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE)	    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE)	    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _EVEN, FLAGS, SELECTOR, SOPTAB##_even, UOPTAB##_even, TYPE) \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS, SELECTOR, SOPTAB##_odd, UOPTAB##_odd, TYPE)
+#endif
+
+#ifndef DEF_INTERNAL_NARROWING_OPTAB_FN
+#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, FLAGS, OPTAB, TYPE)   \
+  DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)		    \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, OPTAB##_lo, TYPE)	    \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, OPTAB##_hi, TYPE)	    \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _EVEN, FLAGS, OPTAB##_even, TYPE)  \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _ODD, FLAGS, OPTAB##_odd, TYPE)
+#endif
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -315,6 +361,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_PLUS,
+				ECF_CONST | ECF_NOTHROW,
+				first,
+				vec_widen_sadd, vec_widen_uadd,
+				binary)
+DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_MINUS,
+				ECF_CONST | ECF_NOTHROW,
+				first,
+				vec_widen_ssub, vec_widen_usub,
+				binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 08922ed4254898f5fffca3f33973e96ed9ce772f..3904ba3ca36949d844532a6a9303f550533311a4 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,10 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *);
+extern void lookup_evenodd_internal_fn (internal_fn, internal_fn *,
+					internal_fn *);
+extern optab direct_internal_fn_optab (internal_fn, tree_pair);
 
 /* Return the ECF_* flags for function FN.  */
 
@@ -210,6 +218,8 @@ extern bool commutative_binary_fn_p (internal_fn);
 extern bool commutative_ternary_fn_p (internal_fn);
 extern int first_commutative_argument (internal_fn);
 extern bool associative_binary_fn_p (internal_fn);
+extern bool widening_fn_p (code_helper);
+extern bool narrowing_fn_p (code_helper);
 
 extern bool set_edom_supported_p (void);
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index c8e39c82d57a7d726e7da33d247b80f32ec9236c..5a08d91e550b2d92e9572211f811fdba99a33a38 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1314,7 +1314,15 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == smul_widen_optab
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
-	  || binoptab == umul_highpart_optab);
+	  || binoptab == umul_highpart_optab
+	  || binoptab == vec_widen_saddl_hi_optab
+	  || binoptab == vec_widen_saddl_lo_optab
+	  || binoptab == vec_widen_uaddl_hi_optab
+	  || binoptab == vec_widen_uaddl_lo_optab
+	  || binoptab == vec_widen_sadd_hi_optab
+	  || binoptab == vec_widen_sadd_lo_optab
+	  || binoptab == vec_widen_uadd_hi_optab
+	  || binoptab == vec_widen_uadd_lo_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 695f5911b300c9ca5737de9be809fa01aabe5e01..d41ed6e1afaddd019c7470f965c0ad21c8b2b9d7 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -410,6 +410,16 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a")
 OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a")
 OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a")
 OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a")
+OPTAB_D (vec_widen_ssub_optab, "vec_widen_ssub_$a")
+OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a")
+OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a")
+OPTAB_D (vec_widen_ssub_odd_optab, "vec_widen_ssub_odd_$a")
+OPTAB_D (vec_widen_ssub_even_optab, "vec_widen_ssub_even_$a")
+OPTAB_D (vec_widen_sadd_optab, "vec_widen_sadd_$a")
+OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a")
+OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a")
+OPTAB_D (vec_widen_sadd_odd_optab, "vec_widen_sadd_odd_$a")
+OPTAB_D (vec_widen_sadd_even_optab, "vec_widen_sadd_even_$a")
 OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
 OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
 OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")
@@ -422,6 +432,16 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a")
 OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
 OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
 OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
+OPTAB_D (vec_widen_usub_optab, "vec_widen_usub_$a")
+OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a")
+OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a")
+OPTAB_D (vec_widen_usub_odd_optab, "vec_widen_usub_odd_$a")
+OPTAB_D (vec_widen_usub_even_optab, "vec_widen_usub_even_$a")
+OPTAB_D (vec_widen_uadd_optab, "vec_widen_uadd_$a")
+OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a")
+OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a")
+OPTAB_D (vec_widen_uadd_odd_optab, "vec_widen_uadd_odd_$a")
+OPTAB_D (vec_widen_uadd_even_optab, "vec_widen_uadd_even_$a")
 OPTAB_D (vec_addsub_optab, "vec_addsub$a3")
 OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4")
 OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 0aeebb67fac864db284985f4a6f0653af281d62b..0e847cd04ca6e33f67a86a78a36d35d42aba2627 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "profile.h"
 #include "sreal.h"
+#include "internal-fn.h"
 
 /* This file contains functions for building the Control Flow Graph (CFG)
    for a function tree.  */
@@ -3411,6 +3412,40 @@ verify_gimple_call (gcall *stmt)
 	  debug_generic_stmt (fn);
 	  return true;
 	}
+      internal_fn ifn = gimple_call_internal_fn (stmt);
+      if (ifn == IFN_LAST)
+	{
+	  error ("gimple call has an invalid IFN");
+	  debug_generic_stmt (fn);
+	  return true;
+	}
+      else if (widening_fn_p (ifn)
+	       || narrowing_fn_p (ifn))
+	{
+	  tree lhs = gimple_get_lhs (stmt);
+	  if (!lhs)
+	    {
+	      error ("vector IFN call with no lhs");
+	      debug_generic_stmt (fn);
+	      return true;
+	    }
+
+	  bool non_vector_operands = false;
+	  for (unsigned i = 0; i < gimple_call_num_args (stmt); ++i)
+	    if (!VECTOR_TYPE_P (TREE_TYPE (gimple_call_arg (stmt, i))))
+	      {
+		non_vector_operands = true;
+		break;
+	      }
+
+	  if (non_vector_operands
+	      || !VECTOR_TYPE_P (TREE_TYPE (lhs)))
+	    {
+	      error ("invalid non-vector operands in vector IFN call");
+	      debug_generic_stmt (fn);
+	      return true;
+	    }
+	}
     }
   else
     {
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 63a19f8d1d89c6bd5d8e55a299cbffaa324b4b84..d74d8db2173b1ab117250fea89de5212d5e354ec 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4433,7 +4433,20 @@ estimate_num_insns (gimple *stmt, eni_weights *weights)
 	tree decl;
 
 	if (gimple_call_internal_p (stmt))
-	  return 0;
+	  {
+	    internal_fn fn = gimple_call_internal_fn (stmt);
+	    switch (fn)
+	      {
+	      case IFN_VEC_WIDEN_PLUS_HI:
+	      case IFN_VEC_WIDEN_PLUS_LO:
+	      case IFN_VEC_WIDEN_MINUS_HI:
+	      case IFN_VEC_WIDEN_MINUS_LO:
+		return 1;
+
+	      default:
+		return 0;
+	      }
+	  }
 	else if ((decl = gimple_call_fndecl (stmt))
 		 && fndecl_built_in_p (decl))
 	  {
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 1778af0242898e3dc73d94d22a5b8505628a53b5..dcd4b5561600346a2c10bd5133507329206e8837 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-		      tree_code widened_code, bool shift_p,
+		      code_helper widened_code, bool shift_p,
 		      unsigned int max_nops,
 		      vect_unpromoted_value *unprom, tree *common_type,
 		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
+    return 0;
+
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else if (is_gimple_call (stmt))
+    rhs_code = gimple_call_combined_fn (stmt);
+  else
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  if (rhs_code != code
+      && rhs_code != widened_code)
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = gimple_get_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op = gimple_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
 	{
 	  /* We already have a common type from earlier operands.
@@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+			     IFN_VEC_WIDEN_MINUS,
 			     false, 2, unprom, &half_type))
     return NULL;
 
@@ -1395,14 +1405,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
 			     tree_code orig_code, code_helper wide_code,
-			     bool shift_p, const char *name)
+			     bool shift_p, const char *name,
+			     optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-			     shift_p, 2, unprom, &half_type))
+			     shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 			      type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+			     stmt_vec_info last_stmt_info, tree *type_out,
+			     tree_code orig_code, internal_fn wide_ifn,
+			     bool shift_p, const char *name,
+			     optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+				      orig_code, ifn, shift_p, name,
+				      subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS.  See vect_recog_widen_op_pattern for details.  */
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-				      "vect_recog_widen_plus_pattern");
+				      PLUS_EXPR, IFN_VEC_WIDEN_PLUS,
+				      false, "vect_recog_widen_plus_pattern",
+				      &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS.  See vect_recog_widen_op_pattern for details.  */
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-				      "vect_recog_widen_minus_pattern");
+				      MINUS_EXPR, IFN_VEC_WIDEN_MINUS,
+				      false, "vect_recog_widen_minus_pattern",
+				      &subtype);
 }
 
 /* Function vect_recog_ctz_ffs_pattern
@@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo,
   vect_unpromoted_value unprom[3];
   tree new_type;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-					    WIDEN_PLUS_EXPR, false, 3,
+					    IFN_VEC_WIDEN_PLUS, false, 3,
 					    unprom, &new_type);
   if (nops == 0)
     return NULL;
@@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index d152ae9ab10b361b88c0f839d6951c43b954750a..132c0337b7f541bfb114c0a3d2abbeffdad79880 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5038,7 +5038,8 @@ vectorizable_conversion (vec_info *vinfo,
   bool widen_arith = (code == WIDEN_PLUS_EXPR
 		 || code == WIDEN_MINUS_EXPR
 		 || code == WIDEN_MULT_EXPR
-		 || code == WIDEN_LSHIFT_EXPR);
+		 || code == WIDEN_LSHIFT_EXPR
+		 || widening_fn_p (code));
 
   if (!widen_arith
       && !CONVERT_EXPR_CODE_P (code)
@@ -5088,8 +5089,8 @@ vectorizable_conversion (vec_info *vinfo,
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
 		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR);
-
+		  || code == WIDEN_MINUS_EXPR
+		  || widening_fn_p (code));
 
       op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
 				     gimple_call_arg (stmt, 0);
@@ -12478,26 +12479,69 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
-    {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+
+  vec_mode = TYPE_MODE (vectype);
+  if (widening_fn_p (code))
+     {
+       /* If this is an internal fn then we must check whether the target
+	  supports either a low-high split or an even-odd split.  */
+      internal_fn ifn = as_internal_fn ((combined_fn) code);
+
+      internal_fn lo, hi, even, odd;
+      lookup_hilo_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = direct_internal_fn_optab (lo, {vectype, vectype});
+      optab2 = direct_internal_fn_optab (hi, {vectype, vectype});
+
+      /* If we don't support low-high, then check for even-odd.  */
+      if (!optab1
+	  || (icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing
+	  || !optab2
+	  || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
+	{
+	  lookup_evenodd_internal_fn (ifn, &even, &odd);
+	  *code1 = as_combined_fn (even);
+	  *code2 = as_combined_fn (odd);
+	  optab1 = direct_internal_fn_optab (even, {vectype, vectype});
+	  optab2 = direct_internal_fn_optab (odd, {vectype, vectype});
+	}
+    }
+  else if (code.is_tree_code ())
+    {
+      if (code == FIX_TRUNC_EXPR)
+	{
+	  /* The signedness is determined from output operand.  */
+	  optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
+	  optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+	}
+      else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ())
+	       && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+	       && VECTOR_BOOLEAN_TYPE_P (vectype)
+	       && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+	       && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+	{
+	  /* If the input and result modes are the same, a different optab
+	     is needed where we pass in the number of units in vectype.  */
+	  optab1 = vec_unpacks_sbool_lo_optab;
+	  optab2 = vec_unpacks_sbool_hi_optab;
+	}
+      else
+	{
+	  optab1 = optab_for_tree_code (c1, vectype, optab_default);
+	  optab2 = optab_for_tree_code (c2, vectype, optab_default);
+	}
+      *code1 = c1;
+      *code2 = c2;
     }
 
   if (!optab1 || !optab2)
     return false;
 
-  vec_mode = TYPE_MODE (vectype);
   if ((icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing
        || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
     return false;
 
-  if (code.is_tree_code ())
-  {
-    *code1 = c1;
-    *code2 = c2;
-  }
-
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
       && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
diff --git a/gcc/tree.def b/gcc/tree.def
index 90ceeec0b512bfa5f983359c0af03cc71de32007..b37b0b35927b92a6536e5c2d9805ffce8319a240 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3)
 DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2)
 
 /* Widening sad (sum of absolute differences).
-   The first two arguments are of type t1 which should be integer.
-   The third argument and the result are of type t2, such that t2 is at least
-   twice the size of t1.  Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
+   The first two arguments are of type t1 which should be a vector of integers.
+   The third argument and the result are of type t2, such that the size of
+   the elements of t2 is at least twice the size of the elements of t1.
+   Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
    equivalent to:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = PLUS_EXPR (tmp2, arg3)
   or:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
  */

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-18 17:15                                                   ` Andre Vieira (lists)
@ 2023-05-22 13:06                                                     ` Richard Biener
  2023-06-01 16:27                                                       ` Andre Vieira (lists)
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Biener @ 2023-05-22 13:06 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Sandiford, Richard Biener, gcc-patches

On Thu, 18 May 2023, Andre Vieira (lists) wrote:

> How about this?
> 
> Not sure about the DEF_INTERNAL documentation I rewrote in internal-fn.def,
> was struggling to word these, so improvements welcome!

The even/odd variant optabs are also commutative_optab_p, so is
the vec_widen_sadd without hi/lo or even/odd.

+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */

do you really want -all?  I think you want -details

+      else if (widening_fn_p (ifn)
+              || narrowing_fn_p (ifn))
+       {
+         tree lhs = gimple_get_lhs (stmt);
+         if (!lhs)
+           {
+             error ("vector IFN call with no lhs");
+             debug_generic_stmt (fn);

that's an error because ...?  Maybe we want to verify this
for all ECF_CONST|ECF_NOTHROW (or pure instead of const) internal
function calls, but I wouldn't add any verification as part
of this patch (not special to widening/narrowing fns either).

        if (gimple_call_internal_p (stmt))
-         return 0;
+         {
+           internal_fn fn = gimple_call_internal_fn (stmt);
+           switch (fn)
+             {
+             case IFN_VEC_WIDEN_PLUS_HI:
+             case IFN_VEC_WIDEN_PLUS_LO:
+             case IFN_VEC_WIDEN_MINUS_HI:
+             case IFN_VEC_WIDEN_MINUS_LO:
+               return 1;

this now looks incomplete.  I think that we want instead to
have a default: returning 1 and then special-cases we want
to cost as zero.  Not sure which - maybe blame tells why
this was added?  I think we can deal with this as followup
(likewise the ranger additions).

Otherwise looks good to me.

Thanks,
Richard.

> gcc/ChangeLog:
> 
> 2023-04-25  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Joel Hutton  <joel.hutton@arm.com>
>             Tamar Christina  <tamar.christina@arm.com>
> 
>         * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>):
> Rename
>         this ...
>         (vec_widen_<su>add_lo_<mode>): ... to this.
>         (vec_widen_<su>addl_hi_<mode>): Rename this ...
>         (vec_widen_<su>add_hi_<mode>): ... to this.
>         (vec_widen_<su>subl_lo_<mode>): Rename this ...
>         (vec_widen_<su>sub_lo_<mode>): ... to this.
>         (vec_widen_<su>subl_hi_<mode>): Rename this ...
>         (vec_widen_<su>sub_hi_<mode>): ...to this.
>         * doc/generic.texi: Document new IFN codes.
> 	* internal-fn.cc (ifn_cmp): Function to compare ifn's for
> sorting/searching.
> 	(lookup_hilo_internal_fn): Add lookup function.
> 	(commutative_binary_fn_p): Add widen_plus fn's.
> 	(widening_fn_p): New function.
> 	(narrowing_fn_p): New function.
> 	         (direct_internal_fn_optab): Change visibility.
> 	* internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an
>         internal_fn that expands into multiple internal_fns for widening.
>         (DEF_INTERNAL_NARROWING_OPTAB_FN): Likewise but for narrowing.
>         (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO,
>          IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD,
>          IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, 
> IFN_VEC_WIDEN_MINUS_LO,
>          IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define widening
> 	         plus,minus functions.
> 	* internal-fn.h (direct_internal_fn_optab): Declare new prototype.
> 	(lookup_hilo_internal_fn): Likewise.
> 	(widening_fn_p): Likewise.
> 	(Narrowing_fn_p): Likewise.
> 	* optabs.cc (commutative_optab_p): Add widening plus optabs.
> 	* optabs.def (OPTAB_D): Define widen add, sub optabs.
>         * tree-cfg.cc (verify_gimple_call): Add checks for widening ifns.
>         * tree-inline.cc (estimate_num_insns): Return same
>         cost for widen add and sub IFNs as previous tree_codes.
> 	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
>         patterns with a hi/lo or even/odd split.
>         (vect_recog_sad_pattern): Refactor to use new IFN codes.
>         (vect_recog_widen_plus_pattern): Likewise.
>         (vect_recog_widen_minus_pattern): Likewise.
>         (vect_recog_average_pattern): Likewise.
> 	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
> 	         _HILO IFNs.
> 	(supportable_widening_operation): Likewise.
>         * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
> 
> gcc/testsuite/ChangeLog:
> 
>     	* gcc.target/aarch64/vect-widen-add.c: Test that new
>     IFN_VEC_WIDEN_PLUS is being used.
>     	* gcc.target/aarch64/vect-widen-sub.c: Test that new
>     IFN_VEC_WIDEN_MINUS is being used.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-05-22 13:06                                                     ` Richard Biener
@ 2023-06-01 16:27                                                       ` Andre Vieira (lists)
  2023-06-02 12:00                                                         ` Richard Sandiford
  2023-06-06 19:00                                                         ` Jakub Jelinek
  0 siblings, 2 replies; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-06-01 16:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Sandiford, Richard Biener, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 8566 bytes --]

Hi,

This is the updated patch and cover letter. Patches for inline and 
gimple-op changes will follow soon.

     DEF_INTERNAL_WIDENING_OPTAB_FN and DEF_INTERNAL_NARROWING_OPTAB_FN 
are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN 
respectively. With the exception that they provide convenience wrappers 
for a single vector to vector conversion, a hi/lo split or an even/odd 
split.  Each definition for <NAME> will require either signed optabs 
named <UOPTAB> and <SOPTAB> (for widening) or a single <OPTAB> (for 
narrowing) for each of the five functions it creates.

      For example, for widening addition the 
DEF_INTERNAL_WIDENING_OPTAB_FN will create five internal functions: 
IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, 
IFN_VEC_WIDEN_PLUS_EVEN and IFN_VEC_WIDEN_PLUS_ODD. Each requiring two 
optabs, one for signed and one for unsigned.
      Aarch64 implements the hi/lo split optabs:
      IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> -> (u/s)addl2
      IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode> -> (u/s)addl

     This gives the same functionality as the previous 
WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into 
VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.

gcc/ChangeLog:

2023-04-25  Andre Vieira  <andre.simoesdiasvieira@arm.com>
             Joel Hutton  <joel.hutton@arm.com>
             Tamar Christina  <tamar.christina@arm.com>

         * config/aarch64/aarch64-simd.md 
(vec_widen_<su>addl_lo_<mode>): Rename
         this ...
         (vec_widen_<su>add_lo_<mode>): ... to this.
         (vec_widen_<su>addl_hi_<mode>): Rename this ...
         (vec_widen_<su>add_hi_<mode>): ... to this.
         (vec_widen_<su>subl_lo_<mode>): Rename this ...
         (vec_widen_<su>sub_lo_<mode>): ... to this.
         (vec_widen_<su>subl_hi_<mode>): Rename this ...
         (vec_widen_<su>sub_hi_<mode>): ...to this.
         * doc/generic.texi: Document new IFN codes.
	* internal-fn.cc (ifn_cmp): Function to compare ifn's for 
sorting/searching.
	(lookup_hilo_internal_fn): Add lookup function.
	(commutative_binary_fn_p): Add widen_plus fn's.
	(widening_fn_p): New function.
	(narrowing_fn_p): New function.
         (direct_internal_fn_optab): Change visibility.
	* internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an
         internal_fn that expands into multiple internal_fns for widening.
         (DEF_INTERNAL_NARROWING_OPTAB_FN): Likewise but for narrowing.
         (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO,
          IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD,
          IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, 
IFN_VEC_WIDEN_MINUS_LO,
          IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define 
widening
         plus,minus functions.
	* internal-fn.h (direct_internal_fn_optab): Declare new prototype.
	(lookup_hilo_internal_fn): Likewise.
	(widening_fn_p): Likewise.
	(Narrowing_fn_p): Likewise.
	* optabs.cc (commutative_optab_p): Add widening plus optabs.
	* optabs.def (OPTAB_D): Define widen add, sub optabs.
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
         patterns with a hi/lo or even/odd split.
         (vect_recog_sad_pattern): Refactor to use new IFN codes.
         (vect_recog_widen_plus_pattern): Likewise.
         (vect_recog_widen_minus_pattern): Likewise.
         (vect_recog_average_pattern): Likewise.
	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
         _HILO IFNs.
	(supportable_widening_operation): Likewise.
         * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-widen-add.c: Test that new
     IFN_VEC_WIDEN_PLUS is being used.
	* gcc.target/aarch64/vect-widen-sub.c: Test that new
     IFN_VEC_WIDEN_MINUS is being used.

On 22/05/2023 14:06, Richard Biener wrote:
> On Thu, 18 May 2023, Andre Vieira (lists) wrote:
> 
>> How about this?
>>
>> Not sure about the DEF_INTERNAL documentation I rewrote in internal-fn.def,
>> was struggling to word these, so improvements welcome!
> 
> The even/odd variant optabs are also commutative_optab_p, so is
> the vec_widen_sadd without hi/lo or even/odd.
> 
> +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
> 
> do you really want -all?  I think you want -details
> 
> +      else if (widening_fn_p (ifn)
> +              || narrowing_fn_p (ifn))
> +       {
> +         tree lhs = gimple_get_lhs (stmt);
> +         if (!lhs)
> +           {
> +             error ("vector IFN call with no lhs");
> +             debug_generic_stmt (fn);
> 
> that's an error because ...?  Maybe we want to verify this
> for all ECF_CONST|ECF_NOTHROW (or pure instead of const) internal
> function calls, but I wouldn't add any verification as part
> of this patch (not special to widening/narrowing fns either).
> 
>          if (gimple_call_internal_p (stmt))
> -         return 0;
> +         {
> +           internal_fn fn = gimple_call_internal_fn (stmt);
> +           switch (fn)
> +             {
> +             case IFN_VEC_WIDEN_PLUS_HI:
> +             case IFN_VEC_WIDEN_PLUS_LO:
> +             case IFN_VEC_WIDEN_MINUS_HI:
> +             case IFN_VEC_WIDEN_MINUS_LO:
> +               return 1;
> 
> this now looks incomplete.  I think that we want instead to
> have a default: returning 1 and then special-cases we want
> to cost as zero.  Not sure which - maybe blame tells why
> this was added?  I think we can deal with this as followup
> (likewise the ranger additions).
> 
> Otherwise looks good to me.
> 
> Thanks,
> Richard.
> 
>> gcc/ChangeLog:
>>
>> 2023-04-25  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>>              Joel Hutton  <joel.hutton@arm.com>
>>              Tamar Christina  <tamar.christina@arm.com>
>>
>>          * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>):
>> Rename
>>          this ...
>>          (vec_widen_<su>add_lo_<mode>): ... to this.
>>          (vec_widen_<su>addl_hi_<mode>): Rename this ...
>>          (vec_widen_<su>add_hi_<mode>): ... to this.
>>          (vec_widen_<su>subl_lo_<mode>): Rename this ...
>>          (vec_widen_<su>sub_lo_<mode>): ... to this.
>>          (vec_widen_<su>subl_hi_<mode>): Rename this ...
>>          (vec_widen_<su>sub_hi_<mode>): ...to this.
>>          * doc/generic.texi: Document new IFN codes.
>> 	* internal-fn.cc (ifn_cmp): Function to compare ifn's for
>> sorting/searching.
>> 	(lookup_hilo_internal_fn): Add lookup function.
>> 	(commutative_binary_fn_p): Add widen_plus fn's.
>> 	(widening_fn_p): New function.
>> 	(narrowing_fn_p): New function.
>> 	         (direct_internal_fn_optab): Change visibility.
>> 	* internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an
>>          internal_fn that expands into multiple internal_fns for widening.
>>          (DEF_INTERNAL_NARROWING_OPTAB_FN): Likewise but for narrowing.
>>          (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO,
>>           IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD,
>>           IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI,
>> IFN_VEC_WIDEN_MINUS_LO,
>>           IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define widening
>> 	         plus,minus functions.
>> 	* internal-fn.h (direct_internal_fn_optab): Declare new prototype.
>> 	(lookup_hilo_internal_fn): Likewise.
>> 	(widening_fn_p): Likewise.
>> 	(Narrowing_fn_p): Likewise.
>> 	* optabs.cc (commutative_optab_p): Add widening plus optabs.
>> 	* optabs.def (OPTAB_D): Define widen add, sub optabs.
>>          * tree-cfg.cc (verify_gimple_call): Add checks for widening ifns.
>>          * tree-inline.cc (estimate_num_insns): Return same
>>          cost for widen add and sub IFNs as previous tree_codes.
>> 	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
>>          patterns with a hi/lo or even/odd split.
>>          (vect_recog_sad_pattern): Refactor to use new IFN codes.
>>          (vect_recog_widen_plus_pattern): Likewise.
>>          (vect_recog_widen_minus_pattern): Likewise.
>>          (vect_recog_average_pattern): Likewise.
>> 	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
>> 	         _HILO IFNs.
>> 	(supportable_widening_operation): Likewise.
>>          * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
>>
>> gcc/testsuite/ChangeLog:
>>
>>      	* gcc.target/aarch64/vect-widen-add.c: Test that new
>>      IFN_VEC_WIDEN_PLUS is being used.
>>      	* gcc.target/aarch64/vect-widen-sub.c: Test that new
>>      IFN_VEC_WIDEN_MINUS is being used.
>>
> 

[-- Attachment #2: ifn1v5.patch --]
[-- Type: text/plain, Size: 34259 bytes --]

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index da9c59e655465a74926b81b95b4ac8c353efb1b7..b404d5cabf9df8ea8c70ea4537deb978d351c51e 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4626,7 +4626,7 @@
   [(set_attr "type" "neon_<ADDSUB:optab>_long")]
 )
 
-(define_expand "vec_widen_<su>addl_lo_<mode>"
+(define_expand "vec_widen_<su>add_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4638,7 +4638,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>addl_hi_<mode>"
+(define_expand "vec_widen_<su>add_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4650,7 +4650,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_lo_<mode>"
+(define_expand "vec_widen_<su>sub_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4662,7 +4662,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_hi_<mode>"
+(define_expand "vec_widen_<su>sub_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 8b2882da4fe7da07d22b4e5384d049ba7d3907bf..5e36dac2b1a10257616f12cdfb0b12d0f2879ae9 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,10 +1811,16 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
-@tindex VEC_WIDEN_PLUS_HI_EXPR
-@tindex VEC_WIDEN_PLUS_LO_EXPR
-@tindex VEC_WIDEN_MINUS_HI_EXPR
-@tindex VEC_WIDEN_MINUS_LO_EXPR
+@tindex IFN_VEC_WIDEN_PLUS
+@tindex IFN_VEC_WIDEN_PLUS_HI
+@tindex IFN_VEC_WIDEN_PLUS_LO
+@tindex IFN_VEC_WIDEN_PLUS_EVEN
+@tindex IFN_VEC_WIDEN_PLUS_ODD
+@tindex IFN_VEC_WIDEN_MINUS
+@tindex IFN_VEC_WIDEN_MINUS_HI
+@tindex IFN_VEC_WIDEN_MINUS_LO
+@tindex IFN_VEC_WIDEN_MINUS_EVEN
+@tindex IFN_VEC_WIDEN_MINUS_ODD
 @tindex VEC_UNPACK_HI_EXPR
 @tindex VEC_UNPACK_LO_EXPR
 @tindex VEC_UNPACK_FLOAT_HI_EXPR
@@ -1861,6 +1867,82 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
+@item IFN_VEC_WIDEN_PLUS
+This internal function represents widening vector addition of two input
+vectors.  Its operands are vectors that contain the same number of elements
+(@code{N}) of the same integral type.  The result is a vector that contains
+the same amount (@code{N}) of elements, of an integral type whose size is twice
+as wide, as the input vectors.  If the current target does not implement the
+corresponding optabs the vectorizer may choose to split it into either a pair
+of @code{IFN_VEC_WIDEN_PLUS_HI} and @code{IFN_VEC_WIDEN_PLUS_LO} or
+@code{IFN_VEC_WIDEN_PLUS_EVEN} and @code{IFN_VEC_WIDEN_PLUS_ODD}, depending
+on what optabs the target implements.
+
+@item IFN_VEC_WIDEN_PLUS_HI
+@itemx IFN_VEC_WIDEN_PLUS_LO
+These internal functions represent widening vector addition of the high and low
+parts of the two input vectors, respectively.  Their operands are vectors that
+contain the same number of elements (@code{N}) of the same integral type. The
+result is a vector that contains half as many elements, of an integral type
+whose size is twice as wide.  In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the
+high @code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} additions.  In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low
+@code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} additions.
+
+@item IFN_VEC_WIDEN_PLUS_EVEN
+@itemx IFN_VEC_WIDEN_PLUS_ODD
+These internal functions represent widening vector addition of the even and odd
+elements of the two input vectors, respectively.  Their operands are vectors
+that contain the same number of elements (@code{N}) of the same integral type.
+The result is a vector that contains half as many elements, of an integral type
+whose size is twice as wide.  In the case of @code{IFN_VEC_WIDEN_PLUS_EVEN} the
+even @code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} additions.  In the case of @code{IFN_VEC_WIDEN_PLUS_ODD} the odd
+@code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} additions.
+
+@item IFN_VEC_WIDEN_MINUS
+This internal function represents widening vector subtraction of two input
+vectors.  Its operands are vectors that contain the same number of elements
+(@code{N}) of the same integral type.  The result is a vector that contains
+the same amount (@code{N}) of elements, of an integral type whose size is twice
+as wide, as the input vectors.  If the current target does not implement the
+corresponding optabs the vectorizer may choose to split it into either a pair
+of @code{IFN_VEC_WIDEN_MINUS_HI} and @code{IFN_VEC_WIDEN_MINUS_LO} or
+@code{IFN_VEC_WIDEN_MINUS_EVEN} and @code{IFN_VEC_WIDEN_MINUS_ODD}, depending
+on what optabs the target implements.
+
+@item IFN_VEC_WIDEN_MINUS_HI
+@itemx IFN_VEC_WIDEN_MINUS_LO
+These internal functions represent widening vector subtraction of the high and
+low parts of the two input vectors, respectively.  Their operands are vectors
+that contain the same number of elements (@code{N}) of the same integral type.
+The high/low elements of the second vector are subtracted from the high/low
+elements of the first. The result is a vector that contains half as many
+elements, of an integral type whose size is twice as wide.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second
+vector are subtracted from the high @code{N/2} of the first to produce the
+vector of @code{N/2} subtractions.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second
+vector are subtracted from the low @code{N/2} of the first to produce the
+vector of @code{N/2} subtractions.
+
+@item IFN_VEC_WIDEN_MINUS_EVEN
+@itemx IFN_VEC_WIDEN_MINUS_ODD
+These internal functions represent widening vector subtraction of the even and
+odd parts of the two input vectors, respectively.  Their operands are vectors
+that contain the same number of elements (@code{N}) of the same integral type.
+The even/odd elements of the second vector are subtracted from the even/odd
+elements of the first. The result is a vector that contains half as many
+elements, of an integral type whose size is twice as wide.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_EVEN} the even @code{N/2} elements of the second
+vector are subtracted from the even @code{N/2} of the first to produce the
+vector of @code{N/2} subtractions.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_ODD} the odd @code{N/2} elements of the second
+vector are subtracted from the odd @code{N/2} of the first to produce the
+vector of @code{N/2} subtractions.
+
 @item VEC_WIDEN_PLUS_HI_EXPR
 @itemx VEC_WIDEN_PLUS_LO_EXPR
 These nodes represent widening vector addition of the high and low parts of
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..348bee35a35ae4ed9a8652f5349f430c2733e1cb 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -90,6 +90,71 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+/*  Given an internal_fn IFN that is either a widening or narrowing function, return its
+    corresponding LO and HI internal_fns.  */
+
+extern void
+lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
+{
+  gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn));
+
+  switch (ifn)
+    {
+    default:
+      gcc_unreachable ();
+#undef DEF_INTERNAL_FN
+#undef DEF_INTERNAL_WIDENING_OPTAB_FN
+#undef DEF_INTERNAL_NARROWING_OPTAB_FN
+#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE)
+#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T)	\
+    case IFN_##NAME:						\
+      *lo = internal_fn (IFN_##NAME##_LO);			\
+      *hi = internal_fn (IFN_##NAME##_HI);			\
+      break;
+#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T)	\
+    case IFN_##NAME:					\
+      *lo = internal_fn (IFN_##NAME##_LO);		\
+      *hi = internal_fn (IFN_##NAME##_HI);		\
+      break;
+#include "internal-fn.def"
+#undef DEF_INTERNAL_FN
+#undef DEF_INTERNAL_WIDENING_OPTAB_FN
+#undef DEF_INTERNAL_NARROWING_OPTAB_FN
+    }
+}
+
+extern void
+lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even,
+			    internal_fn *odd)
+{
+  gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn));
+
+  switch (ifn)
+    {
+    default:
+      gcc_unreachable ();
+#undef DEF_INTERNAL_FN
+#undef DEF_INTERNAL_WIDENING_OPTAB_FN
+#undef DEF_INTERNAL_NARROWING_OPTAB_FN
+#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE)
+#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T)	\
+    case IFN_##NAME:						\
+      *even = internal_fn (IFN_##NAME##_EVEN);			\
+      *odd = internal_fn (IFN_##NAME##_ODD);			\
+      break;
+#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T)	\
+    case IFN_##NAME:					\
+      *even = internal_fn (IFN_##NAME##_EVEN);		\
+      *odd = internal_fn (IFN_##NAME##_ODD);		\
+      break;
+#include "internal-fn.def"
+#undef DEF_INTERNAL_FN
+#undef DEF_INTERNAL_WIDENING_OPTAB_FN
+#undef DEF_INTERNAL_NARROWING_OPTAB_FN
+    }
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -3852,7 +3917,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
 
 /* Return the optab used by internal function FN.  */
 
-static optab
+optab
 direct_internal_fn_optab (internal_fn fn, tree_pair types)
 {
   switch (fn)
@@ -3971,6 +4036,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -4044,6 +4112,68 @@ first_commutative_argument (internal_fn fn)
     }
 }
 
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as wide as the element size of the input vectors.  */
+
+bool
+widening_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_WIDENING_OPTAB_FN
+    #define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \
+    case IFN_##NAME:						  \
+    case IFN_##NAME##_HI:					  \
+    case IFN_##NAME##_LO:					  \
+    case IFN_##NAME##_EVEN:					  \
+    case IFN_##NAME##_ODD:					  \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_WIDENING_OPTAB_FN
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as narrow as the element size of the input vectors.  */
+
+bool
+narrowing_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_NARROWING_OPTAB_FN
+    #define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T)  \
+    case IFN_##NAME##:					    \
+    case IFN_##NAME##_HI:				    \
+    case IFN_##NAME##_LO:				    \
+    case IFN_##NAME##_HI:				    \
+    case IFN_##NAME##_LO:				    \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_NARROWING_OPTAB_FN
+
+    default:
+      return false;
+    }
+}
+
 /* Return true if IFN_SET_EDOM is supported.  */
 
 bool
@@ -4072,6 +4202,8 @@ set_edom_supported_p (void)
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);			\
   }
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_FN
+#undef DEF_INTERNAL_SIGNED_OPTAB_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
@@ -4080,6 +4212,7 @@ set_edom_supported_p (void)
 
    where STMT is the statement that performs the call. */
 static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = {
+
 #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE,
 #include "internal-fn.def"
   0
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..e9edaa201ad4ad171a49119efa9d6bff49add9f4 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -85,6 +85,34 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_WIDENING_OPTAB_FN is a wrapper that defines five internal
+   functions with DEF_INTERNAL_SIGNED_OPTAB_FN:
+   - one that describes a widening operation with the same number of elements
+   in the output and input vectors,
+   - two that describe a pair of high-low widening operations where the output
+   vectors each have half the number of elements of the input vectors,
+   corresponding to the result of the widening operation on the top half and
+   bottom half, these have the suffixes _HI and _LO,
+   - and two that describe a pair of even-odd widening operations where the
+   output vectors each have half the number of elements of the input vectors,
+   corresponding to the result of the widening operation on the even and odd
+   elements, these have the suffixes _EVEN and _ODD.
+   These five internal functions will require two optabs each, a SIGNED_OPTAB
+   and an UNSIGNED_OTPAB.
+
+   DEF_INTERNAL_NARROWING_OPTAB_FN is a wrapper that defines five internal
+   functions with DEF_INTERNAL_OPTAB_FN:
+   - one that describes a narrowing operation with the same number of elements
+   in the output and input vectors,
+   - two that describe a pair of high-low narrowing operations where the output
+   vector has the same number of elements in the top or bottom halves as the
+   full input vectors, these have the suffixes _HI and _LO.
+   - and two that describe a pair of even-odd narrowing operations where the
+   output vector has the same number of elements, in the even or odd positions,
+   as the full input vectors, these have the suffixes _EVEN and _ODD.
+   These five internal functions will require an optab each.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -123,6 +151,24 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_WIDENING_OPTAB_FN
+#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE)		    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE)			    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE)	    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE)	    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _EVEN, FLAGS, SELECTOR, SOPTAB##_even, UOPTAB##_even, TYPE) \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS, SELECTOR, SOPTAB##_odd, UOPTAB##_odd, TYPE)
+#endif
+
+#ifndef DEF_INTERNAL_NARROWING_OPTAB_FN
+#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, FLAGS, OPTAB, TYPE)   \
+  DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)		    \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, OPTAB##_lo, TYPE)	    \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, OPTAB##_hi, TYPE)	    \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _EVEN, FLAGS, OPTAB##_even, TYPE)  \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _ODD, FLAGS, OPTAB##_odd, TYPE)
+#endif
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -315,6 +361,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_PLUS,
+				ECF_CONST | ECF_NOTHROW,
+				first,
+				vec_widen_sadd, vec_widen_uadd,
+				binary)
+DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_MINUS,
+				ECF_CONST | ECF_NOTHROW,
+				first,
+				vec_widen_ssub, vec_widen_usub,
+				binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 08922ed4254898f5fffca3f33973e96ed9ce772f..3904ba3ca36949d844532a6a9303f550533311a4 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,10 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *);
+extern void lookup_evenodd_internal_fn (internal_fn, internal_fn *,
+					internal_fn *);
+extern optab direct_internal_fn_optab (internal_fn, tree_pair);
 
 /* Return the ECF_* flags for function FN.  */
 
@@ -210,6 +218,8 @@ extern bool commutative_binary_fn_p (internal_fn);
 extern bool commutative_ternary_fn_p (internal_fn);
 extern int first_commutative_argument (internal_fn);
 extern bool associative_binary_fn_p (internal_fn);
+extern bool widening_fn_p (code_helper);
+extern bool narrowing_fn_p (code_helper);
 
 extern bool set_edom_supported_p (void);
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index a12333c7169fc6219b0e34b6169780f78e033ee3..aab6ab6faf244a8236dac81be2d68fc28819bc9a 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1314,7 +1314,17 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == smul_widen_optab
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
-	  || binoptab == umul_highpart_optab);
+	  || binoptab == umul_highpart_optab
+	  || binoptab == vec_widen_sadd_optab
+	  || binoptab == vec_widen_uadd_optab
+	  || binoptab == vec_widen_sadd_hi_optab
+	  || binoptab == vec_widen_sadd_lo_optab
+	  || binoptab == vec_widen_uadd_hi_optab
+	  || binoptab == vec_widen_uadd_lo_optab
+	  || binoptab == vec_widen_sadd_even_optab
+	  || binoptab == vec_widen_sadd_odd_optab
+	  || binoptab == vec_widen_uadd_even_optab
+	  || binoptab == vec_widen_uadd_odd_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 695f5911b300c9ca5737de9be809fa01aabe5e01..d41ed6e1afaddd019c7470f965c0ad21c8b2b9d7 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -410,6 +410,16 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a")
 OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a")
 OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a")
 OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a")
+OPTAB_D (vec_widen_ssub_optab, "vec_widen_ssub_$a")
+OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a")
+OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a")
+OPTAB_D (vec_widen_ssub_odd_optab, "vec_widen_ssub_odd_$a")
+OPTAB_D (vec_widen_ssub_even_optab, "vec_widen_ssub_even_$a")
+OPTAB_D (vec_widen_sadd_optab, "vec_widen_sadd_$a")
+OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a")
+OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a")
+OPTAB_D (vec_widen_sadd_odd_optab, "vec_widen_sadd_odd_$a")
+OPTAB_D (vec_widen_sadd_even_optab, "vec_widen_sadd_even_$a")
 OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
 OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
 OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")
@@ -422,6 +432,16 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a")
 OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
 OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
 OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
+OPTAB_D (vec_widen_usub_optab, "vec_widen_usub_$a")
+OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a")
+OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a")
+OPTAB_D (vec_widen_usub_odd_optab, "vec_widen_usub_odd_$a")
+OPTAB_D (vec_widen_usub_even_optab, "vec_widen_usub_even_$a")
+OPTAB_D (vec_widen_uadd_optab, "vec_widen_uadd_$a")
+OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a")
+OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a")
+OPTAB_D (vec_widen_uadd_odd_optab, "vec_widen_uadd_odd_$a")
+OPTAB_D (vec_widen_uadd_even_optab, "vec_widen_uadd_even_$a")
 OPTAB_D (vec_addsub_optab, "vec_addsub$a3")
 OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4")
 OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 220bd9352a4c7acd2e3713e441d74898d3e92b30..b5a73867e44ec3fa04d1201decf81353a67b4c82 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-details" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index a2bed63affbd091977df95a126da1f5b8c1d41d2..1686c3f2f344c367ebb9cf34e558d0878849f9bc 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-details" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 1778af0242898e3dc73d94d22a5b8505628a53b5..dcd4b5561600346a2c10bd5133507329206e8837 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-		      tree_code widened_code, bool shift_p,
+		      code_helper widened_code, bool shift_p,
 		      unsigned int max_nops,
 		      vect_unpromoted_value *unprom, tree *common_type,
 		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
+    return 0;
+
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else if (is_gimple_call (stmt))
+    rhs_code = gimple_call_combined_fn (stmt);
+  else
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  if (rhs_code != code
+      && rhs_code != widened_code)
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = gimple_get_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op = gimple_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
 	{
 	  /* We already have a common type from earlier operands.
@@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+			     IFN_VEC_WIDEN_MINUS,
 			     false, 2, unprom, &half_type))
     return NULL;
 
@@ -1395,14 +1405,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
 			     tree_code orig_code, code_helper wide_code,
-			     bool shift_p, const char *name)
+			     bool shift_p, const char *name,
+			     optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-			     shift_p, 2, unprom, &half_type))
+			     shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 			      type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+			     stmt_vec_info last_stmt_info, tree *type_out,
+			     tree_code orig_code, internal_fn wide_ifn,
+			     bool shift_p, const char *name,
+			     optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+				      orig_code, ifn, shift_p, name,
+				      subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS.  See vect_recog_widen_op_pattern for details.  */
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-				      "vect_recog_widen_plus_pattern");
+				      PLUS_EXPR, IFN_VEC_WIDEN_PLUS,
+				      false, "vect_recog_widen_plus_pattern",
+				      &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS.  See vect_recog_widen_op_pattern for details.  */
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-				      "vect_recog_widen_minus_pattern");
+				      MINUS_EXPR, IFN_VEC_WIDEN_MINUS,
+				      false, "vect_recog_widen_minus_pattern",
+				      &subtype);
 }
 
 /* Function vect_recog_ctz_ffs_pattern
@@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo,
   vect_unpromoted_value unprom[3];
   tree new_type;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-					    WIDEN_PLUS_EXPR, false, 3,
+					    IFN_VEC_WIDEN_PLUS, false, 3,
 					    unprom, &new_type);
   if (nops == 0)
     return NULL;
@@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index d73e7f0936435951fe05fa6b787ba053233635aa..4f1569023a4e42ad6d058bccf62687dc3fe1302e 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5038,7 +5038,8 @@ vectorizable_conversion (vec_info *vinfo,
   bool widen_arith = (code == WIDEN_PLUS_EXPR
 		 || code == WIDEN_MINUS_EXPR
 		 || code == WIDEN_MULT_EXPR
-		 || code == WIDEN_LSHIFT_EXPR);
+		 || code == WIDEN_LSHIFT_EXPR
+		 || widening_fn_p (code));
 
   if (!widen_arith
       && !CONVERT_EXPR_CODE_P (code)
@@ -5088,8 +5089,8 @@ vectorizable_conversion (vec_info *vinfo,
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
 		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR);
-
+		  || code == WIDEN_MINUS_EXPR
+		  || widening_fn_p (code));
 
       op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
 				     gimple_call_arg (stmt, 0);
@@ -12500,26 +12501,69 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
-    {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+
+  vec_mode = TYPE_MODE (vectype);
+  if (widening_fn_p (code))
+     {
+       /* If this is an internal fn then we must check whether the target
+	  supports either a low-high split or an even-odd split.  */
+      internal_fn ifn = as_internal_fn ((combined_fn) code);
+
+      internal_fn lo, hi, even, odd;
+      lookup_hilo_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = direct_internal_fn_optab (lo, {vectype, vectype});
+      optab2 = direct_internal_fn_optab (hi, {vectype, vectype});
+
+      /* If we don't support low-high, then check for even-odd.  */
+      if (!optab1
+	  || (icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing
+	  || !optab2
+	  || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
+	{
+	  lookup_evenodd_internal_fn (ifn, &even, &odd);
+	  *code1 = as_combined_fn (even);
+	  *code2 = as_combined_fn (odd);
+	  optab1 = direct_internal_fn_optab (even, {vectype, vectype});
+	  optab2 = direct_internal_fn_optab (odd, {vectype, vectype});
+	}
+    }
+  else if (code.is_tree_code ())
+    {
+      if (code == FIX_TRUNC_EXPR)
+	{
+	  /* The signedness is determined from output operand.  */
+	  optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
+	  optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+	}
+      else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ())
+	       && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+	       && VECTOR_BOOLEAN_TYPE_P (vectype)
+	       && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+	       && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+	{
+	  /* If the input and result modes are the same, a different optab
+	     is needed where we pass in the number of units in vectype.  */
+	  optab1 = vec_unpacks_sbool_lo_optab;
+	  optab2 = vec_unpacks_sbool_hi_optab;
+	}
+      else
+	{
+	  optab1 = optab_for_tree_code (c1, vectype, optab_default);
+	  optab2 = optab_for_tree_code (c2, vectype, optab_default);
+	}
+      *code1 = c1;
+      *code2 = c2;
     }
 
   if (!optab1 || !optab2)
     return false;
 
-  vec_mode = TYPE_MODE (vectype);
   if ((icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing
        || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
     return false;
 
-  if (code.is_tree_code ())
-  {
-    *code1 = c1;
-    *code2 = c2;
-  }
-
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
       && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
diff --git a/gcc/tree.def b/gcc/tree.def
index 90ceeec0b512bfa5f983359c0af03cc71de32007..b37b0b35927b92a6536e5c2d9805ffce8319a240 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3)
 DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2)
 
 /* Widening sad (sum of absolute differences).
-   The first two arguments are of type t1 which should be integer.
-   The third argument and the result are of type t2, such that t2 is at least
-   twice the size of t1.  Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
+   The first two arguments are of type t1 which should be a vector of integers.
+   The third argument and the result are of type t2, such that the size of
+   the elements of t2 is at least twice the size of the elements of t1.
+   Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
    equivalent to:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = PLUS_EXPR (tmp2, arg3)
   or:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
  */

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-06-01 16:27                                                       ` Andre Vieira (lists)
@ 2023-06-02 12:00                                                         ` Richard Sandiford
  2023-06-06 19:00                                                         ` Jakub Jelinek
  1 sibling, 0 replies; 53+ messages in thread
From: Richard Sandiford @ 2023-06-02 12:00 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: Richard Biener, Richard Biener, gcc-patches

Just some very minor things.

"Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes:
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..348bee35a35ae4ed9a8652f5349f430c2733e1cb 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -90,6 +90,71 @@ lookup_internal_fn (const char *name)
>    return entry ? *entry : IFN_LAST;
>  }
>  
> +/*  Given an internal_fn IFN that is either a widening or narrowing function, return its
> +    corresponding LO and HI internal_fns.  */

Long line and too much space after "/*":

/* Given an internal_fn IFN that is either a widening or narrowing function,
   return its corresponding _LO and _HI internal_fns in *LO and *HI.  */

> +extern void
> +lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
> +{
> +  gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn));
> +
> +  switch (ifn)
> +    {
> +    default:
> +      gcc_unreachable ();
> +#undef DEF_INTERNAL_FN
> +#undef DEF_INTERNAL_WIDENING_OPTAB_FN
> +#undef DEF_INTERNAL_NARROWING_OPTAB_FN
> +#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE)
> +#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T)	\
> +    case IFN_##NAME:						\
> +      *lo = internal_fn (IFN_##NAME##_LO);			\
> +      *hi = internal_fn (IFN_##NAME##_HI);			\
> +      break;
> +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T)	\
> +    case IFN_##NAME:					\
> +      *lo = internal_fn (IFN_##NAME##_LO);		\
> +      *hi = internal_fn (IFN_##NAME##_HI);		\
> +      break;
> +#include "internal-fn.def"
> +#undef DEF_INTERNAL_FN
> +#undef DEF_INTERNAL_WIDENING_OPTAB_FN
> +#undef DEF_INTERNAL_NARROWING_OPTAB_FN
> +    }
> +}
> +
> +extern void
> +lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even,
> +			    internal_fn *odd)

This needs a similar comment:

/* Given an internal_fn IFN that is either a widening or narrowing function,
   return its corresponding _EVEN and _ODD internal_fns in *EVEN and *ODD.  */

> @@ -3971,6 +4036,9 @@ commutative_binary_fn_p (internal_fn fn)
>      case IFN_UBSAN_CHECK_MUL:
>      case IFN_ADD_OVERFLOW:
>      case IFN_MUL_OVERFLOW:
> +    case IFN_VEC_WIDEN_PLUS:
> +    case IFN_VEC_WIDEN_PLUS_LO:
> +    case IFN_VEC_WIDEN_PLUS_HI:

Should include even & odd as well.

I'd suggest leaving out the narrowing stuff for now.  There are some
questions that would be easier to answer once we add the first use,
such as whether one of the hi/lo pair and one or the even/odd pair
merge with a vector containing the other half, whether all four
define the other half to be zero, etc.

OK for the optab/internal-fn parts with those changes from my POV.

Thanks again for doing this!

Richard

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 2/3] Refactor widen_plus as internal_fn
  2023-06-01 16:27                                                       ` Andre Vieira (lists)
  2023-06-02 12:00                                                         ` Richard Sandiford
@ 2023-06-06 19:00                                                         ` Jakub Jelinek
  2023-06-06 21:28                                                           ` [PATCH] modula2: Fix bootstrap Jakub Jelinek
  1 sibling, 1 reply; 53+ messages in thread
From: Jakub Jelinek @ 2023-06-06 19:00 UTC (permalink / raw)
  To: Andre Vieira (lists)
  Cc: Richard Biener, Richard Sandiford, Richard Biener, gcc-patches

On Thu, Jun 01, 2023 at 05:27:56PM +0100, Andre Vieira (lists) via Gcc-patches wrote:
> --- a/gcc/internal-fn.h
> +++ b/gcc/internal-fn.h
> @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
>  #ifndef GCC_INTERNAL_FN_H
>  #define GCC_INTERNAL_FN_H
>  
> +#include "insn-codes.h"
> +#include "insn-opinit.h"

My i686-linux build configured with
../configure --enable-languages=default,obj-c++,lto,go,d,rust,m2 --enable-checking=yes,rtl,extra --enable-libstdcxx-backtrace=yes
just died with
In file included from ../../gcc/m2/gm2-gcc/gcc-consolidation.h:74,
                 from ../../gcc/m2/gm2-gcc/m2except.cc:22:
../../gcc/internal-fn.h:24:10: fatal error: insn-opinit.h: No such file or directory
   24 | #include "insn-opinit.h"
      |          ^~~~~~~~~~~~~~~
compilation terminated.
In file included from ../../gcc/m2/gm2-gcc/gcc-consolidation.h:74,
                 from ../../gcc/m2/m2pp.cc:23:
../../gcc/internal-fn.h:24:10: fatal error: insn-opinit.h: No such file or directory
   24 | #include "insn-opinit.h"
      |          ^~~~~~~~~~~~~~~
In file included from ../../gcc/m2/gm2-gcc/gcc-consolidation.h:74,
                 from ../../gcc/m2/gm2-gcc/rtegraph.cc:22:
../../gcc/internal-fn.h:24:10: fatal error: insn-opinit.h: No such file or directory
   24 | #include "insn-opinit.h"
      |          ^~~~~~~~~~~~~~~
compilation terminated.
compilation terminated.
supposedly because of this change.

Do you really need those includes there?
If yes, what is supposed to ensure that the generated includes
are generated before compiling files which include those?

From what I can see, gcc/Makefile.in has
generated_files var which includes among other things insn-opinit.h,
and
# Dependency information.
  
# In order for parallel make to really start compiling the expensive
# objects from $(OBJS) as early as possible, build all their
# prerequisites strictly before all objects.
$(ALL_HOST_OBJS) : | $(generated_files)

rule, plus I see $(generated_files) mentioned in a couple of dependencies
in gcc/m2/Make-lang.in .  But supposedly because of this change it now
needs to be added to tons of other spots.

	Jakub


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH] modula2: Fix bootstrap
  2023-06-06 19:00                                                         ` Jakub Jelinek
@ 2023-06-06 21:28                                                           ` Jakub Jelinek
  2023-06-06 22:18                                                             ` Gaius Mulley
  2023-06-07  8:42                                                             ` Andre Vieira (lists)
  0 siblings, 2 replies; 53+ messages in thread
From: Jakub Jelinek @ 2023-06-06 21:28 UTC (permalink / raw)
  To: Gaius Mulley; +Cc: Andre Vieira, Richard Biener, Richard Sandiford, gcc-patches

Hi!

internal-fn.h since yesterday includes insn-opinit.h, which is a generated
header.
One of my bootstraps today failed because some m2 sources started compiling
before insn-opinit.h has been generated.

Normally, gcc/Makefile.in has
# In order for parallel make to really start compiling the expensive
# objects from $(OBJS) as early as possible, build all their
# prerequisites strictly before all objects.
$(ALL_HOST_OBJS) : | $(generated_files)

rule which ensures that all the generated files are generated before
any $(ALL_HOST_OBJS) objects start, but use order-only dependency for
this because we don't want to rebuild most of the objects whenever one
generated header is regenerated.  After the initial build in an empty
directory we'll have .deps/ files contain the detailed dependencies.

$(ALL_HOST_OBJS) includes even some FE files, I think in the m2 case
would be m2_OBJS, but m2/Make-lang.in doesn't define those.

The following patch just adds a similar rule to m2/Make-lang.in.
Another option would be to set m2_OBJS variable in m2/Make-lang.in to
something, but not really sure to which exactly and why it isn't
done.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-06-06  Jakub Jelinek  <jakub@redhat.com>

	* Make-lang.in: Build $(generated_files) before building
	all $(GM2_C_OBJS).

--- gcc/m2/Make-lang.in.jj	2023-05-04 09:31:27.289948109 +0200
+++ gcc/m2/Make-lang.in	2023-06-06 21:38:26.655336041 +0200
@@ -511,6 +511,8 @@ GM2_LIBS_BOOT     = m2/gm2-compiler-boot
                     m2/gm2-libs-boot/libgm2.a \
                     $(GM2-BOOT-O)
 
+$(GM2_C_OBJS) : | $(generated_files)
+
 cc1gm2$(exeext): m2/stage1/cc1gm2$(exeext) $(m2.prev)
 	cp -p $< $@
 


	Jakub


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] modula2: Fix bootstrap
  2023-06-06 21:28                                                           ` [PATCH] modula2: Fix bootstrap Jakub Jelinek
@ 2023-06-06 22:18                                                             ` Gaius Mulley
  2023-06-07  8:42                                                             ` Andre Vieira (lists)
  1 sibling, 0 replies; 53+ messages in thread
From: Gaius Mulley @ 2023-06-06 22:18 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Andre Vieira, Richard Biener, Richard Sandiford, gcc-patches

Jakub Jelinek <jakub@redhat.com> writes:

> Hi!
>
> internal-fn.h since yesterday includes insn-opinit.h, which is a generated
> header.
> One of my bootstraps today failed because some m2 sources started compiling
> before insn-opinit.h has been generated.
>
> Normally, gcc/Makefile.in has
> # In order for parallel make to really start compiling the expensive
> # objects from $(OBJS) as early as possible, build all their
> # prerequisites strictly before all objects.
> $(ALL_HOST_OBJS) : | $(generated_files)
>
> rule which ensures that all the generated files are generated before
> any $(ALL_HOST_OBJS) objects start, but use order-only dependency for
> this because we don't want to rebuild most of the objects whenever one
> generated header is regenerated.  After the initial build in an empty
> directory we'll have .deps/ files contain the detailed dependencies.
>
> $(ALL_HOST_OBJS) includes even some FE files, I think in the m2 case
> would be m2_OBJS, but m2/Make-lang.in doesn't define those.
>
> The following patch just adds a similar rule to m2/Make-lang.in.
> Another option would be to set m2_OBJS variable in m2/Make-lang.in to
> something, but not really sure to which exactly and why it isn't
> done.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


> 2023-06-06  Jakub Jelinek  <jakub@redhat.com>
>
> 	* Make-lang.in: Build $(generated_files) before building
> 	all $(GM2_C_OBJS).
>
> --- gcc/m2/Make-lang.in.jj	2023-05-04 09:31:27.289948109 +0200
> +++ gcc/m2/Make-lang.in	2023-06-06 21:38:26.655336041 +0200
> @@ -511,6 +511,8 @@ GM2_LIBS_BOOT     = m2/gm2-compiler-boot
>                      m2/gm2-libs-boot/libgm2.a \
>                      $(GM2-BOOT-O)
>  
> +$(GM2_C_OBJS) : | $(generated_files)
> +
>  cc1gm2$(exeext): m2/stage1/cc1gm2$(exeext) $(m2.prev)
>  	cp -p $< $@
>  
>
>
> 	Jakub

Hi Jakub,

sure looks good to me - thanks for the patch,

regards,
Gaius

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] modula2: Fix bootstrap
  2023-06-06 21:28                                                           ` [PATCH] modula2: Fix bootstrap Jakub Jelinek
  2023-06-06 22:18                                                             ` Gaius Mulley
@ 2023-06-07  8:42                                                             ` Andre Vieira (lists)
  2023-06-13 14:48                                                               ` Jakub Jelinek
  1 sibling, 1 reply; 53+ messages in thread
From: Andre Vieira (lists) @ 2023-06-07  8:42 UTC (permalink / raw)
  To: Jakub Jelinek, Gaius Mulley
  Cc: Richard Biener, Richard Sandiford, gcc-patches

Thanks Jakub!

I do need those includes and sorry I broke your bootstrap it didn't show 
up on my aarch64-unknown-linux-gnu bootstrap, I'm guessing the rules 
there were just run in a different order. Glad you were able to fix it :)

On 06/06/2023 22:28, Jakub Jelinek wrote:
> Hi!
> 
> internal-fn.h since yesterday includes insn-opinit.h, which is a generated
> header.
> One of my bootstraps today failed because some m2 sources started compiling
> before insn-opinit.h has been generated.
> 
> Normally, gcc/Makefile.in has
> # In order for parallel make to really start compiling the expensive
> # objects from $(OBJS) as early as possible, build all their
> # prerequisites strictly before all objects.
> $(ALL_HOST_OBJS) : | $(generated_files)
> 
> rule which ensures that all the generated files are generated before
> any $(ALL_HOST_OBJS) objects start, but use order-only dependency for
> this because we don't want to rebuild most of the objects whenever one
> generated header is regenerated.  After the initial build in an empty
> directory we'll have .deps/ files contain the detailed dependencies.
> 
> $(ALL_HOST_OBJS) includes even some FE files, I think in the m2 case
> would be m2_OBJS, but m2/Make-lang.in doesn't define those.
> 
> The following patch just adds a similar rule to m2/Make-lang.in.
> Another option would be to set m2_OBJS variable in m2/Make-lang.in to
> something, but not really sure to which exactly and why it isn't
> done.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2023-06-06  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* Make-lang.in: Build $(generated_files) before building
> 	all $(GM2_C_OBJS).
> 
> --- gcc/m2/Make-lang.in.jj	2023-05-04 09:31:27.289948109 +0200
> +++ gcc/m2/Make-lang.in	2023-06-06 21:38:26.655336041 +0200
> @@ -511,6 +511,8 @@ GM2_LIBS_BOOT     = m2/gm2-compiler-boot
>                       m2/gm2-libs-boot/libgm2.a \
>                       $(GM2-BOOT-O)
>   
> +$(GM2_C_OBJS) : | $(generated_files)
> +
>   cc1gm2$(exeext): m2/stage1/cc1gm2$(exeext) $(m2.prev)
>   	cp -p $< $@
>   
> 
> 
> 	Jakub
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] modula2: Fix bootstrap
  2023-06-07  8:42                                                             ` Andre Vieira (lists)
@ 2023-06-13 14:48                                                               ` Jakub Jelinek
  0 siblings, 0 replies; 53+ messages in thread
From: Jakub Jelinek @ 2023-06-13 14:48 UTC (permalink / raw)
  To: Andre Vieira (lists)
  Cc: Gaius Mulley, Richard Biener, Richard Sandiford, gcc-patches

On Wed, Jun 07, 2023 at 09:42:22AM +0100, Andre Vieira (lists) wrote:
> I do need those includes and sorry I broke your bootstrap it didn't show up
> on my aarch64-unknown-linux-gnu bootstrap, I'm guessing the rules there were
> just run in a different order. Glad you were able to fix it :)

Unfortunately, it doesn't really work.
My x86_64-linux bootstrap today died again with:
In file included from ../../gcc/m2/gm2-gcc/gcc-consolidation.h:74,
                 from ../../gcc/m2/gm2-lang.cc:24:
../../gcc/internal-fn.h:24:10: fatal error: insn-opinit.h: No such file or directory
   24 | #include "insn-opinit.h"
      |          ^~~~~~~~~~~~~~~
compilation terminated.
/home/jakub/src/gcc/obj36/./prev-gcc/xg++ -B/home/jakub/src/gcc/obj36/./prev-gcc/ -B/usr/local/x86_64-pc-linux-gnu/bin/ -nostdinc++ -B/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs -B/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs  -I/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu  -I/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/include  -I/home/jakub/src/gcc/libstdc++-v3/libsupc++ -L/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs -L/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs  -fno-PIE -c -g   -g -O2 -fchecking=1 -DIN_GCC     -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual  -fno-common  -DHAVE_CONFIG_H \
             -I. -Im2/gm2-gcc -I../../gcc -I../../gcc/m2/gm2-gcc -I../../gcc/../include  -I../../gcc/../libcpp/include -I../../gcc/../libcody  -I../../gcc/../libdecnumber -I../../gcc/../libdecnumber/bid -I../libdecnumber -I../../gcc/../libbacktrace   -I. -Im2/gm2-gcc -I../../gcc -I../../gcc/m2/gm2-gcc -I../../gcc/../include  -I../../gcc/../libcpp/include -I../../gcc/../libcody  -I../../gcc/../libdecnumber -I../../gcc/../libdecnumber/bid -I../libdecnumber -I../../gcc/../libbacktrace  ../../gcc/m2/gm2-gcc/m2type.cc -o m2/gm2-gcc/m2type.o
make[3]: *** [../../gcc/m2/Make-lang.in:570: m2/gm2-lang.o] Error 1
make[3]: *** Waiting for unfinished jobs....
errors.  Dunno what is going on.
I've tried
--- gcc/m2/Make-lang.in.jj	2023-06-07 15:56:07.112684198 +0200
+++ gcc/m2/Make-lang.in	2023-06-13 16:08:55.409364765 +0200
@@ -511,7 +511,7 @@ GM2_LIBS_BOOT     = m2/gm2-compiler-boot
                     m2/gm2-libs-boot/libgm2.a \
                     $(GM2-BOOT-O)
 
-$(GM2_C_OBJS) : | $(generated_files)
+m2_OBJS = $(GM2_C_OBJS)
 
 cc1gm2$(exeext): m2/stage1/cc1gm2$(exeext) $(m2.prev)
 	cp -p $< $@

but that doesn't really work either, this time not just random bootstrap
breakages from time to time, but all the time.
Including GM2_C_OBJS in m2_OBJS is I think the right thing, but that
results in predefining IN_GCC_FRONTEND macro and we have e.g.

/* Front ends should never have to include middle-end headers.  Enforce
   this by poisoning the header double-include protection defines.  */
#ifdef IN_GCC_FRONTEND
#pragma GCC poison GCC_RTL_H GCC_EXCEPT_H GCC_EXPR_H
#endif

in system.h to make sure that FE sources don't include rtl.h, except.h,
expr.h.  But m2/gm2-gcc/gcc-consolidation.h includes tons of the RTL
headers, rtl.h, df.h (twice), except.h; why?
Also, seems one of GM2_C_OBJS is some special copy of stor-layout.cc
which really isn't a FE file and so needs the RTL-ish headers.

	Jakub


^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2023-06-13 14:48 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-25  9:11 [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Joel Hutton
2022-05-27 13:23 ` Richard Biener
2022-05-31 10:07   ` Joel Hutton
2022-05-31 16:46     ` Tamar Christina
2022-06-01 10:11     ` Richard Biener
2022-06-06 17:20       ` Joel Hutton
2022-06-07  8:18         ` Richard Sandiford
2022-06-07  9:01           ` Joel Hutton
2022-06-09 14:03             ` Joel Hutton
2022-06-13  9:02             ` Richard Biener
2022-06-30 13:20               ` Joel Hutton
2022-07-12 12:32                 ` Richard Biener
2023-03-17 10:14                   ` Andre Vieira (lists)
2023-03-17 11:52                     ` Richard Biener
2023-04-20 13:23                       ` Andre Vieira (lists)
2023-04-24 11:57                         ` Richard Biener
2023-04-24 13:01                           ` Richard Sandiford
2023-04-25 12:30                             ` Richard Biener
2023-04-28 16:06                               ` Andre Vieira (lists)
2023-04-25  9:55                           ` Andre Vieira (lists)
2023-04-28 12:36                             ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists)
2023-05-03 11:55                               ` Richard Biener
2023-05-04 15:20                                 ` Andre Vieira (lists)
2023-05-05  6:09                                   ` Richard Biener
2023-05-12 12:14                                     ` Andre Vieira (lists)
2023-05-12 13:18                                       ` Richard Biener
2023-04-28 12:37                             ` [PATCH 2/3] Refactor widen_plus as internal_fn Andre Vieira (lists)
2023-05-03 12:11                               ` Richard Biener
2023-05-03 19:07                                 ` Richard Sandiford
2023-05-12 12:16                                   ` Andre Vieira (lists)
2023-05-12 13:28                                     ` Richard Biener
2023-05-12 13:55                                       ` Andre Vieira (lists)
2023-05-12 14:01                                       ` Richard Sandiford
2023-05-15 10:20                                         ` Richard Biener
2023-05-15 10:47                                           ` Richard Sandiford
2023-05-15 11:01                                             ` Richard Biener
2023-05-15 11:10                                               ` Richard Sandiford
2023-05-15 11:53                                               ` Andre Vieira (lists)
2023-05-15 12:21                                                 ` Richard Biener
2023-05-18 17:15                                                   ` Andre Vieira (lists)
2023-05-22 13:06                                                     ` Richard Biener
2023-06-01 16:27                                                       ` Andre Vieira (lists)
2023-06-02 12:00                                                         ` Richard Sandiford
2023-06-06 19:00                                                         ` Jakub Jelinek
2023-06-06 21:28                                                           ` [PATCH] modula2: Fix bootstrap Jakub Jelinek
2023-06-06 22:18                                                             ` Gaius Mulley
2023-06-07  8:42                                                             ` Andre Vieira (lists)
2023-06-13 14:48                                                               ` Jakub Jelinek
2023-04-28 12:37                             ` [PATCH 3/3] Remove widen_plus/minus_expr tree codes Andre Vieira (lists)
2023-05-03 12:29                               ` Richard Biener
2023-05-10  9:15                                 ` Andre Vieira (lists)
2023-05-12 12:18                                   ` Andre Vieira (lists)
2022-06-13  9:18           ` [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).