* [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns
@ 2022-05-25 9:11 Joel Hutton
2022-05-27 13:23 ` Richard Biener
0 siblings, 1 reply; 53+ messages in thread
From: Joel Hutton @ 2022-05-25 9:11 UTC (permalink / raw)
To: Richard Sandiford; +Cc: Richard Biener, gcc-patches
Ping!
Just checking there is still interest in this. I'm assuming you've been busy with release.
Joel
> -----Original Message-----
> From: Joel Hutton
> Sent: 13 April 2022 16:53
> To: Richard Sandiford <richard.sandiford@arm.com>
> Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org
> Subject: [vect-patterns] Refactor widen_plus/widen_minus as internal_fns
>
> Hi all,
>
> These patches refactor the widening patterns in vect-patterns to use
> internal_fn instead of tree_codes.
>
> Sorry about the delay, some changes to master made it a bit messier.
>
> Bootstrapped and regression tested on aarch64.
>
> Joel
>
> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> > > index 854cbcff390..4a8ea67e62f 100644
> > > --- a/gcc/tree-vect-patterns.c
> > > +++ b/gcc/tree-vect-patterns.c
> > > @@ -1245,7 +1245,7 @@ vect_recog_sad_pattern (vec_info *vinfo,
> > > static gimple * vect_recog_widen_op_pattern (vec_info *vinfo,
> > > stmt_vec_info last_stmt_info, tree *type_out,
> > > - tree_code orig_code, tree_code wide_code,
> > > + tree_code orig_code, code_helper
> > wide_code_or_ifn,
> >
> > I think it'd be better to keep the original “wide_code” name and try
> > to remove as many places as possible in which switching based on
> > tree_code or internal_fn is necessary. The recent gimple-match.h
> > patches should help with that, but more routines might be needed.
>
> Done.
>
> > > @@ -1309,8 +1310,16 @@ vect_recog_widen_op_pattern (vec_info
> *vinfo,
> > > 2, oprnd, half_type, unprom, vectype);
> > >
> > > tree var = vect_recog_temp_ssa_var (itype, NULL);
> > > - gimple *pattern_stmt = gimple_build_assign (var, wide_code,
> > > - oprnd[0], oprnd[1]);
> > > + gimple *pattern_stmt;
> > > + if (wide_code_or_ifn.is_tree_code ())
> > > + pattern_stmt = gimple_build_assign (var, wide_code_or_ifn,
> > > + oprnd[0], oprnd[1]);
> > > + else
> > > + {
> > > + internal_fn fn = as_internal_fn ((combined_fn) wide_code_or_ifn);
> > > + pattern_stmt = gimple_build_call_internal (fn, 2, oprnd[0], oprnd[1]);
> > > + gimple_call_set_lhs (pattern_stmt, var);
> > > + }
> >
> > For example, I think we should hide this inside a new:
> >
> > gimple_build (var, wide_code, oprnd[0], oprnd[1]);
> >
> > that works directly on code_helper, similarly to the new code_helper
> > gimple_build interfaces.
>
> Done.
>
> > > @@ -4513,14 +4513,16 @@ vect_gen_widened_results_half (vec_info
> > *vinfo, enum tree_code code,
> > > tree new_temp;
> > >
> > > /* Generate half of the widened result: */
> > > - gcc_assert (op_type == TREE_CODE_LENGTH (code));
> > > if (op_type != binary_op)
> > > vec_oprnd1 = NULL;
> > > - new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0,
> > vec_oprnd1);
> > > + if (ch.is_tree_code ())
> > > + new_stmt = gimple_build_assign (vec_dest, ch, vec_oprnd0,
> > vec_oprnd1);
> > > + else
> > > + new_stmt = gimple_build_call_internal (as_internal_fn
> > > + ((combined_fn)
> > ch),
> > > + 2, vec_oprnd0, vec_oprnd1);
> >
> > Similarly here. I guess the combined_fn/internal_fn path will also
> > need to cope with null trailing operands, for consistency with the tree_code
> one.
> >
>
> Done.
>
> > > @@ -4744,31 +4747,28 @@ vectorizable_conversion (vec_info *vinfo,
> > > && ! vec_stmt)
> > > return false;
> > >
> > > - gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt);
> > > - if (!stmt)
> > > + gimple* stmt = stmt_info->stmt;
> > > + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
> > > return false;
> > >
> > > - if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
> > > - return false;
> > > + if (is_gimple_assign (stmt))
> > > + {
> > > + code_or_ifn = gimple_assign_rhs_code (stmt); } else
> > > + code_or_ifn = gimple_call_combined_fn (stmt);
> >
> > It might be possible to use gimple_extract_op here (only recently added).
> > This would also provide the number of operands directly, instead of
> > needing “op_type”. It would also provide an array of operands.
> >
>
> Done.
>
> > > - code = gimple_assign_rhs_code (stmt);
> > > - if (!CONVERT_EXPR_CODE_P (code)
> > > - && code != FIX_TRUNC_EXPR
> > > - && code != FLOAT_EXPR
> > > - && code != WIDEN_PLUS_EXPR
> > > - && code != WIDEN_MINUS_EXPR
> > > - && code != WIDEN_MULT_EXPR
> > > - && code != WIDEN_LSHIFT_EXPR)
> >
> > Is it safe to drop this check independently of parts 2 and 3?
> > (Genuine question, haven't checked in detail.)
>
> It requires the parts 2 and 3. I've moved that change into this first patch.
>
> > > @@ -4784,7 +4784,8 @@ vectorizable_conversion (vec_info *vinfo,
> > > }
> > >
> > > rhs_type = TREE_TYPE (op0);
> > > - if ((code != FIX_TRUNC_EXPR && code != FLOAT_EXPR)
> > > + if ((code_or_ifn.is_tree_code () && code_or_ifn != FIX_TRUNC_EXPR
> > > + && code_or_ifn != FLOAT_EXPR)
> >
> > I don't think we want the is_tree_code condition here. The existing
> > != should work.
> >
>
> Done.
>
> > > @@ -11856,13 +11888,13 @@ supportable_widening_operation
> (vec_info
> > *vinfo,
> > > if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
> > > std::swap (c1, c2);
> > >
> > > - if (code == FIX_TRUNC_EXPR)
> > > + if (code_or_ifn == FIX_TRUNC_EXPR)
> > > {
> > > /* The signedness is determined from output operand. */
> > > optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
> > > optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
> > > }
> > > - else if (CONVERT_EXPR_CODE_P (code)
> > > + else if (CONVERT_EXPR_CODE_P ((tree_code) code_or_ifn)
> >
> > I think this should be as_tree_code (), so that it's safe for internal
> > functions if (tree_code) ever becomes a checked convrsion in future.
> > Same for other instances.
> >
>
> Done.
>
> > > && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
> > > && VECTOR_BOOLEAN_TYPE_P (vectype)
> > > && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) […] @@
> > > -12000,7 +12031,7 @@ supportable_widening_operation (vec_info
> > *vinfo,
> > > bool
> > > supportable_narrowing_operation (enum tree_code code,
> > > tree vectype_out, tree vectype_in,
> > > - enum tree_code *code1, int *multi_step_cvt,
> > > + void* _code1, int *multi_step_cvt,
> >
> > This might be rehashing an old conversation, sorry, but why does this
> > need to be void?
> >
>
> Reworked to avoid using void*.
>
> > > vec<tree> *interm_types) {
> > > machine_mode vec_mode;
> > > @@ -12013,6 +12044,7 @@ supportable_narrowing_operation (enum
> > tree_code code,
> > > machine_mode intermediate_mode, prev_mode;
> > > int i;
> > > bool uns;
> > > + tree_code * code1 = (tree_code*) _code1;
> > >
> > > *multi_step_cvt = 0;
> > > switch (code)
> > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index
> > > bd6f334d15f..70c06264c11 100644
> > > --- a/gcc/tree-vectorizer.h
> > > +++ b/gcc/tree-vectorizer.h
> > > @@ -2030,13 +2030,16 @@ extern bool vect_is_simple_use (vec_info *,
> > stmt_vec_info, slp_tree,
> > > enum vect_def_type *,
> > > tree *, stmt_vec_info * = NULL); extern bool
> > > vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool
> > > supportable_widening_operation (vec_info *,
> > > - enum tree_code, stmt_vec_info,
> > > - tree, tree, enum tree_code *,
> > > - enum tree_code *, int *,
> > > - vec<tree> *);
> > > +extern bool supportable_widening_operation (vec_info *vinfo,
> > > + code_helper code_or_ifn,
> > > + stmt_vec_info stmt_info,
> > > + tree vectype_out, tree vectype_in,
> > > + code_helper *code_or_ifn1,
> > > + code_helper *code_or_ifn2,
> > > + int *multi_step_cvt,
> > > + vec<tree> *interm_types);
> >
> > Normal style is to keep the variable names out of the header.
> > The documentation lives in the .c file, so in practice, anyone who
> > wants to add a new caller will need to look there anyway.
> >
> > Thanks,
> > Richard
> >
> > > extern bool supportable_narrowing_operation (enum tree_code, tree,
> > tree,
> > > - enum tree_code *, int *,
> > > + void *, int *,
> > > vec<tree> *);
> > >
> > > extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, diff
> > > --git a/gcc/tree.h b/gcc/tree.h index f62c00bc870..346565f84ce
> > > 100644
> > > --- a/gcc/tree.h
> > > +++ b/gcc/tree.h
> > > @@ -6546,5 +6546,31 @@ extern unsigned fndecl_dealloc_argno (tree);
> > > if nonnull, set the second argument to the referenced enclosing
> > > object or pointer. Otherwise return null. */ extern tree
> > > get_attr_nonstring_decl (tree, tree * = NULL);
> > > +/* Helper to transparently allow tree codes and builtin function codes
> > > + exist in one storage entity. */ class code_helper {
> > > +public:
> > > + code_helper () {}
> > > + code_helper (tree_code code) : rep ((int) code) {}
> > > + code_helper (combined_fn fn) : rep (-(int) fn) {}
> > > + operator tree_code () const { return is_tree_code () ?
> > > + (tree_code) rep :
> > > + ERROR_MARK; }
> > > + operator combined_fn () const { return is_fn_code () ?
> > > + (combined_fn) -rep:
> > > + CFN_LAST; }
> > > + bool is_tree_code () const { return rep > 0; }
> > > + bool is_fn_code () const { return rep < 0; }
> > > + int get_rep () const { return rep; }
> > > +
> > > + enum tree_code as_tree_code () const { return is_tree_code () ?
> > > + (tree_code)* this : MAX_TREE_CODES; } combined_fn as_fn_code
> > > + () const { return is_fn_code () ? (combined_fn)
> > *this
> > > + : CFN_LAST;}
> > > +
> > > +private:
> > > + int rep;
> > > +};
> > >
> > > #endif /* GCC_TREE_H */
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-05-25 9:11 [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Joel Hutton @ 2022-05-27 13:23 ` Richard Biener 2022-05-31 10:07 ` Joel Hutton 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2022-05-27 13:23 UTC (permalink / raw) To: Joel Hutton; +Cc: Richard Sandiford, gcc-patches On Wed, 25 May 2022, Joel Hutton wrote: > Ping! > > Just checking there is still interest in this. I'm assuming you've been > busy with release. Can you post an updated patch (after the .cc renaming, and code_helper now already moved to tree.h). Thanks, Richard. > Joel > > > -----Original Message----- > > From: Joel Hutton > > Sent: 13 April 2022 16:53 > > To: Richard Sandiford <richard.sandiford@arm.com> > > Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org > > Subject: [vect-patterns] Refactor widen_plus/widen_minus as internal_fns > > > > Hi all, > > > > These patches refactor the widening patterns in vect-patterns to use > > internal_fn instead of tree_codes. > > > > Sorry about the delay, some changes to master made it a bit messier. > > > > Bootstrapped and regression tested on aarch64. > > > > Joel > > > > > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c > > > > index 854cbcff390..4a8ea67e62f 100644 > > > > --- a/gcc/tree-vect-patterns.c > > > > +++ b/gcc/tree-vect-patterns.c > > > > @@ -1245,7 +1245,7 @@ vect_recog_sad_pattern (vec_info *vinfo, > > > > static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, > > > > stmt_vec_info last_stmt_info, tree *type_out, > > > > - tree_code orig_code, tree_code wide_code, > > > > + tree_code orig_code, code_helper > > > wide_code_or_ifn, > > > > > > I think it'd be better to keep the original ?wide_code? name and try > > > to remove as many places as possible in which switching based on > > > tree_code or internal_fn is necessary. The recent gimple-match.h > > > patches should help with that, but more routines might be needed. > > > > Done. > > > > > > @@ -1309,8 +1310,16 @@ vect_recog_widen_op_pattern (vec_info > > *vinfo, > > > > 2, oprnd, half_type, unprom, vectype); > > > > > > > > tree var = vect_recog_temp_ssa_var (itype, NULL); > > > > - gimple *pattern_stmt = gimple_build_assign (var, wide_code, > > > > - oprnd[0], oprnd[1]); > > > > + gimple *pattern_stmt; > > > > + if (wide_code_or_ifn.is_tree_code ()) > > > > + pattern_stmt = gimple_build_assign (var, wide_code_or_ifn, > > > > + oprnd[0], oprnd[1]); > > > > + else > > > > + { > > > > + internal_fn fn = as_internal_fn ((combined_fn) wide_code_or_ifn); > > > > + pattern_stmt = gimple_build_call_internal (fn, 2, oprnd[0], oprnd[1]); > > > > + gimple_call_set_lhs (pattern_stmt, var); > > > > + } > > > > > > For example, I think we should hide this inside a new: > > > > > > gimple_build (var, wide_code, oprnd[0], oprnd[1]); > > > > > > that works directly on code_helper, similarly to the new code_helper > > > gimple_build interfaces. > > > > Done. > > > > > > @@ -4513,14 +4513,16 @@ vect_gen_widened_results_half (vec_info > > > *vinfo, enum tree_code code, > > > > tree new_temp; > > > > > > > > /* Generate half of the widened result: */ > > > > - gcc_assert (op_type == TREE_CODE_LENGTH (code)); > > > > if (op_type != binary_op) > > > > vec_oprnd1 = NULL; > > > > - new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, > > > vec_oprnd1); > > > > + if (ch.is_tree_code ()) > > > > + new_stmt = gimple_build_assign (vec_dest, ch, vec_oprnd0, > > > vec_oprnd1); > > > > + else > > > > + new_stmt = gimple_build_call_internal (as_internal_fn > > > > + ((combined_fn) > > > ch), > > > > + 2, vec_oprnd0, vec_oprnd1); > > > > > > Similarly here. I guess the combined_fn/internal_fn path will also > > > need to cope with null trailing operands, for consistency with the tree_code > > one. > > > > > > > Done. > > > > > > @@ -4744,31 +4747,28 @@ vectorizable_conversion (vec_info *vinfo, > > > > && ! vec_stmt) > > > > return false; > > > > > > > > - gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt); > > > > - if (!stmt) > > > > + gimple* stmt = stmt_info->stmt; > > > > + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) > > > > return false; > > > > > > > > - if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) > > > > - return false; > > > > + if (is_gimple_assign (stmt)) > > > > + { > > > > + code_or_ifn = gimple_assign_rhs_code (stmt); } else > > > > + code_or_ifn = gimple_call_combined_fn (stmt); > > > > > > It might be possible to use gimple_extract_op here (only recently added). > > > This would also provide the number of operands directly, instead of > > > needing ?op_type?. It would also provide an array of operands. > > > > > > > Done. > > > > > > - code = gimple_assign_rhs_code (stmt); > > > > - if (!CONVERT_EXPR_CODE_P (code) > > > > - && code != FIX_TRUNC_EXPR > > > > - && code != FLOAT_EXPR > > > > - && code != WIDEN_PLUS_EXPR > > > > - && code != WIDEN_MINUS_EXPR > > > > - && code != WIDEN_MULT_EXPR > > > > - && code != WIDEN_LSHIFT_EXPR) > > > > > > Is it safe to drop this check independently of parts 2 and 3? > > > (Genuine question, haven't checked in detail.) > > > > It requires the parts 2 and 3. I've moved that change into this first patch. > > > > > > @@ -4784,7 +4784,8 @@ vectorizable_conversion (vec_info *vinfo, > > > > } > > > > > > > > rhs_type = TREE_TYPE (op0); > > > > - if ((code != FIX_TRUNC_EXPR && code != FLOAT_EXPR) > > > > + if ((code_or_ifn.is_tree_code () && code_or_ifn != FIX_TRUNC_EXPR > > > > + && code_or_ifn != FLOAT_EXPR) > > > > > > I don't think we want the is_tree_code condition here. The existing > > > != should work. > > > > > > > Done. > > > > > > @@ -11856,13 +11888,13 @@ supportable_widening_operation > > (vec_info > > > *vinfo, > > > > if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR) > > > > std::swap (c1, c2); > > > > > > > > - if (code == FIX_TRUNC_EXPR) > > > > + if (code_or_ifn == FIX_TRUNC_EXPR) > > > > { > > > > /* The signedness is determined from output operand. */ > > > > optab1 = optab_for_tree_code (c1, vectype_out, optab_default); > > > > optab2 = optab_for_tree_code (c2, vectype_out, optab_default); > > > > } > > > > - else if (CONVERT_EXPR_CODE_P (code) > > > > + else if (CONVERT_EXPR_CODE_P ((tree_code) code_or_ifn) > > > > > > I think this should be as_tree_code (), so that it's safe for internal > > > functions if (tree_code) ever becomes a checked convrsion in future. > > > Same for other instances. > > > > > > > Done. > > > > > > && VECTOR_BOOLEAN_TYPE_P (wide_vectype) > > > > && VECTOR_BOOLEAN_TYPE_P (vectype) > > > > && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) [?] @@ > > > > -12000,7 +12031,7 @@ supportable_widening_operation (vec_info > > > *vinfo, > > > > bool > > > > supportable_narrowing_operation (enum tree_code code, > > > > tree vectype_out, tree vectype_in, > > > > - enum tree_code *code1, int *multi_step_cvt, > > > > + void* _code1, int *multi_step_cvt, > > > > > > This might be rehashing an old conversation, sorry, but why does this > > > need to be void? > > > > > > > Reworked to avoid using void*. > > > > > > vec<tree> *interm_types) { > > > > machine_mode vec_mode; > > > > @@ -12013,6 +12044,7 @@ supportable_narrowing_operation (enum > > > tree_code code, > > > > machine_mode intermediate_mode, prev_mode; > > > > int i; > > > > bool uns; > > > > + tree_code * code1 = (tree_code*) _code1; > > > > > > > > *multi_step_cvt = 0; > > > > switch (code) > > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index > > > > bd6f334d15f..70c06264c11 100644 > > > > --- a/gcc/tree-vectorizer.h > > > > +++ b/gcc/tree-vectorizer.h > > > > @@ -2030,13 +2030,16 @@ extern bool vect_is_simple_use (vec_info *, > > > stmt_vec_info, slp_tree, > > > > enum vect_def_type *, > > > > tree *, stmt_vec_info * = NULL); extern bool > > > > vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool > > > > supportable_widening_operation (vec_info *, > > > > - enum tree_code, stmt_vec_info, > > > > - tree, tree, enum tree_code *, > > > > - enum tree_code *, int *, > > > > - vec<tree> *); > > > > +extern bool supportable_widening_operation (vec_info *vinfo, > > > > + code_helper code_or_ifn, > > > > + stmt_vec_info stmt_info, > > > > + tree vectype_out, tree vectype_in, > > > > + code_helper *code_or_ifn1, > > > > + code_helper *code_or_ifn2, > > > > + int *multi_step_cvt, > > > > + vec<tree> *interm_types); > > > > > > Normal style is to keep the variable names out of the header. > > > The documentation lives in the .c file, so in practice, anyone who > > > wants to add a new caller will need to look there anyway. > > > > > > Thanks, > > > Richard > > > > > > > extern bool supportable_narrowing_operation (enum tree_code, tree, > > > tree, > > > > - enum tree_code *, int *, > > > > + void *, int *, > > > > vec<tree> *); > > > > > > > > extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, diff > > > > --git a/gcc/tree.h b/gcc/tree.h index f62c00bc870..346565f84ce > > > > 100644 > > > > --- a/gcc/tree.h > > > > +++ b/gcc/tree.h > > > > @@ -6546,5 +6546,31 @@ extern unsigned fndecl_dealloc_argno (tree); > > > > if nonnull, set the second argument to the referenced enclosing > > > > object or pointer. Otherwise return null. */ extern tree > > > > get_attr_nonstring_decl (tree, tree * = NULL); > > > > +/* Helper to transparently allow tree codes and builtin function codes > > > > + exist in one storage entity. */ class code_helper { > > > > +public: > > > > + code_helper () {} > > > > + code_helper (tree_code code) : rep ((int) code) {} > > > > + code_helper (combined_fn fn) : rep (-(int) fn) {} > > > > + operator tree_code () const { return is_tree_code () ? > > > > + (tree_code) rep : > > > > + ERROR_MARK; } > > > > + operator combined_fn () const { return is_fn_code () ? > > > > + (combined_fn) -rep: > > > > + CFN_LAST; } > > > > + bool is_tree_code () const { return rep > 0; } > > > > + bool is_fn_code () const { return rep < 0; } > > > > + int get_rep () const { return rep; } > > > > + > > > > + enum tree_code as_tree_code () const { return is_tree_code () ? > > > > + (tree_code)* this : MAX_TREE_CODES; } combined_fn as_fn_code > > > > + () const { return is_fn_code () ? (combined_fn) > > > *this > > > > + : CFN_LAST;} > > > > + > > > > +private: > > > > + int rep; > > > > +}; > > > > > > > > #endif /* GCC_TREE_H */ > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-05-27 13:23 ` Richard Biener @ 2022-05-31 10:07 ` Joel Hutton 2022-05-31 16:46 ` Tamar Christina 2022-06-01 10:11 ` Richard Biener 0 siblings, 2 replies; 53+ messages in thread From: Joel Hutton @ 2022-05-31 10:07 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Sandiford, gcc-patches [-- Attachment #1: Type: text/plain, Size: 250 bytes --] > Can you post an updated patch (after the .cc renaming, and code_helper > now already moved to tree.h). > > Thanks, > Richard. Patches attached. They already incorporated the .cc rename, now rebased to be after the change to tree.h Joel [-- Attachment #2: 0001-Refactor-to-allow-internal_fn-s.patch --] [-- Type: application/octet-stream, Size: 26740 bytes --] From 3467bf531402d83c6427716954b9fab933f858ef Mon Sep 17 00:00:00 2001 From: Joel Hutton <joel.hutton@arm.com> Date: Wed, 25 Aug 2021 14:31:15 +0100 Subject: [PATCH 1/3] Refactor to allow internal_fn's Hi all, This refactor allows widening patterns (such as widen_plus/widen_minus) to be represented as either internal_fns or tree_codes. [vect-patterns] Refactor as internal_fn's Refactor vect-patterns to allow patterns to be internal_fns starting with widening_plus/minus patterns gcc/ChangeLog: * gimple-match.h (class code_helper): Add as_internal_fn, as_tree_code helper functions. * gimple.cc (gimple_build): Function to build a GIMPLE_CALL or GIMPLE_ASSIGN as appropriate, given a code_helper. * gimple.h (gimple_build): Function prototype. * tree-core.h (ECF_WIDEN): Flag to mark internal_fn as widening. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Refactor to use code_helper. * tree-vect-stmts.cc (vect_gen_widened_results_half): Refactor to use code_helper. (vect_create_vectorized_promotion_stmts): Refactor to use code_helper. (vectorizable_conversion): Refactor to use code_helper. gimple_call or gimple_assign. (supportable_widening_operation): Refactor to use code_helper. (supportable_narrowing_operation): Refactor to use code_helper. * tree-vectorizer.h (supportable_widening_operation): Change prototype to use code_helper. (supportable_narrowing_operation): change prototype to use code_helper. --- gcc/gimple.cc | 24 +++++ gcc/gimple.h | 1 + gcc/tree-core.h | 3 + gcc/tree-vect-patterns.cc | 7 +- gcc/tree-vect-stmts.cc | 216 +++++++++++++++++++++++--------------- gcc/tree-vectorizer.h | 11 +- gcc/tree.h | 52 +++++++++ 7 files changed, 221 insertions(+), 93 deletions(-) diff --git a/gcc/gimple.cc b/gcc/gimple.cc index b70ab4d25230374f0c90f93d77f9caf8d57587ee..bd3210cd35298daf7c74276e38658b5151d5cc1f 100644 --- a/gcc/gimple.cc +++ b/gcc/gimple.cc @@ -502,6 +502,30 @@ gimple_build_assign (tree lhs, enum tree_code subcode, tree op1 MEM_STAT_DECL) PASS_MEM_STAT); } +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, + or internal_fn contained in ch, respectively. */ +gimple * +gimple_build (tree lhs, code_helper ch, tree op0, tree op1) +{ + if (op0 == NULL_TREE) + return NULL; + if (ch.is_tree_code ()) + return op1 == NULL_TREE ? gimple_build_assign (lhs, ch.as_tree_code (), + op0) : + gimple_build_assign (lhs, ch.as_tree_code (), op0, + op1); + else + { + internal_fn fn = as_internal_fn (ch.as_fn_code ()); + gimple* stmt; + if (op1 == NULL_TREE) + stmt = gimple_build_call_internal (fn, 1, op0); + else + stmt = gimple_build_call_internal (fn, 2, op0, op1); + gimple_call_set_lhs (stmt, lhs); + return stmt; + } +} /* Build a GIMPLE_COND statement. diff --git a/gcc/gimple.h b/gcc/gimple.h index 6b1e89ad74e6b22dd534ff48e48fef688032f844..5350f14f6f29bd8e011b79c2aa79c2bfaef8c58f 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -1523,6 +1523,7 @@ gcall *gimple_build_call_valist (tree, unsigned, va_list); gcall *gimple_build_call_internal (enum internal_fn, unsigned, ...); gcall *gimple_build_call_internal_vec (enum internal_fn, const vec<tree> &); gcall *gimple_build_call_from_tree (tree, tree); +gimple* gimple_build (tree, code_helper, tree, tree); gassign *gimple_build_assign (tree, tree CXX_MEM_STAT_INFO); gassign *gimple_build_assign (tree, enum tree_code, tree, tree, tree CXX_MEM_STAT_INFO); diff --git a/gcc/tree-core.h b/gcc/tree-core.h index ab5fa01e5cb5fb56c1964b93b014ed55a4aa704a..cff6211080bced0bffb39e98039a6550897acf77 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -96,6 +96,9 @@ struct die_struct; /* Nonzero if this is a cold function. */ #define ECF_COLD (1 << 15) +/* Nonzero if this is a widening function. */ +#define ECF_WIDEN (1 << 16) + /* Call argument flags. */ /* Nonzero if the argument is not used by the function. */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 0fad4dbd0945c6c176f3457b751e812f17fcd148..36e362e1daf3f946c6074600a6a322b3bda67755 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -1348,7 +1348,7 @@ vect_recog_sad_pattern (vec_info *vinfo, static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, - tree_code orig_code, tree_code wide_code, + tree_code orig_code, code_helper wide_code, bool shift_p, const char *name) { gimple *last_stmt = last_stmt_info->stmt; @@ -1391,7 +1391,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, vecctype = get_vectype_for_scalar_type (vinfo, ctype); } - enum tree_code dummy_code; + code_helper dummy_code; int dummy_int; auto_vec<tree> dummy_vec; if (!vectype @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, 2, oprnd, half_type, unprom, vectype); tree var = vect_recog_temp_ssa_var (itype, NULL); - gimple *pattern_stmt = gimple_build_assign (var, wide_code, - oprnd[0], oprnd[1]); + gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], oprnd[1]); if (vecctype != vecitype) pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype, diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 346d8ce280437e00bfeb19a4b4adc59eb96207f9..61b51a29f99bcdf0ff6b4ead4a69163ebf8ed383 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4636,7 +4636,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, STMT_INFO is the original scalar stmt that we are vectorizing. */ static gimple * -vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, +vect_gen_widened_results_half (vec_info *vinfo, code_helper ch, tree vec_oprnd0, tree vec_oprnd1, int op_type, tree vec_dest, gimple_stmt_iterator *gsi, stmt_vec_info stmt_info) @@ -4645,14 +4645,12 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, tree new_temp; /* Generate half of the widened result: */ - gcc_assert (op_type == TREE_CODE_LENGTH (code)); if (op_type != binary_op) vec_oprnd1 = NULL; - new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1); + new_stmt = gimple_build (vec_dest, ch, vec_oprnd0, vec_oprnd1); new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_temp); + gimple_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); - return new_stmt; } @@ -4729,8 +4727,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds1, stmt_vec_info stmt_info, tree vec_dest, gimple_stmt_iterator *gsi, - enum tree_code code1, - enum tree_code code2, int op_type) + code_helper ch1, + code_helper ch2, int op_type) { int i; tree vop0, vop1, new_tmp1, new_tmp2; @@ -4746,10 +4744,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vop1 = NULL_TREE; /* Generate the two halves of promotion operation. */ - new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1, + new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1, op_type, vec_dest, gsi, stmt_info); - new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1, + new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1, op_type, vec_dest, gsi, stmt_info); if (is_gimple_call (new_stmt1)) @@ -4846,8 +4844,9 @@ vectorizable_conversion (vec_info *vinfo, tree scalar_dest; tree op0, op1 = NULL_TREE; loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo); - enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK; - enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; + tree_code tc1; + code_helper code, code1, code2; + code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; tree new_temp; enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type}; int ndts = 2; @@ -4876,31 +4875,42 @@ vectorizable_conversion (vec_info *vinfo, && ! vec_stmt) return false; - gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt); - if (!stmt) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return false; - if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) + if (gimple_get_lhs (stmt) == NULL_TREE || + TREE_CODE(gimple_get_lhs (stmt)) != SSA_NAME) return false; - code = gimple_assign_rhs_code (stmt); - if (!CONVERT_EXPR_CODE_P (code) - && code != FIX_TRUNC_EXPR - && code != FLOAT_EXPR - && code != WIDEN_PLUS_EXPR - && code != WIDEN_MINUS_EXPR - && code != WIDEN_MULT_EXPR - && code != WIDEN_LSHIFT_EXPR) + if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) return false; - bool widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); - op_type = TREE_CODE_LENGTH (code); + bool widen_arith = false; + gimple_match_op res_op; + if (!gimple_extract_op (stmt, &res_op)) + return false; + code = res_op.code; + op_type = res_op.num_ops; + + if (is_gimple_assign (stmt)) + { + widen_arith = (code == WIDEN_PLUS_EXPR + || code == WIDEN_MINUS_EXPR + || code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR); + } + else + widen_arith = gimple_call_flags (stmt) & ECF_WIDEN; + + if (!widen_arith + && !CONVERT_EXPR_CODE_P (code) + && code != FIX_TRUNC_EXPR + && code != FLOAT_EXPR) + return false; /* Check types of lhs and rhs. */ - scalar_dest = gimple_assign_lhs (stmt); + scalar_dest = gimple_get_lhs (stmt); lhs_type = TREE_TYPE (scalar_dest); vectype_out = STMT_VINFO_VECTYPE (stmt_info); @@ -4938,10 +4948,15 @@ vectorizable_conversion (vec_info *vinfo, if (op_type == binary_op) { - gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR); + gcc_assert (code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR + || code == WIDEN_PLUS_EXPR + || code == WIDEN_MINUS_EXPR + || widen_arith); + - op1 = gimple_assign_rhs2 (stmt); + op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : + gimple_call_arg (stmt, 0); tree vectype1_in; if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1, &op1, &slp_op1, &dt[1], &vectype1_in)) @@ -5025,8 +5040,12 @@ vectorizable_conversion (vec_info *vinfo, && code != FLOAT_EXPR && !CONVERT_EXPR_CODE_P (code)) return false; - if (supportable_convert_operation (code, vectype_out, vectype_in, &code1)) + if (supportable_convert_operation (code.as_tree_code (), vectype_out, + vectype_in, &tc1)) + { + code1 = tc1; break; + } /* FALLTHRU */ unsupported: if (dump_enabled_p ()) @@ -5037,9 +5056,11 @@ vectorizable_conversion (vec_info *vinfo, case WIDEN: if (known_eq (nunits_in, nunits_out)) { - if (!supportable_half_widening_operation (code, vectype_out, - vectype_in, &code1)) + if (!supportable_half_widening_operation (code.as_tree_code (), + vectype_out, vectype_in, + &tc1)) goto unsupported; + code1 = tc1; gcc_assert (!(multi_step_cvt && op_type == binary_op)); break; } @@ -5073,14 +5094,17 @@ vectorizable_conversion (vec_info *vinfo, if (GET_MODE_SIZE (rhs_mode) == fltsz) { - if (!supportable_convert_operation (code, vectype_out, - cvt_type, &codecvt1)) + tc1 = ERROR_MARK; + if (!supportable_convert_operation (code.as_tree_code (), + vectype_out, + cvt_type, &tc1)) goto unsupported; + codecvt1 = tc1; } - else if (!supportable_widening_operation (vinfo, code, stmt_info, - vectype_out, cvt_type, - &codecvt1, &codecvt2, - &multi_step_cvt, + else if (!supportable_widening_operation (vinfo, code, + stmt_info, vectype_out, + cvt_type, &codecvt1, + &codecvt2, &multi_step_cvt, &interm_types)) continue; else @@ -5088,8 +5112,9 @@ vectorizable_conversion (vec_info *vinfo, if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info, cvt_type, - vectype_in, &code1, &code2, - &multi_step_cvt, &interm_types)) + vectype_in, &code1, + &code2, &multi_step_cvt, + &interm_types)) { found_mode = true; break; @@ -5111,10 +5136,14 @@ vectorizable_conversion (vec_info *vinfo, case NARROW: gcc_assert (op_type == unary_op); - if (supportable_narrowing_operation (code, vectype_out, vectype_in, - &code1, &multi_step_cvt, + if (supportable_narrowing_operation (code.as_tree_code (), vectype_out, + vectype_in, + &tc1, &multi_step_cvt, &interm_types)) - break; + { + code1 = tc1; + break; + } if (code != FIX_TRUNC_EXPR || GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode)) @@ -5125,13 +5154,18 @@ vectorizable_conversion (vec_info *vinfo, cvt_type = get_same_sized_vectype (cvt_type, vectype_in); if (cvt_type == NULL_TREE) goto unsupported; - if (!supportable_convert_operation (code, cvt_type, vectype_in, - &codecvt1)) + if (!supportable_convert_operation (code.as_tree_code (), cvt_type, + vectype_in, + &tc1)) goto unsupported; + codecvt1 = tc1; if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type, - &code1, &multi_step_cvt, + &tc1, &multi_step_cvt, &interm_types)) - break; + { + code1 = tc1; + break; + } goto unsupported; default: @@ -5245,8 +5279,9 @@ vectorizable_conversion (vec_info *vinfo, FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { /* Arguments are ready, create the new vector stmt. */ - gcc_assert (TREE_CODE_LENGTH (code1) == unary_op); - gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0); + gcc_assert (TREE_CODE_LENGTH ((tree_code) code1) == unary_op); + gassign *new_stmt = gimple_build_assign (vec_dest, + code1.as_tree_code (), vop0); new_temp = make_ssa_name (vec_dest, new_stmt); gimple_assign_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); @@ -5278,7 +5313,7 @@ vectorizable_conversion (vec_info *vinfo, for (i = multi_step_cvt; i >= 0; i--) { tree this_dest = vec_dsts[i]; - enum tree_code c1 = code1, c2 = code2; + code_helper c1 = code1, c2 = code2; if (i == 0 && codecvt2 != ERROR_MARK) { c1 = codecvt1; @@ -5288,7 +5323,8 @@ vectorizable_conversion (vec_info *vinfo, vect_create_half_widening_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, stmt_info, this_dest, gsi, - c1, op_type); + c1.as_tree_code (), + op_type); else vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, stmt_info, @@ -5301,9 +5337,11 @@ vectorizable_conversion (vec_info *vinfo, gimple *new_stmt; if (cvt_type) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); + gcc_assert (TREE_CODE_LENGTH ((tree_code) codecvt1) == unary_op); new_temp = make_ssa_name (vec_dest); - new_stmt = gimple_build_assign (new_temp, codecvt1, vop0); + new_stmt = gimple_build_assign (new_temp, + codecvt1.as_tree_code (), + vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } else @@ -5327,10 +5365,10 @@ vectorizable_conversion (vec_info *vinfo, if (cvt_type) FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); + gcc_assert (TREE_CODE_LENGTH (((tree_code) codecvt1)) == unary_op); new_temp = make_ssa_name (vec_dest); gassign *new_stmt - = gimple_build_assign (new_temp, codecvt1, vop0); + = gimple_build_assign (new_temp, codecvt1.as_tree_code (), vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); vec_oprnds0[i] = new_temp; } @@ -5338,7 +5376,7 @@ vectorizable_conversion (vec_info *vinfo, vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0, multi_step_cvt, stmt_info, vec_dsts, gsi, - slp_node, code1); + slp_node, code1.as_tree_code ()); break; } if (!slp_node) @@ -11926,9 +11964,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype) bool supportable_widening_operation (vec_info *vinfo, - enum tree_code code, stmt_vec_info stmt_info, + code_helper code, + stmt_vec_info stmt_info, tree vectype_out, tree vectype_in, - enum tree_code *code1, enum tree_code *code2, + code_helper *code1, + code_helper *code2, int *multi_step_cvt, vec<tree> *interm_types) { @@ -11939,7 +11979,7 @@ supportable_widening_operation (vec_info *vinfo, optab optab1, optab2; tree vectype = vectype_in; tree wide_vectype = vectype_out; - enum tree_code c1, c2; + code_helper c1=MAX_TREE_CODES, c2=MAX_TREE_CODES; int i; tree prev_type, intermediate_type; machine_mode intermediate_mode, prev_mode; @@ -11949,7 +11989,7 @@ supportable_widening_operation (vec_info *vinfo, if (loop_info) vect_loop = LOOP_VINFO_LOOP (loop_info); - switch (code) + switch (code.as_tree_code ()) { case WIDEN_MULT_EXPR: /* The result of a vectorized widening operation usually requires @@ -11990,8 +12030,9 @@ supportable_widening_operation (vec_info *vinfo, && !nested_in_vect_loop_p (vect_loop, stmt_info) && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR, stmt_info, vectype_out, - vectype_in, code1, code2, - multi_step_cvt, interm_types)) + vectype_in, code1, + code2, multi_step_cvt, + interm_types)) { /* Elements in a vector with vect_used_by_reduction property cannot be reordered if the use chain with this property does not have the @@ -12054,6 +12095,9 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_UNPACK_FIX_TRUNC_HI_EXPR; break; + case MAX_TREE_CODES: + break; + default: gcc_unreachable (); } @@ -12061,13 +12105,16 @@ supportable_widening_operation (vec_info *vinfo, if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR) std::swap (c1, c2); + if (code == FIX_TRUNC_EXPR) { /* The signedness is determined from output operand. */ - optab1 = optab_for_tree_code (c1, vectype_out, optab_default); - optab2 = optab_for_tree_code (c2, vectype_out, optab_default); + optab1 = optab_for_tree_code (c1.as_tree_code (), vectype_out, + optab_default); + optab2 = optab_for_tree_code (c2.as_tree_code (), vectype_out, + optab_default); } - else if (CONVERT_EXPR_CODE_P (code) + else if (CONVERT_EXPR_CODE_P (code.as_tree_code ()) && VECTOR_BOOLEAN_TYPE_P (wide_vectype) && VECTOR_BOOLEAN_TYPE_P (vectype) && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) @@ -12080,8 +12127,8 @@ supportable_widening_operation (vec_info *vinfo, } else { - optab1 = optab_for_tree_code (c1, vectype, optab_default); - optab2 = optab_for_tree_code (c2, vectype, optab_default); + optab1 = optab_for_tree_code (c1.as_tree_code (), vectype, optab_default); + optab2 = optab_for_tree_code (c2.as_tree_code (), vectype, optab_default); } if (!optab1 || !optab2) @@ -12092,8 +12139,12 @@ supportable_widening_operation (vec_info *vinfo, || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing) return false; - *code1 = c1; - *code2 = c2; + if (code.is_tree_code ()) + { + *code1 = c1; + *code2 = c2; + } + if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype) && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype)) @@ -12114,7 +12165,7 @@ supportable_widening_operation (vec_info *vinfo, prev_type = vectype; prev_mode = vec_mode; - if (!CONVERT_EXPR_CODE_P (code)) + if (!CONVERT_EXPR_CODE_P ((tree_code) code)) return false; /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS @@ -12145,8 +12196,10 @@ supportable_widening_operation (vec_info *vinfo, } else { - optab3 = optab_for_tree_code (c1, intermediate_type, optab_default); - optab4 = optab_for_tree_code (c2, intermediate_type, optab_default); + optab3 = optab_for_tree_code (c1.as_tree_code (), intermediate_type, + optab_default); + optab4 = optab_for_tree_code (c2.as_tree_code (), intermediate_type, + optab_default); } if (!optab3 || !optab4 @@ -12181,7 +12234,6 @@ supportable_widening_operation (vec_info *vinfo, return false; } - /* Function supportable_narrowing_operation Check whether an operation represented by the code CODE is a @@ -12205,7 +12257,7 @@ supportable_widening_operation (vec_info *vinfo, bool supportable_narrowing_operation (enum tree_code code, tree vectype_out, tree vectype_in, - enum tree_code *code1, int *multi_step_cvt, + tree_code* _code1, int *multi_step_cvt, vec<tree> *interm_types) { machine_mode vec_mode; @@ -12217,8 +12269,8 @@ supportable_narrowing_operation (enum tree_code code, tree intermediate_type, prev_type; machine_mode intermediate_mode, prev_mode; int i; - unsigned HOST_WIDE_INT n_elts; bool uns; + tree_code * code1 = (tree_code*) _code1; *multi_step_cvt = 0; switch (code) @@ -12227,9 +12279,8 @@ supportable_narrowing_operation (enum tree_code code, c1 = VEC_PACK_TRUNC_EXPR; if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype) && VECTOR_BOOLEAN_TYPE_P (vectype) - && SCALAR_INT_MODE_P (TYPE_MODE (vectype)) - && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts) - && n_elts < BITS_PER_UNIT) + && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) optab1 = vec_pack_sbool_trunc_optab; else optab1 = optab_for_tree_code (c1, vectype, optab_default); @@ -12320,9 +12371,8 @@ supportable_narrowing_operation (enum tree_code code, = lang_hooks.types.type_for_mode (intermediate_mode, uns); if (VECTOR_BOOLEAN_TYPE_P (intermediate_type) && VECTOR_BOOLEAN_TYPE_P (prev_type) - && SCALAR_INT_MODE_P (prev_mode) - && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant (&n_elts) - && n_elts < BITS_PER_UNIT) + && intermediate_mode == prev_mode + && SCALAR_INT_MODE_P (prev_mode)) interm_optab = vec_pack_sbool_trunc_optab; else interm_optab diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd784f10ee3d8ff4b4dc 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree, enum vect_def_type *, tree *, stmt_vec_info * = NULL); extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool supportable_widening_operation (vec_info *, - enum tree_code, stmt_vec_info, - tree, tree, enum tree_code *, - enum tree_code *, int *, - vec<tree> *); +extern bool supportable_widening_operation (vec_info*, code_helper, + stmt_vec_info, tree, tree, + code_helper*, code_helper*, + int*, vec<tree> *); extern bool supportable_narrowing_operation (enum tree_code, tree, tree, - enum tree_code *, int *, + tree_code *, int *, vec<tree> *); extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, diff --git a/gcc/tree.h b/gcc/tree.h index f84958933d51144bb6ce7cc41eca5f7f06814550..e51e34c051d9b91d1c02a4b2fefdb2b15606a36f 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -92,6 +92,10 @@ public: bool is_fn_code () const { return rep < 0; } bool is_internal_fn () const; bool is_builtin_fn () const; + enum tree_code as_tree_code () const { return is_tree_code () ? + (tree_code)* this : MAX_TREE_CODES; } + combined_fn as_fn_code () const { return is_fn_code () ? (combined_fn) *this + : CFN_LAST;} int get_rep () const { return rep; } bool operator== (const code_helper &other) { return rep == other.rep; } bool operator!= (const code_helper &other) { return rep != other.rep; } @@ -6657,6 +6661,54 @@ extern unsigned fndecl_dealloc_argno (tree); if nonnull, set the second argument to the referenced enclosing object or pointer. Otherwise return null. */ extern tree get_attr_nonstring_decl (tree, tree * = NULL); +/* Helper to transparently allow tree codes and builtin function codes + exist in one storage entity. */ +class code_helper +{ +public: + code_helper () {} + code_helper (tree_code code) : rep ((int) code) {} + code_helper (combined_fn fn) : rep (-(int) fn) {} + code_helper (internal_fn fn) : rep (-(int) as_combined_fn (fn)) {} + explicit operator tree_code () const { return (tree_code) rep; } + explicit operator combined_fn () const { return (combined_fn) -rep; } + explicit operator internal_fn () const; + explicit operator built_in_function () const; + bool is_tree_code () const { return rep > 0; } + bool is_fn_code () const { return rep < 0; } + bool is_internal_fn () const; + bool is_builtin_fn () const; + int get_rep () const { return rep; } + bool operator== (const code_helper &other) { return rep == other.rep; } + bool operator!= (const code_helper &other) { return rep != other.rep; } + bool operator== (tree_code c) { return rep == code_helper (c).rep; } + bool operator!= (tree_code c) { return rep != code_helper (c).rep; } + +private: + int rep; +}; + +inline code_helper::operator internal_fn () const +{ + return as_internal_fn (combined_fn (*this)); +} + +inline code_helper::operator built_in_function () const +{ + return as_builtin_fn (combined_fn (*this)); +} + +inline bool +code_helper::is_internal_fn () const +{ + return is_fn_code () && internal_fn_p (combined_fn (*this)); +} + +inline bool +code_helper::is_builtin_fn () const +{ + return is_fn_code () && builtin_fn_p (combined_fn (*this)); +} extern int get_target_clone_attr_len (tree); -- 2.17.1 [-- Attachment #3: 0002-Refactor-widen_plus-as-internal_fn.patch --] [-- Type: application/octet-stream, Size: 22720 bytes --] From 70fa5fc3e8282a73b973bdd79fccd3450d5b312c Mon Sep 17 00:00:00 2001 From: Joel Hutton <joel.hutton@arm.com> Date: Wed, 26 Jan 2022 14:00:17 +0000 Subject: [PATCH 2/3] Refactor widen_plus as internal_fn This patch replaces the existing tree_code widen_plus and widen_minus patterns with internal_fn versions. DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it provides convenience wrappers for defining conversions that require a hi/lo split, like widening and narrowing operations. Each definition for <NAME> will require an optab named <OPTAB> and two other optabs that you specify for signed and unsigned. The hi/lo pair is necessary because the widening operations take n narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' operation operates on the first n/2 elements of input. The 'hi' operation operates on the second n/2 elements of input. Defining an internal_fn along with hi/lo variations allows a single internal function to be returned from a vect_recog function that will later be expanded to hi/lo. DEF_INTERNAL_OPTAB_MULTI_FN is used in internal-fn.def to register a widening internal_fn. It is defined differently in different places and internal-fn.def is sourced from those places so the parameters given can be reused. internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later defined to generate the 'expand_' functions for the hi/lo versions of the fn. internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original and hi/lo variants of the internal_fn For example: IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>addl_hi_<mode> -> (u/s)addl2 IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>addl_lo_<mode> -> (u/s)addl This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. gcc/ChangeLog: 2022-04-13 Joel Hutton <joel.hutton@arm.com> 2022-04-13 Tamar Christina <tamar.christina@arm.com> * internal-fn.cc (INCLUDE_MAP): Include maps for use in optab lookup. (DEF_INTERNAL_OPTAB_MULTI_FN): Macro to define an internal_fn that expands into multiple internal_fns (for widening). (ifn_cmp): Function to compare ifn's for sorting/searching. (lookup_multi_ifn_optab): Add lookup function. (lookup_multi_internal_fn): Add lookup function. (commutative_binary_fn_p): Add widen_plus fn's. * internal-fn.def (DEF_INTERNAL_OPTAB_MULTI_FN): Define widening plus,minus functions. (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code. (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code. * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. (lookup_multi_ifn_optab): Add prototype. (lookup_multi_internal_fn): Add prototype. * optabs.cc (commutative_optab_p): Add widening plus, minus optabs. * optabs.def (OPTAB_CD): widen add, sub optabs * tree-core.h (ECF_MULTI): Flag to indicate if a function decays into hi/lo parts. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support patterns with a hi/lo split. (vect_recog_widen_plus_pattern): Refactor to return IFN_VECT_WIDEN_PLUS. (vect_recog_widen_minus_pattern): Refactor to return new IFN_VEC_WIDEN_MINUS. * tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus ifn support. (supportable_widening_operation): Add widen plus/minus ifn support. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: Test that new IFN_VEC_WIDEN_PLUS is being used. * gcc.target/aarch64/vect-widen-sub.c: Test that new IFN_VEC_WIDEN_MINUS is being used. --- gcc/internal-fn.cc | 107 ++++++++++++++++++ gcc/internal-fn.def | 23 ++++ gcc/internal-fn.h | 7 ++ gcc/optabs.cc | 12 +- gcc/optabs.def | 2 + .../gcc.target/aarch64/vect-widen-add.c | 4 +- .../gcc.target/aarch64/vect-widen-sub.c | 4 +- gcc/tree-core.h | 4 + gcc/tree-vect-patterns.cc | 37 ++++-- gcc/tree-vect-stmts.cc | 60 +++++++++- 10 files changed, 244 insertions(+), 16 deletions(-) diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 8b1733e20c4455e4e8c383c92fe859f4256cae69..e95b13af884f67990ad43c286990a351e2bd641b 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +#define INCLUDE_MAP #include "config.h" #include "system.h" #include "coretypes.h" @@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = { 0 }; +const enum internal_fn internal_fn_hilo_keys_array[] = { +#undef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + IFN_##NAME##_LO, \ + IFN_##NAME##_HI, +#include "internal-fn.def" + IFN_LAST +#undef DEF_INTERNAL_OPTAB_MULTI_FN +}; + +const optab internal_fn_hilo_values_array[] = { +#undef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + SOPTAB##_lo_optab, UOPTAB##_lo_optab, \ + SOPTAB##_hi_optab, UOPTAB##_hi_optab, +#include "internal-fn.def" + unknown_optab, unknown_optab +#undef DEF_INTERNAL_OPTAB_MULTI_FN +}; + /* Return the internal function called NAME, or IFN_LAST if there's no such function. */ @@ -90,6 +111,62 @@ lookup_internal_fn (const char *name) return entry ? *entry : IFN_LAST; } +static int +ifn_cmp (const void *a_, const void *b_) +{ + typedef std::pair<enum internal_fn, unsigned> ifn_pair; + auto *a = (const std::pair<ifn_pair, optab> *)a_; + auto *b = (const std::pair<ifn_pair, optab> *)b_; + return (int) (a->first.first) - (b->first.first); +} + +/* Return the optab belonging to the given internal function NAME for the given + SIGN or unknown_optab. */ + +optab +lookup_multi_ifn_optab (enum internal_fn fn, unsigned sign) +{ + typedef std::pair<enum internal_fn, unsigned> ifn_pair; + typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type; + static fn_to_optab_map_type *fn_to_optab_map; + + if (!fn_to_optab_map) + { + unsigned num + = sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn); + fn_to_optab_map = new fn_to_optab_map_type (); + for (unsigned int i = 0; i < num - 1; ++i) + { + enum internal_fn fn = internal_fn_hilo_keys_array[i]; + optab v1 = internal_fn_hilo_values_array[2*i]; + optab v2 = internal_fn_hilo_values_array[2*i + 1]; + ifn_pair key1 (fn, 0); + fn_to_optab_map->safe_push ({key1, v1}); + ifn_pair key2 (fn, 1); + fn_to_optab_map->safe_push ({key2, v2}); + } + fn_to_optab_map->qsort(ifn_cmp); + } + + ifn_pair new_pair (fn, sign ? 1 : 0); + optab tmp; + std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp); + auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp); + return entry != fn_to_optab_map->end () ? entry->second : unknown_optab; +} + +extern void +lookup_multi_internal_fn (enum internal_fn ifn, enum internal_fn *lo, + enum internal_fn *hi) +{ + int ecf_flags = internal_fn_flags (ifn); + gcc_assert (ecf_flags & ECF_MULTI); + + *lo = internal_fn (ifn + 1); + *hi = internal_fn (ifn + 2); +} + + /* Fnspec of each internal function, indexed by function number. */ const_tree internal_fn_fnspec_array[IFN_LAST + 1]; @@ -3906,6 +3983,9 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: return true; default: @@ -3991,6 +4071,32 @@ set_edom_supported_p (void) #endif } +#undef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + static void \ + expand_##CODE (internal_fn, gcall *) \ + { \ + gcc_unreachable (); \ + } \ + static void \ + expand_##CODE##_LO (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab); \ + } \ + static void \ + expand_##CODE##_HI (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab); \ + } + #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \ static void \ expand_##CODE (internal_fn fn, gcall *stmt) \ @@ -4007,6 +4113,7 @@ set_edom_supported_p (void) expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ } #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_MULTI_FN /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index d2d550d358606022b1cb44fa842f06e0be507bc3..4635a9c8af9ad27bb05d7510388d0fe2270428e5 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -82,6 +82,13 @@ along with GCC; see the file COPYING3. If not see says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX} group of functions to any integral mode (including vector modes). + DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it + provides convenience wrappers for defining conversions that require a + hi/lo split, like widening and narrowing operations. Each definition + for <NAME> will require an optab named <OPTAB> and two other optabs that + you specify for signed and unsigned. + + Each entry must have a corresponding expander of the form: void expand_NAME (gimple_call stmt) @@ -120,6 +127,14 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) #endif +#ifndef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME, FLAGS | ECF_MULTI, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE) +#endif + + DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, @@ -292,6 +307,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) +DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_PLUS, + ECF_CONST | ECF_WIDEN | ECF_NOTHROW, + vec_widen_add, vec_widen_saddl, vec_widen_uaddl, + binary) +DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_MINUS, + ECF_CONST | ECF_WIDEN | ECF_NOTHROW, + vec_widen_sub, vec_widen_ssubl, vec_widen_usubl, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 23c014a963c4d72da92c763db87ee486a2adb485..b35de19747d251d19dc13de1e0323368bd2ebdf2 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +#include "insn-codes.h" +#include "insn-opinit.h" + + /* INTEGER_CST values for IFN_UNIQUE function arg-0. UNSPEC: Undifferentiated UNIQUE. @@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn) } extern internal_fn lookup_internal_fn (const char *); +extern optab lookup_multi_ifn_optab (enum internal_fn, unsigned); +extern void lookup_multi_internal_fn (enum internal_fn, enum internal_fn *, + enum internal_fn *); /* Return the ECF_* flags for function FN. */ diff --git a/gcc/optabs.cc b/gcc/optabs.cc index c0a68471d2ddf08bc0e6a3fd592ebb9f05e516c1..7e904b3e154d018779bb1a36de74e6997f70e193 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1314,7 +1314,17 @@ commutative_optab_p (optab binoptab) || binoptab == smul_widen_optab || binoptab == umul_widen_optab || binoptab == smul_highpart_optab - || binoptab == umul_highpart_optab); + || binoptab == umul_highpart_optab + || binoptab == vec_widen_add_optab + || binoptab == vec_widen_sub_optab + || binoptab == vec_widen_saddl_hi_optab + || binoptab == vec_widen_saddl_lo_optab + || binoptab == vec_widen_ssubl_hi_optab + || binoptab == vec_widen_ssubl_lo_optab + || binoptab == vec_widen_uaddl_hi_optab + || binoptab == vec_widen_uaddl_lo_optab + || binoptab == vec_widen_usubl_hi_optab + || binoptab == vec_widen_usubl_lo_optab); } /* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're diff --git a/gcc/optabs.def b/gcc/optabs.def index 801310ebaa7d469520809bb7efed6820f8eb866b..a7881dcb49e4ef07d8f07aa31214eb3a7a944e2e 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4") OPTAB_CD(umsub_widen_optab, "umsub$b$a4") OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4") OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4") +OPTAB_CD(vec_widen_add_optab, "add$a$b3") +OPTAB_CD(vec_widen_sub_optab, "sub$a$b3") OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b") OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b") OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */ /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */ /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */ diff --git a/gcc/tree-core.h b/gcc/tree-core.h index cff6211080bced0bffb39e98039a6550897acf77..d0c8b812cfb9c3ac83bf25fff0431b08cb7d823d 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -99,6 +99,10 @@ struct die_struct; /* Nonzero if this is a widening function. */ #define ECF_WIDEN (1 << 16) +/* Nonzero if this is a function that decomposes into a lo/hi operation. */ +#define ECF_MULTI (1 << 17) + + /* Call argument flags. */ /* Nonzero if the argument is not used by the function. */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 36e362e1daf3f946c6074600a6a322b3bda67755..0fd587da12b4a17b238327ae60f5a2a7a0efc514 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -1349,14 +1349,16 @@ static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, tree_code orig_code, code_helper wide_code, - bool shift_p, const char *name) + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) { gimple *last_stmt = last_stmt_info->stmt; vect_unpromoted_value unprom[2]; tree half_type; if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code, - shift_p, 2, unprom, &half_type)) + shift_p, 2, unprom, &half_type, subtype)) + return NULL; /* Pattern detected. */ @@ -1422,6 +1424,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo, type, pattern_stmt, vecctype); } +static gimple * +vect_recog_widen_op_pattern (vec_info *vinfo, + stmt_vec_info last_stmt_info, tree *type_out, + tree_code orig_code, internal_fn wide_ifn, + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) +{ + combined_fn ifn = as_combined_fn (wide_ifn); + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + orig_code, ifn, shift_p, name, + subtype); +} + + /* Try to detect multiplication on widened inputs, converting MULT_EXPR to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ @@ -1435,26 +1451,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, } /* Try to detect addition on widened inputs, converting PLUS_EXPR - to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_PLUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - PLUS_EXPR, WIDEN_PLUS_EXPR, false, - "vect_recog_widen_plus_pattern"); + PLUS_EXPR, IFN_VEC_WIDEN_PLUS, + false, "vect_recog_widen_plus_pattern", + &subtype); } /* Try to detect subtraction on widened inputs, converting MINUS_EXPR - to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_MINUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - MINUS_EXPR, WIDEN_MINUS_EXPR, false, - "vect_recog_widen_minus_pattern"); + MINUS_EXPR, IFN_VEC_WIDEN_MINUS, + false, "vect_recog_widen_minus_pattern", + &subtype); } /* Function vect_recog_popcount_pattern @@ -5618,6 +5638,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + /* These must come after the double widening ones. */ }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 61b51a29f99bcdf0ff6b4ead4a69163ebf8ed383..c31831df723eeae8ea4fca2790a18b562106c889 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4880,7 +4880,7 @@ vectorizable_conversion (vec_info *vinfo, return false; if (gimple_get_lhs (stmt) == NULL_TREE || - TREE_CODE(gimple_get_lhs (stmt)) != SSA_NAME) + TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) return false; if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) @@ -12125,12 +12125,62 @@ supportable_widening_operation (vec_info *vinfo, optab1 = vec_unpacks_sbool_lo_optab; optab2 = vec_unpacks_sbool_hi_optab; } - else - { - optab1 = optab_for_tree_code (c1.as_tree_code (), vectype, optab_default); - optab2 = optab_for_tree_code (c2.as_tree_code (), vectype, optab_default); + + if (code.is_fn_code ()) + { + internal_fn ifn = as_internal_fn (code.as_fn_code ()); + int ecf_flags = internal_fn_flags (ifn); + gcc_assert (ecf_flags & ECF_MULTI); + + switch (code.as_fn_code ()) + { + case CFN_VEC_WIDEN_PLUS: + break; + case CFN_VEC_WIDEN_MINUS: + break; + case CFN_LAST: + default: + return false; + } + + internal_fn lo, hi; + lookup_multi_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); + optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); } + if (code.is_tree_code ()) + { + if (code == FIX_TRUNC_EXPR) + { + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (c1.as_tree_code (), vectype_out, + optab_default); + optab2 = optab_for_tree_code (c2.as_tree_code (), vectype_out, + optab_default); + } + else if (CONVERT_EXPR_CODE_P (code.as_tree_code ()) + && VECTOR_BOOLEAN_TYPE_P (wide_vectype) + && VECTOR_BOOLEAN_TYPE_P (vectype) + && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) + { + /* If the input and result modes are the same, a different optab + is needed where we pass in the number of units in vectype. */ + optab1 = vec_unpacks_sbool_lo_optab; + optab2 = vec_unpacks_sbool_hi_optab; + } + else + { + optab1 = optab_for_tree_code (c1.as_tree_code (), vectype, + optab_default); + optab2 = optab_for_tree_code (c2.as_tree_code (), vectype, + optab_default); + } + } + if (!optab1 || !optab2) return false; -- 2.17.1 [-- Attachment #4: 0003-Remove-widen_plus-minus_expr-tree-codes.patch --] [-- Type: application/octet-stream, Size: 19457 bytes --] From 40224aa09ccd4f44aa6bd843f6c7ce0dbb3b6970 Mon Sep 17 00:00:00 2001 From: Joel Hutton <joel.hutton@arm.com> Date: Fri, 28 Jan 2022 12:04:44 +0000 Subject: [PATCH 3/3] Remove widen_plus/minus_expr tree codes This patch removes the old widen plus/minus tree codes which have been replaced by internal functions. gcc/ChangeLog: * doc/generic.texi: Remove old tree codes. * expr.cc (expand_expr_real_2): Remove old tree code cases. * gimple-pretty-print.cc (dump_binary_rhs): Remove old tree code cases. * optabs-tree.cc (optab_for_tree_code): Remove old tree code cases. (supportable_half_widening_operation): Remove old tree code cases. * tree-cfg.cc (verify_gimple_assign_binary): Remove old tree code cases. * tree-inline.cc (estimate_operator_cost): Remove old tree code cases. * tree-pretty-print.cc (dump_generic_node): Remove tree code definition. (op_symbol_code): Remove old tree code cases. * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Remove old tree code cases. (vect_analyze_data_ref_accesses): Remove old tree code cases. * tree-vect-generic.cc (expand_vector_operations_1): Remove old tree code cases. * tree-vect-patterns.cc (vect_widened_op_tree): Refactor ot replace usage in vect_recog_sad_pattern. (vect_recog_sad_pattern): Replace tree code widening pattern with internal function. (vect_recog_average_pattern): Replace tree code widening pattern with internal function. * tree-vect-stmts.cc (vectorizable_conversion): Remove old tree code cases. (supportable_widening_operation): Remove old tree code cases. * tree.def (WIDEN_PLUS_EXPR): Remove tree code definition. (WIDEN_MINUS_EXPR): Remove tree code definition. (VEC_WIDEN_PLUS_HI_EXPR): Remove tree code definition. (VEC_WIDEN_PLUS_LO_EXPR): Remove tree code definition. (VEC_WIDEN_MINUS_HI_EXPR): Remove tree code definition. (VEC_WIDEN_MINUS_LO_EXPR): Remove tree code definition. --- gcc/doc/generic.texi | 31 ------------------------------- gcc/expr.cc | 6 ------ gcc/gimple-pretty-print.cc | 4 ---- gcc/optabs-tree.cc | 24 ------------------------ gcc/tree-cfg.cc | 6 ------ gcc/tree-inline.cc | 6 ------ gcc/tree-pretty-print.cc | 12 ------------ gcc/tree-vect-data-refs.cc | 8 +++----- gcc/tree-vect-generic.cc | 4 ---- gcc/tree-vect-patterns.cc | 36 +++++++++++++++++++++++++----------- gcc/tree-vect-stmts.cc | 18 ++---------------- gcc/tree.def | 6 ------ 12 files changed, 30 insertions(+), 131 deletions(-) diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index e5f9d1be8ea81f3da002ec3bb925590d331a2551..344045efd419b0cc3a11771acf70d2fd279c48ac 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1811,10 +1811,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @tindex VEC_WIDEN_MULT_LO_EXPR -@tindex VEC_WIDEN_PLUS_HI_EXPR -@tindex VEC_WIDEN_PLUS_LO_EXPR -@tindex VEC_WIDEN_MINUS_HI_EXPR -@tindex VEC_WIDEN_MINUS_LO_EXPR @tindex VEC_UNPACK_HI_EXPR @tindex VEC_UNPACK_LO_EXPR @tindex VEC_UNPACK_FLOAT_HI_EXPR @@ -1861,33 +1857,6 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the low @code{N/2} elements of the two vector are multiplied to produce the vector of @code{N/2} products. -@item VEC_WIDEN_PLUS_HI_EXPR -@itemx VEC_WIDEN_PLUS_LO_EXPR -These nodes represent widening vector addition of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The result -is a vector that contains half as many elements, of an integral type whose size -is twice as wide. In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. - -@item VEC_WIDEN_MINUS_HI_EXPR -@itemx VEC_WIDEN_MINUS_LO_EXPR -These nodes represent widening vector subtraction of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The high/low -elements of the second vector are subtracted from the high/low elements of the -first. The result is a vector that contains half as many elements, of an -integral type whose size is twice as wide. In the case of -@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second -vector are subtracted from the high @code{N/2} of the first to produce the -vector of @code{N/2} products. In the case of -@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second -vector are subtracted from the low @code{N/2} of the first to produce the -vector of @code{N/2} products. - @item VEC_UNPACK_HI_EXPR @itemx VEC_UNPACK_LO_EXPR These nodes represent unpacking of the high and low parts of the input vector, diff --git a/gcc/expr.cc b/gcc/expr.cc index 7197996cec7d24dd43d60928d5618b32b77677a1..1f941efc9e995c7f6a35ff93aaa6bd3c35faaa1f 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -9337,8 +9337,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target, unsignedp); return target; - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_MULT_EXPR: /* If first operand is constant, swap them. Thus the following special case checks need only @@ -10116,10 +10114,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, return temp; } - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc index ebd87b20a0adc080c4a8f9429e75f49b96e72f9a..2a1a5b7f811ca341e8ee7e85a9701d3a37ff80bf 100644 --- a/gcc/gimple-pretty-print.cc +++ b/gcc/gimple-pretty-print.cc @@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc, case VEC_PACK_FLOAT_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_SERIES_EXPR: for (p = get_tree_code_name (code); *p; p++) pp_character (buffer, TOUPPER (*p)); diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc index 8383fe820b080f6d66f83dcf3b77d3c9f869f4bc..2f5f93dc6624f86f6b5618cf6e7aa2b508053a64 100644 --- a/gcc/optabs-tree.cc +++ b/gcc/optabs-tree.cc @@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, return (TYPE_UNSIGNED (type) ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab); - case VEC_WIDEN_PLUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab); - - case VEC_WIDEN_PLUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab); - - case VEC_WIDEN_MINUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab); - - case VEC_WIDEN_MINUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab); - case VEC_UNPACK_HI_EXPR: return (TYPE_UNSIGNED (type) ? vec_unpacku_hi_optab : vec_unpacks_hi_optab); @@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, 'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO. Supported widening operations: - WIDEN_MINUS_EXPR - WIDEN_PLUS_EXPR WIDEN_MULT_EXPR WIDEN_LSHIFT_EXPR @@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out, case WIDEN_LSHIFT_EXPR: *code1 = LSHIFT_EXPR; break; - case WIDEN_MINUS_EXPR: - *code1 = MINUS_EXPR; - break; - case WIDEN_PLUS_EXPR: - *code1 = PLUS_EXPR; - break; case WIDEN_MULT_EXPR: *code1 = MULT_EXPR; break; diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 8de1b144a426776bf464765477c71ee8f2e52b81..46eed1e1f22052fc077f2fc25e5be627bce541b6 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -3948,8 +3948,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case PLUS_EXPR: case MINUS_EXPR: { @@ -4070,10 +4068,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index 043e1d5987a4c4b0159109dafb85a805ca828c1e..c0bebb7f4de36838341ed62389ad0e2b79f03034 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -4288,8 +4288,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case REALIGN_LOAD_EXPR: - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case DOT_PROD_EXPR: @@ -4298,10 +4296,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case WIDEN_MULT_MINUS_EXPR: case WIDEN_LSHIFT_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 6acd394a0790ad2ad989f195a3288f0f0a8cc489..53ca62dc1a6873ae9365f199061bde9edd486196 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -2825,8 +2825,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, break; /* Binary arithmetic and logic expressions. */ - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case MULT_EXPR: @@ -3790,10 +3788,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, case VEC_SERIES_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: case VEC_WIDEN_MULT_ODD_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: @@ -4311,12 +4305,6 @@ op_symbol_code (enum tree_code code) case WIDEN_LSHIFT_EXPR: return "w<<"; - case WIDEN_PLUS_EXPR: - return "w+"; - - case WIDEN_MINUS_EXPR: - return "w-"; - case POINTER_PLUS_EXPR: return "+"; diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index d20a10a1524164eef788ab4b88ba57c7a09c3387..98dd56ff022233ccead36a1f5a5e896e352f9f5b 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR || gimple_assign_rhs_code (assign) == FLOAT_EXPR) { tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign)); @@ -3172,8 +3170,8 @@ vect_analyze_data_ref_accesses (vec_info *vinfo, break; /* Check that the DR_INITs are compile-time constants. */ - if (!tree_fits_shwi_p (DR_INIT (dra)) - || !tree_fits_shwi_p (DR_INIT (drb))) + if (TREE_CODE (DR_INIT (dra)) != INTEGER_CST + || TREE_CODE (DR_INIT (drb)) != INTEGER_CST) break; /* Different .GOMP_SIMD_LANE calls still give the same lane, @@ -3225,7 +3223,7 @@ vect_analyze_data_ref_accesses (vec_info *vinfo, unsigned HOST_WIDE_INT step = absu_hwi (tree_to_shwi (DR_STEP (dra))); if (step != 0 - && step <= ((unsigned HOST_WIDE_INT)init_b - init_a)) + && step <= (unsigned HOST_WIDE_INT)(init_b - init_a)) break; } } diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index 92aba5d4af61dd478ec3f1b94854e4ad84166774..5823b08baf70b89b22ecc148b0702a84671ad084 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -2209,10 +2209,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi, arguments, not the widened result. VEC_UNPACK_FLOAT_*_EXPR is calculated in the same way above. */ if (code == WIDEN_SUM_EXPR - || code == VEC_WIDEN_PLUS_HI_EXPR - || code == VEC_WIDEN_PLUS_LO_EXPR - || code == VEC_WIDEN_MINUS_HI_EXPR - || code == VEC_WIDEN_MINUS_LO_EXPR || code == VEC_WIDEN_MULT_HI_EXPR || code == VEC_WIDEN_MULT_LO_EXPR || code == VEC_WIDEN_MULT_EVEN_EXPR diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 0fd587da12b4a17b238327ae60f5a2a7a0efc514..0d821413b971d983c6562c3e8fbe60e2c3d0cb94 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -557,21 +557,29 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type) static unsigned int vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, - tree_code widened_code, bool shift_p, + code_helper widened_code, bool shift_p, unsigned int max_nops, vect_unpromoted_value *unprom, tree *common_type, enum optab_subtype *subtype = NULL) { /* Check for an integer operation with the right code. */ - gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); - if (!assign) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return 0; - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + code_helper rhs_code; + if (is_gimple_assign (stmt)) + rhs_code = gimple_assign_rhs_code (stmt); + else + rhs_code = gimple_call_combined_fn (stmt); + + if (rhs_code.as_tree_code () != code + && rhs_code.get_rep () != widened_code.get_rep ()) return 0; - tree type = TREE_TYPE (gimple_assign_lhs (assign)); + tree lhs = is_gimple_assign (stmt) ? gimple_assign_lhs (stmt): + gimple_call_lhs (stmt); + tree type = TREE_TYPE (lhs); if (!INTEGRAL_TYPE_P (type)) return 0; @@ -584,7 +592,11 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, { vect_unpromoted_value *this_unprom = &unprom[next_op]; unsigned int nops = 1; - tree op = gimple_op (assign, i + 1); + tree op; + if (is_gimple_assign (stmt)) + op = gimple_op (stmt, i + 1); + else + op = gimple_call_arg (stmt, i); if (i == 1 && TREE_CODE (op) == INTEGER_CST) { /* We already have a common type from earlier operands. @@ -1297,8 +1309,9 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR, - false, 2, unprom, &half_type)) + if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, + CFN_VEC_WIDEN_MINUS, false, 2, unprom, + &half_type)) return NULL; vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); @@ -2335,9 +2348,10 @@ vect_recog_average_pattern (vec_info *vinfo, internal_fn ifn = IFN_AVG_FLOOR; vect_unpromoted_value unprom[3]; tree new_type; + enum optab_subtype subtype; unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR, - WIDEN_PLUS_EXPR, false, 3, - unprom, &new_type); + CFN_VEC_WIDEN_PLUS, false, 3, + unprom, &new_type, &subtype); if (nops == 0) return NULL; if (nops == 3) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index c31831df723eeae8ea4fca2790a18b562106c889..9adbb9fbf116ef316d5bed2c84a7074722055717 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4895,9 +4895,7 @@ vectorizable_conversion (vec_info *vinfo, if (is_gimple_assign (stmt)) { - widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR + widen_arith = (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR); } else @@ -4950,8 +4948,6 @@ vectorizable_conversion (vec_info *vinfo, { gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR || widen_arith); @@ -11976,7 +11972,7 @@ supportable_widening_operation (vec_info *vinfo, class loop *vect_loop = NULL; machine_mode vec_mode; enum insn_code icode1, icode2; - optab optab1, optab2; + optab optab1 = unknown_optab, optab2 = unknown_optab; tree vectype = vectype_in; tree wide_vectype = vectype_out; code_helper c1=MAX_TREE_CODES, c2=MAX_TREE_CODES; @@ -12070,16 +12066,6 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_WIDEN_LSHIFT_HI_EXPR; break; - case WIDEN_PLUS_EXPR: - c1 = VEC_WIDEN_PLUS_LO_EXPR; - c2 = VEC_WIDEN_PLUS_HI_EXPR; - break; - - case WIDEN_MINUS_EXPR: - c1 = VEC_WIDEN_MINUS_LO_EXPR; - c2 = VEC_WIDEN_MINUS_HI_EXPR; - break; - CASE_CONVERT: c1 = VEC_UNPACK_LO_EXPR; c2 = VEC_UNPACK_HI_EXPR; diff --git a/gcc/tree.def b/gcc/tree.def index 62650b6934b337c5d56e5393dc114173d72c9aa9..9b2dce3576440c445d3240b9ed937fe67c9a5992 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1383,8 +1383,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3) the first argument from type t1 to type t2, and then shifting it by the second argument. */ DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2) /* Widening vector multiplication. The two operands are vectors with N elements of size S. Multiplying the @@ -1449,10 +1447,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2) */ DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2) /* PREDICT_EXPR. Specify hint for branch prediction. The PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-05-31 10:07 ` Joel Hutton @ 2022-05-31 16:46 ` Tamar Christina 2022-06-01 10:11 ` Richard Biener 1 sibling, 0 replies; 53+ messages in thread From: Tamar Christina @ 2022-05-31 16:46 UTC (permalink / raw) To: Joel Hutton, Richard Biener; +Cc: Richard Sandiford, gcc-patches > Just checking there is still interest in this Definitely, I am waiting for this to be able to send a new patch upstream 😊 Cheers, Tamar. > -----Original Message----- > From: Gcc-patches <gcc-patches- > bounces+tamar.christina=arm.com@gcc.gnu.org> On Behalf Of Joel Hutton > via Gcc-patches > Sent: Tuesday, May 31, 2022 11:08 AM > To: Richard Biener <rguenther@suse.de> > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; gcc- > patches@gcc.gnu.org > Subject: RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as > internal_fns > > > Can you post an updated patch (after the .cc renaming, and code_helper > > now already moved to tree.h). > > > > Thanks, > > Richard. > > Patches attached. They already incorporated the .cc rename, now rebased to > be after the change to tree.h > > Joel ^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-05-31 10:07 ` Joel Hutton 2022-05-31 16:46 ` Tamar Christina @ 2022-06-01 10:11 ` Richard Biener 2022-06-06 17:20 ` Joel Hutton 1 sibling, 1 reply; 53+ messages in thread From: Richard Biener @ 2022-06-01 10:11 UTC (permalink / raw) To: Joel Hutton; +Cc: Richard Sandiford, gcc-patches On Tue, 31 May 2022, Joel Hutton wrote: > > Can you post an updated patch (after the .cc renaming, and code_helper > > now already moved to tree.h). > > > > Thanks, > > Richard. > > Patches attached. They already incorporated the .cc rename, now rebased to be after the change to tree.h @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, 2, oprnd, half_type, unprom, vectype); tree var = vect_recog_temp_ssa_var (itype, NULL); - gimple *pattern_stmt = gimple_build_assign (var, wide_code, - oprnd[0], oprnd[1]); + gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], oprnd[1]); you should be able to do without the new gimple_build overload by using gimple_seq stmts = NULL; gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]); gimple *pattern_stmt = gimple_seq_last_stmt (stmts); because 'gimple_build' is an existing API. - if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) + if (gimple_get_lhs (stmt) == NULL_TREE || + TREE_CODE(gimple_get_lhs (stmt)) != SSA_NAME) return false; || go to the next line, space after TREE_CODE + bool widen_arith = false; + gimple_match_op res_op; + if (!gimple_extract_op (stmt, &res_op)) + return false; + code = res_op.code; + op_type = res_op.num_ops; + + if (is_gimple_assign (stmt)) + { + widen_arith = (code == WIDEN_PLUS_EXPR + || code == WIDEN_MINUS_EXPR + || code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR); + } + else + widen_arith = gimple_call_flags (stmt) & ECF_WIDEN; there seem to be formatting issues. Also shouldn't you check if (res_op.code.is_tree_code ()) instead if is_gimple_assign? I also don't like the ECF_WIDEN "trick", just do as with the tree codes and explicitely enumerate widening ifns here. gimple_extract_op is a bit heavy-weight as well, so maybe instead simply do if (is_gimple_assign (stmt)) { code = gimple_assign_rhs_code (stmt); ... } else if (gimple_call_internal_p (stmt)) { code = gimple_call_internal_fn (stmt); ... } else return false; + code_helper c1=MAX_TREE_CODES, c2=MAX_TREE_CODES; spaces before/after '=' @@ -12061,13 +12105,16 @@ supportable_widening_operation (vec_info *vinfo, if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR) std::swap (c1, c2); + if (code == FIX_TRUNC_EXPR) { unnecessary whitespace change. diff --git a/gcc/tree.h b/gcc/tree.h index f84958933d51144bb6ce7cc41eca5f7f06814550..e51e34c051d9b91d1c02a4b2fefdb2b15606a36f 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -92,6 +92,10 @@ public: bool is_fn_code () const { return rep < 0; } bool is_internal_fn () const; bool is_builtin_fn () const; + enum tree_code as_tree_code () const { return is_tree_code () ? + (tree_code)* this : MAX_TREE_CODES; } + combined_fn as_fn_code () const { return is_fn_code () ? (combined_fn) *this + : CFN_LAST;} hmm, the other as_* functions we have are not member functions. Also this semantically differs from the tree_code () conversion operator (that one was supposed to be "cheap"). The existing as_internal_fn for example is documented as being valid only if the code is actually an internal fn. I see you are introducing the new function as convenience to get a "safe" not-a-X value, so maybe they should be called safe_as_tree_code () instead? int get_rep () const { return rep; } bool operator== (const code_helper &other) { return rep == other.rep; } bool operator!= (const code_helper &other) { return rep != other.rep; } @@ -6657,6 +6661,54 @@ extern unsigned fndecl_dealloc_argno (tree); if nonnull, set the second argument to the referenced enclosing object or pointer. Otherwise return null. */ extern tree get_attr_nonstring_decl (tree, tree * = NULL); +/* Helper to transparently allow tree codes and builtin function codes + exist in one storage entity. */ +class code_helper +{ duplicate add of code_helper. Sorry to raise these issues so late. Richard. ^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-06-01 10:11 ` Richard Biener @ 2022-06-06 17:20 ` Joel Hutton 2022-06-07 8:18 ` Richard Sandiford 0 siblings, 1 reply; 53+ messages in thread From: Joel Hutton @ 2022-06-06 17:20 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Sandiford, gcc-patches [-- Attachment #1: Type: text/plain, Size: 5524 bytes --] > > Patches attached. They already incorporated the .cc rename, now > > rebased to be after the change to tree.h > > @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, > 2, oprnd, half_type, unprom, vectype); > > tree var = vect_recog_temp_ssa_var (itype, NULL); > - gimple *pattern_stmt = gimple_build_assign (var, wide_code, > - oprnd[0], oprnd[1]); > + gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], > oprnd[1]); > > > you should be able to do without the new gimple_build overload > by using > > gimple_seq stmts = NULL; > gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]); > gimple *pattern_stmt = gimple_seq_last_stmt (stmts); > > because 'gimple_build' is an existing API. Done. The gimple_build overload was at the request of Richard Sandiford, I assume removing it is ok with you Richard S? From Richard Sandiford: > For example, I think we should hide this inside a new: > > gimple_build (var, wide_code, oprnd[0], oprnd[1]); > > that works directly on code_helper, similarly to the new code_helper > gimple_build interfaces. > > > - if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) > + if (gimple_get_lhs (stmt) == NULL_TREE || > + TREE_CODE(gimple_get_lhs (stmt)) != SSA_NAME) > return false; > > || go to the next line, space after TREE_CODE > Done. > + bool widen_arith = false; > + gimple_match_op res_op; > + if (!gimple_extract_op (stmt, &res_op)) > + return false; > + code = res_op.code; > + op_type = res_op.num_ops; > + > + if (is_gimple_assign (stmt)) > + { > + widen_arith = (code == WIDEN_PLUS_EXPR > + || code == WIDEN_MINUS_EXPR > + || code == WIDEN_MULT_EXPR > + || code == WIDEN_LSHIFT_EXPR); > + } > + else > + widen_arith = gimple_call_flags (stmt) & ECF_WIDEN; > > there seem to be formatting issues. Also shouldn't you check > if (res_op.code.is_tree_code ()) instead if is_gimple_assign? > I also don't like the ECF_WIDEN "trick", just do as with the > tree codes and explicitely enumerate widening ifns here. > Done. I've set widen_arith to False for the first patch as the second patch introduces the widening ifns. > gimple_extract_op is a bit heavy-weight as well, so maybe > instead simply do > > if (is_gimple_assign (stmt)) > { > code = gimple_assign_rhs_code (stmt); > ... > } > else if (gimple_call_internal_p (stmt)) > { > code = gimple_call_internal_fn (stmt); > ... > } > else > return false; The patch was originally written as above, it was changed to use gimple_extract_op at the request of Richard Sandiford. I prefer gimple_extract_op as it's more compact, but I don't have strong feelings. If the Richards can agree on either version I'm happy. From Richard Sandiford: > > + if (is_gimple_assign (stmt)) > > + { > > + code_or_ifn = gimple_assign_rhs_code (stmt); } else > > + code_or_ifn = gimple_call_combined_fn (stmt); > > It might be possible to use gimple_extract_op here (only recently added). > This would also provide the number of operands directly, instead of > needing "op_type". It would also provide an array of operands. > > + code_helper c1=MAX_TREE_CODES, c2=MAX_TREE_CODES; > > spaces before/after '=' > Done. > @@ -12061,13 +12105,16 @@ supportable_widening_operation (vec_info > *vinfo, > if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR) > std::swap (c1, c2); > > + > if (code == FIX_TRUNC_EXPR) > { > > unnecessary whitespace change. > Fixed. > diff --git a/gcc/tree.h b/gcc/tree.h > index > f84958933d51144bb6ce7cc41eca5f7f06814550..e51e34c051d9b91d1c02a4b2 > fefdb2b15606a36f > 100644 > --- a/gcc/tree.h > +++ b/gcc/tree.h > @@ -92,6 +92,10 @@ public: > bool is_fn_code () const { return rep < 0; } > bool is_internal_fn () const; > bool is_builtin_fn () const; > + enum tree_code as_tree_code () const { return is_tree_code () ? > + (tree_code)* this : MAX_TREE_CODES; } > + combined_fn as_fn_code () const { return is_fn_code () ? (combined_fn) > *this > + : CFN_LAST;} > > hmm, the other as_* functions we have are not member functions. > Also this semantically differs from the tree_code () conversion > operator (that one was supposed to be "cheap"). The existing > as_internal_fn for example is documented as being valid only if > the code is actually an internal fn. I see you are introducing > the new function as convenience to get a "safe" not-a-X value, > so maybe they should be called safe_as_tree_code () instead? > SGTM. Done > > int get_rep () const { return rep; } > bool operator== (const code_helper &other) { return rep == other.rep; } > bool operator!= (const code_helper &other) { return rep != other.rep; } > @@ -6657,6 +6661,54 @@ extern unsigned fndecl_dealloc_argno (tree); > if nonnull, set the second argument to the referenced enclosing > object or pointer. Otherwise return null. */ > extern tree get_attr_nonstring_decl (tree, tree * = NULL); > +/* Helper to transparently allow tree codes and builtin function codes > + exist in one storage entity. */ > +class code_helper > +{ > > duplicate add of code_helper. Fixed. Tests are being re-run. Ok, with changes? [-- Attachment #2: 0001-Refactor-to-allow-internal_fn-s.patch --] [-- Type: application/octet-stream, Size: 23246 bytes --] From 58d1f19224bd6501b5238916871cf2c0f3ba8bd0 Mon Sep 17 00:00:00 2001 From: Joel Hutton <joel.hutton@arm.com> Date: Wed, 25 Aug 2021 14:31:15 +0100 Subject: [PATCH 1/3] Refactor to allow internal_fn's Hi all, This refactor allows widening patterns (such as widen_plus/widen_minus) to be represented as either internal_fns or tree_codes. [vect-patterns] Refactor as internal_fn's Refactor vect-patterns to allow patterns to be internal_fns starting with widening_plus/minus patterns gcc/ChangeLog: * gimple-match.h (class code_helper): Add safe_as_internal_fn, safe_as_tree_code helper functions. * tree-core.h (ECF_WIDEN): Flag to mark internal_fn as widening. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Refactor to use code_helper. * tree-vect-stmts.cc (vect_gen_widened_results_half): Refactor to use code_helper. (vect_create_vectorized_promotion_stmts): Refactor to use code_helper. (vectorizable_conversion): Refactor to use code_helper. gimple_call or gimple_assign. (supportable_widening_operation): Refactor to use code_helper. (supportable_narrowing_operation): Refactor to use code_helper. * tree-vectorizer.h (supportable_widening_operation): Change prototype to use code_helper. (supportable_narrowing_operation): change prototype to use code_helper. --- gcc/tree-core.h | 3 + gcc/tree-vect-patterns.cc | 11 +- gcc/tree-vect-stmts.cc | 217 +++++++++++++++++++++++--------------- gcc/tree-vectorizer.h | 11 +- gcc/tree.h | 4 + 5 files changed, 153 insertions(+), 93 deletions(-) diff --git a/gcc/tree-core.h b/gcc/tree-core.h index ab5fa01e5cb5fb56c1964b93b014ed55a4aa704a..cff6211080bced0bffb39e98039a6550897acf77 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -96,6 +96,9 @@ struct die_struct; /* Nonzero if this is a cold function. */ #define ECF_COLD (1 << 15) +/* Nonzero if this is a widening function. */ +#define ECF_WIDEN (1 << 16) + /* Call argument flags. */ /* Nonzero if the argument is not used by the function. */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 0fad4dbd0945c6c176f3457b751e812f17fcd148..c011b8ede3c266b59f731e316efbec7d98e91068 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -25,6 +25,8 @@ along with GCC; see the file COPYING3. If not see #include "rtl.h" #include "tree.h" #include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-fold.h" #include "ssa.h" #include "expmed.h" #include "optabs-tree.h" @@ -1348,7 +1350,7 @@ vect_recog_sad_pattern (vec_info *vinfo, static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, - tree_code orig_code, tree_code wide_code, + tree_code orig_code, code_helper wide_code, bool shift_p, const char *name) { gimple *last_stmt = last_stmt_info->stmt; @@ -1391,7 +1393,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, vecctype = get_vectype_for_scalar_type (vinfo, ctype); } - enum tree_code dummy_code; + code_helper dummy_code; int dummy_int; auto_vec<tree> dummy_vec; if (!vectype @@ -1412,8 +1414,9 @@ vect_recog_widen_op_pattern (vec_info *vinfo, 2, oprnd, half_type, unprom, vectype); tree var = vect_recog_temp_ssa_var (itype, NULL); - gimple *pattern_stmt = gimple_build_assign (var, wide_code, - oprnd[0], oprnd[1]); + gimple_seq stmts = NULL; + gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]); + gimple *pattern_stmt = gimple_seq_last_stmt (stmts); if (vecctype != vecitype) pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype, diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 346d8ce280437e00bfeb19a4b4adc59eb96207f9..9b31425352689d409b8c0aa0c1d5c69e72db869a 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4636,7 +4636,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, STMT_INFO is the original scalar stmt that we are vectorizing. */ static gimple * -vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, +vect_gen_widened_results_half (vec_info *vinfo, code_helper ch, tree vec_oprnd0, tree vec_oprnd1, int op_type, tree vec_dest, gimple_stmt_iterator *gsi, stmt_vec_info stmt_info) @@ -4645,14 +4645,15 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, tree new_temp; /* Generate half of the widened result: */ - gcc_assert (op_type == TREE_CODE_LENGTH (code)); if (op_type != binary_op) vec_oprnd1 = NULL; - new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1); + + gimple_seq stmts = NULL; + gimple_build (&stmts, ch, vec_oprnd0, vec_oprnd1); + new_stmt = gimple_seq_last_stmt (stmts); new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_temp); + gimple_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); - return new_stmt; } @@ -4729,8 +4730,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds1, stmt_vec_info stmt_info, tree vec_dest, gimple_stmt_iterator *gsi, - enum tree_code code1, - enum tree_code code2, int op_type) + code_helper ch1, + code_helper ch2, int op_type) { int i; tree vop0, vop1, new_tmp1, new_tmp2; @@ -4746,10 +4747,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vop1 = NULL_TREE; /* Generate the two halves of promotion operation. */ - new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1, + new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1, op_type, vec_dest, gsi, stmt_info); - new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1, + new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1, op_type, vec_dest, gsi, stmt_info); if (is_gimple_call (new_stmt1)) @@ -4846,8 +4847,9 @@ vectorizable_conversion (vec_info *vinfo, tree scalar_dest; tree op0, op1 = NULL_TREE; loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo); - enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK; - enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; + tree_code tc1; + code_helper code, code1, code2; + code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; tree new_temp; enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type}; int ndts = 2; @@ -4876,31 +4878,42 @@ vectorizable_conversion (vec_info *vinfo, && ! vec_stmt) return false; - gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt); - if (!stmt) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return false; - if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) + if (gimple_get_lhs (stmt) == NULL_TREE + || TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) return false; - code = gimple_assign_rhs_code (stmt); - if (!CONVERT_EXPR_CODE_P (code) - && code != FIX_TRUNC_EXPR - && code != FLOAT_EXPR - && code != WIDEN_PLUS_EXPR - && code != WIDEN_MINUS_EXPR - && code != WIDEN_MULT_EXPR - && code != WIDEN_LSHIFT_EXPR) + if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) return false; - bool widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); - op_type = TREE_CODE_LENGTH (code); + bool widen_arith = false; + gimple_match_op res_op; + if (!gimple_extract_op (stmt, &res_op)) + return false; + code = res_op.code; + op_type = res_op.num_ops; + + if (res_op.code.is_tree_code ()) + { + widen_arith = (code == WIDEN_PLUS_EXPR + || code == WIDEN_MINUS_EXPR + || code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR); + } + else + widen_arith = false; + + if (!widen_arith + && !CONVERT_EXPR_CODE_P (code) + && code != FIX_TRUNC_EXPR + && code != FLOAT_EXPR) + return false; /* Check types of lhs and rhs. */ - scalar_dest = gimple_assign_lhs (stmt); + scalar_dest = gimple_get_lhs (stmt); lhs_type = TREE_TYPE (scalar_dest); vectype_out = STMT_VINFO_VECTYPE (stmt_info); @@ -4938,10 +4951,14 @@ vectorizable_conversion (vec_info *vinfo, if (op_type == binary_op) { - gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR); + gcc_assert (code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR + || code == WIDEN_PLUS_EXPR + || code == WIDEN_MINUS_EXPR); - op1 = gimple_assign_rhs2 (stmt); + + op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : + gimple_call_arg (stmt, 0); tree vectype1_in; if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1, &op1, &slp_op1, &dt[1], &vectype1_in)) @@ -5025,8 +5042,12 @@ vectorizable_conversion (vec_info *vinfo, && code != FLOAT_EXPR && !CONVERT_EXPR_CODE_P (code)) return false; - if (supportable_convert_operation (code, vectype_out, vectype_in, &code1)) + if (supportable_convert_operation (code.safe_as_tree_code (), vectype_out, + vectype_in, &tc1)) + { + code1 = tc1; break; + } /* FALLTHRU */ unsupported: if (dump_enabled_p ()) @@ -5037,9 +5058,11 @@ vectorizable_conversion (vec_info *vinfo, case WIDEN: if (known_eq (nunits_in, nunits_out)) { - if (!supportable_half_widening_operation (code, vectype_out, - vectype_in, &code1)) + if (!supportable_half_widening_operation (code.safe_as_tree_code (), + vectype_out, vectype_in, + &tc1)) goto unsupported; + code1 = tc1; gcc_assert (!(multi_step_cvt && op_type == binary_op)); break; } @@ -5073,14 +5096,17 @@ vectorizable_conversion (vec_info *vinfo, if (GET_MODE_SIZE (rhs_mode) == fltsz) { - if (!supportable_convert_operation (code, vectype_out, - cvt_type, &codecvt1)) + tc1 = ERROR_MARK; + if (!supportable_convert_operation (code.safe_as_tree_code (), + vectype_out, + cvt_type, &tc1)) goto unsupported; + codecvt1 = tc1; } - else if (!supportable_widening_operation (vinfo, code, stmt_info, - vectype_out, cvt_type, - &codecvt1, &codecvt2, - &multi_step_cvt, + else if (!supportable_widening_operation (vinfo, code, + stmt_info, vectype_out, + cvt_type, &codecvt1, + &codecvt2, &multi_step_cvt, &interm_types)) continue; else @@ -5088,8 +5114,9 @@ vectorizable_conversion (vec_info *vinfo, if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info, cvt_type, - vectype_in, &code1, &code2, - &multi_step_cvt, &interm_types)) + vectype_in, &code1, + &code2, &multi_step_cvt, + &interm_types)) { found_mode = true; break; @@ -5111,10 +5138,14 @@ vectorizable_conversion (vec_info *vinfo, case NARROW: gcc_assert (op_type == unary_op); - if (supportable_narrowing_operation (code, vectype_out, vectype_in, - &code1, &multi_step_cvt, + if (supportable_narrowing_operation (code.safe_as_tree_code (), vectype_out, + vectype_in, + &tc1, &multi_step_cvt, &interm_types)) - break; + { + code1 = tc1; + break; + } if (code != FIX_TRUNC_EXPR || GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode)) @@ -5125,13 +5156,18 @@ vectorizable_conversion (vec_info *vinfo, cvt_type = get_same_sized_vectype (cvt_type, vectype_in); if (cvt_type == NULL_TREE) goto unsupported; - if (!supportable_convert_operation (code, cvt_type, vectype_in, - &codecvt1)) + if (!supportable_convert_operation (code.safe_as_tree_code (), cvt_type, + vectype_in, + &tc1)) goto unsupported; + codecvt1 = tc1; if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type, - &code1, &multi_step_cvt, + &tc1, &multi_step_cvt, &interm_types)) - break; + { + code1 = tc1; + break; + } goto unsupported; default: @@ -5245,8 +5281,9 @@ vectorizable_conversion (vec_info *vinfo, FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { /* Arguments are ready, create the new vector stmt. */ - gcc_assert (TREE_CODE_LENGTH (code1) == unary_op); - gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0); + gcc_assert (TREE_CODE_LENGTH ((tree_code) code1) == unary_op); + gassign *new_stmt = gimple_build_assign (vec_dest, + code1.safe_as_tree_code (), vop0); new_temp = make_ssa_name (vec_dest, new_stmt); gimple_assign_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); @@ -5278,7 +5315,7 @@ vectorizable_conversion (vec_info *vinfo, for (i = multi_step_cvt; i >= 0; i--) { tree this_dest = vec_dsts[i]; - enum tree_code c1 = code1, c2 = code2; + code_helper c1 = code1, c2 = code2; if (i == 0 && codecvt2 != ERROR_MARK) { c1 = codecvt1; @@ -5288,7 +5325,8 @@ vectorizable_conversion (vec_info *vinfo, vect_create_half_widening_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, stmt_info, this_dest, gsi, - c1, op_type); + c1.safe_as_tree_code (), + op_type); else vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, stmt_info, @@ -5301,9 +5339,11 @@ vectorizable_conversion (vec_info *vinfo, gimple *new_stmt; if (cvt_type) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); + gcc_assert (TREE_CODE_LENGTH ((tree_code) codecvt1) == unary_op); new_temp = make_ssa_name (vec_dest); - new_stmt = gimple_build_assign (new_temp, codecvt1, vop0); + new_stmt = gimple_build_assign (new_temp, + codecvt1.safe_as_tree_code (), + vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } else @@ -5327,10 +5367,10 @@ vectorizable_conversion (vec_info *vinfo, if (cvt_type) FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); + gcc_assert (TREE_CODE_LENGTH (((tree_code) codecvt1)) == unary_op); new_temp = make_ssa_name (vec_dest); gassign *new_stmt - = gimple_build_assign (new_temp, codecvt1, vop0); + = gimple_build_assign (new_temp, codecvt1.safe_as_tree_code (), vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); vec_oprnds0[i] = new_temp; } @@ -5338,7 +5378,7 @@ vectorizable_conversion (vec_info *vinfo, vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0, multi_step_cvt, stmt_info, vec_dsts, gsi, - slp_node, code1); + slp_node, code1.safe_as_tree_code ()); break; } if (!slp_node) @@ -11926,9 +11966,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype) bool supportable_widening_operation (vec_info *vinfo, - enum tree_code code, stmt_vec_info stmt_info, + code_helper code, + stmt_vec_info stmt_info, tree vectype_out, tree vectype_in, - enum tree_code *code1, enum tree_code *code2, + code_helper *code1, + code_helper *code2, int *multi_step_cvt, vec<tree> *interm_types) { @@ -11939,7 +11981,7 @@ supportable_widening_operation (vec_info *vinfo, optab optab1, optab2; tree vectype = vectype_in; tree wide_vectype = vectype_out; - enum tree_code c1, c2; + code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES; int i; tree prev_type, intermediate_type; machine_mode intermediate_mode, prev_mode; @@ -11949,7 +11991,7 @@ supportable_widening_operation (vec_info *vinfo, if (loop_info) vect_loop = LOOP_VINFO_LOOP (loop_info); - switch (code) + switch (code.safe_as_tree_code ()) { case WIDEN_MULT_EXPR: /* The result of a vectorized widening operation usually requires @@ -11990,8 +12032,9 @@ supportable_widening_operation (vec_info *vinfo, && !nested_in_vect_loop_p (vect_loop, stmt_info) && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR, stmt_info, vectype_out, - vectype_in, code1, code2, - multi_step_cvt, interm_types)) + vectype_in, code1, + code2, multi_step_cvt, + interm_types)) { /* Elements in a vector with vect_used_by_reduction property cannot be reordered if the use chain with this property does not have the @@ -12054,6 +12097,9 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_UNPACK_FIX_TRUNC_HI_EXPR; break; + case MAX_TREE_CODES: + break; + default: gcc_unreachable (); } @@ -12064,10 +12110,12 @@ supportable_widening_operation (vec_info *vinfo, if (code == FIX_TRUNC_EXPR) { /* The signedness is determined from output operand. */ - optab1 = optab_for_tree_code (c1, vectype_out, optab_default); - optab2 = optab_for_tree_code (c2, vectype_out, optab_default); + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out, + optab_default); } - else if (CONVERT_EXPR_CODE_P (code) + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) && VECTOR_BOOLEAN_TYPE_P (wide_vectype) && VECTOR_BOOLEAN_TYPE_P (vectype) && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) @@ -12080,8 +12128,8 @@ supportable_widening_operation (vec_info *vinfo, } else { - optab1 = optab_for_tree_code (c1, vectype, optab_default); - optab2 = optab_for_tree_code (c2, vectype, optab_default); + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, optab_default); } if (!optab1 || !optab2) @@ -12092,8 +12140,12 @@ supportable_widening_operation (vec_info *vinfo, || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing) return false; - *code1 = c1; - *code2 = c2; + if (code.is_tree_code ()) + { + *code1 = c1; + *code2 = c2; + } + if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype) && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype)) @@ -12114,7 +12166,7 @@ supportable_widening_operation (vec_info *vinfo, prev_type = vectype; prev_mode = vec_mode; - if (!CONVERT_EXPR_CODE_P (code)) + if (!CONVERT_EXPR_CODE_P ((tree_code) code)) return false; /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS @@ -12145,8 +12197,10 @@ supportable_widening_operation (vec_info *vinfo, } else { - optab3 = optab_for_tree_code (c1, intermediate_type, optab_default); - optab4 = optab_for_tree_code (c2, intermediate_type, optab_default); + optab3 = optab_for_tree_code (c1.safe_as_tree_code (), intermediate_type, + optab_default); + optab4 = optab_for_tree_code (c2.safe_as_tree_code (), intermediate_type, + optab_default); } if (!optab3 || !optab4 @@ -12181,7 +12235,6 @@ supportable_widening_operation (vec_info *vinfo, return false; } - /* Function supportable_narrowing_operation Check whether an operation represented by the code CODE is a @@ -12205,7 +12258,7 @@ supportable_widening_operation (vec_info *vinfo, bool supportable_narrowing_operation (enum tree_code code, tree vectype_out, tree vectype_in, - enum tree_code *code1, int *multi_step_cvt, + tree_code* _code1, int *multi_step_cvt, vec<tree> *interm_types) { machine_mode vec_mode; @@ -12217,8 +12270,8 @@ supportable_narrowing_operation (enum tree_code code, tree intermediate_type, prev_type; machine_mode intermediate_mode, prev_mode; int i; - unsigned HOST_WIDE_INT n_elts; bool uns; + tree_code * code1 = (tree_code*) _code1; *multi_step_cvt = 0; switch (code) @@ -12227,9 +12280,8 @@ supportable_narrowing_operation (enum tree_code code, c1 = VEC_PACK_TRUNC_EXPR; if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype) && VECTOR_BOOLEAN_TYPE_P (vectype) - && SCALAR_INT_MODE_P (TYPE_MODE (vectype)) - && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts) - && n_elts < BITS_PER_UNIT) + && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) optab1 = vec_pack_sbool_trunc_optab; else optab1 = optab_for_tree_code (c1, vectype, optab_default); @@ -12320,9 +12372,8 @@ supportable_narrowing_operation (enum tree_code code, = lang_hooks.types.type_for_mode (intermediate_mode, uns); if (VECTOR_BOOLEAN_TYPE_P (intermediate_type) && VECTOR_BOOLEAN_TYPE_P (prev_type) - && SCALAR_INT_MODE_P (prev_mode) - && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant (&n_elts) - && n_elts < BITS_PER_UNIT) + && intermediate_mode == prev_mode + && SCALAR_INT_MODE_P (prev_mode)) interm_optab = vec_pack_sbool_trunc_optab; else interm_optab diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd784f10ee3d8ff4b4dc 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree, enum vect_def_type *, tree *, stmt_vec_info * = NULL); extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool supportable_widening_operation (vec_info *, - enum tree_code, stmt_vec_info, - tree, tree, enum tree_code *, - enum tree_code *, int *, - vec<tree> *); +extern bool supportable_widening_operation (vec_info*, code_helper, + stmt_vec_info, tree, tree, + code_helper*, code_helper*, + int*, vec<tree> *); extern bool supportable_narrowing_operation (enum tree_code, tree, tree, - enum tree_code *, int *, + tree_code *, int *, vec<tree> *); extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, diff --git a/gcc/tree.h b/gcc/tree.h index f84958933d51144bb6ce7cc41eca5f7f06814550..00b0e4d1c696633fe38baad5295b1f90398d53fc 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -92,6 +92,10 @@ public: bool is_fn_code () const { return rep < 0; } bool is_internal_fn () const; bool is_builtin_fn () const; + enum tree_code safe_as_tree_code () const { return is_tree_code () ? + (tree_code)* this : MAX_TREE_CODES; } + combined_fn safe_as_fn_code () const { return is_fn_code () ? (combined_fn) *this + : CFN_LAST;} int get_rep () const { return rep; } bool operator== (const code_helper &other) { return rep == other.rep; } bool operator!= (const code_helper &other) { return rep != other.rep; } -- 2.17.1 [-- Attachment #3: 0002-Refactor-widen_plus-as-internal_fn.patch --] [-- Type: application/octet-stream, Size: 23132 bytes --] From 233a24f2a4eeced2fd4e99578e6ea81ec8622192 Mon Sep 17 00:00:00 2001 From: Joel Hutton <joel.hutton@arm.com> Date: Wed, 26 Jan 2022 14:00:17 +0000 Subject: [PATCH 2/3] Refactor widen_plus as internal_fn This patch replaces the existing tree_code widen_plus and widen_minus patterns with internal_fn versions. DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it provides convenience wrappers for defining conversions that require a hi/lo split, like widening and narrowing operations. Each definition for <NAME> will require an optab named <OPTAB> and two other optabs that you specify for signed and unsigned. The hi/lo pair is necessary because the widening operations take n narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' operation operates on the first n/2 elements of input. The 'hi' operation operates on the second n/2 elements of input. Defining an internal_fn along with hi/lo variations allows a single internal function to be returned from a vect_recog function that will later be expanded to hi/lo. DEF_INTERNAL_OPTAB_MULTI_FN is used in internal-fn.def to register a widening internal_fn. It is defined differently in different places and internal-fn.def is sourced from those places so the parameters given can be reused. internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later defined to generate the 'expand_' functions for the hi/lo versions of the fn. internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original and hi/lo variants of the internal_fn For example: IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>addl_hi_<mode> -> (u/s)addl2 IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>addl_lo_<mode> -> (u/s)addl This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. gcc/ChangeLog: 2022-04-13 Joel Hutton <joel.hutton@arm.com> 2022-04-13 Tamar Christina <tamar.christina@arm.com> * internal-fn.cc (INCLUDE_MAP): Include maps for use in optab lookup. (DEF_INTERNAL_OPTAB_MULTI_FN): Macro to define an internal_fn that expands into multiple internal_fns (for widening). (ifn_cmp): Function to compare ifn's for sorting/searching. (lookup_multi_ifn_optab): Add lookup function. (lookup_multi_internal_fn): Add lookup function. (commutative_binary_fn_p): Add widen_plus fn's. * internal-fn.def (DEF_INTERNAL_OPTAB_MULTI_FN): Define widening plus,minus functions. (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code. (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code. * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. (lookup_multi_ifn_optab): Add prototype. (lookup_multi_internal_fn): Add prototype. * optabs.cc (commutative_optab_p): Add widening plus, minus optabs. * optabs.def (OPTAB_CD): widen add, sub optabs * tree-core.h (ECF_MULTI): Flag to indicate if a function decays into hi/lo parts. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support patterns with a hi/lo split. (vect_recog_widen_plus_pattern): Refactor to return IFN_VECT_WIDEN_PLUS. (vect_recog_widen_minus_pattern): Refactor to return new IFN_VEC_WIDEN_MINUS. * tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus ifn support. (supportable_widening_operation): Add widen plus/minus ifn support. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: Test that new IFN_VEC_WIDEN_PLUS is being used. * gcc.target/aarch64/vect-widen-sub.c: Test that new IFN_VEC_WIDEN_MINUS is being used. --- gcc/internal-fn.cc | 107 ++++++++++++++++++ gcc/internal-fn.def | 23 ++++ gcc/internal-fn.h | 7 ++ gcc/optabs.cc | 12 +- gcc/optabs.def | 2 + .../gcc.target/aarch64/vect-widen-add.c | 4 +- .../gcc.target/aarch64/vect-widen-sub.c | 4 +- gcc/tree-core.h | 4 + gcc/tree-vect-patterns.cc | 37 ++++-- gcc/tree-vect-stmts.cc | 65 ++++++++++- 10 files changed, 248 insertions(+), 17 deletions(-) diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 8b1733e20c4455e4e8c383c92fe859f4256cae69..e95b13af884f67990ad43c286990a351e2bd641b 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +#define INCLUDE_MAP #include "config.h" #include "system.h" #include "coretypes.h" @@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = { 0 }; +const enum internal_fn internal_fn_hilo_keys_array[] = { +#undef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + IFN_##NAME##_LO, \ + IFN_##NAME##_HI, +#include "internal-fn.def" + IFN_LAST +#undef DEF_INTERNAL_OPTAB_MULTI_FN +}; + +const optab internal_fn_hilo_values_array[] = { +#undef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + SOPTAB##_lo_optab, UOPTAB##_lo_optab, \ + SOPTAB##_hi_optab, UOPTAB##_hi_optab, +#include "internal-fn.def" + unknown_optab, unknown_optab +#undef DEF_INTERNAL_OPTAB_MULTI_FN +}; + /* Return the internal function called NAME, or IFN_LAST if there's no such function. */ @@ -90,6 +111,62 @@ lookup_internal_fn (const char *name) return entry ? *entry : IFN_LAST; } +static int +ifn_cmp (const void *a_, const void *b_) +{ + typedef std::pair<enum internal_fn, unsigned> ifn_pair; + auto *a = (const std::pair<ifn_pair, optab> *)a_; + auto *b = (const std::pair<ifn_pair, optab> *)b_; + return (int) (a->first.first) - (b->first.first); +} + +/* Return the optab belonging to the given internal function NAME for the given + SIGN or unknown_optab. */ + +optab +lookup_multi_ifn_optab (enum internal_fn fn, unsigned sign) +{ + typedef std::pair<enum internal_fn, unsigned> ifn_pair; + typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type; + static fn_to_optab_map_type *fn_to_optab_map; + + if (!fn_to_optab_map) + { + unsigned num + = sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn); + fn_to_optab_map = new fn_to_optab_map_type (); + for (unsigned int i = 0; i < num - 1; ++i) + { + enum internal_fn fn = internal_fn_hilo_keys_array[i]; + optab v1 = internal_fn_hilo_values_array[2*i]; + optab v2 = internal_fn_hilo_values_array[2*i + 1]; + ifn_pair key1 (fn, 0); + fn_to_optab_map->safe_push ({key1, v1}); + ifn_pair key2 (fn, 1); + fn_to_optab_map->safe_push ({key2, v2}); + } + fn_to_optab_map->qsort(ifn_cmp); + } + + ifn_pair new_pair (fn, sign ? 1 : 0); + optab tmp; + std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp); + auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp); + return entry != fn_to_optab_map->end () ? entry->second : unknown_optab; +} + +extern void +lookup_multi_internal_fn (enum internal_fn ifn, enum internal_fn *lo, + enum internal_fn *hi) +{ + int ecf_flags = internal_fn_flags (ifn); + gcc_assert (ecf_flags & ECF_MULTI); + + *lo = internal_fn (ifn + 1); + *hi = internal_fn (ifn + 2); +} + + /* Fnspec of each internal function, indexed by function number. */ const_tree internal_fn_fnspec_array[IFN_LAST + 1]; @@ -3906,6 +3983,9 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: return true; default: @@ -3991,6 +4071,32 @@ set_edom_supported_p (void) #endif } +#undef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + static void \ + expand_##CODE (internal_fn, gcall *) \ + { \ + gcc_unreachable (); \ + } \ + static void \ + expand_##CODE##_LO (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab); \ + } \ + static void \ + expand_##CODE##_HI (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab); \ + } + #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \ static void \ expand_##CODE (internal_fn fn, gcall *stmt) \ @@ -4007,6 +4113,7 @@ set_edom_supported_p (void) expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ } #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_MULTI_FN /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index d2d550d358606022b1cb44fa842f06e0be507bc3..4635a9c8af9ad27bb05d7510388d0fe2270428e5 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -82,6 +82,13 @@ along with GCC; see the file COPYING3. If not see says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX} group of functions to any integral mode (including vector modes). + DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it + provides convenience wrappers for defining conversions that require a + hi/lo split, like widening and narrowing operations. Each definition + for <NAME> will require an optab named <OPTAB> and two other optabs that + you specify for signed and unsigned. + + Each entry must have a corresponding expander of the form: void expand_NAME (gimple_call stmt) @@ -120,6 +127,14 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) #endif +#ifndef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME, FLAGS | ECF_MULTI, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE) +#endif + + DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, @@ -292,6 +307,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) +DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_PLUS, + ECF_CONST | ECF_WIDEN | ECF_NOTHROW, + vec_widen_add, vec_widen_saddl, vec_widen_uaddl, + binary) +DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_MINUS, + ECF_CONST | ECF_WIDEN | ECF_NOTHROW, + vec_widen_sub, vec_widen_ssubl, vec_widen_usubl, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 23c014a963c4d72da92c763db87ee486a2adb485..b35de19747d251d19dc13de1e0323368bd2ebdf2 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +#include "insn-codes.h" +#include "insn-opinit.h" + + /* INTEGER_CST values for IFN_UNIQUE function arg-0. UNSPEC: Undifferentiated UNIQUE. @@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn) } extern internal_fn lookup_internal_fn (const char *); +extern optab lookup_multi_ifn_optab (enum internal_fn, unsigned); +extern void lookup_multi_internal_fn (enum internal_fn, enum internal_fn *, + enum internal_fn *); /* Return the ECF_* flags for function FN. */ diff --git a/gcc/optabs.cc b/gcc/optabs.cc index c0a68471d2ddf08bc0e6a3fd592ebb9f05e516c1..7e904b3e154d018779bb1a36de74e6997f70e193 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1314,7 +1314,17 @@ commutative_optab_p (optab binoptab) || binoptab == smul_widen_optab || binoptab == umul_widen_optab || binoptab == smul_highpart_optab - || binoptab == umul_highpart_optab); + || binoptab == umul_highpart_optab + || binoptab == vec_widen_add_optab + || binoptab == vec_widen_sub_optab + || binoptab == vec_widen_saddl_hi_optab + || binoptab == vec_widen_saddl_lo_optab + || binoptab == vec_widen_ssubl_hi_optab + || binoptab == vec_widen_ssubl_lo_optab + || binoptab == vec_widen_uaddl_hi_optab + || binoptab == vec_widen_uaddl_lo_optab + || binoptab == vec_widen_usubl_hi_optab + || binoptab == vec_widen_usubl_lo_optab); } /* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're diff --git a/gcc/optabs.def b/gcc/optabs.def index 801310ebaa7d469520809bb7efed6820f8eb866b..a7881dcb49e4ef07d8f07aa31214eb3a7a944e2e 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4") OPTAB_CD(umsub_widen_optab, "umsub$b$a4") OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4") OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4") +OPTAB_CD(vec_widen_add_optab, "add$a$b3") +OPTAB_CD(vec_widen_sub_optab, "sub$a$b3") OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b") OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b") OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */ /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */ /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */ diff --git a/gcc/tree-core.h b/gcc/tree-core.h index cff6211080bced0bffb39e98039a6550897acf77..d0c8b812cfb9c3ac83bf25fff0431b08cb7d823d 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -99,6 +99,10 @@ struct die_struct; /* Nonzero if this is a widening function. */ #define ECF_WIDEN (1 << 16) +/* Nonzero if this is a function that decomposes into a lo/hi operation. */ +#define ECF_MULTI (1 << 17) + + /* Call argument flags. */ /* Nonzero if the argument is not used by the function. */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index c011b8ede3c266b59f731e316efbec7d98e91068..268f5402fcdd5ec5bfb806db8c410e701c771275 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -1351,14 +1351,16 @@ static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, tree_code orig_code, code_helper wide_code, - bool shift_p, const char *name) + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) { gimple *last_stmt = last_stmt_info->stmt; vect_unpromoted_value unprom[2]; tree half_type; if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code, - shift_p, 2, unprom, &half_type)) + shift_p, 2, unprom, &half_type, subtype)) + return NULL; /* Pattern detected. */ @@ -1426,6 +1428,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo, type, pattern_stmt, vecctype); } +static gimple * +vect_recog_widen_op_pattern (vec_info *vinfo, + stmt_vec_info last_stmt_info, tree *type_out, + tree_code orig_code, internal_fn wide_ifn, + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) +{ + combined_fn ifn = as_combined_fn (wide_ifn); + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + orig_code, ifn, shift_p, name, + subtype); +} + + /* Try to detect multiplication on widened inputs, converting MULT_EXPR to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ @@ -1439,26 +1455,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, } /* Try to detect addition on widened inputs, converting PLUS_EXPR - to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_PLUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - PLUS_EXPR, WIDEN_PLUS_EXPR, false, - "vect_recog_widen_plus_pattern"); + PLUS_EXPR, IFN_VEC_WIDEN_PLUS, + false, "vect_recog_widen_plus_pattern", + &subtype); } /* Try to detect subtraction on widened inputs, converting MINUS_EXPR - to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_MINUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - MINUS_EXPR, WIDEN_MINUS_EXPR, false, - "vect_recog_widen_minus_pattern"); + MINUS_EXPR, IFN_VEC_WIDEN_MINUS, + false, "vect_recog_widen_minus_pattern", + &subtype); } /* Function vect_recog_popcount_pattern @@ -5622,6 +5642,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + /* These must come after the double widening ones. */ }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 9b31425352689d409b8c0aa0c1d5c69e72db869a..9af0d107fdafb959db10d87e4e0ba5fda4e47bd7 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4904,7 +4904,8 @@ vectorizable_conversion (vec_info *vinfo, || code == WIDEN_LSHIFT_EXPR); } else - widen_arith = false; + widen_arith = (code == IFN_VEC_WIDEN_PLUS + || code == IFN_VEC_WIDEN_MINUS); if (!widen_arith && !CONVERT_EXPR_CODE_P (code) @@ -4954,7 +4955,9 @@ vectorizable_conversion (vec_info *vinfo, gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR); + || code == WIDEN_MINUS_EXPR + || code == IFN_VEC_WIDEN_PLUS + || code == IFN_VEC_WIDEN_MINUS); op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : @@ -12126,12 +12129,62 @@ supportable_widening_operation (vec_info *vinfo, optab1 = vec_unpacks_sbool_lo_optab; optab2 = vec_unpacks_sbool_hi_optab; } - else - { - optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, optab_default); - optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, optab_default); + + if (code.is_fn_code ()) + { + internal_fn ifn = as_internal_fn (code.safe_as_fn_code ()); + int ecf_flags = internal_fn_flags (ifn); + gcc_assert (ecf_flags & ECF_MULTI); + + switch (code.safe_as_fn_code ()) + { + case CFN_VEC_WIDEN_PLUS: + break; + case CFN_VEC_WIDEN_MINUS: + break; + case CFN_LAST: + default: + return false; + } + + internal_fn lo, hi; + lookup_multi_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); + optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); } + if (code.is_tree_code ()) + { + if (code == FIX_TRUNC_EXPR) + { + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out, + optab_default); + } + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) + && VECTOR_BOOLEAN_TYPE_P (wide_vectype) + && VECTOR_BOOLEAN_TYPE_P (vectype) + && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) + { + /* If the input and result modes are the same, a different optab + is needed where we pass in the number of units in vectype. */ + optab1 = vec_unpacks_sbool_lo_optab; + optab2 = vec_unpacks_sbool_hi_optab; + } + else + { + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, + optab_default); + } + } + if (!optab1 || !optab2) return false; -- 2.17.1 [-- Attachment #4: 0003-Remove-widen_plus-minus_expr-tree-codes.patch --] [-- Type: application/octet-stream, Size: 19519 bytes --] From 061e11fa3c76c42640ab6467858e057e3067a6d3 Mon Sep 17 00:00:00 2001 From: Joel Hutton <joel.hutton@arm.com> Date: Fri, 28 Jan 2022 12:04:44 +0000 Subject: [PATCH 3/3] Remove widen_plus/minus_expr tree codes This patch removes the old widen plus/minus tree codes which have been replaced by internal functions. gcc/ChangeLog: * doc/generic.texi: Remove old tree codes. * expr.cc (expand_expr_real_2): Remove old tree code cases. * gimple-pretty-print.cc (dump_binary_rhs): Remove old tree code cases. * optabs-tree.cc (optab_for_tree_code): Remove old tree code cases. (supportable_half_widening_operation): Remove old tree code cases. * tree-cfg.cc (verify_gimple_assign_binary): Remove old tree code cases. * tree-inline.cc (estimate_operator_cost): Remove old tree code cases. * tree-pretty-print.cc (dump_generic_node): Remove tree code definition. (op_symbol_code): Remove old tree code cases. * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Remove old tree code cases. (vect_analyze_data_ref_accesses): Remove old tree code cases. * tree-vect-generic.cc (expand_vector_operations_1): Remove old tree code cases. * tree-vect-patterns.cc (vect_widened_op_tree): Refactor ot replace usage in vect_recog_sad_pattern. (vect_recog_sad_pattern): Replace tree code widening pattern with internal function. (vect_recog_average_pattern): Replace tree code widening pattern with internal function. * tree-vect-stmts.cc (vectorizable_conversion): Remove old tree code cases. (supportable_widening_operation): Remove old tree code cases. * tree.def (WIDEN_PLUS_EXPR): Remove tree code definition. (WIDEN_MINUS_EXPR): Remove tree code definition. (VEC_WIDEN_PLUS_HI_EXPR): Remove tree code definition. (VEC_WIDEN_PLUS_LO_EXPR): Remove tree code definition. (VEC_WIDEN_MINUS_HI_EXPR): Remove tree code definition. (VEC_WIDEN_MINUS_LO_EXPR): Remove tree code definition. --- gcc/doc/generic.texi | 31 ------------------------------- gcc/expr.cc | 6 ------ gcc/gimple-pretty-print.cc | 4 ---- gcc/optabs-tree.cc | 24 ------------------------ gcc/tree-cfg.cc | 6 ------ gcc/tree-inline.cc | 6 ------ gcc/tree-pretty-print.cc | 12 ------------ gcc/tree-vect-data-refs.cc | 8 +++----- gcc/tree-vect-generic.cc | 4 ---- gcc/tree-vect-patterns.cc | 36 +++++++++++++++++++++++++----------- gcc/tree-vect-stmts.cc | 18 ++---------------- gcc/tree.def | 6 ------ 12 files changed, 30 insertions(+), 131 deletions(-) diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index e5f9d1be8ea81f3da002ec3bb925590d331a2551..344045efd419b0cc3a11771acf70d2fd279c48ac 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1811,10 +1811,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @tindex VEC_WIDEN_MULT_LO_EXPR -@tindex VEC_WIDEN_PLUS_HI_EXPR -@tindex VEC_WIDEN_PLUS_LO_EXPR -@tindex VEC_WIDEN_MINUS_HI_EXPR -@tindex VEC_WIDEN_MINUS_LO_EXPR @tindex VEC_UNPACK_HI_EXPR @tindex VEC_UNPACK_LO_EXPR @tindex VEC_UNPACK_FLOAT_HI_EXPR @@ -1861,33 +1857,6 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the low @code{N/2} elements of the two vector are multiplied to produce the vector of @code{N/2} products. -@item VEC_WIDEN_PLUS_HI_EXPR -@itemx VEC_WIDEN_PLUS_LO_EXPR -These nodes represent widening vector addition of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The result -is a vector that contains half as many elements, of an integral type whose size -is twice as wide. In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. - -@item VEC_WIDEN_MINUS_HI_EXPR -@itemx VEC_WIDEN_MINUS_LO_EXPR -These nodes represent widening vector subtraction of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The high/low -elements of the second vector are subtracted from the high/low elements of the -first. The result is a vector that contains half as many elements, of an -integral type whose size is twice as wide. In the case of -@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second -vector are subtracted from the high @code{N/2} of the first to produce the -vector of @code{N/2} products. In the case of -@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second -vector are subtracted from the low @code{N/2} of the first to produce the -vector of @code{N/2} products. - @item VEC_UNPACK_HI_EXPR @itemx VEC_UNPACK_LO_EXPR These nodes represent unpacking of the high and low parts of the input vector, diff --git a/gcc/expr.cc b/gcc/expr.cc index fb062dc847577ec9dc2c951330f4cfadcc869325..4e3655070400cee086c2fdc6ac5bbe08d303de5e 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -9371,8 +9371,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target, unsignedp); return target; - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_MULT_EXPR: /* If first operand is constant, swap them. Thus the following special case checks need only @@ -10150,10 +10148,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, return temp; } - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc index ebd87b20a0adc080c4a8f9429e75f49b96e72f9a..2a1a5b7f811ca341e8ee7e85a9701d3a37ff80bf 100644 --- a/gcc/gimple-pretty-print.cc +++ b/gcc/gimple-pretty-print.cc @@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc, case VEC_PACK_FLOAT_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_SERIES_EXPR: for (p = get_tree_code_name (code); *p; p++) pp_character (buffer, TOUPPER (*p)); diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc index 8383fe820b080f6d66f83dcf3b77d3c9f869f4bc..2f5f93dc6624f86f6b5618cf6e7aa2b508053a64 100644 --- a/gcc/optabs-tree.cc +++ b/gcc/optabs-tree.cc @@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, return (TYPE_UNSIGNED (type) ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab); - case VEC_WIDEN_PLUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab); - - case VEC_WIDEN_PLUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab); - - case VEC_WIDEN_MINUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab); - - case VEC_WIDEN_MINUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab); - case VEC_UNPACK_HI_EXPR: return (TYPE_UNSIGNED (type) ? vec_unpacku_hi_optab : vec_unpacks_hi_optab); @@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, 'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO. Supported widening operations: - WIDEN_MINUS_EXPR - WIDEN_PLUS_EXPR WIDEN_MULT_EXPR WIDEN_LSHIFT_EXPR @@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out, case WIDEN_LSHIFT_EXPR: *code1 = LSHIFT_EXPR; break; - case WIDEN_MINUS_EXPR: - *code1 = MINUS_EXPR; - break; - case WIDEN_PLUS_EXPR: - *code1 = PLUS_EXPR; - break; case WIDEN_MULT_EXPR: *code1 = MULT_EXPR; break; diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 8de1b144a426776bf464765477c71ee8f2e52b81..46eed1e1f22052fc077f2fc25e5be627bce541b6 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -3948,8 +3948,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case PLUS_EXPR: case MINUS_EXPR: { @@ -4070,10 +4068,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index 043e1d5987a4c4b0159109dafb85a805ca828c1e..c0bebb7f4de36838341ed62389ad0e2b79f03034 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -4288,8 +4288,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case REALIGN_LOAD_EXPR: - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case DOT_PROD_EXPR: @@ -4298,10 +4296,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case WIDEN_MULT_MINUS_EXPR: case WIDEN_LSHIFT_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 6acd394a0790ad2ad989f195a3288f0f0a8cc489..53ca62dc1a6873ae9365f199061bde9edd486196 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -2825,8 +2825,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, break; /* Binary arithmetic and logic expressions. */ - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case MULT_EXPR: @@ -3790,10 +3788,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, case VEC_SERIES_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: case VEC_WIDEN_MULT_ODD_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: @@ -4311,12 +4305,6 @@ op_symbol_code (enum tree_code code) case WIDEN_LSHIFT_EXPR: return "w<<"; - case WIDEN_PLUS_EXPR: - return "w+"; - - case WIDEN_MINUS_EXPR: - return "w-"; - case POINTER_PLUS_EXPR: return "+"; diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index d20a10a1524164eef788ab4b88ba57c7a09c3387..98dd56ff022233ccead36a1f5a5e896e352f9f5b 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR || gimple_assign_rhs_code (assign) == FLOAT_EXPR) { tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign)); @@ -3172,8 +3170,8 @@ vect_analyze_data_ref_accesses (vec_info *vinfo, break; /* Check that the DR_INITs are compile-time constants. */ - if (!tree_fits_shwi_p (DR_INIT (dra)) - || !tree_fits_shwi_p (DR_INIT (drb))) + if (TREE_CODE (DR_INIT (dra)) != INTEGER_CST + || TREE_CODE (DR_INIT (drb)) != INTEGER_CST) break; /* Different .GOMP_SIMD_LANE calls still give the same lane, @@ -3225,7 +3223,7 @@ vect_analyze_data_ref_accesses (vec_info *vinfo, unsigned HOST_WIDE_INT step = absu_hwi (tree_to_shwi (DR_STEP (dra))); if (step != 0 - && step <= ((unsigned HOST_WIDE_INT)init_b - init_a)) + && step <= (unsigned HOST_WIDE_INT)(init_b - init_a)) break; } } diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index 92aba5d4af61dd478ec3f1b94854e4ad84166774..5823b08baf70b89b22ecc148b0702a84671ad084 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -2209,10 +2209,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi, arguments, not the widened result. VEC_UNPACK_FLOAT_*_EXPR is calculated in the same way above. */ if (code == WIDEN_SUM_EXPR - || code == VEC_WIDEN_PLUS_HI_EXPR - || code == VEC_WIDEN_PLUS_LO_EXPR - || code == VEC_WIDEN_MINUS_HI_EXPR - || code == VEC_WIDEN_MINUS_LO_EXPR || code == VEC_WIDEN_MULT_HI_EXPR || code == VEC_WIDEN_MULT_LO_EXPR || code == VEC_WIDEN_MULT_EVEN_EXPR diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 268f5402fcdd5ec5bfb806db8c410e701c771275..64a0bde05bfcf62d2dc4bd18a9b6f1cb5f8698b5 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -559,21 +559,29 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type) static unsigned int vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, - tree_code widened_code, bool shift_p, + code_helper widened_code, bool shift_p, unsigned int max_nops, vect_unpromoted_value *unprom, tree *common_type, enum optab_subtype *subtype = NULL) { /* Check for an integer operation with the right code. */ - gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); - if (!assign) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return 0; - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + code_helper rhs_code; + if (is_gimple_assign (stmt)) + rhs_code = gimple_assign_rhs_code (stmt); + else + rhs_code = gimple_call_combined_fn (stmt); + + if (rhs_code.safe_as_tree_code () != code + && rhs_code.get_rep () != widened_code.get_rep ()) return 0; - tree type = TREE_TYPE (gimple_assign_lhs (assign)); + tree lhs = is_gimple_assign (stmt) ? gimple_assign_lhs (stmt): + gimple_call_lhs (stmt); + tree type = TREE_TYPE (lhs); if (!INTEGRAL_TYPE_P (type)) return 0; @@ -586,7 +594,11 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, { vect_unpromoted_value *this_unprom = &unprom[next_op]; unsigned int nops = 1; - tree op = gimple_op (assign, i + 1); + tree op; + if (is_gimple_assign (stmt)) + op = gimple_op (stmt, i + 1); + else + op = gimple_call_arg (stmt, i); if (i == 1 && TREE_CODE (op) == INTEGER_CST) { /* We already have a common type from earlier operands. @@ -1299,8 +1311,9 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR, - false, 2, unprom, &half_type)) + if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, + CFN_VEC_WIDEN_MINUS, false, 2, unprom, + &half_type)) return NULL; vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); @@ -2339,9 +2352,10 @@ vect_recog_average_pattern (vec_info *vinfo, internal_fn ifn = IFN_AVG_FLOOR; vect_unpromoted_value unprom[3]; tree new_type; + enum optab_subtype subtype; unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR, - WIDEN_PLUS_EXPR, false, 3, - unprom, &new_type); + CFN_VEC_WIDEN_PLUS, false, 3, + unprom, &new_type, &subtype); if (nops == 0) return NULL; if (nops == 3) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 9af0d107fdafb959db10d87e4e0ba5fda4e47bd7..a2c0947864a64aea7c19a9c2bc53961821838e00 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4898,9 +4898,7 @@ vectorizable_conversion (vec_info *vinfo, if (res_op.code.is_tree_code ()) { - widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR + widen_arith = (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR); } else @@ -4954,8 +4952,6 @@ vectorizable_conversion (vec_info *vinfo, { gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR || code == IFN_VEC_WIDEN_PLUS || code == IFN_VEC_WIDEN_MINUS); @@ -11981,7 +11977,7 @@ supportable_widening_operation (vec_info *vinfo, class loop *vect_loop = NULL; machine_mode vec_mode; enum insn_code icode1, icode2; - optab optab1, optab2; + optab optab1 = unknown_optab, optab2 = unknown_optab; tree vectype = vectype_in; tree wide_vectype = vectype_out; code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES; @@ -12075,16 +12071,6 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_WIDEN_LSHIFT_HI_EXPR; break; - case WIDEN_PLUS_EXPR: - c1 = VEC_WIDEN_PLUS_LO_EXPR; - c2 = VEC_WIDEN_PLUS_HI_EXPR; - break; - - case WIDEN_MINUS_EXPR: - c1 = VEC_WIDEN_MINUS_LO_EXPR; - c2 = VEC_WIDEN_MINUS_HI_EXPR; - break; - CASE_CONVERT: c1 = VEC_UNPACK_LO_EXPR; c2 = VEC_UNPACK_HI_EXPR; diff --git a/gcc/tree.def b/gcc/tree.def index 62650b6934b337c5d56e5393dc114173d72c9aa9..9b2dce3576440c445d3240b9ed937fe67c9a5992 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1383,8 +1383,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3) the first argument from type t1 to type t2, and then shifting it by the second argument. */ DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2) /* Widening vector multiplication. The two operands are vectors with N elements of size S. Multiplying the @@ -1449,10 +1447,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2) */ DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2) /* PREDICT_EXPR. Specify hint for branch prediction. The PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-06-06 17:20 ` Joel Hutton @ 2022-06-07 8:18 ` Richard Sandiford 2022-06-07 9:01 ` Joel Hutton 2022-06-13 9:18 ` [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Richard Biener 0 siblings, 2 replies; 53+ messages in thread From: Richard Sandiford @ 2022-06-07 8:18 UTC (permalink / raw) To: Joel Hutton; +Cc: Richard Biener, gcc-patches Joel Hutton <Joel.Hutton@arm.com> writes: >> > Patches attached. They already incorporated the .cc rename, now >> > rebased to be after the change to tree.h >> >> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, >> 2, oprnd, half_type, unprom, vectype); >> >> tree var = vect_recog_temp_ssa_var (itype, NULL); >> - gimple *pattern_stmt = gimple_build_assign (var, wide_code, >> - oprnd[0], oprnd[1]); >> + gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], >> oprnd[1]); >> >> >> you should be able to do without the new gimple_build overload >> by using >> >> gimple_seq stmts = NULL; >> gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]); >> gimple *pattern_stmt = gimple_seq_last_stmt (stmts); >> >> because 'gimple_build' is an existing API. > > Done. > > The gimple_build overload was at the request of Richard Sandiford, I assume removing it is ok with you Richard S? > From Richard Sandiford: >> For example, I think we should hide this inside a new: >> >> gimple_build (var, wide_code, oprnd[0], oprnd[1]); >> >> that works directly on code_helper, similarly to the new code_helper >> gimple_build interfaces. I thought the potential problem with the above is that gimple_build is a folding interface, so in principle it's allowed to return an existing SSA_NAME set by an existing statement (or even a constant). I think in this context we do need to force a new statement to be created. Of course, the hope is that there wouldn't still be such folding opportunities at this stage, but I don't think it's guaranteed (especially with options fuzzing). Sind I was mentioned :-) ... Could you run the patch through contrib/check_GNU_style.py? There seem to be a few long lines. > + if (res_op.code.is_tree_code ()) Do you need this is_tree_code ()? These comparisons… > + { > + widen_arith = (code == WIDEN_PLUS_EXPR > + || code == WIDEN_MINUS_EXPR > + || code == WIDEN_MULT_EXPR > + || code == WIDEN_LSHIFT_EXPR); …ought to be safe unconditionally. > + } > + else > + widen_arith = false; > + > + if (!widen_arith > + && !CONVERT_EXPR_CODE_P (code) > + && code != FIX_TRUNC_EXPR > + && code != FLOAT_EXPR) > + return false; > > /* Check types of lhs and rhs. */ > - scalar_dest = gimple_assign_lhs (stmt); > + scalar_dest = gimple_get_lhs (stmt); > lhs_type = TREE_TYPE (scalar_dest); > vectype_out = STMT_VINFO_VECTYPE (stmt_info); > > @@ -4938,10 +4951,14 @@ vectorizable_conversion (vec_info *vinfo, > > if (op_type == binary_op) > { > - gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR > - || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR); > + gcc_assert (code == WIDEN_MULT_EXPR > + || code == WIDEN_LSHIFT_EXPR > + || code == WIDEN_PLUS_EXPR > + || code == WIDEN_MINUS_EXPR); > > - op1 = gimple_assign_rhs2 (stmt); > + > + op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : > + gimple_call_arg (stmt, 0); > tree vectype1_in; > if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1, > &op1, &slp_op1, &dt[1], &vectype1_in)) > […] > @@ -12181,7 +12235,6 @@ supportable_widening_operation (vec_info *vinfo, > return false; > } > > - > /* Function supportable_narrowing_operation > > Check whether an operation represented by the code CODE is a Seems like a spurious change. > @@ -12205,7 +12258,7 @@ supportable_widening_operation (vec_info *vinfo, > bool > supportable_narrowing_operation (enum tree_code code, > tree vectype_out, tree vectype_in, > - enum tree_code *code1, int *multi_step_cvt, > + tree_code* _code1, int *multi_step_cvt, The original formatting (space before the “*”) was correct. Names beginning with _ are reserved, so I think we need a different name here. Also, the name in the comment should stay in sync with the name in the code. That said though, I'm not sure… > vec<tree> *interm_types) > { > machine_mode vec_mode; > @@ -12217,8 +12270,8 @@ supportable_narrowing_operation (enum tree_code code, > tree intermediate_type, prev_type; > machine_mode intermediate_mode, prev_mode; > int i; > - unsigned HOST_WIDE_INT n_elts; > bool uns; > + tree_code * code1 = (tree_code*) _code1; …the combination of these two changes makes sense on their own. > > *multi_step_cvt = 0; > switch (code) > @@ -12227,9 +12280,8 @@ supportable_narrowing_operation (enum tree_code code, > c1 = VEC_PACK_TRUNC_EXPR; > if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype) > && VECTOR_BOOLEAN_TYPE_P (vectype) > - && SCALAR_INT_MODE_P (TYPE_MODE (vectype)) > - && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts) > - && n_elts < BITS_PER_UNIT) > + && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype) > + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) > optab1 = vec_pack_sbool_trunc_optab; > else > optab1 = optab_for_tree_code (c1, vectype, optab_default); > @@ -12320,9 +12372,8 @@ supportable_narrowing_operation (enum tree_code code, > = lang_hooks.types.type_for_mode (intermediate_mode, uns); > if (VECTOR_BOOLEAN_TYPE_P (intermediate_type) > && VECTOR_BOOLEAN_TYPE_P (prev_type) > - && SCALAR_INT_MODE_P (prev_mode) > - && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant (&n_elts) > - && n_elts < BITS_PER_UNIT) > + && intermediate_mode == prev_mode > + && SCALAR_INT_MODE_P (prev_mode)) > interm_optab = vec_pack_sbool_trunc_optab; > else > interm_optab This part looks like a behavioural change, so I think it should be part of a separate patch. > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h > index 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd784f10ee3d8ff4b4dc 100644 > --- a/gcc/tree-vectorizer.h > +++ b/gcc/tree-vectorizer.h > @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree, > enum vect_def_type *, > tree *, stmt_vec_info * = NULL); > extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree); > -extern bool supportable_widening_operation (vec_info *, > - enum tree_code, stmt_vec_info, > - tree, tree, enum tree_code *, > - enum tree_code *, int *, > - vec<tree> *); > +extern bool supportable_widening_operation (vec_info*, code_helper, > + stmt_vec_info, tree, tree, > + code_helper*, code_helper*, > + int*, vec<tree> *); > extern bool supportable_narrowing_operation (enum tree_code, tree, tree, > - enum tree_code *, int *, > + tree_code *, int *, > vec<tree> *); > > extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, > diff --git a/gcc/tree.h b/gcc/tree.h > index f84958933d51144bb6ce7cc41eca5f7f06814550..00b0e4d1c696633fe38baad5295b1f90398d53fc 100644 > --- a/gcc/tree.h > +++ b/gcc/tree.h > @@ -92,6 +92,10 @@ public: > bool is_fn_code () const { return rep < 0; } > bool is_internal_fn () const; > bool is_builtin_fn () const; > + enum tree_code safe_as_tree_code () const { return is_tree_code () ? > + (tree_code)* this : MAX_TREE_CODES; } > + combined_fn safe_as_fn_code () const { return is_fn_code () ? (combined_fn) *this > + : CFN_LAST;} Since these don't fit on a line, the coding convention says that they should be defined outside of the class. Thanks, Richard > int get_rep () const { return rep; } > bool operator== (const code_helper &other) { return rep == other.rep; } > bool operator!= (const code_helper &other) { return rep != other.rep; } ^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-06-07 8:18 ` Richard Sandiford @ 2022-06-07 9:01 ` Joel Hutton 2022-06-09 14:03 ` Joel Hutton 2022-06-13 9:02 ` Richard Biener 2022-06-13 9:18 ` [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Richard Biener 1 sibling, 2 replies; 53+ messages in thread From: Joel Hutton @ 2022-06-07 9:01 UTC (permalink / raw) To: Richard Sandiford; +Cc: Richard Biener, gcc-patches Thanks Richard, > I thought the potential problem with the above is that gimple_build is a > folding interface, so in principle it's allowed to return an existing SSA_NAME > set by an existing statement (or even a constant). > I think in this context we do need to force a new statement to be created. Before I make any changes, I'd like to check we're all on the same page. richi, are you ok with the gimple_build function, perhaps with a different name if you are concerned with overloading? we could use gimple_ch_build or gimple_code_helper_build? Similarly are you ok with the use of gimple_extract_op? I would lean towards using it as it is cleaner, but I don't have strong feelings. Joel > -----Original Message----- > From: Richard Sandiford <richard.sandiford@arm.com> > Sent: 07 June 2022 09:18 > To: Joel Hutton <Joel.Hutton@arm.com> > Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org > Subject: Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as > internal_fns > > Joel Hutton <Joel.Hutton@arm.com> writes: > >> > Patches attached. They already incorporated the .cc rename, now > >> > rebased to be after the change to tree.h > >> > >> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, > >> 2, oprnd, half_type, unprom, vectype); > >> > >> tree var = vect_recog_temp_ssa_var (itype, NULL); > >> - gimple *pattern_stmt = gimple_build_assign (var, wide_code, > >> - oprnd[0], oprnd[1]); > >> + gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], > >> oprnd[1]); > >> > >> > >> you should be able to do without the new gimple_build overload by > >> using > >> > >> gimple_seq stmts = NULL; > >> gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]); > >> gimple *pattern_stmt = gimple_seq_last_stmt (stmts); > >> > >> because 'gimple_build' is an existing API. > > > > Done. > > > > The gimple_build overload was at the request of Richard Sandiford, I > assume removing it is ok with you Richard S? > > From Richard Sandiford: > >> For example, I think we should hide this inside a new: > >> > >> gimple_build (var, wide_code, oprnd[0], oprnd[1]); > >> > >> that works directly on code_helper, similarly to the new code_helper > >> gimple_build interfaces. > > I thought the potential problem with the above is that gimple_build is a > folding interface, so in principle it's allowed to return an existing SSA_NAME > set by an existing statement (or even a constant). > I think in this context we do need to force a new statement to be created. > > Of course, the hope is that there wouldn't still be such folding opportunities > at this stage, but I don't think it's guaranteed (especially with options > fuzzing). > > Sind I was mentioned :-) ... > > Could you run the patch through contrib/check_GNU_style.py? > There seem to be a few long lines. > > > + if (res_op.code.is_tree_code ()) > > Do you need this is_tree_code ()? These comparisons… > > > + { > > + widen_arith = (code == WIDEN_PLUS_EXPR > > + || code == WIDEN_MINUS_EXPR > > + || code == WIDEN_MULT_EXPR > > + || code == WIDEN_LSHIFT_EXPR); > > …ought to be safe unconditionally. > > > + } > > + else > > + widen_arith = false; > > + > > + if (!widen_arith > > + && !CONVERT_EXPR_CODE_P (code) > > + && code != FIX_TRUNC_EXPR > > + && code != FLOAT_EXPR) > > + return false; > > > > /* Check types of lhs and rhs. */ > > - scalar_dest = gimple_assign_lhs (stmt); > > + scalar_dest = gimple_get_lhs (stmt); > > lhs_type = TREE_TYPE (scalar_dest); > > vectype_out = STMT_VINFO_VECTYPE (stmt_info); > > > > @@ -4938,10 +4951,14 @@ vectorizable_conversion (vec_info *vinfo, > > > > if (op_type == binary_op) > > { > > - gcc_assert (code == WIDEN_MULT_EXPR || code == > WIDEN_LSHIFT_EXPR > > - || code == WIDEN_PLUS_EXPR || code == > WIDEN_MINUS_EXPR); > > + gcc_assert (code == WIDEN_MULT_EXPR > > + || code == WIDEN_LSHIFT_EXPR > > + || code == WIDEN_PLUS_EXPR > > + || code == WIDEN_MINUS_EXPR); > > > > - op1 = gimple_assign_rhs2 (stmt); > > + > > + op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : > > + gimple_call_arg (stmt, 0); > > tree vectype1_in; > > if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1, > > &op1, &slp_op1, &dt[1], &vectype1_in)) […] @@ > -12181,7 > > +12235,6 @@ supportable_widening_operation (vec_info *vinfo, > > return false; > > } > > > > - > > /* Function supportable_narrowing_operation > > > > Check whether an operation represented by the code CODE is a > > Seems like a spurious change. > > > @@ -12205,7 +12258,7 @@ supportable_widening_operation (vec_info > > *vinfo, bool supportable_narrowing_operation (enum tree_code code, > > tree vectype_out, tree vectype_in, > > - enum tree_code *code1, int *multi_step_cvt, > > + tree_code* _code1, int *multi_step_cvt, > > The original formatting (space before the “*”) was correct. > Names beginning with _ are reserved, so I think we need a different > name here. Also, the name in the comment should stay in sync with > the name in the code. > > That said though, I'm not sure… > > > vec<tree> *interm_types) > > { > > machine_mode vec_mode; > > @@ -12217,8 +12270,8 @@ supportable_narrowing_operation (enum > tree_code code, > > tree intermediate_type, prev_type; > > machine_mode intermediate_mode, prev_mode; > > int i; > > - unsigned HOST_WIDE_INT n_elts; > > bool uns; > > + tree_code * code1 = (tree_code*) _code1; > > …the combination of these two changes makes sense on their own. > > > > > *multi_step_cvt = 0; > > switch (code) > > @@ -12227,9 +12280,8 @@ supportable_narrowing_operation (enum > tree_code code, > > c1 = VEC_PACK_TRUNC_EXPR; > > if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype) > > && VECTOR_BOOLEAN_TYPE_P (vectype) > > - && SCALAR_INT_MODE_P (TYPE_MODE (vectype)) > > - && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts) > > - && n_elts < BITS_PER_UNIT) > > + && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype) > > + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) > > optab1 = vec_pack_sbool_trunc_optab; > > else > > optab1 = optab_for_tree_code (c1, vectype, optab_default); > > @@ -12320,9 +12372,8 @@ supportable_narrowing_operation (enum > tree_code code, > > = lang_hooks.types.type_for_mode (intermediate_mode, uns); > > if (VECTOR_BOOLEAN_TYPE_P (intermediate_type) > > && VECTOR_BOOLEAN_TYPE_P (prev_type) > > - && SCALAR_INT_MODE_P (prev_mode) > > - && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant > (&n_elts) > > - && n_elts < BITS_PER_UNIT) > > + && intermediate_mode == prev_mode > > + && SCALAR_INT_MODE_P (prev_mode)) > > interm_optab = vec_pack_sbool_trunc_optab; > > else > > interm_optab > > This part looks like a behavioural change, so I think it should be part > of a separate patch. > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h > > index > 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd78 > 4f10ee3d8ff4b4dc 100644 > > --- a/gcc/tree-vectorizer.h > > +++ b/gcc/tree-vectorizer.h > > @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, > stmt_vec_info, slp_tree, > > enum vect_def_type *, > > tree *, stmt_vec_info * = NULL); > > extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree); > > -extern bool supportable_widening_operation (vec_info *, > > - enum tree_code, stmt_vec_info, > > - tree, tree, enum tree_code *, > > - enum tree_code *, int *, > > - vec<tree> *); > > +extern bool supportable_widening_operation (vec_info*, code_helper, > > + stmt_vec_info, tree, tree, > > + code_helper*, code_helper*, > > + int*, vec<tree> *); > > extern bool supportable_narrowing_operation (enum tree_code, tree, > tree, > > - enum tree_code *, int *, > > + tree_code *, int *, > > vec<tree> *); > > > > extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, > > diff --git a/gcc/tree.h b/gcc/tree.h > > index > f84958933d51144bb6ce7cc41eca5f7f06814550..00b0e4d1c696633fe38baad5 > 295b1f90398d53fc 100644 > > --- a/gcc/tree.h > > +++ b/gcc/tree.h > > @@ -92,6 +92,10 @@ public: > > bool is_fn_code () const { return rep < 0; } > > bool is_internal_fn () const; > > bool is_builtin_fn () const; > > + enum tree_code safe_as_tree_code () const { return is_tree_code () ? > > + (tree_code)* this : MAX_TREE_CODES; } > > + combined_fn safe_as_fn_code () const { return is_fn_code () ? > (combined_fn) *this > > + : CFN_LAST;} > > Since these don't fit on a line, the coding convention says that they > should be defined outside of the class. > > Thanks, > Richard > > > int get_rep () const { return rep; } > > bool operator== (const code_helper &other) { return rep == other.rep; } > > bool operator!= (const code_helper &other) { return rep != other.rep; } ^ permalink raw reply [flat|nested] 53+ messages in thread
* [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-06-07 9:01 ` Joel Hutton @ 2022-06-09 14:03 ` Joel Hutton 2022-06-13 9:02 ` Richard Biener 1 sibling, 0 replies; 53+ messages in thread From: Joel Hutton @ 2022-06-09 14:03 UTC (permalink / raw) To: Richard Sandiford; +Cc: Richard Biener, gcc-patches > Before I make any changes, I'd like to check we're all on the same page. > > richi, are you ok with the gimple_build function, perhaps with a different > name if you are concerned with overloading? we could use gimple_ch_build > or gimple_code_helper_build? > > Similarly are you ok with the use of gimple_extract_op? I would lean towards > using it as it is cleaner, but I don't have strong feelings. > > Joel Ping. Just looking for some confirmation before I rework this patch. It would be good to get some agreement on this as Tamar is blocked on this patch. Joel > -----Original Message----- > From: Joel Hutton > Sent: 07 June 2022 10:02 > To: Richard Sandiford <richard.sandiford@arm.com> > Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org > Subject: RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as > internal_fns > > Thanks Richard, > > > I thought the potential problem with the above is that gimple_build is > > a folding interface, so in principle it's allowed to return an > > existing SSA_NAME set by an existing statement (or even a constant). > > I think in this context we do need to force a new statement to be created. > > Before I make any changes, I'd like to check we're all on the same page. > > richi, are you ok with the gimple_build function, perhaps with a different > name if you are concerned with overloading? we could use gimple_ch_build > or gimple_code_helper_build? > > Similarly are you ok with the use of gimple_extract_op? I would lean towards > using it as it is cleaner, but I don't have strong feelings. > > Joel > > > -----Original Message----- > > From: Richard Sandiford <richard.sandiford@arm.com> > > Sent: 07 June 2022 09:18 > > To: Joel Hutton <Joel.Hutton@arm.com> > > Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org > > Subject: Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as > > internal_fns > > > > Joel Hutton <Joel.Hutton@arm.com> writes: > > >> > Patches attached. They already incorporated the .cc rename, now > > >> > rebased to be after the change to tree.h > > >> > > >> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info > *vinfo, > > >> 2, oprnd, half_type, unprom, vectype); > > >> > > >> tree var = vect_recog_temp_ssa_var (itype, NULL); > > >> - gimple *pattern_stmt = gimple_build_assign (var, wide_code, > > >> - oprnd[0], oprnd[1]); > > >> + gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], > > >> oprnd[1]); > > >> > > >> > > >> you should be able to do without the new gimple_build overload by > > >> using > > >> > > >> gimple_seq stmts = NULL; > > >> gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]); > > >> gimple *pattern_stmt = gimple_seq_last_stmt (stmts); > > >> > > >> because 'gimple_build' is an existing API. > > > > > > Done. > > > > > > The gimple_build overload was at the request of Richard Sandiford, I > > assume removing it is ok with you Richard S? > > > From Richard Sandiford: > > >> For example, I think we should hide this inside a new: > > >> > > >> gimple_build (var, wide_code, oprnd[0], oprnd[1]); > > >> > > >> that works directly on code_helper, similarly to the new > > >> code_helper gimple_build interfaces. > > > > I thought the potential problem with the above is that gimple_build is > > a folding interface, so in principle it's allowed to return an > > existing SSA_NAME set by an existing statement (or even a constant). > > I think in this context we do need to force a new statement to be created. > > > > Of course, the hope is that there wouldn't still be such folding > > opportunities at this stage, but I don't think it's guaranteed > > (especially with options fuzzing). > > > > Sind I was mentioned :-) ... > > > > Could you run the patch through contrib/check_GNU_style.py? > > There seem to be a few long lines. > > > > > + if (res_op.code.is_tree_code ()) > > > > Do you need this is_tree_code ()? These comparisons… > > > > > + { > > > + widen_arith = (code == WIDEN_PLUS_EXPR > > > + || code == WIDEN_MINUS_EXPR > > > + || code == WIDEN_MULT_EXPR > > > + || code == WIDEN_LSHIFT_EXPR); > > > > …ought to be safe unconditionally. > > > > > + } > > > + else > > > + widen_arith = false; > > > + > > > + if (!widen_arith > > > + && !CONVERT_EXPR_CODE_P (code) > > > + && code != FIX_TRUNC_EXPR > > > + && code != FLOAT_EXPR) > > > + return false; > > > > > > /* Check types of lhs and rhs. */ > > > - scalar_dest = gimple_assign_lhs (stmt); > > > + scalar_dest = gimple_get_lhs (stmt); > > > lhs_type = TREE_TYPE (scalar_dest); > > > vectype_out = STMT_VINFO_VECTYPE (stmt_info); > > > > > > @@ -4938,10 +4951,14 @@ vectorizable_conversion (vec_info *vinfo, > > > > > > if (op_type == binary_op) > > > { > > > - gcc_assert (code == WIDEN_MULT_EXPR || code == > > WIDEN_LSHIFT_EXPR > > > - || code == WIDEN_PLUS_EXPR || code == > > WIDEN_MINUS_EXPR); > > > + gcc_assert (code == WIDEN_MULT_EXPR > > > + || code == WIDEN_LSHIFT_EXPR > > > + || code == WIDEN_PLUS_EXPR > > > + || code == WIDEN_MINUS_EXPR); > > > > > > - op1 = gimple_assign_rhs2 (stmt); > > > + > > > + op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : > > > + gimple_call_arg (stmt, 0); > > > tree vectype1_in; > > > if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1, > > > &op1, &slp_op1, &dt[1], &vectype1_in)) […] @@ > > -12181,7 > > > +12235,6 @@ supportable_widening_operation (vec_info *vinfo, > > > return false; > > > } > > > > > > - > > > /* Function supportable_narrowing_operation > > > > > > Check whether an operation represented by the code CODE is a > > > > Seems like a spurious change. > > > > > @@ -12205,7 +12258,7 @@ supportable_widening_operation (vec_info > > > *vinfo, bool supportable_narrowing_operation (enum tree_code code, > > > tree vectype_out, tree vectype_in, > > > - enum tree_code *code1, int *multi_step_cvt, > > > + tree_code* _code1, int *multi_step_cvt, > > > > The original formatting (space before the “*”) was correct. > > Names beginning with _ are reserved, so I think we need a different > > name here. Also, the name in the comment should stay in sync with the > > name in the code. > > > > That said though, I'm not sure… > > > > > vec<tree> *interm_types) { > > > machine_mode vec_mode; > > > @@ -12217,8 +12270,8 @@ supportable_narrowing_operation (enum > > tree_code code, > > > tree intermediate_type, prev_type; > > > machine_mode intermediate_mode, prev_mode; > > > int i; > > > - unsigned HOST_WIDE_INT n_elts; > > > bool uns; > > > + tree_code * code1 = (tree_code*) _code1; > > > > …the combination of these two changes makes sense on their own. > > > > > > > > *multi_step_cvt = 0; > > > switch (code) > > > @@ -12227,9 +12280,8 @@ supportable_narrowing_operation (enum > > tree_code code, > > > c1 = VEC_PACK_TRUNC_EXPR; > > > if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype) > > > && VECTOR_BOOLEAN_TYPE_P (vectype) > > > - && SCALAR_INT_MODE_P (TYPE_MODE (vectype)) > > > - && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts) > > > - && n_elts < BITS_PER_UNIT) > > > + && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype) > > > + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) > > > optab1 = vec_pack_sbool_trunc_optab; > > > else > > > optab1 = optab_for_tree_code (c1, vectype, optab_default); @@ > > > -12320,9 +12372,8 @@ supportable_narrowing_operation (enum > > tree_code code, > > > = lang_hooks.types.type_for_mode (intermediate_mode, uns); > > > if (VECTOR_BOOLEAN_TYPE_P (intermediate_type) > > > && VECTOR_BOOLEAN_TYPE_P (prev_type) > > > - && SCALAR_INT_MODE_P (prev_mode) > > > - && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant > > (&n_elts) > > > - && n_elts < BITS_PER_UNIT) > > > + && intermediate_mode == prev_mode > > > + && SCALAR_INT_MODE_P (prev_mode)) > > > interm_optab = vec_pack_sbool_trunc_optab; > > > else > > > interm_optab > > > > This part looks like a behavioural change, so I think it should be > > part of a separate patch. > > > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index > > > 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd78 > > 4f10ee3d8ff4b4dc 100644 > > > --- a/gcc/tree-vectorizer.h > > > +++ b/gcc/tree-vectorizer.h > > > @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, > > stmt_vec_info, slp_tree, > > > enum vect_def_type *, > > > tree *, stmt_vec_info * = NULL); extern bool > > > vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool > > > supportable_widening_operation (vec_info *, > > > - enum tree_code, stmt_vec_info, > > > - tree, tree, enum tree_code *, > > > - enum tree_code *, int *, > > > - vec<tree> *); > > > +extern bool supportable_widening_operation (vec_info*, code_helper, > > > + stmt_vec_info, tree, tree, > > > + code_helper*, code_helper*, > > > + int*, vec<tree> *); > > > extern bool supportable_narrowing_operation (enum tree_code, tree, > > tree, > > > - enum tree_code *, int *, > > > + tree_code *, int *, > > > vec<tree> *); > > > > > > extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, diff > > > --git a/gcc/tree.h b/gcc/tree.h index > > > f84958933d51144bb6ce7cc41eca5f7f06814550..00b0e4d1c696633fe38baad5 > > 295b1f90398d53fc 100644 > > > --- a/gcc/tree.h > > > +++ b/gcc/tree.h > > > @@ -92,6 +92,10 @@ public: > > > bool is_fn_code () const { return rep < 0; } > > > bool is_internal_fn () const; > > > bool is_builtin_fn () const; > > > + enum tree_code safe_as_tree_code () const { return is_tree_code () ? > > > + (tree_code)* this : MAX_TREE_CODES; } combined_fn > > > + safe_as_fn_code () const { return is_fn_code () ? > > (combined_fn) *this > > > + : CFN_LAST;} > > > > Since these don't fit on a line, the coding convention says that they > > should be defined outside of the class. > > > > Thanks, > > Richard > > > > > int get_rep () const { return rep; } > > > bool operator== (const code_helper &other) { return rep == other.rep; } > > > bool operator!= (const code_helper &other) { return rep != > > > other.rep; } ^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-06-07 9:01 ` Joel Hutton 2022-06-09 14:03 ` Joel Hutton @ 2022-06-13 9:02 ` Richard Biener 2022-06-30 13:20 ` Joel Hutton 1 sibling, 1 reply; 53+ messages in thread From: Richard Biener @ 2022-06-13 9:02 UTC (permalink / raw) To: Joel Hutton; +Cc: Richard Sandiford, gcc-patches On Tue, 7 Jun 2022, Joel Hutton wrote: > Thanks Richard, > > > I thought the potential problem with the above is that gimple_build is a > > folding interface, so in principle it's allowed to return an existing SSA_NAME > > set by an existing statement (or even a constant). > > I think in this context we do need to force a new statement to be created. > > Before I make any changes, I'd like to check we're all on the same page. > > richi, are you ok with the gimple_build function, perhaps with a > different name if you are concerned with overloading? we could use > gimple_ch_build or gimple_code_helper_build? We can go with a private vect_gimple_build function until we sort out the API issue to unblock Tamar (I'll reply to Richards reply with further thoughts on this) > Similarly are you ok with the use of gimple_extract_op? I would lean towards using it as it is cleaner, but I don't have strong feelings. I don't like using gimple_extract_op here, I think I outlined a variant that is even shorter. Richard. > Joel > > > -----Original Message----- > > From: Richard Sandiford <richard.sandiford@arm.com> > > Sent: 07 June 2022 09:18 > > To: Joel Hutton <Joel.Hutton@arm.com> > > Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org > > Subject: Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as > > internal_fns > > > > Joel Hutton <Joel.Hutton@arm.com> writes: > > >> > Patches attached. They already incorporated the .cc rename, now > > >> > rebased to be after the change to tree.h > > >> > > >> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, > > >> 2, oprnd, half_type, unprom, vectype); > > >> > > >> tree var = vect_recog_temp_ssa_var (itype, NULL); > > >> - gimple *pattern_stmt = gimple_build_assign (var, wide_code, > > >> - oprnd[0], oprnd[1]); > > >> + gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], > > >> oprnd[1]); > > >> > > >> > > >> you should be able to do without the new gimple_build overload by > > >> using > > >> > > >> gimple_seq stmts = NULL; > > >> gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]); > > >> gimple *pattern_stmt = gimple_seq_last_stmt (stmts); > > >> > > >> because 'gimple_build' is an existing API. > > > > > > Done. > > > > > > The gimple_build overload was at the request of Richard Sandiford, I > > assume removing it is ok with you Richard S? > > > From Richard Sandiford: > > >> For example, I think we should hide this inside a new: > > >> > > >> gimple_build (var, wide_code, oprnd[0], oprnd[1]); > > >> > > >> that works directly on code_helper, similarly to the new code_helper > > >> gimple_build interfaces. > > > > I thought the potential problem with the above is that gimple_build is a > > folding interface, so in principle it's allowed to return an existing SSA_NAME > > set by an existing statement (or even a constant). > > I think in this context we do need to force a new statement to be created. > > > > Of course, the hope is that there wouldn't still be such folding opportunities > > at this stage, but I don't think it's guaranteed (especially with options > > fuzzing). > > > > Sind I was mentioned :-) ... > > > > Could you run the patch through contrib/check_GNU_style.py? > > There seem to be a few long lines. > > > > > + if (res_op.code.is_tree_code ()) > > > > Do you need this is_tree_code ()? These comparisons… > > > > > + { > > > + widen_arith = (code == WIDEN_PLUS_EXPR > > > + || code == WIDEN_MINUS_EXPR > > > + || code == WIDEN_MULT_EXPR > > > + || code == WIDEN_LSHIFT_EXPR); > > > > …ought to be safe unconditionally. > > > > > + } > > > + else > > > + widen_arith = false; > > > + > > > + if (!widen_arith > > > + && !CONVERT_EXPR_CODE_P (code) > > > + && code != FIX_TRUNC_EXPR > > > + && code != FLOAT_EXPR) > > > + return false; > > > > > > /* Check types of lhs and rhs. */ > > > - scalar_dest = gimple_assign_lhs (stmt); > > > + scalar_dest = gimple_get_lhs (stmt); > > > lhs_type = TREE_TYPE (scalar_dest); > > > vectype_out = STMT_VINFO_VECTYPE (stmt_info); > > > > > > @@ -4938,10 +4951,14 @@ vectorizable_conversion (vec_info *vinfo, > > > > > > if (op_type == binary_op) > > > { > > > - gcc_assert (code == WIDEN_MULT_EXPR || code == > > WIDEN_LSHIFT_EXPR > > > - || code == WIDEN_PLUS_EXPR || code == > > WIDEN_MINUS_EXPR); > > > + gcc_assert (code == WIDEN_MULT_EXPR > > > + || code == WIDEN_LSHIFT_EXPR > > > + || code == WIDEN_PLUS_EXPR > > > + || code == WIDEN_MINUS_EXPR); > > > > > > - op1 = gimple_assign_rhs2 (stmt); > > > + > > > + op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : > > > + gimple_call_arg (stmt, 0); > > > tree vectype1_in; > > > if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1, > > > &op1, &slp_op1, &dt[1], &vectype1_in)) […] @@ > > -12181,7 > > > +12235,6 @@ supportable_widening_operation (vec_info *vinfo, > > > return false; > > > } > > > > > > - > > > /* Function supportable_narrowing_operation > > > > > > Check whether an operation represented by the code CODE is a > > > > Seems like a spurious change. > > > > > @@ -12205,7 +12258,7 @@ supportable_widening_operation (vec_info > > > *vinfo, bool supportable_narrowing_operation (enum tree_code code, > > > tree vectype_out, tree vectype_in, > > > - enum tree_code *code1, int *multi_step_cvt, > > > + tree_code* _code1, int *multi_step_cvt, > > > > The original formatting (space before the “*”) was correct. > > Names beginning with _ are reserved, so I think we need a different > > name here. Also, the name in the comment should stay in sync with > > the name in the code. > > > > That said though, I'm not sure… > > > > > vec<tree> *interm_types) > > > { > > > machine_mode vec_mode; > > > @@ -12217,8 +12270,8 @@ supportable_narrowing_operation (enum > > tree_code code, > > > tree intermediate_type, prev_type; > > > machine_mode intermediate_mode, prev_mode; > > > int i; > > > - unsigned HOST_WIDE_INT n_elts; > > > bool uns; > > > + tree_code * code1 = (tree_code*) _code1; > > > > …the combination of these two changes makes sense on their own. > > > > > > > > *multi_step_cvt = 0; > > > switch (code) > > > @@ -12227,9 +12280,8 @@ supportable_narrowing_operation (enum > > tree_code code, > > > c1 = VEC_PACK_TRUNC_EXPR; > > > if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype) > > > && VECTOR_BOOLEAN_TYPE_P (vectype) > > > - && SCALAR_INT_MODE_P (TYPE_MODE (vectype)) > > > - && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts) > > > - && n_elts < BITS_PER_UNIT) > > > + && TYPE_MODE (narrow_vectype) == TYPE_MODE (vectype) > > > + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) > > > optab1 = vec_pack_sbool_trunc_optab; > > > else > > > optab1 = optab_for_tree_code (c1, vectype, optab_default); > > > @@ -12320,9 +12372,8 @@ supportable_narrowing_operation (enum > > tree_code code, > > > = lang_hooks.types.type_for_mode (intermediate_mode, uns); > > > if (VECTOR_BOOLEAN_TYPE_P (intermediate_type) > > > && VECTOR_BOOLEAN_TYPE_P (prev_type) > > > - && SCALAR_INT_MODE_P (prev_mode) > > > - && TYPE_VECTOR_SUBPARTS (intermediate_type).is_constant > > (&n_elts) > > > - && n_elts < BITS_PER_UNIT) > > > + && intermediate_mode == prev_mode > > > + && SCALAR_INT_MODE_P (prev_mode)) > > > interm_optab = vec_pack_sbool_trunc_optab; > > > else > > > interm_optab > > > > This part looks like a behavioural change, so I think it should be part > > of a separate patch. > > > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h > > > index > > 642eb0aeb21264cd736a479b1ec25357abef29cd..50ff8eeac1e6b9859302bd78 > > 4f10ee3d8ff4b4dc 100644 > > > --- a/gcc/tree-vectorizer.h > > > +++ b/gcc/tree-vectorizer.h > > > @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, > > stmt_vec_info, slp_tree, > > > enum vect_def_type *, > > > tree *, stmt_vec_info * = NULL); > > > extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree); > > > -extern bool supportable_widening_operation (vec_info *, > > > - enum tree_code, stmt_vec_info, > > > - tree, tree, enum tree_code *, > > > - enum tree_code *, int *, > > > - vec<tree> *); > > > +extern bool supportable_widening_operation (vec_info*, code_helper, > > > + stmt_vec_info, tree, tree, > > > + code_helper*, code_helper*, > > > + int*, vec<tree> *); > > > extern bool supportable_narrowing_operation (enum tree_code, tree, > > tree, > > > - enum tree_code *, int *, > > > + tree_code *, int *, > > > vec<tree> *); > > > > > > extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, > > > diff --git a/gcc/tree.h b/gcc/tree.h > > > index > > f84958933d51144bb6ce7cc41eca5f7f06814550..00b0e4d1c696633fe38baad5 > > 295b1f90398d53fc 100644 > > > --- a/gcc/tree.h > > > +++ b/gcc/tree.h > > > @@ -92,6 +92,10 @@ public: > > > bool is_fn_code () const { return rep < 0; } > > > bool is_internal_fn () const; > > > bool is_builtin_fn () const; > > > + enum tree_code safe_as_tree_code () const { return is_tree_code () ? > > > + (tree_code)* this : MAX_TREE_CODES; } > > > + combined_fn safe_as_fn_code () const { return is_fn_code () ? > > (combined_fn) *this > > > + : CFN_LAST;} > > > > Since these don't fit on a line, the coding convention says that they > > should be defined outside of the class. > > > > Thanks, > > Richard > > > > > int get_rep () const { return rep; } > > > bool operator== (const code_helper &other) { return rep == other.rep; } > > > bool operator!= (const code_helper &other) { return rep != other.rep; } > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-06-13 9:02 ` Richard Biener @ 2022-06-30 13:20 ` Joel Hutton 2022-07-12 12:32 ` Richard Biener 0 siblings, 1 reply; 53+ messages in thread From: Joel Hutton @ 2022-06-30 13:20 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Sandiford, gcc-patches, Andre Simoes Dias Vieira [-- Attachment #1: Type: text/plain, Size: 647 bytes --] > We can go with a private vect_gimple_build function until we sort out the API > issue to unblock Tamar (I'll reply to Richards reply with further thoughts on > this) > Done. > > Similarly are you ok with the use of gimple_extract_op? I would lean > towards using it as it is cleaner, but I don't have strong feelings. > > I don't like using gimple_extract_op here, I think I outlined a variant that is > even shorter. > Done. Updated patches attached, bootstrapped and regression tested on aarch64. Tomorrow is my last working day at Arm, so it will likely be Andre that commits this/addresses any further comments. [-- Attachment #2: 0001-Refactor-to-allow-internal_fn-s.patch --] [-- Type: application/octet-stream, Size: 23390 bytes --] From f1321c617838e94044cbae357a63db002fbd3edb Mon Sep 17 00:00:00 2001 From: Joel Hutton <joel.hutton@arm.com> Date: Wed, 25 Aug 2021 14:31:15 +0100 Subject: [PATCH 1/3] Refactor to allow internal_fn's Hi all, This refactor allows widening patterns (such as widen_plus/widen_minus) to be represented as either internal_fns or tree_codes. [vect-patterns] Refactor as internal_fn's Refactor vect-patterns to allow patterns to be internal_fns starting with widening_plus/minus patterns gcc/ChangeLog: * tree-core.h (ECF_WIDEN): New flag. * gimple-match.h (class code_helper): * tree-core.h (ECF_WIDEN): Flag to mark internal_fn as widening. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Refactor to use code_helper. (vect_gimple_build): New function. * tree-vect-stmts.cc (vect_gen_widened_results_half): Refactor to use code_helper. (vect_create_vectorized_promotion_stmts): Refactor to use code_helper. (vectorizable_conversion): Refactor to use code_helper. gimple_call or gimple_assign. (supportable_widening_operation): Refactor to use code_helper. (supportable_narrowing_operation): Refactor to use code_helper. * tree-vectorizer.h (supportable_widening_operation): Change prototype to use code_helper. (supportable_narrowing_operation): change prototype to use code_helper. (vect_gimple_build): New function prototype. * tree.h (code_helper::safe_as_tree_code): New function. helper functions. (code_helper::safe_as_fn_code): New function. --- gcc/tree-core.h | 3 + gcc/tree-vect-patterns.cc | 34 ++++++- gcc/tree-vect-stmts.cc | 208 +++++++++++++++++++++++++------------- gcc/tree-vectorizer.h | 14 +-- gcc/tree.h | 13 +++ 5 files changed, 189 insertions(+), 83 deletions(-) diff --git a/gcc/tree-core.h b/gcc/tree-core.h index ab5fa01e5cb5fb56c1964b93b014ed55a4aa704a..cff6211080bced0bffb39e98039a6550897acf77 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -96,6 +96,9 @@ struct die_struct; /* Nonzero if this is a cold function. */ #define ECF_COLD (1 << 15) +/* Nonzero if this is a widening function. */ +#define ECF_WIDEN (1 << 16) + /* Call argument flags. */ /* Nonzero if the argument is not used by the function. */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 8f624863971392c891fde7278949c8818f646576..d892158f024fc045b897aebe76f2e2b66211cf83 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -25,6 +25,8 @@ along with GCC; see the file COPYING3. If not see #include "rtl.h" #include "tree.h" #include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-fold.h" #include "ssa.h" #include "expmed.h" #include "optabs-tree.h" @@ -1348,7 +1350,7 @@ vect_recog_sad_pattern (vec_info *vinfo, static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, - tree_code orig_code, tree_code wide_code, + tree_code orig_code, code_helper wide_code, bool shift_p, const char *name) { gimple *last_stmt = last_stmt_info->stmt; @@ -1391,7 +1393,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, vecctype = get_vectype_for_scalar_type (vinfo, ctype); } - enum tree_code dummy_code; + code_helper dummy_code; int dummy_int; auto_vec<tree> dummy_vec; if (!vectype @@ -1412,8 +1414,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, 2, oprnd, half_type, unprom, vectype); tree var = vect_recog_temp_ssa_var (itype, NULL); - gimple *pattern_stmt = gimple_build_assign (var, wide_code, - oprnd[0], oprnd[1]); + gimple *pattern_stmt = vect_gimple_build (var, wide_code, oprnd[0], oprnd[1]); if (vecctype != vecitype) pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype, @@ -5968,3 +5969,28 @@ vect_pattern_recog (vec_info *vinfo) /* After this no more add_stmt calls are allowed. */ vinfo->stmt_vec_info_ro = true; } + +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, + or internal_fn contained in ch, respectively. */ +gimple * +vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1) +{ + if (op0 == NULL_TREE) + return NULL; + if (ch.is_tree_code ()) + return op1 == NULL_TREE ? gimple_build_assign (lhs, ch.safe_as_tree_code (), + op0) : + gimple_build_assign (lhs, ch.safe_as_tree_code (), + op0, op1); + else + { + internal_fn fn = as_internal_fn (ch.safe_as_fn_code ()); + gimple* stmt; + if (op1 == NULL_TREE) + stmt = gimple_build_call_internal (fn, 1, op0); + else + stmt = gimple_build_call_internal (fn, 2, op0, op1); + gimple_call_set_lhs (stmt, lhs); + return stmt; + } +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 346d8ce280437e00bfeb19a4b4adc59eb96207f9..d6aabb873c86ab8ff0bae41c7f6c3bad34d583c5 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4636,7 +4636,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, STMT_INFO is the original scalar stmt that we are vectorizing. */ static gimple * -vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, +vect_gen_widened_results_half (vec_info *vinfo, code_helper ch, tree vec_oprnd0, tree vec_oprnd1, int op_type, tree vec_dest, gimple_stmt_iterator *gsi, stmt_vec_info stmt_info) @@ -4645,12 +4645,11 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, tree new_temp; /* Generate half of the widened result: */ - gcc_assert (op_type == TREE_CODE_LENGTH (code)); if (op_type != binary_op) vec_oprnd1 = NULL; - new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1); + new_stmt = vect_gimple_build (vec_dest, ch, vec_oprnd0, vec_oprnd1); new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_temp); + gimple_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); return new_stmt; @@ -4729,8 +4728,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds1, stmt_vec_info stmt_info, tree vec_dest, gimple_stmt_iterator *gsi, - enum tree_code code1, - enum tree_code code2, int op_type) + code_helper ch1, + code_helper ch2, int op_type) { int i; tree vop0, vop1, new_tmp1, new_tmp2; @@ -4746,10 +4745,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vop1 = NULL_TREE; /* Generate the two halves of promotion operation. */ - new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1, + new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1, op_type, vec_dest, gsi, stmt_info); - new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1, + new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1, op_type, vec_dest, gsi, stmt_info); if (is_gimple_call (new_stmt1)) @@ -4846,8 +4845,9 @@ vectorizable_conversion (vec_info *vinfo, tree scalar_dest; tree op0, op1 = NULL_TREE; loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo); - enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK; - enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; + tree_code tc1; + code_helper code, code1, code2; + code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; tree new_temp; enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type}; int ndts = 2; @@ -4876,31 +4876,43 @@ vectorizable_conversion (vec_info *vinfo, && ! vec_stmt) return false; - gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt); - if (!stmt) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return false; - if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) + if (gimple_get_lhs (stmt) == NULL_TREE + || TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) return false; - code = gimple_assign_rhs_code (stmt); - if (!CONVERT_EXPR_CODE_P (code) - && code != FIX_TRUNC_EXPR - && code != FLOAT_EXPR - && code != WIDEN_PLUS_EXPR - && code != WIDEN_MINUS_EXPR - && code != WIDEN_MULT_EXPR - && code != WIDEN_LSHIFT_EXPR) + if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) + return false; + + if (is_gimple_assign (stmt)) + { + code = gimple_assign_rhs_code (stmt); + op_type = TREE_CODE_LENGTH (code.safe_as_tree_code ()); + } + else if (gimple_call_internal_p (stmt)) + { + code = gimple_call_internal_fn (stmt); + op_type = gimple_call_num_args (stmt); + } + else return false; bool widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); - op_type = TREE_CODE_LENGTH (code); + || code == WIDEN_MINUS_EXPR + || code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR); + + if (!widen_arith + && !CONVERT_EXPR_CODE_P (code) + && code != FIX_TRUNC_EXPR + && code != FLOAT_EXPR) + return false; /* Check types of lhs and rhs. */ - scalar_dest = gimple_assign_lhs (stmt); + scalar_dest = gimple_get_lhs (stmt); lhs_type = TREE_TYPE (scalar_dest); vectype_out = STMT_VINFO_VECTYPE (stmt_info); @@ -4938,10 +4950,14 @@ vectorizable_conversion (vec_info *vinfo, if (op_type == binary_op) { - gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR); + gcc_assert (code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR + || code == WIDEN_PLUS_EXPR + || code == WIDEN_MINUS_EXPR); - op1 = gimple_assign_rhs2 (stmt); + + op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : + gimple_call_arg (stmt, 0); tree vectype1_in; if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1, &op1, &slp_op1, &dt[1], &vectype1_in)) @@ -5025,8 +5041,12 @@ vectorizable_conversion (vec_info *vinfo, && code != FLOAT_EXPR && !CONVERT_EXPR_CODE_P (code)) return false; - if (supportable_convert_operation (code, vectype_out, vectype_in, &code1)) + if (supportable_convert_operation (code.safe_as_tree_code (), vectype_out, + vectype_in, &tc1)) + { + code1 = tc1; break; + } /* FALLTHRU */ unsupported: if (dump_enabled_p ()) @@ -5037,9 +5057,11 @@ vectorizable_conversion (vec_info *vinfo, case WIDEN: if (known_eq (nunits_in, nunits_out)) { - if (!supportable_half_widening_operation (code, vectype_out, - vectype_in, &code1)) + if (!supportable_half_widening_operation (code.safe_as_tree_code (), + vectype_out, vectype_in, + &tc1)) goto unsupported; + code1 = tc1; gcc_assert (!(multi_step_cvt && op_type == binary_op)); break; } @@ -5073,14 +5095,17 @@ vectorizable_conversion (vec_info *vinfo, if (GET_MODE_SIZE (rhs_mode) == fltsz) { - if (!supportable_convert_operation (code, vectype_out, - cvt_type, &codecvt1)) + tc1 = ERROR_MARK; + if (!supportable_convert_operation (code.safe_as_tree_code (), + vectype_out, + cvt_type, &tc1)) goto unsupported; + codecvt1 = tc1; } - else if (!supportable_widening_operation (vinfo, code, stmt_info, - vectype_out, cvt_type, - &codecvt1, &codecvt2, - &multi_step_cvt, + else if (!supportable_widening_operation (vinfo, code, + stmt_info, vectype_out, + cvt_type, &codecvt1, + &codecvt2, &multi_step_cvt, &interm_types)) continue; else @@ -5088,8 +5113,9 @@ vectorizable_conversion (vec_info *vinfo, if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info, cvt_type, - vectype_in, &code1, &code2, - &multi_step_cvt, &interm_types)) + vectype_in, &code1, + &code2, &multi_step_cvt, + &interm_types)) { found_mode = true; break; @@ -5111,10 +5137,15 @@ vectorizable_conversion (vec_info *vinfo, case NARROW: gcc_assert (op_type == unary_op); - if (supportable_narrowing_operation (code, vectype_out, vectype_in, - &code1, &multi_step_cvt, + if (supportable_narrowing_operation (code.safe_as_tree_code (), + vectype_out, + vectype_in, + &tc1, &multi_step_cvt, &interm_types)) - break; + { + code1 = tc1; + break; + } if (code != FIX_TRUNC_EXPR || GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode)) @@ -5125,13 +5156,18 @@ vectorizable_conversion (vec_info *vinfo, cvt_type = get_same_sized_vectype (cvt_type, vectype_in); if (cvt_type == NULL_TREE) goto unsupported; - if (!supportable_convert_operation (code, cvt_type, vectype_in, - &codecvt1)) + if (!supportable_convert_operation (code.safe_as_tree_code (), cvt_type, + vectype_in, + &tc1)) goto unsupported; + codecvt1 = tc1; if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type, - &code1, &multi_step_cvt, + &tc1, &multi_step_cvt, &interm_types)) - break; + { + code1 = tc1; + break; + } goto unsupported; default: @@ -5245,8 +5281,10 @@ vectorizable_conversion (vec_info *vinfo, FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { /* Arguments are ready, create the new vector stmt. */ - gcc_assert (TREE_CODE_LENGTH (code1) == unary_op); - gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0); + gcc_assert (TREE_CODE_LENGTH ((tree_code) code1) == unary_op); + gassign *new_stmt = gimple_build_assign (vec_dest, + code1.safe_as_tree_code (), + vop0); new_temp = make_ssa_name (vec_dest, new_stmt); gimple_assign_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); @@ -5278,7 +5316,7 @@ vectorizable_conversion (vec_info *vinfo, for (i = multi_step_cvt; i >= 0; i--) { tree this_dest = vec_dsts[i]; - enum tree_code c1 = code1, c2 = code2; + code_helper c1 = code1, c2 = code2; if (i == 0 && codecvt2 != ERROR_MARK) { c1 = codecvt1; @@ -5288,7 +5326,8 @@ vectorizable_conversion (vec_info *vinfo, vect_create_half_widening_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, stmt_info, this_dest, gsi, - c1, op_type); + c1.safe_as_tree_code (), + op_type); else vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, stmt_info, @@ -5301,9 +5340,11 @@ vectorizable_conversion (vec_info *vinfo, gimple *new_stmt; if (cvt_type) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); + gcc_assert (TREE_CODE_LENGTH ((tree_code) codecvt1) == unary_op); new_temp = make_ssa_name (vec_dest); - new_stmt = gimple_build_assign (new_temp, codecvt1, vop0); + new_stmt = gimple_build_assign (new_temp, + codecvt1.safe_as_tree_code (), + vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } else @@ -5327,10 +5368,12 @@ vectorizable_conversion (vec_info *vinfo, if (cvt_type) FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); + gcc_assert (TREE_CODE_LENGTH (((tree_code) codecvt1)) == unary_op); new_temp = make_ssa_name (vec_dest); gassign *new_stmt - = gimple_build_assign (new_temp, codecvt1, vop0); + = gimple_build_assign (new_temp, + codecvt1.safe_as_tree_code (), + vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); vec_oprnds0[i] = new_temp; } @@ -5338,7 +5381,8 @@ vectorizable_conversion (vec_info *vinfo, vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0, multi_step_cvt, stmt_info, vec_dsts, gsi, - slp_node, code1); + slp_node, + code1.safe_as_tree_code ()); break; } if (!slp_node) @@ -11926,9 +11970,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype) bool supportable_widening_operation (vec_info *vinfo, - enum tree_code code, stmt_vec_info stmt_info, + code_helper code, + stmt_vec_info stmt_info, tree vectype_out, tree vectype_in, - enum tree_code *code1, enum tree_code *code2, + code_helper *code1, + code_helper *code2, int *multi_step_cvt, vec<tree> *interm_types) { @@ -11939,7 +11985,7 @@ supportable_widening_operation (vec_info *vinfo, optab optab1, optab2; tree vectype = vectype_in; tree wide_vectype = vectype_out; - enum tree_code c1, c2; + code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES; int i; tree prev_type, intermediate_type; machine_mode intermediate_mode, prev_mode; @@ -11949,7 +11995,7 @@ supportable_widening_operation (vec_info *vinfo, if (loop_info) vect_loop = LOOP_VINFO_LOOP (loop_info); - switch (code) + switch (code.safe_as_tree_code ()) { case WIDEN_MULT_EXPR: /* The result of a vectorized widening operation usually requires @@ -11990,8 +12036,9 @@ supportable_widening_operation (vec_info *vinfo, && !nested_in_vect_loop_p (vect_loop, stmt_info) && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR, stmt_info, vectype_out, - vectype_in, code1, code2, - multi_step_cvt, interm_types)) + vectype_in, code1, + code2, multi_step_cvt, + interm_types)) { /* Elements in a vector with vect_used_by_reduction property cannot be reordered if the use chain with this property does not have the @@ -12054,6 +12101,9 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_UNPACK_FIX_TRUNC_HI_EXPR; break; + case MAX_TREE_CODES: + break; + default: gcc_unreachable (); } @@ -12064,10 +12114,12 @@ supportable_widening_operation (vec_info *vinfo, if (code == FIX_TRUNC_EXPR) { /* The signedness is determined from output operand. */ - optab1 = optab_for_tree_code (c1, vectype_out, optab_default); - optab2 = optab_for_tree_code (c2, vectype_out, optab_default); + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out, + optab_default); } - else if (CONVERT_EXPR_CODE_P (code) + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) && VECTOR_BOOLEAN_TYPE_P (wide_vectype) && VECTOR_BOOLEAN_TYPE_P (vectype) && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) @@ -12080,8 +12132,10 @@ supportable_widening_operation (vec_info *vinfo, } else { - optab1 = optab_for_tree_code (c1, vectype, optab_default); - optab2 = optab_for_tree_code (c2, vectype, optab_default); + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, + optab_default); } if (!optab1 || !optab2) @@ -12092,8 +12146,12 @@ supportable_widening_operation (vec_info *vinfo, || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing) return false; - *code1 = c1; - *code2 = c2; + if (code.is_tree_code ()) + { + *code1 = c1; + *code2 = c2; + } + if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype) && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype)) @@ -12114,7 +12172,7 @@ supportable_widening_operation (vec_info *vinfo, prev_type = vectype; prev_mode = vec_mode; - if (!CONVERT_EXPR_CODE_P (code)) + if (!CONVERT_EXPR_CODE_P ((tree_code) code)) return false; /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS @@ -12145,8 +12203,12 @@ supportable_widening_operation (vec_info *vinfo, } else { - optab3 = optab_for_tree_code (c1, intermediate_type, optab_default); - optab4 = optab_for_tree_code (c2, intermediate_type, optab_default); + optab3 = optab_for_tree_code (c1.safe_as_tree_code (), + intermediate_type, + optab_default); + optab4 = optab_for_tree_code (c2.safe_as_tree_code (), + intermediate_type, + optab_default); } if (!optab3 || !optab4 @@ -12205,7 +12267,7 @@ supportable_widening_operation (vec_info *vinfo, bool supportable_narrowing_operation (enum tree_code code, tree vectype_out, tree vectype_in, - enum tree_code *code1, int *multi_step_cvt, + tree_code *code1, int *multi_step_cvt, vec<tree> *interm_types) { machine_mode vec_mode; diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 642eb0aeb21264cd736a479b1ec25357abef29cd..6f70bd622c4a4dea8c432cd26c96d24af399ef3e 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2120,13 +2120,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree, enum vect_def_type *, tree *, stmt_vec_info * = NULL); extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool supportable_widening_operation (vec_info *, - enum tree_code, stmt_vec_info, - tree, tree, enum tree_code *, - enum tree_code *, int *, - vec<tree> *); +extern bool supportable_widening_operation (vec_info*, code_helper, + stmt_vec_info, tree, tree, + code_helper*, code_helper*, + int*, vec<tree> *); extern bool supportable_narrowing_operation (enum tree_code, tree, tree, - enum tree_code *, int *, + tree_code *, int *, vec<tree> *); extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, @@ -2558,4 +2557,7 @@ vect_is_integer_truncation (stmt_vec_info stmt_info) && TYPE_PRECISION (lhs_type) < TYPE_PRECISION (rhs_type)); } +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, + or internal_fn contained in ch, respectively. */ +gimple * vect_gimple_build (tree, code_helper, tree, tree); #endif /* GCC_TREE_VECTORIZER_H */ diff --git a/gcc/tree.h b/gcc/tree.h index 6f6ad5a3a5f4dd4173482dfe259acf539ba24000..24b5184122550fe21ab0a5387867b6c65c20bb03 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -93,6 +93,8 @@ public: bool is_internal_fn () const; bool is_builtin_fn () const; int get_rep () const { return rep; } + enum tree_code safe_as_tree_code () const; + combined_fn safe_as_fn_code () const; bool operator== (const code_helper &other) { return rep == other.rep; } bool operator!= (const code_helper &other) { return rep != other.rep; } bool operator== (tree_code c) { return rep == code_helper (c).rep; } @@ -102,6 +104,17 @@ private: int rep; }; +inline enum tree_code +code_helper::safe_as_tree_code () const +{ + return is_tree_code () ? (tree_code)* this : MAX_TREE_CODES; +} + +inline combined_fn +code_helper::safe_as_fn_code () const { + return is_fn_code () ? (combined_fn) *this : CFN_LAST; +} + inline code_helper::operator internal_fn () const { return as_internal_fn (combined_fn (*this)); -- 2.17.1 [-- Attachment #3: 0002-Refactor-widen_plus-as-internal_fn.patch --] [-- Type: application/octet-stream, Size: 22832 bytes --] From 1e8afa697157c3cb520a36304326e14891444226 Mon Sep 17 00:00:00 2001 From: Joel Hutton <joel.hutton@arm.com> Date: Wed, 26 Jan 2022 14:00:17 +0000 Subject: [PATCH 2/3] Refactor widen_plus as internal_fn This patch replaces the existing tree_code widen_plus and widen_minus patterns with internal_fn versions. DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it provides convenience wrappers for defining conversions that require a hi/lo split, like widening and narrowing operations. Each definition for <NAME> will require an optab named <OPTAB> and two other optabs that you specify for signed and unsigned. The hi/lo pair is necessary because the widening operations take n narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' operation operates on the first n/2 elements of input. The 'hi' operation operates on the second n/2 elements of input. Defining an internal_fn along with hi/lo variations allows a single internal function to be returned from a vect_recog function that will later be expanded to hi/lo. DEF_INTERNAL_OPTAB_MULTI_FN is used in internal-fn.def to register a widening internal_fn. It is defined differently in different places and internal-fn.def is sourced from those places so the parameters given can be reused. internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later defined to generate the 'expand_' functions for the hi/lo versions of the fn. internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original and hi/lo variants of the internal_fn For example: IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>addl_hi_<mode> -> (u/s)addl2 IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>addl_lo_<mode> -> (u/s)addl This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. gcc/ChangeLog: 2022-04-13 Joel Hutton <joel.hutton@arm.com> 2022-04-13 Tamar Christina <tamar.christina@arm.com> * internal-fn.cc (INCLUDE_MAP): Include maps for use in optab lookup. (DEF_INTERNAL_OPTAB_MULTI_FN): Macro to define an internal_fn that expands into multiple internal_fns (for widening). (ifn_cmp): Function to compare ifn's for sorting/searching. (lookup_multi_ifn_optab): Add lookup function. (lookup_multi_internal_fn): Add lookup function. (commutative_binary_fn_p): Add widen_plus fn's. * internal-fn.def (DEF_INTERNAL_OPTAB_MULTI_FN): Define widening plus,minus functions. (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code. (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code. * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. (lookup_multi_ifn_optab): Add prototype. (lookup_multi_internal_fn): Add prototype. * optabs.cc (commutative_optab_p): Add widening plus, minus optabs. * optabs.def (OPTAB_CD): widen add, sub optabs * tree-core.h (ECF_MULTI): Flag to indicate if a function decays into hi/lo parts. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support patterns with a hi/lo split. (vect_recog_widen_plus_pattern): Refactor to return IFN_VECT_WIDEN_PLUS. (vect_recog_widen_minus_pattern): Refactor to return new IFN_VEC_WIDEN_MINUS. * tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus ifn support. (supportable_widening_operation): Add widen plus/minus ifn support. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: Test that new IFN_VEC_WIDEN_PLUS is being used. * gcc.target/aarch64/vect-widen-sub.c: Test that new IFN_VEC_WIDEN_MINUS is being used. --- gcc/internal-fn.cc | 107 ++++++++++++++++++ gcc/internal-fn.def | 23 ++++ gcc/internal-fn.h | 7 ++ gcc/optabs.cc | 12 +- gcc/optabs.def | 2 + .../gcc.target/aarch64/vect-widen-add.c | 4 +- .../gcc.target/aarch64/vect-widen-sub.c | 4 +- gcc/tree-core.h | 4 + gcc/tree-vect-patterns.cc | 37 ++++-- gcc/tree-vect-stmts.cc | 68 +++++++++-- 10 files changed, 249 insertions(+), 19 deletions(-) diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 91588f8bc9f7c3fe2bac17f3c4e6078cddb7b4d2..b2cb3e508027a84e4456d676d78b27b6c04b7b61 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +#define INCLUDE_MAP #include "config.h" #include "system.h" #include "coretypes.h" @@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = { 0 }; +const enum internal_fn internal_fn_hilo_keys_array[] = { +#undef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + IFN_##NAME##_LO, \ + IFN_##NAME##_HI, +#include "internal-fn.def" + IFN_LAST +#undef DEF_INTERNAL_OPTAB_MULTI_FN +}; + +const optab internal_fn_hilo_values_array[] = { +#undef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + SOPTAB##_lo_optab, UOPTAB##_lo_optab, \ + SOPTAB##_hi_optab, UOPTAB##_hi_optab, +#include "internal-fn.def" + unknown_optab, unknown_optab +#undef DEF_INTERNAL_OPTAB_MULTI_FN +}; + /* Return the internal function called NAME, or IFN_LAST if there's no such function. */ @@ -90,6 +111,62 @@ lookup_internal_fn (const char *name) return entry ? *entry : IFN_LAST; } +static int +ifn_cmp (const void *a_, const void *b_) +{ + typedef std::pair<enum internal_fn, unsigned> ifn_pair; + auto *a = (const std::pair<ifn_pair, optab> *)a_; + auto *b = (const std::pair<ifn_pair, optab> *)b_; + return (int) (a->first.first) - (b->first.first); +} + +/* Return the optab belonging to the given internal function NAME for the given + SIGN or unknown_optab. */ + +optab +lookup_multi_ifn_optab (enum internal_fn fn, unsigned sign) +{ + typedef std::pair<enum internal_fn, unsigned> ifn_pair; + typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type; + static fn_to_optab_map_type *fn_to_optab_map; + + if (!fn_to_optab_map) + { + unsigned num + = sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn); + fn_to_optab_map = new fn_to_optab_map_type (); + for (unsigned int i = 0; i < num - 1; ++i) + { + enum internal_fn fn = internal_fn_hilo_keys_array[i]; + optab v1 = internal_fn_hilo_values_array[2*i]; + optab v2 = internal_fn_hilo_values_array[2*i + 1]; + ifn_pair key1 (fn, 0); + fn_to_optab_map->safe_push ({key1, v1}); + ifn_pair key2 (fn, 1); + fn_to_optab_map->safe_push ({key2, v2}); + } + fn_to_optab_map->qsort (ifn_cmp); + } + + ifn_pair new_pair (fn, sign ? 1 : 0); + optab tmp; + std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp); + auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp); + return entry != fn_to_optab_map->end () ? entry->second : unknown_optab; +} + +extern void +lookup_multi_internal_fn (enum internal_fn ifn, enum internal_fn *lo, + enum internal_fn *hi) +{ + int ecf_flags = internal_fn_flags (ifn); + gcc_assert (ecf_flags & ECF_MULTI); + + *lo = internal_fn (ifn + 1); + *hi = internal_fn (ifn + 2); +} + + /* Fnspec of each internal function, indexed by function number. */ const_tree internal_fn_fnspec_array[IFN_LAST + 1]; @@ -3928,6 +4005,9 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: return true; default: @@ -4013,6 +4093,32 @@ set_edom_supported_p (void) #endif } +#undef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + static void \ + expand_##CODE (internal_fn, gcall *) \ + { \ + gcc_unreachable (); \ + } \ + static void \ + expand_##CODE##_LO (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab); \ + } \ + static void \ + expand_##CODE##_HI (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab); \ + } + #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \ static void \ expand_##CODE (internal_fn fn, gcall *stmt) \ @@ -4029,6 +4135,7 @@ set_edom_supported_p (void) expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ } #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_MULTI_FN /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index d2d550d358606022b1cb44fa842f06e0be507bc3..4635a9c8af9ad27bb05d7510388d0fe2270428e5 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -82,6 +82,13 @@ along with GCC; see the file COPYING3. If not see says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX} group of functions to any integral mode (including vector modes). + DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it + provides convenience wrappers for defining conversions that require a + hi/lo split, like widening and narrowing operations. Each definition + for <NAME> will require an optab named <OPTAB> and two other optabs that + you specify for signed and unsigned. + + Each entry must have a corresponding expander of the form: void expand_NAME (gimple_call stmt) @@ -120,6 +127,14 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) #endif +#ifndef DEF_INTERNAL_OPTAB_MULTI_FN +#define DEF_INTERNAL_OPTAB_MULTI_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME, FLAGS | ECF_MULTI, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE) +#endif + + DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, @@ -292,6 +307,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) +DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_PLUS, + ECF_CONST | ECF_WIDEN | ECF_NOTHROW, + vec_widen_add, vec_widen_saddl, vec_widen_uaddl, + binary) +DEF_INTERNAL_OPTAB_MULTI_FN (VEC_WIDEN_MINUS, + ECF_CONST | ECF_WIDEN | ECF_NOTHROW, + vec_widen_sub, vec_widen_ssubl, vec_widen_usubl, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 23c014a963c4d72da92c763db87ee486a2adb485..b35de19747d251d19dc13de1e0323368bd2ebdf2 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +#include "insn-codes.h" +#include "insn-opinit.h" + + /* INTEGER_CST values for IFN_UNIQUE function arg-0. UNSPEC: Undifferentiated UNIQUE. @@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn) } extern internal_fn lookup_internal_fn (const char *); +extern optab lookup_multi_ifn_optab (enum internal_fn, unsigned); +extern void lookup_multi_internal_fn (enum internal_fn, enum internal_fn *, + enum internal_fn *); /* Return the ECF_* flags for function FN. */ diff --git a/gcc/optabs.cc b/gcc/optabs.cc index a50dd798f2a454ac54e247f3e6cbab17577ea304..f9be369a6c5b99de5bbad664a11364d1c2cc4b95 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1312,7 +1312,17 @@ commutative_optab_p (optab binoptab) || binoptab == smul_widen_optab || binoptab == umul_widen_optab || binoptab == smul_highpart_optab - || binoptab == umul_highpart_optab); + || binoptab == umul_highpart_optab + || binoptab == vec_widen_add_optab + || binoptab == vec_widen_sub_optab + || binoptab == vec_widen_saddl_hi_optab + || binoptab == vec_widen_saddl_lo_optab + || binoptab == vec_widen_ssubl_hi_optab + || binoptab == vec_widen_ssubl_lo_optab + || binoptab == vec_widen_uaddl_hi_optab + || binoptab == vec_widen_uaddl_lo_optab + || binoptab == vec_widen_usubl_hi_optab + || binoptab == vec_widen_usubl_lo_optab); } /* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're diff --git a/gcc/optabs.def b/gcc/optabs.def index 801310ebaa7d469520809bb7efed6820f8eb866b..a7881dcb49e4ef07d8f07aa31214eb3a7a944e2e 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4") OPTAB_CD(umsub_widen_optab, "umsub$b$a4") OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4") OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4") +OPTAB_CD(vec_widen_add_optab, "add$a$b3") +OPTAB_CD(vec_widen_sub_optab, "sub$a$b3") OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b") OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b") OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */ /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */ /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */ diff --git a/gcc/tree-core.h b/gcc/tree-core.h index cff6211080bced0bffb39e98039a6550897acf77..d0c8b812cfb9c3ac83bf25fff0431b08cb7d823d 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -99,6 +99,10 @@ struct die_struct; /* Nonzero if this is a widening function. */ #define ECF_WIDEN (1 << 16) +/* Nonzero if this is a function that decomposes into a lo/hi operation. */ +#define ECF_MULTI (1 << 17) + + /* Call argument flags. */ /* Nonzero if the argument is not used by the function. */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index d892158f024fc045b897aebe76f2e2b66211cf83..62ca28d725ed4ac8d7e4d493119e40772a0fbac6 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -1351,14 +1351,16 @@ static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, tree_code orig_code, code_helper wide_code, - bool shift_p, const char *name) + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) { gimple *last_stmt = last_stmt_info->stmt; vect_unpromoted_value unprom[2]; tree half_type; if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code, - shift_p, 2, unprom, &half_type)) + shift_p, 2, unprom, &half_type, subtype)) + return NULL; /* Pattern detected. */ @@ -1424,6 +1426,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo, type, pattern_stmt, vecctype); } +static gimple * +vect_recog_widen_op_pattern (vec_info *vinfo, + stmt_vec_info last_stmt_info, tree *type_out, + tree_code orig_code, internal_fn wide_ifn, + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) +{ + combined_fn ifn = as_combined_fn (wide_ifn); + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + orig_code, ifn, shift_p, name, + subtype); +} + + /* Try to detect multiplication on widened inputs, converting MULT_EXPR to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ @@ -1437,26 +1453,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, } /* Try to detect addition on widened inputs, converting PLUS_EXPR - to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_PLUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - PLUS_EXPR, WIDEN_PLUS_EXPR, false, - "vect_recog_widen_plus_pattern"); + PLUS_EXPR, IFN_VEC_WIDEN_PLUS, + false, "vect_recog_widen_plus_pattern", + &subtype); } /* Try to detect subtraction on widened inputs, converting MINUS_EXPR - to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_MINUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - MINUS_EXPR, WIDEN_MINUS_EXPR, false, - "vect_recog_widen_minus_pattern"); + MINUS_EXPR, IFN_VEC_WIDEN_MINUS, + false, "vect_recog_widen_minus_pattern", + &subtype); } /* Function vect_recog_popcount_pattern @@ -5629,6 +5649,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + /* These must come after the double widening ones. */ }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index d6aabb873c86ab8ff0bae41c7f6c3bad34d583c5..6fa0669fdfc8630842b3f9f32f4b4a253e79bb92 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4903,7 +4903,9 @@ vectorizable_conversion (vec_info *vinfo, bool widen_arith = (code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); + || code == WIDEN_LSHIFT_EXPR + || code == IFN_VEC_WIDEN_PLUS + || code == IFN_VEC_WIDEN_MINUS); if (!widen_arith && !CONVERT_EXPR_CODE_P (code) @@ -4953,7 +4955,9 @@ vectorizable_conversion (vec_info *vinfo, gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR); + || code == WIDEN_MINUS_EXPR + || code == IFN_VEC_WIDEN_PLUS + || code == IFN_VEC_WIDEN_MINUS); op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : @@ -12130,14 +12134,62 @@ supportable_widening_operation (vec_info *vinfo, optab1 = vec_unpacks_sbool_lo_optab; optab2 = vec_unpacks_sbool_hi_optab; } - else - { - optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, - optab_default); - optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, - optab_default); + + if (code.is_fn_code ()) + { + internal_fn ifn = as_internal_fn (code.safe_as_fn_code ()); + int ecf_flags = internal_fn_flags (ifn); + gcc_assert (ecf_flags & ECF_MULTI); + + switch (code.safe_as_fn_code ()) + { + case CFN_VEC_WIDEN_PLUS: + break; + case CFN_VEC_WIDEN_MINUS: + break; + case CFN_LAST: + default: + return false; + } + + internal_fn lo, hi; + lookup_multi_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); + optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); } + if (code.is_tree_code ()) + { + if (code == FIX_TRUNC_EXPR) + { + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out, + optab_default); + } + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) + && VECTOR_BOOLEAN_TYPE_P (wide_vectype) + && VECTOR_BOOLEAN_TYPE_P (vectype) + && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) + { + /* If the input and result modes are the same, a different optab + is needed where we pass in the number of units in vectype. */ + optab1 = vec_unpacks_sbool_lo_optab; + optab2 = vec_unpacks_sbool_hi_optab; + } + else + { + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, + optab_default); + } + } + if (!optab1 || !optab2) return false; -- 2.17.1 [-- Attachment #4: 0003-Remove-widen_plus-minus_expr-tree-codes.patch --] [-- Type: application/octet-stream, Size: 19552 bytes --] From 60664218e6e59510f02fb64b49a236e9e5b26c9f Mon Sep 17 00:00:00 2001 From: Joel Hutton <joel.hutton@arm.com> Date: Fri, 28 Jan 2022 12:04:44 +0000 Subject: [PATCH 3/3] Remove widen_plus/minus_expr tree codes This patch removes the old widen plus/minus tree codes which have been replaced by internal functions. gcc/ChangeLog: * doc/generic.texi: Remove old tree codes. * expr.cc (expand_expr_real_2): Remove old tree code cases. * gimple-pretty-print.cc (dump_binary_rhs): Remove old tree code cases. * optabs-tree.cc (optab_for_tree_code): Remove old tree code cases. (supportable_half_widening_operation): Remove old tree code cases. * tree-cfg.cc (verify_gimple_assign_binary): Remove old tree code cases. * tree-inline.cc (estimate_operator_cost): Remove old tree code cases. * tree-pretty-print.cc (dump_generic_node): Remove tree code definition. (op_symbol_code): Remove old tree code cases. * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Remove old tree code cases. (vect_analyze_data_ref_accesses): Remove old tree code cases. * tree-vect-generic.cc (expand_vector_operations_1): Remove old tree code cases. * tree-vect-patterns.cc (vect_widened_op_tree): Refactor ot replace usage in vect_recog_sad_pattern. (vect_recog_sad_pattern): Replace tree code widening pattern with internal function. (vect_recog_average_pattern): Replace tree code widening pattern with internal function. * tree-vect-stmts.cc (vectorizable_conversion): Remove old tree code cases. (supportable_widening_operation): Remove old tree code cases. * tree.def (WIDEN_PLUS_EXPR): Remove tree code definition. (WIDEN_MINUS_EXPR): Remove tree code definition. (VEC_WIDEN_PLUS_HI_EXPR): Remove tree code definition. (VEC_WIDEN_PLUS_LO_EXPR): Remove tree code definition. (VEC_WIDEN_MINUS_HI_EXPR): Remove tree code definition. (VEC_WIDEN_MINUS_LO_EXPR): Remove tree code definition. --- gcc/doc/generic.texi | 31 ------------------------------- gcc/expr.cc | 6 ------ gcc/gimple-pretty-print.cc | 4 ---- gcc/optabs-tree.cc | 24 ------------------------ gcc/tree-cfg.cc | 6 ------ gcc/tree-inline.cc | 6 ------ gcc/tree-pretty-print.cc | 12 ------------ gcc/tree-vect-data-refs.cc | 8 +++----- gcc/tree-vect-generic.cc | 4 ---- gcc/tree-vect-patterns.cc | 36 +++++++++++++++++++++++++----------- gcc/tree-vect-stmts.cc | 18 ++---------------- gcc/tree.def | 6 ------ 12 files changed, 30 insertions(+), 131 deletions(-) diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index e5f9d1be8ea81f3da002ec3bb925590d331a2551..344045efd419b0cc3a11771acf70d2fd279c48ac 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1811,10 +1811,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @tindex VEC_WIDEN_MULT_LO_EXPR -@tindex VEC_WIDEN_PLUS_HI_EXPR -@tindex VEC_WIDEN_PLUS_LO_EXPR -@tindex VEC_WIDEN_MINUS_HI_EXPR -@tindex VEC_WIDEN_MINUS_LO_EXPR @tindex VEC_UNPACK_HI_EXPR @tindex VEC_UNPACK_LO_EXPR @tindex VEC_UNPACK_FLOAT_HI_EXPR @@ -1861,33 +1857,6 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the low @code{N/2} elements of the two vector are multiplied to produce the vector of @code{N/2} products. -@item VEC_WIDEN_PLUS_HI_EXPR -@itemx VEC_WIDEN_PLUS_LO_EXPR -These nodes represent widening vector addition of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The result -is a vector that contains half as many elements, of an integral type whose size -is twice as wide. In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. - -@item VEC_WIDEN_MINUS_HI_EXPR -@itemx VEC_WIDEN_MINUS_LO_EXPR -These nodes represent widening vector subtraction of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The high/low -elements of the second vector are subtracted from the high/low elements of the -first. The result is a vector that contains half as many elements, of an -integral type whose size is twice as wide. In the case of -@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second -vector are subtracted from the high @code{N/2} of the first to produce the -vector of @code{N/2} products. In the case of -@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second -vector are subtracted from the low @code{N/2} of the first to produce the -vector of @code{N/2} products. - @item VEC_UNPACK_HI_EXPR @itemx VEC_UNPACK_LO_EXPR These nodes represent unpacking of the high and low parts of the input vector, diff --git a/gcc/expr.cc b/gcc/expr.cc index 5d66c9f21f0ccd2eafb322eb9001f0dc873e35b4..b80385d51ba22172750d94535e04c82f75661255 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -9386,8 +9386,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target, unsignedp); return target; - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_MULT_EXPR: /* If first operand is constant, swap them. Thus the following special case checks need only @@ -10165,10 +10163,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, return temp; } - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc index ebd87b20a0adc080c4a8f9429e75f49b96e72f9a..2a1a5b7f811ca341e8ee7e85a9701d3a37ff80bf 100644 --- a/gcc/gimple-pretty-print.cc +++ b/gcc/gimple-pretty-print.cc @@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc, case VEC_PACK_FLOAT_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_SERIES_EXPR: for (p = get_tree_code_name (code); *p; p++) pp_character (buffer, TOUPPER (*p)); diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc index 8383fe820b080f6d66f83dcf3b77d3c9f869f4bc..2f5f93dc6624f86f6b5618cf6e7aa2b508053a64 100644 --- a/gcc/optabs-tree.cc +++ b/gcc/optabs-tree.cc @@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, return (TYPE_UNSIGNED (type) ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab); - case VEC_WIDEN_PLUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab); - - case VEC_WIDEN_PLUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab); - - case VEC_WIDEN_MINUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab); - - case VEC_WIDEN_MINUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab); - case VEC_UNPACK_HI_EXPR: return (TYPE_UNSIGNED (type) ? vec_unpacku_hi_optab : vec_unpacks_hi_optab); @@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, 'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO. Supported widening operations: - WIDEN_MINUS_EXPR - WIDEN_PLUS_EXPR WIDEN_MULT_EXPR WIDEN_LSHIFT_EXPR @@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out, case WIDEN_LSHIFT_EXPR: *code1 = LSHIFT_EXPR; break; - case WIDEN_MINUS_EXPR: - *code1 = MINUS_EXPR; - break; - case WIDEN_PLUS_EXPR: - *code1 = PLUS_EXPR; - break; case WIDEN_MULT_EXPR: *code1 = MULT_EXPR; break; diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index bfcb1425f7e2e46e3d525808adda11560041dd68..757c6c73e351c13bc6695699d9f449530546f70f 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -3951,8 +3951,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case PLUS_EXPR: case MINUS_EXPR: { @@ -4073,10 +4071,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index 043e1d5987a4c4b0159109dafb85a805ca828c1e..c0bebb7f4de36838341ed62389ad0e2b79f03034 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -4288,8 +4288,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case REALIGN_LOAD_EXPR: - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case DOT_PROD_EXPR: @@ -4298,10 +4296,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case WIDEN_MULT_MINUS_EXPR: case WIDEN_LSHIFT_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index bfabe9e76279d7c3383b684ed61cc92228de4500..0ca8802576656f098e60cb77fa4312d1375ff3f0 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -2846,8 +2846,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, break; /* Binary arithmetic and logic expressions. */ - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case MULT_EXPR: @@ -3811,10 +3809,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, case VEC_SERIES_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: case VEC_WIDEN_MULT_ODD_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: @@ -4332,12 +4326,6 @@ op_symbol_code (enum tree_code code) case WIDEN_LSHIFT_EXPR: return "w<<"; - case WIDEN_PLUS_EXPR: - return "w+"; - - case WIDEN_MINUS_EXPR: - return "w-"; - case POINTER_PLUS_EXPR: return "+"; diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index d20a10a1524164eef788ab4b88ba57c7a09c3387..98dd56ff022233ccead36a1f5a5e896e352f9f5b 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR || gimple_assign_rhs_code (assign) == FLOAT_EXPR) { tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign)); @@ -3172,8 +3170,8 @@ vect_analyze_data_ref_accesses (vec_info *vinfo, break; /* Check that the DR_INITs are compile-time constants. */ - if (!tree_fits_shwi_p (DR_INIT (dra)) - || !tree_fits_shwi_p (DR_INIT (drb))) + if (TREE_CODE (DR_INIT (dra)) != INTEGER_CST + || TREE_CODE (DR_INIT (drb)) != INTEGER_CST) break; /* Different .GOMP_SIMD_LANE calls still give the same lane, @@ -3225,7 +3223,7 @@ vect_analyze_data_ref_accesses (vec_info *vinfo, unsigned HOST_WIDE_INT step = absu_hwi (tree_to_shwi (DR_STEP (dra))); if (step != 0 - && step <= ((unsigned HOST_WIDE_INT)init_b - init_a)) + && step <= (unsigned HOST_WIDE_INT)(init_b - init_a)) break; } } diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index 350129555a0c71c0896c4f1003163f3b3557c11b..066f05873118c2288c90604e6287c91ef9aed72b 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -2209,10 +2209,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi, arguments, not the widened result. VEC_UNPACK_FLOAT_*_EXPR is calculated in the same way above. */ if (code == WIDEN_SUM_EXPR - || code == VEC_WIDEN_PLUS_HI_EXPR - || code == VEC_WIDEN_PLUS_LO_EXPR - || code == VEC_WIDEN_MINUS_HI_EXPR - || code == VEC_WIDEN_MINUS_LO_EXPR || code == VEC_WIDEN_MULT_HI_EXPR || code == VEC_WIDEN_MULT_LO_EXPR || code == VEC_WIDEN_MULT_EVEN_EXPR diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 62ca28d725ed4ac8d7e4d493119e40772a0fbac6..9cd3989656c024b1d0394b2fcde6f6d774dff74e 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -559,21 +559,29 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type) static unsigned int vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, - tree_code widened_code, bool shift_p, + code_helper widened_code, bool shift_p, unsigned int max_nops, vect_unpromoted_value *unprom, tree *common_type, enum optab_subtype *subtype = NULL) { /* Check for an integer operation with the right code. */ - gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); - if (!assign) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return 0; - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + code_helper rhs_code; + if (is_gimple_assign (stmt)) + rhs_code = gimple_assign_rhs_code (stmt); + else + rhs_code = gimple_call_combined_fn (stmt); + + if (rhs_code.safe_as_tree_code () != code + && rhs_code.get_rep () != widened_code.get_rep ()) return 0; - tree type = TREE_TYPE (gimple_assign_lhs (assign)); + tree lhs = is_gimple_assign (stmt) ? gimple_assign_lhs (stmt): + gimple_call_lhs (stmt); + tree type = TREE_TYPE (lhs); if (!INTEGRAL_TYPE_P (type)) return 0; @@ -586,7 +594,11 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, { vect_unpromoted_value *this_unprom = &unprom[next_op]; unsigned int nops = 1; - tree op = gimple_op (assign, i + 1); + tree op; + if (is_gimple_assign (stmt)) + op = gimple_op (stmt, i + 1); + else + op = gimple_call_arg (stmt, i); if (i == 1 && TREE_CODE (op) == INTEGER_CST) { /* We already have a common type from earlier operands. @@ -1299,8 +1311,9 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR, - false, 2, unprom, &half_type)) + if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, + CFN_VEC_WIDEN_MINUS, false, 2, unprom, + &half_type)) return NULL; vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); @@ -2337,9 +2350,10 @@ vect_recog_average_pattern (vec_info *vinfo, internal_fn ifn = IFN_AVG_FLOOR; vect_unpromoted_value unprom[3]; tree new_type; + enum optab_subtype subtype; unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR, - WIDEN_PLUS_EXPR, false, 3, - unprom, &new_type); + CFN_VEC_WIDEN_PLUS, false, 3, + unprom, &new_type, &subtype); if (nops == 0) return NULL; if (nops == 3) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 6fa0669fdfc8630842b3f9f32f4b4a253e79bb92..92b17e6d0ec18ce3d90290dba9efec5d1968264c 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4900,9 +4900,7 @@ vectorizable_conversion (vec_info *vinfo, else return false; - bool widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR + bool widen_arith = (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == IFN_VEC_WIDEN_PLUS || code == IFN_VEC_WIDEN_MINUS); @@ -4954,8 +4952,6 @@ vectorizable_conversion (vec_info *vinfo, { gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR || code == IFN_VEC_WIDEN_PLUS || code == IFN_VEC_WIDEN_MINUS); @@ -11986,7 +11982,7 @@ supportable_widening_operation (vec_info *vinfo, class loop *vect_loop = NULL; machine_mode vec_mode; enum insn_code icode1, icode2; - optab optab1, optab2; + optab optab1 = unknown_optab, optab2 = unknown_optab; tree vectype = vectype_in; tree wide_vectype = vectype_out; code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES; @@ -12080,16 +12076,6 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_WIDEN_LSHIFT_HI_EXPR; break; - case WIDEN_PLUS_EXPR: - c1 = VEC_WIDEN_PLUS_LO_EXPR; - c2 = VEC_WIDEN_PLUS_HI_EXPR; - break; - - case WIDEN_MINUS_EXPR: - c1 = VEC_WIDEN_MINUS_LO_EXPR; - c2 = VEC_WIDEN_MINUS_HI_EXPR; - break; - CASE_CONVERT: c1 = VEC_UNPACK_LO_EXPR; c2 = VEC_UNPACK_HI_EXPR; diff --git a/gcc/tree.def b/gcc/tree.def index 62650b6934b337c5d56e5393dc114173d72c9aa9..9b2dce3576440c445d3240b9ed937fe67c9a5992 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1383,8 +1383,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3) the first argument from type t1 to type t2, and then shifting it by the second argument. */ DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2) /* Widening vector multiplication. The two operands are vectors with N elements of size S. Multiplying the @@ -1449,10 +1447,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2) */ DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2) /* PREDICT_EXPR. Specify hint for branch prediction. The PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-06-30 13:20 ` Joel Hutton @ 2022-07-12 12:32 ` Richard Biener 2023-03-17 10:14 ` Andre Vieira (lists) 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2022-07-12 12:32 UTC (permalink / raw) To: Joel Hutton; +Cc: Richard Sandiford, gcc-patches, Andre Simoes Dias Vieira On Thu, 30 Jun 2022, Joel Hutton wrote: > > We can go with a private vect_gimple_build function until we sort out the API > > issue to unblock Tamar (I'll reply to Richards reply with further thoughts on > > this) > > > > Done. > > > > Similarly are you ok with the use of gimple_extract_op? I would lean > > towards using it as it is cleaner, but I don't have strong feelings. > > > > I don't like using gimple_extract_op here, I think I outlined a variant that is > > even shorter. > > > > Done. > > Updated patches attached, bootstrapped and regression tested on aarch64. > > Tomorrow is my last working day at Arm, so it will likely be Andre that commits this/addresses any further comments. First sorry for the (repeated) delays. In the first patch I still see ECF_WIDEN, I don't like that, we use things like associative_binary_fn_p so for widening internal functions similar predicates should be used. In the second patch you add vec_widen_{add,sub} optabs +OPTAB_CD(vec_widen_add_optab, "add$a$b3") +OPTAB_CD(vec_widen_sub_optab, "sub$a$b3") but a) the names are that of regular adds which is at least confusing (if not wrong), b) there's no documentation for them in md.texi which, c) doesn't explain why they are necessary when we have vec_widen_[su]{add,sub}l_optab + internal_fn ifn = as_internal_fn (code.safe_as_fn_code ()); asks for safe_as_internal_fn () (just complete the API, also with safe_as_builtin_fn) + internal_fn lo, hi; + lookup_multi_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); in fact this probably shows that the guarding condition should be if (code.is_internal_fn ()) instead of if (code.is_fn_code ()). + optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); + optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); this shows the two lookup_ APIs are inconsistent in having two vs. one output, please make them consistent. I'd say give lookup_multi_internal_fn a enum { LO, HI } argument, returning the result. Given VEC_WIDEN_MULT has HI, LO, EVEN and ODD variants that sounds more future proof. The internal_fn stuff could probably get a 2nd eye from Richard. In the third patch I see unrelated and wrong changes like /* Check that the DR_INITs are compile-time constants. */ - if (!tree_fits_shwi_p (DR_INIT (dra)) - || !tree_fits_shwi_p (DR_INIT (drb))) + if (TREE_CODE (DR_INIT (dra)) != INTEGER_CST + || TREE_CODE (DR_INIT (drb)) != INTEGER_CST) break; please strip the patch down to relevant changes. - tree op = gimple_op (assign, i + 1); + tree op; + if (is_gimple_assign (stmt)) + op = gimple_op (stmt, i + 1); + else + op = gimple_call_arg (stmt, i); somebody added gimple_arg which can be used here doing op = gimple_arg (stmt, i); + tree lhs = is_gimple_assign (stmt) ? gimple_assign_lhs (stmt): + gimple_call_lhs (stmt); tree lhs = gimple_get_lhs (stmt); /* Check for an integer operation with the right code. */ - gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); - if (!assign) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return 0; - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + code_helper rhs_code; + if (is_gimple_assign (stmt)) + rhs_code = gimple_assign_rhs_code (stmt); + else + rhs_code = gimple_call_combined_fn (stmt); + + if (rhs_code.safe_as_tree_code () != code + && rhs_code.get_rep () != widened_code.get_rep ()) return 0; that's probably better refactored as if (is_gimple_assign (stmt)) { if (code check) return 0; } else if (is_gimple_call (..)) { .. } else return 0; otherwise the last patch looks reasonable. Richard. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-07-12 12:32 ` Richard Biener @ 2023-03-17 10:14 ` Andre Vieira (lists) 2023-03-17 11:52 ` Richard Biener 0 siblings, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-03-17 10:14 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Sandiford, gcc-patches Hi Richard, I'm only picking this up now. Just going through your earlier comments and stuff and I noticed we didn't address the situation with the gimple::build. Do you want me to add overloaded static member functions to cover all gimple_build_* functions, or just create one to replace vect_gimple_build and we create them as needed? It's more work but I think adding them all would be better. I'd even argue that it would be nice to replace the old ones with the new ones, but I can imagine you might not want that as it makes backporting and the likes a bit annoying... Let me know what you prefer, I'll go work on your latest comments too. Cheers, Andre ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2023-03-17 10:14 ` Andre Vieira (lists) @ 2023-03-17 11:52 ` Richard Biener 2023-04-20 13:23 ` Andre Vieira (lists) 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2023-03-17 11:52 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Sandiford, gcc-patches On Fri, 17 Mar 2023, Andre Vieira (lists) wrote: > Hi Richard, > > I'm only picking this up now. Just going through your earlier comments and > stuff and I noticed we didn't address the situation with the gimple::build. Do > you want me to add overloaded static member functions to cover all > gimple_build_* functions, or just create one to replace vect_gimple_build and > we create them as needed? It's more work but I think adding them all would be > better. I'd even argue that it would be nice to replace the old ones with the > new ones, but I can imagine you might not want that as it makes backporting > and the likes a bit annoying... > > Let me know what you prefer, I'll go work on your latest comments too. I think the series was resolved and I approved it. As for vect_gimple_build the better way forward would be to use gimple_build () as existing but add a vect_finish_stmt_* handling a gimple_seq. Richard. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2023-03-17 11:52 ` Richard Biener @ 2023-04-20 13:23 ` Andre Vieira (lists) 2023-04-24 11:57 ` Richard Biener 0 siblings, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-04-20 13:23 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Sandiford, gcc-patches [-- Attachment #1: Type: text/plain, Size: 1105 bytes --] Rebased all three patches and made some small changes to the second one: - removed sub and abd optabs from commutative_optab_p, I suspect this was a copy paste mistake, - removed what I believe to be a superfluous switch case in vectorizable conversion, the one that was here: + if (code.is_fn_code ()) + { + internal_fn ifn = as_internal_fn (code.as_fn_code ()); + int ecf_flags = internal_fn_flags (ifn); + gcc_assert (ecf_flags & ECF_MULTI); + + switch (code.as_fn_code ()) + { + case CFN_VEC_WIDEN_PLUS: + break; + case CFN_VEC_WIDEN_MINUS: + break; + case CFN_LAST: + default: + return false; + } + + internal_fn lo, hi; + lookup_multi_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); + optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); } I don't think we need to check they are a specfic fn code, as we look-up optabs and if they succeed then surely we can vectorize? OK for trunk? Kind regards, Andre [-- Attachment #2: ifn0.patch --] [-- Type: text/plain, Size: 21121 bytes --] diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 8802141cd6edb298866025b8a55843eae1f0eb17..68dfba266d679c9738a3d5d70551a91cbdafcf66 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -25,6 +25,8 @@ along with GCC; see the file COPYING3. If not see #include "rtl.h" #include "tree.h" #include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-fold.h" #include "ssa.h" #include "expmed.h" #include "optabs-tree.h" @@ -1391,7 +1393,7 @@ vect_recog_sad_pattern (vec_info *vinfo, static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, - tree_code orig_code, tree_code wide_code, + tree_code orig_code, code_helper wide_code, bool shift_p, const char *name) { gimple *last_stmt = last_stmt_info->stmt; @@ -1434,7 +1436,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, vecctype = get_vectype_for_scalar_type (vinfo, ctype); } - enum tree_code dummy_code; + code_helper dummy_code; int dummy_int; auto_vec<tree> dummy_vec; if (!vectype @@ -1455,8 +1457,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, 2, oprnd, half_type, unprom, vectype); tree var = vect_recog_temp_ssa_var (itype, NULL); - gimple *pattern_stmt = gimple_build_assign (var, wide_code, - oprnd[0], oprnd[1]); + gimple *pattern_stmt = vect_gimple_build (var, wide_code, oprnd[0], oprnd[1]); if (vecctype != vecitype) pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype, @@ -6406,3 +6407,28 @@ vect_pattern_recog (vec_info *vinfo) /* After this no more add_stmt calls are allowed. */ vinfo->stmt_vec_info_ro = true; } + +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, + or internal_fn contained in ch, respectively. */ +gimple * +vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1) +{ + if (op0 == NULL_TREE) + return NULL; + if (ch.is_tree_code ()) + return op1 == NULL_TREE ? gimple_build_assign (lhs, ch.safe_as_tree_code (), + op0) : + gimple_build_assign (lhs, ch.safe_as_tree_code (), + op0, op1); + else + { + internal_fn fn = as_internal_fn (ch.safe_as_fn_code ()); + gimple* stmt; + if (op1 == NULL_TREE) + stmt = gimple_build_call_internal (fn, 1, op0); + else + stmt = gimple_build_call_internal (fn, 2, op0, op1); + gimple_call_set_lhs (stmt, lhs); + return stmt; + } +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 6b7dbfd4a231baec24e740ffe0ce0b0bf7a1de6b..715ec2e30a4de620b8a5076c0e7f2f7fd1b0654e 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4768,7 +4768,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, STMT_INFO is the original scalar stmt that we are vectorizing. */ static gimple * -vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, +vect_gen_widened_results_half (vec_info *vinfo, code_helper ch, tree vec_oprnd0, tree vec_oprnd1, int op_type, tree vec_dest, gimple_stmt_iterator *gsi, stmt_vec_info stmt_info) @@ -4777,12 +4777,11 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, tree new_temp; /* Generate half of the widened result: */ - gcc_assert (op_type == TREE_CODE_LENGTH (code)); if (op_type != binary_op) vec_oprnd1 = NULL; - new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1); + new_stmt = vect_gimple_build (vec_dest, ch, vec_oprnd0, vec_oprnd1); new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_temp); + gimple_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); return new_stmt; @@ -4861,8 +4860,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds1, stmt_vec_info stmt_info, tree vec_dest, gimple_stmt_iterator *gsi, - enum tree_code code1, - enum tree_code code2, int op_type) + code_helper ch1, + code_helper ch2, int op_type) { int i; tree vop0, vop1, new_tmp1, new_tmp2; @@ -4878,10 +4877,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vop1 = NULL_TREE; /* Generate the two halves of promotion operation. */ - new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1, + new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1, op_type, vec_dest, gsi, stmt_info); - new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1, + new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1, op_type, vec_dest, gsi, stmt_info); if (is_gimple_call (new_stmt1)) @@ -4978,8 +4977,9 @@ vectorizable_conversion (vec_info *vinfo, tree scalar_dest; tree op0, op1 = NULL_TREE; loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo); - enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK; - enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; + tree_code tc1; + code_helper code, code1, code2; + code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; tree new_temp; enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type}; int ndts = 2; @@ -5008,31 +5008,43 @@ vectorizable_conversion (vec_info *vinfo, && ! vec_stmt) return false; - gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt); - if (!stmt) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return false; - if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) + if (gimple_get_lhs (stmt) == NULL_TREE + || TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) return false; - code = gimple_assign_rhs_code (stmt); - if (!CONVERT_EXPR_CODE_P (code) - && code != FIX_TRUNC_EXPR - && code != FLOAT_EXPR - && code != WIDEN_PLUS_EXPR - && code != WIDEN_MINUS_EXPR - && code != WIDEN_MULT_EXPR - && code != WIDEN_LSHIFT_EXPR) + if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) + return false; + + if (is_gimple_assign (stmt)) + { + code = gimple_assign_rhs_code (stmt); + op_type = TREE_CODE_LENGTH (code.safe_as_tree_code ()); + } + else if (gimple_call_internal_p (stmt)) + { + code = gimple_call_internal_fn (stmt); + op_type = gimple_call_num_args (stmt); + } + else return false; bool widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); - op_type = TREE_CODE_LENGTH (code); + || code == WIDEN_MINUS_EXPR + || code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR); + + if (!widen_arith + && !CONVERT_EXPR_CODE_P (code) + && code != FIX_TRUNC_EXPR + && code != FLOAT_EXPR) + return false; /* Check types of lhs and rhs. */ - scalar_dest = gimple_assign_lhs (stmt); + scalar_dest = gimple_get_lhs (stmt); lhs_type = TREE_TYPE (scalar_dest); vectype_out = STMT_VINFO_VECTYPE (stmt_info); @@ -5070,10 +5082,14 @@ vectorizable_conversion (vec_info *vinfo, if (op_type == binary_op) { - gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR); + gcc_assert (code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR + || code == WIDEN_PLUS_EXPR + || code == WIDEN_MINUS_EXPR); + - op1 = gimple_assign_rhs2 (stmt); + op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : + gimple_call_arg (stmt, 0); tree vectype1_in; if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1, &op1, &slp_op1, &dt[1], &vectype1_in)) @@ -5157,8 +5173,12 @@ vectorizable_conversion (vec_info *vinfo, && code != FLOAT_EXPR && !CONVERT_EXPR_CODE_P (code)) return false; - if (supportable_convert_operation (code, vectype_out, vectype_in, &code1)) + if (supportable_convert_operation (code.safe_as_tree_code (), vectype_out, + vectype_in, &tc1)) + { + code1 = tc1; break; + } /* FALLTHRU */ unsupported: if (dump_enabled_p ()) @@ -5169,9 +5189,11 @@ vectorizable_conversion (vec_info *vinfo, case WIDEN: if (known_eq (nunits_in, nunits_out)) { - if (!supportable_half_widening_operation (code, vectype_out, - vectype_in, &code1)) + if (!supportable_half_widening_operation (code.safe_as_tree_code (), + vectype_out, vectype_in, + &tc1)) goto unsupported; + code1 = tc1; gcc_assert (!(multi_step_cvt && op_type == binary_op)); break; } @@ -5205,14 +5227,17 @@ vectorizable_conversion (vec_info *vinfo, if (GET_MODE_SIZE (rhs_mode) == fltsz) { - if (!supportable_convert_operation (code, vectype_out, - cvt_type, &codecvt1)) + tc1 = ERROR_MARK; + if (!supportable_convert_operation (code.safe_as_tree_code (), + vectype_out, + cvt_type, &tc1)) goto unsupported; + codecvt1 = tc1; } - else if (!supportable_widening_operation (vinfo, code, stmt_info, - vectype_out, cvt_type, - &codecvt1, &codecvt2, - &multi_step_cvt, + else if (!supportable_widening_operation (vinfo, code, + stmt_info, vectype_out, + cvt_type, &codecvt1, + &codecvt2, &multi_step_cvt, &interm_types)) continue; else @@ -5220,8 +5245,9 @@ vectorizable_conversion (vec_info *vinfo, if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info, cvt_type, - vectype_in, &code1, &code2, - &multi_step_cvt, &interm_types)) + vectype_in, &code1, + &code2, &multi_step_cvt, + &interm_types)) { found_mode = true; break; @@ -5243,10 +5269,15 @@ vectorizable_conversion (vec_info *vinfo, case NARROW: gcc_assert (op_type == unary_op); - if (supportable_narrowing_operation (code, vectype_out, vectype_in, - &code1, &multi_step_cvt, + if (supportable_narrowing_operation (code.safe_as_tree_code (), + vectype_out, + vectype_in, + &tc1, &multi_step_cvt, &interm_types)) - break; + { + code1 = tc1; + break; + } if (code != FIX_TRUNC_EXPR || GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode)) @@ -5257,13 +5288,18 @@ vectorizable_conversion (vec_info *vinfo, cvt_type = get_same_sized_vectype (cvt_type, vectype_in); if (cvt_type == NULL_TREE) goto unsupported; - if (!supportable_convert_operation (code, cvt_type, vectype_in, - &codecvt1)) + if (!supportable_convert_operation (code.safe_as_tree_code (), cvt_type, + vectype_in, + &tc1)) goto unsupported; + codecvt1 = tc1; if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type, - &code1, &multi_step_cvt, + &tc1, &multi_step_cvt, &interm_types)) - break; + { + code1 = tc1; + break; + } goto unsupported; default: @@ -5377,8 +5413,10 @@ vectorizable_conversion (vec_info *vinfo, FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { /* Arguments are ready, create the new vector stmt. */ - gcc_assert (TREE_CODE_LENGTH (code1) == unary_op); - gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0); + gcc_assert (TREE_CODE_LENGTH ((tree_code) code1) == unary_op); + gassign *new_stmt = gimple_build_assign (vec_dest, + code1.safe_as_tree_code (), + vop0); new_temp = make_ssa_name (vec_dest, new_stmt); gimple_assign_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); @@ -5410,7 +5448,7 @@ vectorizable_conversion (vec_info *vinfo, for (i = multi_step_cvt; i >= 0; i--) { tree this_dest = vec_dsts[i]; - enum tree_code c1 = code1, c2 = code2; + code_helper c1 = code1, c2 = code2; if (i == 0 && codecvt2 != ERROR_MARK) { c1 = codecvt1; @@ -5420,7 +5458,8 @@ vectorizable_conversion (vec_info *vinfo, vect_create_half_widening_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, stmt_info, this_dest, gsi, - c1, op_type); + c1.safe_as_tree_code (), + op_type); else vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, stmt_info, @@ -5433,9 +5472,11 @@ vectorizable_conversion (vec_info *vinfo, gimple *new_stmt; if (cvt_type) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); + gcc_assert (TREE_CODE_LENGTH ((tree_code) codecvt1) == unary_op); new_temp = make_ssa_name (vec_dest); - new_stmt = gimple_build_assign (new_temp, codecvt1, vop0); + new_stmt = gimple_build_assign (new_temp, + codecvt1.safe_as_tree_code (), + vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } else @@ -5459,10 +5500,12 @@ vectorizable_conversion (vec_info *vinfo, if (cvt_type) FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); + gcc_assert (TREE_CODE_LENGTH (((tree_code) codecvt1)) == unary_op); new_temp = make_ssa_name (vec_dest); gassign *new_stmt - = gimple_build_assign (new_temp, codecvt1, vop0); + = gimple_build_assign (new_temp, + codecvt1.safe_as_tree_code (), + vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); vec_oprnds0[i] = new_temp; } @@ -5470,7 +5513,8 @@ vectorizable_conversion (vec_info *vinfo, vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0, multi_step_cvt, stmt_info, vec_dsts, gsi, - slp_node, code1); + slp_node, + code1.safe_as_tree_code ()); break; } if (!slp_node) @@ -12151,9 +12195,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype) bool supportable_widening_operation (vec_info *vinfo, - enum tree_code code, stmt_vec_info stmt_info, + code_helper code, + stmt_vec_info stmt_info, tree vectype_out, tree vectype_in, - enum tree_code *code1, enum tree_code *code2, + code_helper *code1, + code_helper *code2, int *multi_step_cvt, vec<tree> *interm_types) { @@ -12164,7 +12210,7 @@ supportable_widening_operation (vec_info *vinfo, optab optab1, optab2; tree vectype = vectype_in; tree wide_vectype = vectype_out; - enum tree_code c1, c2; + code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES; int i; tree prev_type, intermediate_type; machine_mode intermediate_mode, prev_mode; @@ -12174,7 +12220,7 @@ supportable_widening_operation (vec_info *vinfo, if (loop_info) vect_loop = LOOP_VINFO_LOOP (loop_info); - switch (code) + switch (code.safe_as_tree_code ()) { case WIDEN_MULT_EXPR: /* The result of a vectorized widening operation usually requires @@ -12215,8 +12261,9 @@ supportable_widening_operation (vec_info *vinfo, && !nested_in_vect_loop_p (vect_loop, stmt_info) && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR, stmt_info, vectype_out, - vectype_in, code1, code2, - multi_step_cvt, interm_types)) + vectype_in, code1, + code2, multi_step_cvt, + interm_types)) { /* Elements in a vector with vect_used_by_reduction property cannot be reordered if the use chain with this property does not have the @@ -12279,6 +12326,9 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_UNPACK_FIX_TRUNC_HI_EXPR; break; + case MAX_TREE_CODES: + break; + default: gcc_unreachable (); } @@ -12289,10 +12339,12 @@ supportable_widening_operation (vec_info *vinfo, if (code == FIX_TRUNC_EXPR) { /* The signedness is determined from output operand. */ - optab1 = optab_for_tree_code (c1, vectype_out, optab_default); - optab2 = optab_for_tree_code (c2, vectype_out, optab_default); + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out, + optab_default); } - else if (CONVERT_EXPR_CODE_P (code) + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) && VECTOR_BOOLEAN_TYPE_P (wide_vectype) && VECTOR_BOOLEAN_TYPE_P (vectype) && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) @@ -12305,8 +12357,10 @@ supportable_widening_operation (vec_info *vinfo, } else { - optab1 = optab_for_tree_code (c1, vectype, optab_default); - optab2 = optab_for_tree_code (c2, vectype, optab_default); + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, + optab_default); } if (!optab1 || !optab2) @@ -12317,8 +12371,12 @@ supportable_widening_operation (vec_info *vinfo, || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing) return false; - *code1 = c1; - *code2 = c2; + if (code.is_tree_code ()) + { + *code1 = c1; + *code2 = c2; + } + if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype) && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype)) @@ -12339,7 +12397,7 @@ supportable_widening_operation (vec_info *vinfo, prev_type = vectype; prev_mode = vec_mode; - if (!CONVERT_EXPR_CODE_P (code)) + if (!CONVERT_EXPR_CODE_P ((tree_code) code)) return false; /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS @@ -12379,8 +12437,12 @@ supportable_widening_operation (vec_info *vinfo, } else { - optab3 = optab_for_tree_code (c1, intermediate_type, optab_default); - optab4 = optab_for_tree_code (c2, intermediate_type, optab_default); + optab3 = optab_for_tree_code (c1.safe_as_tree_code (), + intermediate_type, + optab_default); + optab4 = optab_for_tree_code (c2.safe_as_tree_code (), + intermediate_type, + optab_default); } if (!optab3 || !optab4 @@ -12439,7 +12501,7 @@ supportable_widening_operation (vec_info *vinfo, bool supportable_narrowing_operation (enum tree_code code, tree vectype_out, tree vectype_in, - enum tree_code *code1, int *multi_step_cvt, + tree_code *code1, int *multi_step_cvt, vec<tree> *interm_types) { machine_mode vec_mode; diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 9cf2fb23fe397b467d89aa7cc5ebeaa293ed4cce..d241eba6ef3302225bbe37b374baa11e6472c280 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2139,13 +2139,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree, enum vect_def_type *, tree *, stmt_vec_info * = NULL); extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool supportable_widening_operation (vec_info *, - enum tree_code, stmt_vec_info, - tree, tree, enum tree_code *, - enum tree_code *, int *, - vec<tree> *); +extern bool supportable_widening_operation (vec_info*, code_helper, + stmt_vec_info, tree, tree, + code_helper*, code_helper*, + int*, vec<tree> *); extern bool supportable_narrowing_operation (enum tree_code, tree, tree, - enum tree_code *, int *, + tree_code *, int *, vec<tree> *); extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, @@ -2583,4 +2582,7 @@ vect_is_integer_truncation (stmt_vec_info stmt_info) && TYPE_PRECISION (lhs_type) < TYPE_PRECISION (rhs_type)); } +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, + or internal_fn contained in ch, respectively. */ +gimple * vect_gimple_build (tree, code_helper, tree, tree); #endif /* GCC_TREE_VECTORIZER_H */ diff --git a/gcc/tree.h b/gcc/tree.h index abcdb5638d49aea4ccc46efa8e540b1fa78aa27a..a250a80e0321241e1158086acb2dd837d5827e10 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -93,6 +93,8 @@ public: bool is_internal_fn () const; bool is_builtin_fn () const; int get_rep () const { return rep; } + enum tree_code safe_as_tree_code () const; + combined_fn safe_as_fn_code () const; bool operator== (const code_helper &other) { return rep == other.rep; } bool operator!= (const code_helper &other) { return rep != other.rep; } bool operator== (tree_code c) { return rep == code_helper (c).rep; } @@ -102,6 +104,17 @@ private: int rep; }; +inline enum tree_code +code_helper::safe_as_tree_code () const +{ + return is_tree_code () ? (tree_code)* this : MAX_TREE_CODES; +} + +inline combined_fn +code_helper::safe_as_fn_code () const { + return is_fn_code () ? (combined_fn) *this : CFN_LAST; +} + inline code_helper::operator internal_fn () const { return as_internal_fn (combined_fn (*this)); [-- Attachment #3: ifn1.patch --] [-- Type: text/plain, Size: 18605 bytes --] diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 6e81dc05e0e0714256759b0594816df451415a2d..e4d815cd577d266d2bccf6fb68d62aac91a8b4cf 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +#define INCLUDE_MAP #include "config.h" #include "system.h" #include "coretypes.h" @@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = { 0 }; +const enum internal_fn internal_fn_hilo_keys_array[] = { +#undef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + IFN_##NAME##_LO, \ + IFN_##NAME##_HI, +#include "internal-fn.def" + IFN_LAST +#undef DEF_INTERNAL_OPTAB_HILO_FN +}; + +const optab internal_fn_hilo_values_array[] = { +#undef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + SOPTAB##_lo_optab, UOPTAB##_lo_optab, \ + SOPTAB##_hi_optab, UOPTAB##_hi_optab, +#include "internal-fn.def" + unknown_optab, unknown_optab +#undef DEF_INTERNAL_OPTAB_HILO_FN +}; + /* Return the internal function called NAME, or IFN_LAST if there's no such function. */ @@ -90,6 +111,61 @@ lookup_internal_fn (const char *name) return entry ? *entry : IFN_LAST; } +static int +ifn_cmp (const void *a_, const void *b_) +{ + typedef std::pair<enum internal_fn, unsigned> ifn_pair; + auto *a = (const std::pair<ifn_pair, optab> *)a_; + auto *b = (const std::pair<ifn_pair, optab> *)b_; + return (int) (a->first.first) - (b->first.first); +} + +/* Return the optab belonging to the given internal function NAME for the given + SIGN or unknown_optab. */ + +optab +lookup_hilo_ifn_optab (enum internal_fn fn, unsigned sign) +{ + typedef std::pair<enum internal_fn, unsigned> ifn_pair; + typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type; + static fn_to_optab_map_type *fn_to_optab_map; + + if (!fn_to_optab_map) + { + unsigned num + = sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn); + fn_to_optab_map = new fn_to_optab_map_type (); + for (unsigned int i = 0; i < num - 1; ++i) + { + enum internal_fn fn = internal_fn_hilo_keys_array[i]; + optab v1 = internal_fn_hilo_values_array[2*i]; + optab v2 = internal_fn_hilo_values_array[2*i + 1]; + ifn_pair key1 (fn, 0); + fn_to_optab_map->safe_push ({key1, v1}); + ifn_pair key2 (fn, 1); + fn_to_optab_map->safe_push ({key2, v2}); + } + fn_to_optab_map->qsort (ifn_cmp); + } + + ifn_pair new_pair (fn, sign ? 1 : 0); + optab tmp; + std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp); + auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp); + return entry != fn_to_optab_map->end () ? entry->second : unknown_optab; +} + +extern void +lookup_hilo_internal_fn (enum internal_fn ifn, enum internal_fn *lo, + enum internal_fn *hi) +{ + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + *lo = internal_fn (ifn + 1); + *hi = internal_fn (ifn + 2); +} + + /* Fnspec of each internal function, indexed by function number. */ const_tree internal_fn_fnspec_array[IFN_LAST + 1]; @@ -3970,6 +4046,9 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: return true; default: @@ -4043,6 +4122,42 @@ first_commutative_argument (internal_fn fn) } } +/* Return true if FN has a wider output type than its argument types. */ + +bool +widening_fn_p (internal_fn fn) +{ + switch (fn) + { + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_MINUS: + return true; + + default: + return false; + } +} + +/* Return true if FN decomposes to _hi and _lo IFN. If true this should also + be a widening function. */ + +bool +decomposes_to_hilo_fn_p (internal_fn fn) +{ + if (!widening_fn_p (fn)) + return false; + + switch (fn) + { + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_MINUS: + return true; + + default: + return false; + } +} + /* Return true if IFN_SET_EDOM is supported. */ bool @@ -4055,6 +4170,32 @@ set_edom_supported_p (void) #endif } +#undef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + static void \ + expand_##CODE (internal_fn, gcall *) \ + { \ + gcc_unreachable (); \ + } \ + static void \ + expand_##CODE##_LO (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab); \ + } \ + static void \ + expand_##CODE##_HI (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab); \ + } + #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \ static void \ expand_##CODE (internal_fn fn, gcall *stmt) \ @@ -4071,6 +4212,7 @@ set_edom_supported_p (void) expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ } #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_HILO_FN /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..347ed667d92620e0ee3ea15c58ecac6c242ebe73 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -85,6 +85,13 @@ along with GCC; see the file COPYING3. If not see says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX} group of functions to any integral mode (including vector modes). + DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it + provides convenience wrappers for defining conversions that require a + hi/lo split, like widening and narrowing operations. Each definition + for <NAME> will require an optab named <OPTAB> and two other optabs that + you specify for signed and unsigned. + + Each entry must have a corresponding expander of the form: void expand_NAME (gimple_call stmt) @@ -123,6 +130,14 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) #endif +#ifndef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE) +#endif + + DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, @@ -315,6 +330,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) +DEF_INTERNAL_OPTAB_HILO_FN (VEC_WIDEN_PLUS, + ECF_CONST | ECF_NOTHROW, + vec_widen_add, vec_widen_saddl, vec_widen_uaddl, + binary) +DEF_INTERNAL_OPTAB_HILO_FN (VEC_WIDEN_MINUS, + ECF_CONST | ECF_NOTHROW, + vec_widen_sub, vec_widen_ssubl, vec_widen_usubl, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 08922ed4254898f5fffca3f33973e96ed9ce772f..6a5f8762e872ad2ef64ce2986a678e3b40622d81 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +#include "insn-codes.h" +#include "insn-opinit.h" + + /* INTEGER_CST values for IFN_UNIQUE function arg-0. UNSPEC: Undifferentiated UNIQUE. @@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn) } extern internal_fn lookup_internal_fn (const char *); +extern optab lookup_hilo_ifn_optab (enum internal_fn, unsigned); +extern void lookup_hilo_internal_fn (enum internal_fn, enum internal_fn *, + enum internal_fn *); /* Return the ECF_* flags for function FN. */ @@ -210,6 +217,8 @@ extern bool commutative_binary_fn_p (internal_fn); extern bool commutative_ternary_fn_p (internal_fn); extern int first_commutative_argument (internal_fn); extern bool associative_binary_fn_p (internal_fn); +extern bool widening_fn_p (internal_fn); +extern bool decomposes_to_hilo_fn_p (internal_fn); extern bool set_edom_supported_p (void); diff --git a/gcc/optabs.cc b/gcc/optabs.cc index c8e39c82d57a7d726e7da33d247b80f32ec9236c..d4dd7ee3d34d01c32ab432ae4e4ce9e4b522b2f7 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1314,7 +1314,12 @@ commutative_optab_p (optab binoptab) || binoptab == smul_widen_optab || binoptab == umul_widen_optab || binoptab == smul_highpart_optab - || binoptab == umul_highpart_optab); + || binoptab == umul_highpart_optab + || binoptab == vec_widen_add_optab + || binoptab == vec_widen_saddl_hi_optab + || binoptab == vec_widen_saddl_lo_optab + || binoptab == vec_widen_uaddl_hi_optab + || binoptab == vec_widen_uaddl_lo_optab); } /* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're diff --git a/gcc/optabs.def b/gcc/optabs.def index 695f5911b300c9ca5737de9be809fa01aabe5e01..e064189103b3be70644468d11f3c91ac45ffe0d0 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4") OPTAB_CD(umsub_widen_optab, "umsub$b$a4") OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4") OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4") +OPTAB_CD(vec_widen_add_optab, "add$a$b3") +OPTAB_CD(vec_widen_sub_optab, "sub$a$b3") OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b") OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b") OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */ /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */ /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 68dfba266d679c9738a3d5d70551a91cbdafcf66..1a514461b2ca416f45a5fa9abe417980d33ef4df 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -1394,14 +1394,16 @@ static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, tree_code orig_code, code_helper wide_code, - bool shift_p, const char *name) + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) { gimple *last_stmt = last_stmt_info->stmt; vect_unpromoted_value unprom[2]; tree half_type; if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code, - shift_p, 2, unprom, &half_type)) + shift_p, 2, unprom, &half_type, subtype)) + return NULL; /* Pattern detected. */ @@ -1467,6 +1469,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo, type, pattern_stmt, vecctype); } +static gimple * +vect_recog_widen_op_pattern (vec_info *vinfo, + stmt_vec_info last_stmt_info, tree *type_out, + tree_code orig_code, internal_fn wide_ifn, + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) +{ + combined_fn ifn = as_combined_fn (wide_ifn); + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + orig_code, ifn, shift_p, name, + subtype); +} + + /* Try to detect multiplication on widened inputs, converting MULT_EXPR to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ @@ -1480,26 +1496,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, } /* Try to detect addition on widened inputs, converting PLUS_EXPR - to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_PLUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - PLUS_EXPR, WIDEN_PLUS_EXPR, false, - "vect_recog_widen_plus_pattern"); + PLUS_EXPR, IFN_VEC_WIDEN_PLUS, + false, "vect_recog_widen_plus_pattern", + &subtype); } /* Try to detect subtraction on widened inputs, converting MINUS_EXPR - to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_MINUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - MINUS_EXPR, WIDEN_MINUS_EXPR, false, - "vect_recog_widen_minus_pattern"); + MINUS_EXPR, IFN_VEC_WIDEN_MINUS, + false, "vect_recog_widen_minus_pattern", + &subtype); } /* Function vect_recog_popcount_pattern @@ -6067,6 +6087,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + /* These must come after the double widening ones. */ }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 715ec2e30a4de620b8a5076c0e7f2f7fd1b0654e..f4806073f48d4dedea3ac9bd855792b152d78919 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5035,7 +5035,9 @@ vectorizable_conversion (vec_info *vinfo, bool widen_arith = (code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); + || code == WIDEN_LSHIFT_EXPR + || code == IFN_VEC_WIDEN_PLUS + || code == IFN_VEC_WIDEN_MINUS); if (!widen_arith && !CONVERT_EXPR_CODE_P (code) @@ -5085,7 +5087,9 @@ vectorizable_conversion (vec_info *vinfo, gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR); + || code == WIDEN_MINUS_EXPR + || code == IFN_VEC_WIDEN_PLUS + || code == IFN_VEC_WIDEN_MINUS); op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : @@ -12355,14 +12359,50 @@ supportable_widening_operation (vec_info *vinfo, optab1 = vec_unpacks_sbool_lo_optab; optab2 = vec_unpacks_sbool_hi_optab; } - else - { - optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, - optab_default); - optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, - optab_default); + + if (code.is_fn_code ()) + { + internal_fn ifn = as_internal_fn (code.safe_as_fn_code ()); + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + internal_fn lo, hi; + lookup_hilo_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); + optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); } + if (code.is_tree_code ()) + { + if (code == FIX_TRUNC_EXPR) + { + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype_out, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype_out, + optab_default); + } + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) + && VECTOR_BOOLEAN_TYPE_P (wide_vectype) + && VECTOR_BOOLEAN_TYPE_P (vectype) + && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) + { + /* If the input and result modes are the same, a different optab + is needed where we pass in the number of units in vectype. */ + optab1 = vec_unpacks_sbool_lo_optab; + optab2 = vec_unpacks_sbool_hi_optab; + } + else + { + optab1 = optab_for_tree_code (c1.safe_as_tree_code (), vectype, + optab_default); + optab2 = optab_for_tree_code (c2.safe_as_tree_code (), vectype, + optab_default); + } + } + if (!optab1 || !optab2) return false; [-- Attachment #4: ifn2.patch --] [-- Type: text/plain, Size: 19234 bytes --] diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index 1a1b26b1c6c23ce273bcd08dc9a973f777174007..25b1558dcb941ea491a19aeeb2cd8f4d2dbdf7c6 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -5365,10 +5365,6 @@ expand_debug_expr (tree exp) case VEC_WIDEN_MULT_ODD_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_PERM_EXPR: case VEC_DUPLICATE_EXPR: case VEC_SERIES_EXPR: @@ -5405,8 +5401,6 @@ expand_debug_expr (tree exp) case WIDEN_MULT_EXPR: case WIDEN_MULT_PLUS_EXPR: case WIDEN_MULT_MINUS_EXPR: - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: if (SCALAR_INT_MODE_P (GET_MODE (op0)) && SCALAR_INT_MODE_P (mode)) { @@ -5419,10 +5413,6 @@ expand_debug_expr (tree exp) op1 = simplify_gen_unary (ZERO_EXTEND, mode, op1, inner_mode); else op1 = simplify_gen_unary (SIGN_EXTEND, mode, op1, inner_mode); - if (TREE_CODE (exp) == WIDEN_PLUS_EXPR) - return simplify_gen_binary (PLUS, mode, op0, op1); - else if (TREE_CODE (exp) == WIDEN_MINUS_EXPR) - return simplify_gen_binary (MINUS, mode, op0, op1); op0 = simplify_gen_binary (MULT, mode, op0, op1); if (TREE_CODE (exp) == WIDEN_MULT_EXPR) return op0; diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index 2c14b7abce2db0a3da0a21e916907947cb56a265..3816abaaf4d364d604a44942317f96f3f303e5b6 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1811,10 +1811,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @tindex VEC_WIDEN_MULT_LO_EXPR -@tindex VEC_WIDEN_PLUS_HI_EXPR -@tindex VEC_WIDEN_PLUS_LO_EXPR -@tindex VEC_WIDEN_MINUS_HI_EXPR -@tindex VEC_WIDEN_MINUS_LO_EXPR @tindex VEC_UNPACK_HI_EXPR @tindex VEC_UNPACK_LO_EXPR @tindex VEC_UNPACK_FLOAT_HI_EXPR @@ -1861,33 +1857,6 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the low @code{N/2} elements of the two vector are multiplied to produce the vector of @code{N/2} products. -@item VEC_WIDEN_PLUS_HI_EXPR -@itemx VEC_WIDEN_PLUS_LO_EXPR -These nodes represent widening vector addition of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The result -is a vector that contains half as many elements, of an integral type whose size -is twice as wide. In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. - -@item VEC_WIDEN_MINUS_HI_EXPR -@itemx VEC_WIDEN_MINUS_LO_EXPR -These nodes represent widening vector subtraction of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The high/low -elements of the second vector are subtracted from the high/low elements of the -first. The result is a vector that contains half as many elements, of an -integral type whose size is twice as wide. In the case of -@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second -vector are subtracted from the high @code{N/2} of the first to produce the -vector of @code{N/2} products. In the case of -@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second -vector are subtracted from the low @code{N/2} of the first to produce the -vector of @code{N/2} products. - @item VEC_UNPACK_HI_EXPR @itemx VEC_UNPACK_LO_EXPR These nodes represent unpacking of the high and low parts of the input vector, diff --git a/gcc/expr.cc b/gcc/expr.cc index f8f5cc5a6ca67f291b3c8b7246d593c0be80272f..454d1391b19a7d2aa53f0a88876d1eaf0494de51 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -9601,8 +9601,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target, unsignedp); return target; - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_MULT_EXPR: /* If first operand is constant, swap them. Thus the following special case checks need only @@ -10380,10 +10378,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, return temp; } - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc index 300e9d7ed1e7be73f30875e08c461a8880c3134e..d903826894e7f0dfd34dc0caad92eea3caa45e05 100644 --- a/gcc/gimple-pretty-print.cc +++ b/gcc/gimple-pretty-print.cc @@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc, case VEC_PACK_FLOAT_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_SERIES_EXPR: for (p = get_tree_code_name (code); *p; p++) pp_character (buffer, TOUPPER (*p)); diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index 4ca32a7b5d52f8426b09d1446a336650e143b41f..5ae7f7596c6fc6f901e4e47ae44f00185f4602b2 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -797,12 +797,6 @@ gimple_range_op_handler::maybe_non_standard () if (gimple_code (m_stmt) == GIMPLE_ASSIGN) switch (gimple_assign_rhs_code (m_stmt)) { - case WIDEN_PLUS_EXPR: - { - signed_op = ptr_op_widen_plus_signed; - unsigned_op = ptr_op_widen_plus_unsigned; - } - gcc_fallthrough (); case WIDEN_MULT_EXPR: { m_valid = false; diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc index 8010046c6a8b3e809c989ddef7a06ddaa68ae32a..ee1aa8c9676ee9c67edbf403e6295da391826a62 100644 --- a/gcc/optabs-tree.cc +++ b/gcc/optabs-tree.cc @@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, return (TYPE_UNSIGNED (type) ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab); - case VEC_WIDEN_PLUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab); - - case VEC_WIDEN_PLUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab); - - case VEC_WIDEN_MINUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab); - - case VEC_WIDEN_MINUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab); - case VEC_UNPACK_HI_EXPR: return (TYPE_UNSIGNED (type) ? vec_unpacku_hi_optab : vec_unpacks_hi_optab); @@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, 'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO. Supported widening operations: - WIDEN_MINUS_EXPR - WIDEN_PLUS_EXPR WIDEN_MULT_EXPR WIDEN_LSHIFT_EXPR @@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out, case WIDEN_LSHIFT_EXPR: *code1 = LSHIFT_EXPR; break; - case WIDEN_MINUS_EXPR: - *code1 = MINUS_EXPR; - break; - case WIDEN_PLUS_EXPR: - *code1 = PLUS_EXPR; - break; case WIDEN_MULT_EXPR: *code1 = MULT_EXPR; break; diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index a9fcc7fd050f871437ef336ecfb8d6cc81280ee0..f80cd1465df83b5540492e619e56b9af249e9f31 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -4017,8 +4017,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case PLUS_EXPR: case MINUS_EXPR: { @@ -4139,10 +4137,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index c702f0032a19203a7c536a01c1e7f47fc7b77add..6e5fd45a0c2435109dd3d50e8fc8e1d4969a1fd0 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -4273,8 +4273,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case REALIGN_LOAD_EXPR: - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case DOT_PROD_EXPR: @@ -4283,10 +4281,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case WIDEN_MULT_MINUS_EXPR: case WIDEN_LSHIFT_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 7947f9647a15110b52d195643ad7d28ee32d4236..9941d8bf80535a98e647b8928619a6bf08bc434c 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -2874,8 +2874,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, break; /* Binary arithmetic and logic expressions. */ - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case MULT_EXPR: @@ -3831,10 +3829,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, case VEC_SERIES_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: case VEC_WIDEN_MULT_ODD_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: @@ -4352,12 +4346,6 @@ op_symbol_code (enum tree_code code) case WIDEN_LSHIFT_EXPR: return "w<<"; - case WIDEN_PLUS_EXPR: - return "w+"; - - case WIDEN_MINUS_EXPR: - return "w-"; - case POINTER_PLUS_EXPR: return "+"; diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 8daf7bd7dd34d043b1d7b4cba1779f0ecf9f520a..213a3899a6c145bb057cd118bec1df7a05728aef 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR || gimple_assign_rhs_code (assign) == FLOAT_EXPR) { tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign)); diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index 445da53292e9d1d2db62ca962fc017bb0e6c9bbe..342ffc5fa7f3b8f37e6bd4658d2f1fccf1d2c7fa 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -2227,10 +2227,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi, arguments, not the widened result. VEC_UNPACK_FLOAT_*_EXPR is calculated in the same way above. */ if (code == WIDEN_SUM_EXPR - || code == VEC_WIDEN_PLUS_HI_EXPR - || code == VEC_WIDEN_PLUS_LO_EXPR - || code == VEC_WIDEN_MINUS_HI_EXPR - || code == VEC_WIDEN_MINUS_LO_EXPR || code == VEC_WIDEN_MULT_HI_EXPR || code == VEC_WIDEN_MULT_LO_EXPR || code == VEC_WIDEN_MULT_EVEN_EXPR diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 1a514461b2ca416f45a5fa9abe417980d33ef4df..13c69133d7ae565cf0334390cb0c303c89f98ac8 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -561,21 +561,35 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type) static unsigned int vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, - tree_code widened_code, bool shift_p, + code_helper widened_code, bool shift_p, unsigned int max_nops, vect_unpromoted_value *unprom, tree *common_type, enum optab_subtype *subtype = NULL) { /* Check for an integer operation with the right code. */ - gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); - if (!assign) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return 0; - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + code_helper rhs_code; + if (is_gimple_assign (stmt)) + { + rhs_code = gimple_assign_rhs_code (stmt); + if (rhs_code.safe_as_tree_code () != code + && rhs_code.get_rep () != widened_code.get_rep ()) + return 0; + } + else if (is_gimple_call (stmt)) + { + rhs_code = gimple_call_combined_fn (stmt); + if (rhs_code.get_rep () != widened_code.get_rep ()) + return 0; + } + else return 0; - tree type = TREE_TYPE (gimple_assign_lhs (assign)); + tree lhs = gimple_get_lhs (stmt); + tree type = TREE_TYPE (lhs); if (!INTEGRAL_TYPE_P (type)) return 0; @@ -588,7 +602,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, { vect_unpromoted_value *this_unprom = &unprom[next_op]; unsigned int nops = 1; - tree op = gimple_op (assign, i + 1); + tree op = gimple_arg (stmt, i); if (i == 1 && TREE_CODE (op) == INTEGER_CST) { /* We already have a common type from earlier operands. @@ -1342,8 +1356,9 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR, - false, 2, unprom, &half_type)) + if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, + CFN_VEC_WIDEN_MINUS, false, 2, unprom, + &half_type)) return NULL; vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); @@ -2696,9 +2711,10 @@ vect_recog_average_pattern (vec_info *vinfo, internal_fn ifn = IFN_AVG_FLOOR; vect_unpromoted_value unprom[3]; tree new_type; + enum optab_subtype subtype; unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR, - WIDEN_PLUS_EXPR, false, 3, - unprom, &new_type); + CFN_VEC_WIDEN_PLUS, false, 3, + unprom, &new_type, &subtype); if (nops == 0) return NULL; if (nops == 3) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index f4806073f48d4dedea3ac9bd855792b152d78919..38f4680d45ab80e8f86327327c13667d96bc5bea 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5032,9 +5032,7 @@ vectorizable_conversion (vec_info *vinfo, else return false; - bool widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR + bool widen_arith = (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == IFN_VEC_WIDEN_PLUS || code == IFN_VEC_WIDEN_MINUS); @@ -5086,8 +5084,6 @@ vectorizable_conversion (vec_info *vinfo, { gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR || code == IFN_VEC_WIDEN_PLUS || code == IFN_VEC_WIDEN_MINUS); @@ -12211,7 +12207,7 @@ supportable_widening_operation (vec_info *vinfo, class loop *vect_loop = NULL; machine_mode vec_mode; enum insn_code icode1, icode2; - optab optab1, optab2; + optab optab1 = unknown_optab, optab2 = unknown_optab; tree vectype = vectype_in; tree wide_vectype = vectype_out; code_helper c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES; @@ -12305,16 +12301,6 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_WIDEN_LSHIFT_HI_EXPR; break; - case WIDEN_PLUS_EXPR: - c1 = VEC_WIDEN_PLUS_LO_EXPR; - c2 = VEC_WIDEN_PLUS_HI_EXPR; - break; - - case WIDEN_MINUS_EXPR: - c1 = VEC_WIDEN_MINUS_LO_EXPR; - c2 = VEC_WIDEN_MINUS_HI_EXPR; - break; - CASE_CONVERT: c1 = VEC_UNPACK_LO_EXPR; c2 = VEC_UNPACK_HI_EXPR; diff --git a/gcc/tree.def b/gcc/tree.def index ee02754354f015a16737c7e879d89c3e3be0d5aa..a58e608a90078818a7ade9d1173ac7ec84c48c7a 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3) DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2) /* Widening sad (sum of absolute differences). - The first two arguments are of type t1 which should be integer. - The third argument and the result are of type t2, such that t2 is at least - twice the size of t1. Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is + The first two arguments are of type t1 which should be a vector of integers. + The third argument and the result are of type t2, such that the size of + the elements of t2 is at least twice the size of the elements of t1. + Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is equivalent to: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = PLUS_EXPR (tmp2, arg3) or: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = WIDEN_SUM_EXPR (tmp2, arg3) */ @@ -1421,8 +1422,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3) the first argument from type t1 to type t2, and then shifting it by the second argument. */ DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2) /* Widening vector multiplication. The two operands are vectors with N elements of size S. Multiplying the @@ -1487,10 +1486,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2) */ DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2) /* PREDICT_EXPR. Specify hint for branch prediction. The PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2023-04-20 13:23 ` Andre Vieira (lists) @ 2023-04-24 11:57 ` Richard Biener 2023-04-24 13:01 ` Richard Sandiford 2023-04-25 9:55 ` Andre Vieira (lists) 0 siblings, 2 replies; 53+ messages in thread From: Richard Biener @ 2023-04-24 11:57 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches On Thu, Apr 20, 2023 at 3:24 PM Andre Vieira (lists) via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Rebased all three patches and made some small changes to the second one: > - removed sub and abd optabs from commutative_optab_p, I suspect this > was a copy paste mistake, > - removed what I believe to be a superfluous switch case in vectorizable > conversion, the one that was here: > + if (code.is_fn_code ()) > + { > + internal_fn ifn = as_internal_fn (code.as_fn_code ()); > + int ecf_flags = internal_fn_flags (ifn); > + gcc_assert (ecf_flags & ECF_MULTI); > + > + switch (code.as_fn_code ()) > + { > + case CFN_VEC_WIDEN_PLUS: > + break; > + case CFN_VEC_WIDEN_MINUS: > + break; > + case CFN_LAST: > + default: > + return false; > + } > + > + internal_fn lo, hi; > + lookup_multi_internal_fn (ifn, &lo, &hi); > + *code1 = as_combined_fn (lo); > + *code2 = as_combined_fn (hi); > + optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); > + optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); > } > > I don't think we need to check they are a specfic fn code, as we look-up > optabs and if they succeed then surely we can vectorize? > > OK for trunk? In the first patch I see some uses of safe_as_tree_code like + if (ch.is_tree_code ()) + return op1 == NULL_TREE ? gimple_build_assign (lhs, ch.safe_as_tree_code (), + op0) : + gimple_build_assign (lhs, ch.safe_as_tree_code (), + op0, op1); + else + { + internal_fn fn = as_internal_fn (ch.safe_as_fn_code ()); + gimple* stmt; where the context actually requires a valid tree code. Please change those to force to tree code / ifn code. Just use explicit casts here and the other places that are similar. Before the as_internal_fn just put a gcc_assert (ch.is_internal_fn ()). Maybe the need for the (ugly) safe_as_tree_code/fn_code goes away then. Otherwise patch1 looks OK. Unfortunately there are no ChangeLog / patch descriptions on the changes. patch2 has - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + code_helper rhs_code; + if (is_gimple_assign (stmt)) + { + rhs_code = gimple_assign_rhs_code (stmt); + if (rhs_code.safe_as_tree_code () != code + && rhs_code.get_rep () != widened_code.get_rep ()) + return 0; + } + else if (is_gimple_call (stmt)) + { + rhs_code = gimple_call_combined_fn (stmt); + if (rhs_code.get_rep () != widened_code.get_rep ()) + return 0; + } that looks mightly complicated - esp. the use of get_rep () looks dangerous? What's the intent of this? Not that I understand the existing code much. A comment would clearly help (also indicating test coverage). > Kind regards, > Andre ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2023-04-24 11:57 ` Richard Biener @ 2023-04-24 13:01 ` Richard Sandiford 2023-04-25 12:30 ` Richard Biener 2023-04-25 9:55 ` Andre Vieira (lists) 1 sibling, 1 reply; 53+ messages in thread From: Richard Sandiford @ 2023-04-24 13:01 UTC (permalink / raw) To: Richard Biener; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches Richard Biener <richard.guenther@gmail.com> writes: > On Thu, Apr 20, 2023 at 3:24 PM Andre Vieira (lists) via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> Rebased all three patches and made some small changes to the second one: >> - removed sub and abd optabs from commutative_optab_p, I suspect this >> was a copy paste mistake, >> - removed what I believe to be a superfluous switch case in vectorizable >> conversion, the one that was here: >> + if (code.is_fn_code ()) >> + { >> + internal_fn ifn = as_internal_fn (code.as_fn_code ()); >> + int ecf_flags = internal_fn_flags (ifn); >> + gcc_assert (ecf_flags & ECF_MULTI); >> + >> + switch (code.as_fn_code ()) >> + { >> + case CFN_VEC_WIDEN_PLUS: >> + break; >> + case CFN_VEC_WIDEN_MINUS: >> + break; >> + case CFN_LAST: >> + default: >> + return false; >> + } >> + >> + internal_fn lo, hi; >> + lookup_multi_internal_fn (ifn, &lo, &hi); >> + *code1 = as_combined_fn (lo); >> + *code2 = as_combined_fn (hi); >> + optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); >> + optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); >> } >> >> I don't think we need to check they are a specfic fn code, as we look-up >> optabs and if they succeed then surely we can vectorize? >> >> OK for trunk? > > In the first patch I see some uses of safe_as_tree_code like > > + if (ch.is_tree_code ()) > + return op1 == NULL_TREE ? gimple_build_assign (lhs, > ch.safe_as_tree_code (), > + op0) : > + gimple_build_assign (lhs, ch.safe_as_tree_code (), > + op0, op1); > + else > + { > + internal_fn fn = as_internal_fn (ch.safe_as_fn_code ()); > + gimple* stmt; > > where the context actually requires a valid tree code. Please change those > to force to tree code / ifn code. Just use explicit casts here and the other > places that are similar. Before the as_internal_fn just put a > gcc_assert (ch.is_internal_fn ()). Also, doesn't the above ?: simplify to the "else" arm? Null trailing arguments would be ignored for unary operators. I wasn't sure what to make of the op0 handling: > +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, > + or internal_fn contained in ch, respectively. */ > +gimple * > +vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1) > +{ > + if (op0 == NULL_TREE) > + return NULL; Can that happen, and if so, does returning null make sense? Maybe an assert would be safer. Thanks, Richard ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2023-04-24 13:01 ` Richard Sandiford @ 2023-04-25 12:30 ` Richard Biener 2023-04-28 16:06 ` Andre Vieira (lists) 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2023-04-25 12:30 UTC (permalink / raw) To: Richard Sandiford; +Cc: Richard Biener, Andre Vieira (lists), gcc-patches On Mon, 24 Apr 2023, Richard Sandiford wrote: > Richard Biener <richard.guenther@gmail.com> writes: > > On Thu, Apr 20, 2023 at 3:24?PM Andre Vieira (lists) via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> > >> Rebased all three patches and made some small changes to the second one: > >> - removed sub and abd optabs from commutative_optab_p, I suspect this > >> was a copy paste mistake, > >> - removed what I believe to be a superfluous switch case in vectorizable > >> conversion, the one that was here: > >> + if (code.is_fn_code ()) > >> + { > >> + internal_fn ifn = as_internal_fn (code.as_fn_code ()); > >> + int ecf_flags = internal_fn_flags (ifn); > >> + gcc_assert (ecf_flags & ECF_MULTI); > >> + > >> + switch (code.as_fn_code ()) > >> + { > >> + case CFN_VEC_WIDEN_PLUS: > >> + break; > >> + case CFN_VEC_WIDEN_MINUS: > >> + break; > >> + case CFN_LAST: > >> + default: > >> + return false; > >> + } > >> + > >> + internal_fn lo, hi; > >> + lookup_multi_internal_fn (ifn, &lo, &hi); > >> + *code1 = as_combined_fn (lo); > >> + *code2 = as_combined_fn (hi); > >> + optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); > >> + optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); > >> } > >> > >> I don't think we need to check they are a specfic fn code, as we look-up > >> optabs and if they succeed then surely we can vectorize? > >> > >> OK for trunk? > > > > In the first patch I see some uses of safe_as_tree_code like > > > > + if (ch.is_tree_code ()) > > + return op1 == NULL_TREE ? gimple_build_assign (lhs, > > ch.safe_as_tree_code (), > > + op0) : > > + gimple_build_assign (lhs, ch.safe_as_tree_code (), > > + op0, op1); > > + else > > + { > > + internal_fn fn = as_internal_fn (ch.safe_as_fn_code ()); > > + gimple* stmt; > > > > where the context actually requires a valid tree code. Please change those > > to force to tree code / ifn code. Just use explicit casts here and the other > > places that are similar. Before the as_internal_fn just put a > > gcc_assert (ch.is_internal_fn ()). > > Also, doesn't the above ?: simplify to the "else" arm? Null trailing > arguments would be ignored for unary operators. > > I wasn't sure what to make of the op0 handling: > > > +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, > > + or internal_fn contained in ch, respectively. */ > > +gimple * > > +vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1) > > +{ > > + if (op0 == NULL_TREE) > > + return NULL; > > Can that happen, and if so, does returning null make sense? > Maybe an assert would be safer. Yeah, I was hoping to have a look whether the new gimple_build overloads could be used to make this all better (but hoped we can finally get this series in in some way). Richard. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2023-04-25 12:30 ` Richard Biener @ 2023-04-28 16:06 ` Andre Vieira (lists) 0 siblings, 0 replies; 53+ messages in thread From: Andre Vieira (lists) @ 2023-04-28 16:06 UTC (permalink / raw) To: Richard Biener, Richard Sandiford; +Cc: Richard Biener, gcc-patches On 25/04/2023 13:30, Richard Biener wrote: > On Mon, 24 Apr 2023, Richard Sandiford wrote: > >> Richard Biener <richard.guenther@gmail.com> writes: >>> On Thu, Apr 20, 2023 at 3:24?PM Andre Vieira (lists) via Gcc-patches >>> <gcc-patches@gcc.gnu.org> wrote: >>>> >>>> Rebased all three patches and made some small changes to the second one: >>>> - removed sub and abd optabs from commutative_optab_p, I suspect this >>>> was a copy paste mistake, >>>> - removed what I believe to be a superfluous switch case in vectorizable >>>> conversion, the one that was here: >>>> + if (code.is_fn_code ()) >>>> + { >>>> + internal_fn ifn = as_internal_fn (code.as_fn_code ()); >>>> + int ecf_flags = internal_fn_flags (ifn); >>>> + gcc_assert (ecf_flags & ECF_MULTI); >>>> + >>>> + switch (code.as_fn_code ()) >>>> + { >>>> + case CFN_VEC_WIDEN_PLUS: >>>> + break; >>>> + case CFN_VEC_WIDEN_MINUS: >>>> + break; >>>> + case CFN_LAST: >>>> + default: >>>> + return false; >>>> + } >>>> + >>>> + internal_fn lo, hi; >>>> + lookup_multi_internal_fn (ifn, &lo, &hi); >>>> + *code1 = as_combined_fn (lo); >>>> + *code2 = as_combined_fn (hi); >>>> + optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); >>>> + optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); >>>> } >>>> >>>> I don't think we need to check they are a specfic fn code, as we look-up >>>> optabs and if they succeed then surely we can vectorize? >>>> >>>> OK for trunk? >>> >>> In the first patch I see some uses of safe_as_tree_code like >>> >>> + if (ch.is_tree_code ()) >>> + return op1 == NULL_TREE ? gimple_build_assign (lhs, >>> ch.safe_as_tree_code (), >>> + op0) : >>> + gimple_build_assign (lhs, ch.safe_as_tree_code (), >>> + op0, op1); >>> + else >>> + { >>> + internal_fn fn = as_internal_fn (ch.safe_as_fn_code ()); >>> + gimple* stmt; >>> >>> where the context actually requires a valid tree code. Please change those >>> to force to tree code / ifn code. Just use explicit casts here and the other >>> places that are similar. Before the as_internal_fn just put a >>> gcc_assert (ch.is_internal_fn ()). >> >> Also, doesn't the above ?: simplify to the "else" arm? Null trailing >> arguments would be ignored for unary operators. >> >> I wasn't sure what to make of the op0 handling: >> >>> +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, >>> + or internal_fn contained in ch, respectively. */ >>> +gimple * >>> +vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1) >>> +{ >>> + if (op0 == NULL_TREE) >>> + return NULL; >> >> Can that happen, and if so, does returning null make sense? >> Maybe an assert would be safer. > > Yeah, I was hoping to have a look whether the new gimple_build > overloads could be used to make this all better (but hoped we can > finally get this series in in some way). > > Richard. Yeah, in the newest version of the first patch of the series I found that most of the time I can get away with only really needing to distinguish between tree_code and internal_fn when building gimple, for which it currently uses vect_gimple_build, but it does feel like that could easily be a gimple function. Having said that, as I partially mention in the patch, I didn't rewrite the optabs-tree supportable_half_widening and supportable_conversion (or whatever they are called) because those also at some point need to access the stmt and there is a massive difference in how we handle gassigns and gcall's from that perspective, but maybe we can generalize that too somehow... Anyway have a look at the new versions (posted just some minutes after the email I'm replying too haha! timing :P) ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2023-04-24 11:57 ` Richard Biener 2023-04-24 13:01 ` Richard Sandiford @ 2023-04-25 9:55 ` Andre Vieira (lists) 2023-04-28 12:36 ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists) ` (2 more replies) 1 sibling, 3 replies; 53+ messages in thread From: Andre Vieira (lists) @ 2023-04-25 9:55 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches On 24/04/2023 12:57, Richard Biener wrote: > On Thu, Apr 20, 2023 at 3:24 PM Andre Vieira (lists) via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> Rebased all three patches and made some small changes to the second one: >> - removed sub and abd optabs from commutative_optab_p, I suspect this >> was a copy paste mistake, >> - removed what I believe to be a superfluous switch case in vectorizable >> conversion, the one that was here: >> + if (code.is_fn_code ()) >> + { >> + internal_fn ifn = as_internal_fn (code.as_fn_code ()); >> + int ecf_flags = internal_fn_flags (ifn); >> + gcc_assert (ecf_flags & ECF_MULTI); >> + >> + switch (code.as_fn_code ()) >> + { >> + case CFN_VEC_WIDEN_PLUS: >> + break; >> + case CFN_VEC_WIDEN_MINUS: >> + break; >> + case CFN_LAST: >> + default: >> + return false; >> + } >> + >> + internal_fn lo, hi; >> + lookup_multi_internal_fn (ifn, &lo, &hi); >> + *code1 = as_combined_fn (lo); >> + *code2 = as_combined_fn (hi); >> + optab1 = lookup_multi_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); >> + optab2 = lookup_multi_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); >> } >> >> I don't think we need to check they are a specfic fn code, as we look-up >> optabs and if they succeed then surely we can vectorize? >> >> OK for trunk? > > In the first patch I see some uses of safe_as_tree_code like > > + if (ch.is_tree_code ()) > + return op1 == NULL_TREE ? gimple_build_assign (lhs, > ch.safe_as_tree_code (), > + op0) : > + gimple_build_assign (lhs, ch.safe_as_tree_code (), > + op0, op1); > + else > + { > + internal_fn fn = as_internal_fn (ch.safe_as_fn_code ()); > + gimple* stmt; > > where the context actually requires a valid tree code. Please change those > to force to tree code / ifn code. Just use explicit casts here and the other > places that are similar. Before the as_internal_fn just put a > gcc_assert (ch.is_internal_fn ()). > > Maybe the need for the (ugly) safe_as_tree_code/fn_code goes away then. > > Otherwise patch1 looks OK. > > Unfortunately there are no ChangeLog / patch descriptions on the changes. > patch2 has > > - tree_code rhs_code = gimple_assign_rhs_code (assign); > - if (rhs_code != code && rhs_code != widened_code) > + code_helper rhs_code; > + if (is_gimple_assign (stmt)) > + { > + rhs_code = gimple_assign_rhs_code (stmt); > + if (rhs_code.safe_as_tree_code () != code > + && rhs_code.get_rep () != widened_code.get_rep ()) > + return 0; > + } > + else if (is_gimple_call (stmt)) > + { > + rhs_code = gimple_call_combined_fn (stmt); > + if (rhs_code.get_rep () != widened_code.get_rep ()) > + return 0; > + } > > that looks mightly complicated - esp. the use of get_rep () > looks dangerous? What's the intent of this? Not that I > understand the existing code much. A comment would > clearly help (also indicating test coverage). I don't think the use of get_rep here is dangerous, it's meant to avoid having to check whether widened_code is the same 'kind' as rhs_code. With get_rep we don't have to do this check first because tree_codes will have positive reps and combined_fns negative reps. Having said that, this can all be simplified and we don't need to use get_rep either as the == operator has been overloaded to use get_rep and even use the constructor on the rhs of the ==, so I suggest moving the check after assigning rhs_code and just doing: if (rhs_code != code && rhs_code != widened_code) return 0; > > >> Kind regards, >> Andre ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH 1/3] Refactor to allow internal_fn's 2023-04-25 9:55 ` Andre Vieira (lists) @ 2023-04-28 12:36 ` Andre Vieira (lists) 2023-05-03 11:55 ` Richard Biener 2023-04-28 12:37 ` [PATCH 2/3] Refactor widen_plus as internal_fn Andre Vieira (lists) 2023-04-28 12:37 ` [PATCH 3/3] Remove widen_plus/minus_expr tree codes Andre Vieira (lists) 2 siblings, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-04-28 12:36 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches [-- Attachment #1: Type: text/plain, Size: 1769 bytes --] Hi, I'm posting the patches separately now with ChangeLogs. I made the suggested changes and tried to simplify the code a bit further. Where internal to tree-vect-stmts I changed most functions to use code_helper to avoid having to check at places we didn't need to. I was trying to simplify things further by also modifying supportable_half_widening_operation and supportable_convert_operation but the result of that was that I ended up moving the code to cast to tree code inside them rather than at the call site and it didn't look simpler, so I left those. Though if we did make those changes we'd no longer need to keep around the tc1 variable in vectorizable_conversion... Let me know what you think. gcc/ChangeLog: 2023-04-28 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> * tree-vect-patterns.cc (vect_gimple_build): New Function. (vect_recog_widen_op_pattern): Refactor to use code_helper. * tree-vect-stmts.cc (vect_gen_widened_results_half): Likewise. (vect_create_vectorized_demotion_stmts): Likewise. (vect_create_vectorized_promotion_stmts): Likewise. (vect_create_half_widening_stmts): Likewise. (vectorizable_conversion): Likewise. (vectorizable_call): Likewise. (supportable_widening_operation): Likewise. (supportable_narrowing_operation): Likewise. (simple_integer_narrowing): Likewise. * tree-vectorizer.h (supportable_widening_operation): Likewise. (supportable_narrowing_operation): Likewise. (vect_gimple_build): New function prototype. * tree.h (code_helper::safe_as_tree_code): New function. (code_helper::safe_as_fn_code): New function. [-- Attachment #2: ifn0_v2.patch --] [-- Type: text/plain, Size: 22695 bytes --] diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 8802141cd6edb298866025b8a55843eae1f0eb17..b35023adade94c1996cd076c4b7419560e819c6b 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -25,6 +25,8 @@ along with GCC; see the file COPYING3. If not see #include "rtl.h" #include "tree.h" #include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-fold.h" #include "ssa.h" #include "expmed.h" #include "optabs-tree.h" @@ -1391,7 +1393,7 @@ vect_recog_sad_pattern (vec_info *vinfo, static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, - tree_code orig_code, tree_code wide_code, + tree_code orig_code, code_helper wide_code, bool shift_p, const char *name) { gimple *last_stmt = last_stmt_info->stmt; @@ -1434,7 +1436,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, vecctype = get_vectype_for_scalar_type (vinfo, ctype); } - enum tree_code dummy_code; + code_helper dummy_code; int dummy_int; auto_vec<tree> dummy_vec; if (!vectype @@ -1455,8 +1457,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, 2, oprnd, half_type, unprom, vectype); tree var = vect_recog_temp_ssa_var (itype, NULL); - gimple *pattern_stmt = gimple_build_assign (var, wide_code, - oprnd[0], oprnd[1]); + gimple *pattern_stmt = vect_gimple_build (var, wide_code, oprnd[0], oprnd[1]); if (vecctype != vecitype) pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype, @@ -6406,3 +6407,20 @@ vect_pattern_recog (vec_info *vinfo) /* After this no more add_stmt calls are allowed. */ vinfo->stmt_vec_info_ro = true; } + +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, + or internal_fn contained in ch, respectively. */ +gimple * +vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1) +{ + gcc_assert (op0 != NULL_TREE); + if (ch.is_tree_code ()) + return gimple_build_assign (lhs, (tree_code) ch, op0, op1); + + gcc_assert (ch.is_internal_fn ()); + gimple* stmt = gimple_build_call_internal (as_internal_fn ((combined_fn) ch), + op1 == NULL_TREE ? 1 : 2, + op0, op1); + gimple_call_set_lhs (stmt, lhs); + return stmt; +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 6b7dbfd4a231baec24e740ffe0ce0b0bf7a1de6b..ce47f4940fa9a1baca4ba1162065cfc3b4072eba 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -3258,13 +3258,13 @@ vectorizable_bswap (vec_info *vinfo, static bool simple_integer_narrowing (tree vectype_out, tree vectype_in, - tree_code *convert_code) + code_helper *convert_code) { if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype_out)) || !INTEGRAL_TYPE_P (TREE_TYPE (vectype_in))) return false; - tree_code code; + code_helper code; int multi_step_cvt = 0; auto_vec <tree, 8> interm_types; if (!supportable_narrowing_operation (NOP_EXPR, vectype_out, vectype_in, @@ -3478,7 +3478,7 @@ vectorizable_call (vec_info *vinfo, tree callee = gimple_call_fndecl (stmt); /* First try using an internal function. */ - tree_code convert_code = ERROR_MARK; + code_helper convert_code = MAX_TREE_CODES; if (cfn != CFN_LAST && (modifier == NONE || (modifier == NARROW @@ -3664,8 +3664,8 @@ vectorizable_call (vec_info *vinfo, continue; } new_temp = make_ssa_name (vec_dest); - new_stmt = gimple_build_assign (new_temp, convert_code, - prev_res, half_res); + new_stmt = vect_gimple_build (new_temp, convert_code, + prev_res, half_res); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } @@ -3755,8 +3755,8 @@ vectorizable_call (vec_info *vinfo, continue; } new_temp = make_ssa_name (vec_dest); - new_stmt = gimple_build_assign (new_temp, convert_code, - prev_res, half_res); + new_stmt = vect_gimple_build (new_temp, convert_code, prev_res, + half_res); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } else @@ -4768,7 +4768,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, STMT_INFO is the original scalar stmt that we are vectorizing. */ static gimple * -vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, +vect_gen_widened_results_half (vec_info *vinfo, code_helper ch, tree vec_oprnd0, tree vec_oprnd1, int op_type, tree vec_dest, gimple_stmt_iterator *gsi, stmt_vec_info stmt_info) @@ -4777,12 +4777,11 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, tree new_temp; /* Generate half of the widened result: */ - gcc_assert (op_type == TREE_CODE_LENGTH (code)); if (op_type != binary_op) vec_oprnd1 = NULL; - new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1); + new_stmt = vect_gimple_build (vec_dest, ch, vec_oprnd0, vec_oprnd1); new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_temp); + gimple_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); return new_stmt; @@ -4799,7 +4798,7 @@ vect_create_vectorized_demotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds, stmt_vec_info stmt_info, vec<tree> &vec_dsts, gimple_stmt_iterator *gsi, - slp_tree slp_node, enum tree_code code) + slp_tree slp_node, code_helper code) { unsigned int i; tree vop0, vop1, new_tmp, vec_dest; @@ -4811,9 +4810,9 @@ vect_create_vectorized_demotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds, /* Create demotion operation. */ vop0 = (*vec_oprnds)[i]; vop1 = (*vec_oprnds)[i + 1]; - gassign *new_stmt = gimple_build_assign (vec_dest, code, vop0, vop1); + gimple *new_stmt = vect_gimple_build (vec_dest, code, vop0, vop1); new_tmp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_tmp); + gimple_set_lhs (new_stmt, new_tmp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); if (multi_step_cvt) @@ -4861,8 +4860,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds1, stmt_vec_info stmt_info, tree vec_dest, gimple_stmt_iterator *gsi, - enum tree_code code1, - enum tree_code code2, int op_type) + code_helper ch1, + code_helper ch2, int op_type) { int i; tree vop0, vop1, new_tmp1, new_tmp2; @@ -4878,10 +4877,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vop1 = NULL_TREE; /* Generate the two halves of promotion operation. */ - new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1, + new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1, op_type, vec_dest, gsi, stmt_info); - new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1, + new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1, op_type, vec_dest, gsi, stmt_info); if (is_gimple_call (new_stmt1)) @@ -4912,7 +4911,7 @@ vect_create_half_widening_stmts (vec_info *vinfo, vec<tree> *vec_oprnds1, stmt_vec_info stmt_info, tree vec_dest, gimple_stmt_iterator *gsi, - enum tree_code code1, + code_helper code1, int op_type) { int i; @@ -4942,13 +4941,13 @@ vect_create_half_widening_stmts (vec_info *vinfo, new_stmt2 = gimple_build_assign (new_tmp2, NOP_EXPR, vop1); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt2, gsi); /* Perform the operation. With both vector inputs widened. */ - new_stmt3 = gimple_build_assign (vec_dest, code1, new_tmp1, new_tmp2); + new_stmt3 = vect_gimple_build (vec_dest, code1, new_tmp1, new_tmp2); } else { /* Perform the operation. With the single vector input widened. */ - new_stmt3 = gimple_build_assign (vec_dest, code1, new_tmp1, vop1); - } + new_stmt3 = vect_gimple_build (vec_dest, code1, new_tmp1, vop1); + } new_tmp3 = make_ssa_name (vec_dest, new_stmt3); gimple_assign_set_lhs (new_stmt3, new_tmp3); @@ -4978,8 +4977,9 @@ vectorizable_conversion (vec_info *vinfo, tree scalar_dest; tree op0, op1 = NULL_TREE; loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo); - enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK; - enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; + tree_code tc1; + code_helper code, code1, code2; + code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; tree new_temp; enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type}; int ndts = 2; @@ -5008,31 +5008,43 @@ vectorizable_conversion (vec_info *vinfo, && ! vec_stmt) return false; - gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt); - if (!stmt) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return false; - if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) + if (gimple_get_lhs (stmt) == NULL_TREE + || TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) return false; - code = gimple_assign_rhs_code (stmt); - if (!CONVERT_EXPR_CODE_P (code) - && code != FIX_TRUNC_EXPR - && code != FLOAT_EXPR - && code != WIDEN_PLUS_EXPR - && code != WIDEN_MINUS_EXPR - && code != WIDEN_MULT_EXPR - && code != WIDEN_LSHIFT_EXPR) + if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) + return false; + + if (is_gimple_assign (stmt)) + { + code = gimple_assign_rhs_code (stmt); + op_type = TREE_CODE_LENGTH ((tree_code) code); + } + else if (gimple_call_internal_p (stmt)) + { + code = gimple_call_internal_fn (stmt); + op_type = gimple_call_num_args (stmt); + } + else return false; bool widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); - op_type = TREE_CODE_LENGTH (code); + || code == WIDEN_MINUS_EXPR + || code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR); + + if (!widen_arith + && !CONVERT_EXPR_CODE_P (code) + && code != FIX_TRUNC_EXPR + && code != FLOAT_EXPR) + return false; /* Check types of lhs and rhs. */ - scalar_dest = gimple_assign_lhs (stmt); + scalar_dest = gimple_get_lhs (stmt); lhs_type = TREE_TYPE (scalar_dest); vectype_out = STMT_VINFO_VECTYPE (stmt_info); @@ -5070,10 +5082,14 @@ vectorizable_conversion (vec_info *vinfo, if (op_type == binary_op) { - gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR); + gcc_assert (code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR + || code == WIDEN_PLUS_EXPR + || code == WIDEN_MINUS_EXPR); + - op1 = gimple_assign_rhs2 (stmt); + op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : + gimple_call_arg (stmt, 0); tree vectype1_in; if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1, &op1, &slp_op1, &dt[1], &vectype1_in)) @@ -5157,8 +5173,13 @@ vectorizable_conversion (vec_info *vinfo, && code != FLOAT_EXPR && !CONVERT_EXPR_CODE_P (code)) return false; - if (supportable_convert_operation (code, vectype_out, vectype_in, &code1)) + gcc_assert (code.is_tree_code ()); + if (supportable_convert_operation ((tree_code) code, vectype_out, + vectype_in, &tc1)) + { + code1 = tc1; break; + } /* FALLTHRU */ unsupported: if (dump_enabled_p ()) @@ -5169,9 +5190,12 @@ vectorizable_conversion (vec_info *vinfo, case WIDEN: if (known_eq (nunits_in, nunits_out)) { - if (!supportable_half_widening_operation (code, vectype_out, - vectype_in, &code1)) + if (!(code.is_tree_code () + && supportable_half_widening_operation ((tree_code) code, + vectype_out, vectype_in, + &tc1))) goto unsupported; + code1 = tc1; gcc_assert (!(multi_step_cvt && op_type == binary_op)); break; } @@ -5205,14 +5229,17 @@ vectorizable_conversion (vec_info *vinfo, if (GET_MODE_SIZE (rhs_mode) == fltsz) { - if (!supportable_convert_operation (code, vectype_out, - cvt_type, &codecvt1)) + tc1 = ERROR_MARK; + gcc_assert (code.is_tree_code ()); + if (!supportable_convert_operation ((tree_code) code, vectype_out, + cvt_type, &tc1)) goto unsupported; + codecvt1 = tc1; } - else if (!supportable_widening_operation (vinfo, code, stmt_info, - vectype_out, cvt_type, - &codecvt1, &codecvt2, - &multi_step_cvt, + else if (!supportable_widening_operation (vinfo, code, + stmt_info, vectype_out, + cvt_type, &codecvt1, + &codecvt2, &multi_step_cvt, &interm_types)) continue; else @@ -5220,8 +5247,9 @@ vectorizable_conversion (vec_info *vinfo, if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info, cvt_type, - vectype_in, &code1, &code2, - &multi_step_cvt, &interm_types)) + vectype_in, &code1, + &code2, &multi_step_cvt, + &interm_types)) { found_mode = true; break; @@ -5257,9 +5285,11 @@ vectorizable_conversion (vec_info *vinfo, cvt_type = get_same_sized_vectype (cvt_type, vectype_in); if (cvt_type == NULL_TREE) goto unsupported; - if (!supportable_convert_operation (code, cvt_type, vectype_in, - &codecvt1)) + if (!code.is_tree_code () + || !supportable_convert_operation ((tree_code) code, cvt_type, + vectype_in, &tc1)) goto unsupported; + codecvt1 = tc1; if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type, &code1, &multi_step_cvt, &interm_types)) @@ -5377,10 +5407,9 @@ vectorizable_conversion (vec_info *vinfo, FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { /* Arguments are ready, create the new vector stmt. */ - gcc_assert (TREE_CODE_LENGTH (code1) == unary_op); - gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0); + gimple *new_stmt = vect_gimple_build (vec_dest, code1, vop0); new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_temp); + gimple_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); if (slp_node) @@ -5410,17 +5439,16 @@ vectorizable_conversion (vec_info *vinfo, for (i = multi_step_cvt; i >= 0; i--) { tree this_dest = vec_dsts[i]; - enum tree_code c1 = code1, c2 = code2; + code_helper c1 = code1, c2 = code2; if (i == 0 && codecvt2 != ERROR_MARK) { c1 = codecvt1; c2 = codecvt2; } if (known_eq (nunits_out, nunits_in)) - vect_create_half_widening_stmts (vinfo, &vec_oprnds0, - &vec_oprnds1, stmt_info, - this_dest, gsi, - c1, op_type); + vect_create_half_widening_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, + stmt_info, this_dest, gsi, c1, + op_type); else vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, stmt_info, @@ -5433,9 +5461,8 @@ vectorizable_conversion (vec_info *vinfo, gimple *new_stmt; if (cvt_type) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); new_temp = make_ssa_name (vec_dest); - new_stmt = gimple_build_assign (new_temp, codecvt1, vop0); + new_stmt = vect_gimple_build (new_temp, codecvt1, vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } else @@ -5459,10 +5486,8 @@ vectorizable_conversion (vec_info *vinfo, if (cvt_type) FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); new_temp = make_ssa_name (vec_dest); - gassign *new_stmt - = gimple_build_assign (new_temp, codecvt1, vop0); + gimple *new_stmt = vect_gimple_build (new_temp, codecvt1, vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); vec_oprnds0[i] = new_temp; } @@ -12151,9 +12176,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype) bool supportable_widening_operation (vec_info *vinfo, - enum tree_code code, stmt_vec_info stmt_info, + code_helper code, + stmt_vec_info stmt_info, tree vectype_out, tree vectype_in, - enum tree_code *code1, enum tree_code *code2, + code_helper *code1, + code_helper *code2, int *multi_step_cvt, vec<tree> *interm_types) { @@ -12164,7 +12191,7 @@ supportable_widening_operation (vec_info *vinfo, optab optab1, optab2; tree vectype = vectype_in; tree wide_vectype = vectype_out; - enum tree_code c1, c2; + tree_code c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES; int i; tree prev_type, intermediate_type; machine_mode intermediate_mode, prev_mode; @@ -12174,8 +12201,12 @@ supportable_widening_operation (vec_info *vinfo, if (loop_info) vect_loop = LOOP_VINFO_LOOP (loop_info); - switch (code) + switch (code.safe_as_tree_code ()) { + case MAX_TREE_CODES: + /* Don't set c1 and c2 if code is not a tree_code. */ + break; + case WIDEN_MULT_EXPR: /* The result of a vectorized widening operation usually requires two vectors (because the widened results do not fit into one vector). @@ -12215,8 +12246,9 @@ supportable_widening_operation (vec_info *vinfo, && !nested_in_vect_loop_p (vect_loop, stmt_info) && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR, stmt_info, vectype_out, - vectype_in, code1, code2, - multi_step_cvt, interm_types)) + vectype_in, code1, + code2, multi_step_cvt, + interm_types)) { /* Elements in a vector with vect_used_by_reduction property cannot be reordered if the use chain with this property does not have the @@ -12292,7 +12324,7 @@ supportable_widening_operation (vec_info *vinfo, optab1 = optab_for_tree_code (c1, vectype_out, optab_default); optab2 = optab_for_tree_code (c2, vectype_out, optab_default); } - else if (CONVERT_EXPR_CODE_P (code) + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) && VECTOR_BOOLEAN_TYPE_P (wide_vectype) && VECTOR_BOOLEAN_TYPE_P (vectype) && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) @@ -12317,8 +12349,12 @@ supportable_widening_operation (vec_info *vinfo, || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing) return false; - *code1 = c1; - *code2 = c2; + if (code.is_tree_code ()) + { + *code1 = c1; + *code2 = c2; + } + if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype) && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype)) @@ -12339,7 +12375,7 @@ supportable_widening_operation (vec_info *vinfo, prev_type = vectype; prev_mode = vec_mode; - if (!CONVERT_EXPR_CODE_P (code)) + if (!CONVERT_EXPR_CODE_P ((tree_code) code)) return false; /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS @@ -12437,9 +12473,9 @@ supportable_widening_operation (vec_info *vinfo, narrowing operation (short in the above example). */ bool -supportable_narrowing_operation (enum tree_code code, +supportable_narrowing_operation (code_helper code, tree vectype_out, tree vectype_in, - enum tree_code *code1, int *multi_step_cvt, + code_helper *code1, int *multi_step_cvt, vec<tree> *interm_types) { machine_mode vec_mode; @@ -12454,8 +12490,11 @@ supportable_narrowing_operation (enum tree_code code, unsigned HOST_WIDE_INT n_elts; bool uns; + if (!code.is_tree_code ()) + return false; + *multi_step_cvt = 0; - switch (code) + switch ((tree_code) code) { CASE_CONVERT: c1 = VEC_PACK_TRUNC_EXPR; diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 9cf2fb23fe397b467d89aa7cc5ebeaa293ed4cce..f215cd0639bcf803c9d0554cfdc57823431991d5 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2139,13 +2139,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree, enum vect_def_type *, tree *, stmt_vec_info * = NULL); extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool supportable_widening_operation (vec_info *, - enum tree_code, stmt_vec_info, - tree, tree, enum tree_code *, - enum tree_code *, int *, - vec<tree> *); -extern bool supportable_narrowing_operation (enum tree_code, tree, tree, - enum tree_code *, int *, +extern bool supportable_widening_operation (vec_info*, code_helper, + stmt_vec_info, tree, tree, + code_helper*, code_helper*, + int*, vec<tree> *); +extern bool supportable_narrowing_operation (code_helper, tree, tree, + code_helper *, int *, vec<tree> *); extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, @@ -2583,4 +2582,7 @@ vect_is_integer_truncation (stmt_vec_info stmt_info) && TYPE_PRECISION (lhs_type) < TYPE_PRECISION (rhs_type)); } +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, + or internal_fn contained in ch, respectively. */ +gimple * vect_gimple_build (tree, code_helper, tree, tree = NULL_TREE); #endif /* GCC_TREE_VECTORIZER_H */ diff --git a/gcc/tree.h b/gcc/tree.h index abcdb5638d49aea4ccc46efa8e540b1fa78aa27a..f6cd528e7d789c3f81fb2da3c1e1a29fa11f6e0f 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -93,6 +93,8 @@ public: bool is_internal_fn () const; bool is_builtin_fn () const; int get_rep () const { return rep; } + enum tree_code safe_as_tree_code () const; + combined_fn safe_as_fn_code () const; bool operator== (const code_helper &other) { return rep == other.rep; } bool operator!= (const code_helper &other) { return rep != other.rep; } bool operator== (tree_code c) { return rep == code_helper (c).rep; } @@ -102,6 +104,17 @@ private: int rep; }; +inline enum tree_code +code_helper::safe_as_tree_code () const +{ + return is_tree_code () ? (tree_code) *this : MAX_TREE_CODES; +} + +inline combined_fn +code_helper::safe_as_fn_code () const { + return is_fn_code () ? (combined_fn) *this : CFN_LAST; +} + inline code_helper::operator internal_fn () const { return as_internal_fn (combined_fn (*this)); ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 1/3] Refactor to allow internal_fn's 2023-04-28 12:36 ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists) @ 2023-05-03 11:55 ` Richard Biener 2023-05-04 15:20 ` Andre Vieira (lists) 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2023-05-03 11:55 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: > Hi, > > I'm posting the patches separately now with ChangeLogs. > > I made the suggested changes and tried to simplify the code a bit further. > Where internal to tree-vect-stmts I changed most functions to use code_helper > to avoid having to check at places we didn't need to. I was trying to simplify > things further by also modifying supportable_half_widening_operation and > supportable_convert_operation but the result of that was that I ended up > moving the code to cast to tree code inside them rather than at the call site > and it didn't look simpler, so I left those. Though if we did make those > changes we'd no longer need to keep around the tc1 variable in > vectorizable_conversion... Let me know what you think. I see that - else if (CONVERT_EXPR_CODE_P (code) + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) is convenient (as much as I dislike safe_as_tree_code). Isn't the following - if (!CONVERT_EXPR_CODE_P (code)) + if (!CONVERT_EXPR_CODE_P ((tree_code) code)) return false; then wrong? In other places you added an assert - I assume that we might want to checking assert in the cast operators? (those were added mainly for convenience, maybe we want as_a <>, etc. here - not sure if those will play well with enums though). Just suggestions for eventual followups in this area. +inline enum tree_code +code_helper::safe_as_tree_code () const +{ + return is_tree_code () ? (tree_code) *this : MAX_TREE_CODES; +} + +inline combined_fn +code_helper::safe_as_fn_code () const { + return is_fn_code () ? (combined_fn) *this : CFN_LAST; +} + newline after the last 'const'. Can you place a comment before these to explain their intended use? Aka give the case the code isn't the desired kind a safe value? The patch is OK with just the last bit fixed. Thanks, Richard. > gcc/ChangeLog: > > 2023-04-28 Andre Vieira <andre.simoesdiasvieira@arm.com> > Joel Hutton <joel.hutton@arm.com> > > * tree-vect-patterns.cc (vect_gimple_build): New Function. > (vect_recog_widen_op_pattern): Refactor to use code_helper. > * tree-vect-stmts.cc (vect_gen_widened_results_half): Likewise. > (vect_create_vectorized_demotion_stmts): Likewise. > (vect_create_vectorized_promotion_stmts): Likewise. > (vect_create_half_widening_stmts): Likewise. > (vectorizable_conversion): Likewise. > (vectorizable_call): Likewise. > (supportable_widening_operation): Likewise. > (supportable_narrowing_operation): Likewise. > (simple_integer_narrowing): Likewise. > * tree-vectorizer.h (supportable_widening_operation): Likewise. > (supportable_narrowing_operation): Likewise. > (vect_gimple_build): New function prototype. > * tree.h (code_helper::safe_as_tree_code): New function. > (code_helper::safe_as_fn_code): New function. > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 1/3] Refactor to allow internal_fn's 2023-05-03 11:55 ` Richard Biener @ 2023-05-04 15:20 ` Andre Vieira (lists) 2023-05-05 6:09 ` Richard Biener 0 siblings, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-05-04 15:20 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches On 03/05/2023 12:55, Richard Biener wrote: > On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: > >> Hi, >> >> I'm posting the patches separately now with ChangeLogs. >> >> I made the suggested changes and tried to simplify the code a bit further. >> Where internal to tree-vect-stmts I changed most functions to use code_helper >> to avoid having to check at places we didn't need to. I was trying to simplify >> things further by also modifying supportable_half_widening_operation and >> supportable_convert_operation but the result of that was that I ended up >> moving the code to cast to tree code inside them rather than at the call site >> and it didn't look simpler, so I left those. Though if we did make those >> changes we'd no longer need to keep around the tc1 variable in >> vectorizable_conversion... Let me know what you think. > > I see that > > - else if (CONVERT_EXPR_CODE_P (code) > + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) > > is convenient (as much as I dislike safe_as_tree_code). Isn't > the following > > - if (!CONVERT_EXPR_CODE_P (code)) > + if (!CONVERT_EXPR_CODE_P ((tree_code) code)) > return false; For some reason I thought the code could only reach here if code was a tree code, but I guess if we have an ifn and the modes aren't the same as the wide_vectype it would fall to this, which for an ifn this would fail. I am wondering whether it needs to though, the multi-step widening should also work for ifn's no? We'd need to adapt it, to not use c1, c2 but hi, lo in case of ifn I guess.. and then use a different optab look up too? Though I'm thinking, maybe this should be a follow-up and just not have that 'feature' for now. The feature being, supporting multi-step conversion for new widening IFN's. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 1/3] Refactor to allow internal_fn's 2023-05-04 15:20 ` Andre Vieira (lists) @ 2023-05-05 6:09 ` Richard Biener 2023-05-12 12:14 ` Andre Vieira (lists) 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2023-05-05 6:09 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches On Thu, 4 May 2023, Andre Vieira (lists) wrote: > > > On 03/05/2023 12:55, Richard Biener wrote: > > On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: > > > >> Hi, > >> > >> I'm posting the patches separately now with ChangeLogs. > >> > >> I made the suggested changes and tried to simplify the code a bit further. > >> Where internal to tree-vect-stmts I changed most functions to use > >> code_helper > >> to avoid having to check at places we didn't need to. I was trying to > >> simplify > >> things further by also modifying supportable_half_widening_operation and > >> supportable_convert_operation but the result of that was that I ended up > >> moving the code to cast to tree code inside them rather than at the call > >> site > >> and it didn't look simpler, so I left those. Though if we did make those > >> changes we'd no longer need to keep around the tc1 variable in > >> vectorizable_conversion... Let me know what you think. > > > > I see that > > > > - else if (CONVERT_EXPR_CODE_P (code) > > + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) > > > > is convenient (as much as I dislike safe_as_tree_code). Isn't > > the following > > > > - if (!CONVERT_EXPR_CODE_P (code)) > > + if (!CONVERT_EXPR_CODE_P ((tree_code) code)) > > return false; > For some reason I thought the code could only reach here if code was a tree > code, but I guess if we have an ifn and the modes aren't the same as the > wide_vectype it would fall to this, which for an ifn this would fail. I am > wondering whether it needs to though, the multi-step widening should also work > for ifn's no? We'd need to adapt it, to not use c1, c2 but hi, lo in case of > ifn I guess.. and then use a different optab look up too? > > Though I'm thinking, maybe this should be a follow-up and just not have that > 'feature' for now. The feature being, supporting multi-step conversion for new > widening IFN's. Yes, I think we should address this in a followup if needed. Richard. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 1/3] Refactor to allow internal_fn's 2023-05-05 6:09 ` Richard Biener @ 2023-05-12 12:14 ` Andre Vieira (lists) 2023-05-12 13:18 ` Richard Biener 0 siblings, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-05-12 12:14 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches [-- Attachment #1: Type: text/plain, Size: 1132 bytes --] Hi, I think I tackled all of your comments, let me know if I missed something. gcc/ChangeLog: 2023-05-12 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> * tree-vect-patterns.cc (vect_gimple_build): New Function. (vect_recog_widen_op_pattern): Refactor to use code_helper. * tree-vect-stmts.cc (vect_gen_widened_results_half): Likewise. (vect_create_vectorized_demotion_stmts): Likewise. (vect_create_vectorized_promotion_stmts): Likewise. (vect_create_half_widening_stmts): Likewise. (vectorizable_conversion): Likewise. (vectorizable_call): Likewise. (supportable_widening_operation): Likewise. (supportable_narrowing_operation): Likewise. (simple_integer_narrowing): Likewise. * tree-vectorizer.h (supportable_widening_operation): Likewise. (supportable_narrowing_operation): Likewise. (vect_gimple_build): New function prototype. * tree.h (code_helper::safe_as_tree_code): New function. (code_helper::safe_as_fn_code): New function. [-- Attachment #2: ifn0v3.patch --] [-- Type: text/plain, Size: 23124 bytes --] diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 33a8b2bb60601dc1a67de62a56bbf3c355e12dbd..1778af0242898e3dc73d94d22a5b8505628a53b5 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -25,6 +25,8 @@ along with GCC; see the file COPYING3. If not see #include "rtl.h" #include "tree.h" #include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-fold.h" #include "ssa.h" #include "expmed.h" #include "optabs-tree.h" @@ -1392,7 +1394,7 @@ vect_recog_sad_pattern (vec_info *vinfo, static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, - tree_code orig_code, tree_code wide_code, + tree_code orig_code, code_helper wide_code, bool shift_p, const char *name) { gimple *last_stmt = last_stmt_info->stmt; @@ -1435,7 +1437,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, vecctype = get_vectype_for_scalar_type (vinfo, ctype); } - enum tree_code dummy_code; + code_helper dummy_code; int dummy_int; auto_vec<tree> dummy_vec; if (!vectype @@ -1456,8 +1458,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, 2, oprnd, half_type, unprom, vectype); tree var = vect_recog_temp_ssa_var (itype, NULL); - gimple *pattern_stmt = gimple_build_assign (var, wide_code, - oprnd[0], oprnd[1]); + gimple *pattern_stmt = vect_gimple_build (var, wide_code, oprnd[0], oprnd[1]); if (vecctype != vecitype) pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype, @@ -6808,3 +6809,20 @@ vect_pattern_recog (vec_info *vinfo) /* After this no more add_stmt calls are allowed. */ vinfo->stmt_vec_info_ro = true; } + +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, + or internal_fn contained in ch, respectively. */ +gimple * +vect_gimple_build (tree lhs, code_helper ch, tree op0, tree op1) +{ + gcc_assert (op0 != NULL_TREE); + if (ch.is_tree_code ()) + return gimple_build_assign (lhs, (tree_code) ch, op0, op1); + + gcc_assert (ch.is_internal_fn ()); + gimple* stmt = gimple_build_call_internal (as_internal_fn ((combined_fn) ch), + op1 == NULL_TREE ? 1 : 2, + op0, op1); + gimple_call_set_lhs (stmt, lhs); + return stmt; +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 61a2da4ecee9c449c1469cab3c4cfa1a782471d5..d152ae9ab10b361b88c0f839d6951c43b954750a 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -3261,13 +3261,13 @@ vectorizable_bswap (vec_info *vinfo, static bool simple_integer_narrowing (tree vectype_out, tree vectype_in, - tree_code *convert_code) + code_helper *convert_code) { if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype_out)) || !INTEGRAL_TYPE_P (TREE_TYPE (vectype_in))) return false; - tree_code code; + code_helper code; int multi_step_cvt = 0; auto_vec <tree, 8> interm_types; if (!supportable_narrowing_operation (NOP_EXPR, vectype_out, vectype_in, @@ -3481,7 +3481,7 @@ vectorizable_call (vec_info *vinfo, tree callee = gimple_call_fndecl (stmt); /* First try using an internal function. */ - tree_code convert_code = ERROR_MARK; + code_helper convert_code = MAX_TREE_CODES; if (cfn != CFN_LAST && (modifier == NONE || (modifier == NARROW @@ -3667,8 +3667,8 @@ vectorizable_call (vec_info *vinfo, continue; } new_temp = make_ssa_name (vec_dest); - new_stmt = gimple_build_assign (new_temp, convert_code, - prev_res, half_res); + new_stmt = vect_gimple_build (new_temp, convert_code, + prev_res, half_res); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } @@ -3758,8 +3758,8 @@ vectorizable_call (vec_info *vinfo, continue; } new_temp = make_ssa_name (vec_dest); - new_stmt = gimple_build_assign (new_temp, convert_code, - prev_res, half_res); + new_stmt = vect_gimple_build (new_temp, convert_code, prev_res, + half_res); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } else @@ -4771,7 +4771,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, STMT_INFO is the original scalar stmt that we are vectorizing. */ static gimple * -vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, +vect_gen_widened_results_half (vec_info *vinfo, code_helper ch, tree vec_oprnd0, tree vec_oprnd1, int op_type, tree vec_dest, gimple_stmt_iterator *gsi, stmt_vec_info stmt_info) @@ -4780,12 +4780,11 @@ vect_gen_widened_results_half (vec_info *vinfo, enum tree_code code, tree new_temp; /* Generate half of the widened result: */ - gcc_assert (op_type == TREE_CODE_LENGTH (code)); if (op_type != binary_op) vec_oprnd1 = NULL; - new_stmt = gimple_build_assign (vec_dest, code, vec_oprnd0, vec_oprnd1); + new_stmt = vect_gimple_build (vec_dest, ch, vec_oprnd0, vec_oprnd1); new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_temp); + gimple_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); return new_stmt; @@ -4802,7 +4801,7 @@ vect_create_vectorized_demotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds, stmt_vec_info stmt_info, vec<tree> &vec_dsts, gimple_stmt_iterator *gsi, - slp_tree slp_node, enum tree_code code) + slp_tree slp_node, code_helper code) { unsigned int i; tree vop0, vop1, new_tmp, vec_dest; @@ -4814,9 +4813,9 @@ vect_create_vectorized_demotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds, /* Create demotion operation. */ vop0 = (*vec_oprnds)[i]; vop1 = (*vec_oprnds)[i + 1]; - gassign *new_stmt = gimple_build_assign (vec_dest, code, vop0, vop1); + gimple *new_stmt = vect_gimple_build (vec_dest, code, vop0, vop1); new_tmp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_tmp); + gimple_set_lhs (new_stmt, new_tmp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); if (multi_step_cvt) @@ -4864,8 +4863,8 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vec<tree> *vec_oprnds1, stmt_vec_info stmt_info, tree vec_dest, gimple_stmt_iterator *gsi, - enum tree_code code1, - enum tree_code code2, int op_type) + code_helper ch1, + code_helper ch2, int op_type) { int i; tree vop0, vop1, new_tmp1, new_tmp2; @@ -4881,10 +4880,10 @@ vect_create_vectorized_promotion_stmts (vec_info *vinfo, vop1 = NULL_TREE; /* Generate the two halves of promotion operation. */ - new_stmt1 = vect_gen_widened_results_half (vinfo, code1, vop0, vop1, + new_stmt1 = vect_gen_widened_results_half (vinfo, ch1, vop0, vop1, op_type, vec_dest, gsi, stmt_info); - new_stmt2 = vect_gen_widened_results_half (vinfo, code2, vop0, vop1, + new_stmt2 = vect_gen_widened_results_half (vinfo, ch2, vop0, vop1, op_type, vec_dest, gsi, stmt_info); if (is_gimple_call (new_stmt1)) @@ -4915,7 +4914,7 @@ vect_create_half_widening_stmts (vec_info *vinfo, vec<tree> *vec_oprnds1, stmt_vec_info stmt_info, tree vec_dest, gimple_stmt_iterator *gsi, - enum tree_code code1, + code_helper code1, int op_type) { int i; @@ -4945,13 +4944,13 @@ vect_create_half_widening_stmts (vec_info *vinfo, new_stmt2 = gimple_build_assign (new_tmp2, NOP_EXPR, vop1); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt2, gsi); /* Perform the operation. With both vector inputs widened. */ - new_stmt3 = gimple_build_assign (vec_dest, code1, new_tmp1, new_tmp2); + new_stmt3 = vect_gimple_build (vec_dest, code1, new_tmp1, new_tmp2); } else { /* Perform the operation. With the single vector input widened. */ - new_stmt3 = gimple_build_assign (vec_dest, code1, new_tmp1, vop1); - } + new_stmt3 = vect_gimple_build (vec_dest, code1, new_tmp1, vop1); + } new_tmp3 = make_ssa_name (vec_dest, new_stmt3); gimple_assign_set_lhs (new_stmt3, new_tmp3); @@ -4981,8 +4980,9 @@ vectorizable_conversion (vec_info *vinfo, tree scalar_dest; tree op0, op1 = NULL_TREE; loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo); - enum tree_code code, code1 = ERROR_MARK, code2 = ERROR_MARK; - enum tree_code codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; + tree_code tc1; + code_helper code, code1, code2; + code_helper codecvt1 = ERROR_MARK, codecvt2 = ERROR_MARK; tree new_temp; enum vect_def_type dt[2] = {vect_unknown_def_type, vect_unknown_def_type}; int ndts = 2; @@ -5011,31 +5011,43 @@ vectorizable_conversion (vec_info *vinfo, && ! vec_stmt) return false; - gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt); - if (!stmt) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) return false; - if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) + if (gimple_get_lhs (stmt) == NULL_TREE + || TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) return false; - code = gimple_assign_rhs_code (stmt); - if (!CONVERT_EXPR_CODE_P (code) - && code != FIX_TRUNC_EXPR - && code != FLOAT_EXPR - && code != WIDEN_PLUS_EXPR - && code != WIDEN_MINUS_EXPR - && code != WIDEN_MULT_EXPR - && code != WIDEN_LSHIFT_EXPR) + if (TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME) + return false; + + if (is_gimple_assign (stmt)) + { + code = gimple_assign_rhs_code (stmt); + op_type = TREE_CODE_LENGTH ((tree_code) code); + } + else if (gimple_call_internal_p (stmt)) + { + code = gimple_call_internal_fn (stmt); + op_type = gimple_call_num_args (stmt); + } + else return false; bool widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); - op_type = TREE_CODE_LENGTH (code); + || code == WIDEN_MINUS_EXPR + || code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR); + + if (!widen_arith + && !CONVERT_EXPR_CODE_P (code) + && code != FIX_TRUNC_EXPR + && code != FLOAT_EXPR) + return false; /* Check types of lhs and rhs. */ - scalar_dest = gimple_assign_lhs (stmt); + scalar_dest = gimple_get_lhs (stmt); lhs_type = TREE_TYPE (scalar_dest); vectype_out = STMT_VINFO_VECTYPE (stmt_info); @@ -5073,10 +5085,14 @@ vectorizable_conversion (vec_info *vinfo, if (op_type == binary_op) { - gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR); + gcc_assert (code == WIDEN_MULT_EXPR + || code == WIDEN_LSHIFT_EXPR + || code == WIDEN_PLUS_EXPR + || code == WIDEN_MINUS_EXPR); + - op1 = gimple_assign_rhs2 (stmt); + op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : + gimple_call_arg (stmt, 0); tree vectype1_in; if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 1, &op1, &slp_op1, &dt[1], &vectype1_in)) @@ -5160,8 +5176,13 @@ vectorizable_conversion (vec_info *vinfo, && code != FLOAT_EXPR && !CONVERT_EXPR_CODE_P (code)) return false; - if (supportable_convert_operation (code, vectype_out, vectype_in, &code1)) + gcc_assert (code.is_tree_code ()); + if (supportable_convert_operation ((tree_code) code, vectype_out, + vectype_in, &tc1)) + { + code1 = tc1; break; + } /* FALLTHRU */ unsupported: if (dump_enabled_p ()) @@ -5172,9 +5193,12 @@ vectorizable_conversion (vec_info *vinfo, case WIDEN: if (known_eq (nunits_in, nunits_out)) { - if (!supportable_half_widening_operation (code, vectype_out, - vectype_in, &code1)) + if (!(code.is_tree_code () + && supportable_half_widening_operation ((tree_code) code, + vectype_out, vectype_in, + &tc1))) goto unsupported; + code1 = tc1; gcc_assert (!(multi_step_cvt && op_type == binary_op)); break; } @@ -5208,14 +5232,17 @@ vectorizable_conversion (vec_info *vinfo, if (GET_MODE_SIZE (rhs_mode) == fltsz) { - if (!supportable_convert_operation (code, vectype_out, - cvt_type, &codecvt1)) + tc1 = ERROR_MARK; + gcc_assert (code.is_tree_code ()); + if (!supportable_convert_operation ((tree_code) code, vectype_out, + cvt_type, &tc1)) goto unsupported; + codecvt1 = tc1; } - else if (!supportable_widening_operation (vinfo, code, stmt_info, - vectype_out, cvt_type, - &codecvt1, &codecvt2, - &multi_step_cvt, + else if (!supportable_widening_operation (vinfo, code, + stmt_info, vectype_out, + cvt_type, &codecvt1, + &codecvt2, &multi_step_cvt, &interm_types)) continue; else @@ -5223,8 +5250,9 @@ vectorizable_conversion (vec_info *vinfo, if (supportable_widening_operation (vinfo, NOP_EXPR, stmt_info, cvt_type, - vectype_in, &code1, &code2, - &multi_step_cvt, &interm_types)) + vectype_in, &code1, + &code2, &multi_step_cvt, + &interm_types)) { found_mode = true; break; @@ -5260,9 +5288,11 @@ vectorizable_conversion (vec_info *vinfo, cvt_type = get_same_sized_vectype (cvt_type, vectype_in); if (cvt_type == NULL_TREE) goto unsupported; - if (!supportable_convert_operation (code, cvt_type, vectype_in, - &codecvt1)) + if (!code.is_tree_code () + || !supportable_convert_operation ((tree_code) code, cvt_type, + vectype_in, &tc1)) goto unsupported; + codecvt1 = tc1; if (supportable_narrowing_operation (NOP_EXPR, vectype_out, cvt_type, &code1, &multi_step_cvt, &interm_types)) @@ -5380,10 +5410,9 @@ vectorizable_conversion (vec_info *vinfo, FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { /* Arguments are ready, create the new vector stmt. */ - gcc_assert (TREE_CODE_LENGTH (code1) == unary_op); - gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0); + gimple *new_stmt = vect_gimple_build (vec_dest, code1, vop0); new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_temp); + gimple_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); if (slp_node) @@ -5413,17 +5442,16 @@ vectorizable_conversion (vec_info *vinfo, for (i = multi_step_cvt; i >= 0; i--) { tree this_dest = vec_dsts[i]; - enum tree_code c1 = code1, c2 = code2; + code_helper c1 = code1, c2 = code2; if (i == 0 && codecvt2 != ERROR_MARK) { c1 = codecvt1; c2 = codecvt2; } if (known_eq (nunits_out, nunits_in)) - vect_create_half_widening_stmts (vinfo, &vec_oprnds0, - &vec_oprnds1, stmt_info, - this_dest, gsi, - c1, op_type); + vect_create_half_widening_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, + stmt_info, this_dest, gsi, c1, + op_type); else vect_create_vectorized_promotion_stmts (vinfo, &vec_oprnds0, &vec_oprnds1, stmt_info, @@ -5436,9 +5464,8 @@ vectorizable_conversion (vec_info *vinfo, gimple *new_stmt; if (cvt_type) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); new_temp = make_ssa_name (vec_dest); - new_stmt = gimple_build_assign (new_temp, codecvt1, vop0); + new_stmt = vect_gimple_build (new_temp, codecvt1, vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); } else @@ -5462,10 +5489,8 @@ vectorizable_conversion (vec_info *vinfo, if (cvt_type) FOR_EACH_VEC_ELT (vec_oprnds0, i, vop0) { - gcc_assert (TREE_CODE_LENGTH (codecvt1) == unary_op); new_temp = make_ssa_name (vec_dest); - gassign *new_stmt - = gimple_build_assign (new_temp, codecvt1, vop0); + gimple *new_stmt = vect_gimple_build (new_temp, codecvt1, vop0); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); vec_oprnds0[i] = new_temp; } @@ -12294,9 +12319,11 @@ vect_maybe_update_slp_op_vectype (slp_tree op, tree vectype) bool supportable_widening_operation (vec_info *vinfo, - enum tree_code code, stmt_vec_info stmt_info, + code_helper code, + stmt_vec_info stmt_info, tree vectype_out, tree vectype_in, - enum tree_code *code1, enum tree_code *code2, + code_helper *code1, + code_helper *code2, int *multi_step_cvt, vec<tree> *interm_types) { @@ -12307,7 +12334,7 @@ supportable_widening_operation (vec_info *vinfo, optab optab1, optab2; tree vectype = vectype_in; tree wide_vectype = vectype_out; - enum tree_code c1, c2; + tree_code c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES; int i; tree prev_type, intermediate_type; machine_mode intermediate_mode, prev_mode; @@ -12317,8 +12344,12 @@ supportable_widening_operation (vec_info *vinfo, if (loop_info) vect_loop = LOOP_VINFO_LOOP (loop_info); - switch (code) + switch (code.safe_as_tree_code ()) { + case MAX_TREE_CODES: + /* Don't set c1 and c2 if code is not a tree_code. */ + break; + case WIDEN_MULT_EXPR: /* The result of a vectorized widening operation usually requires two vectors (because the widened results do not fit into one vector). @@ -12358,8 +12389,9 @@ supportable_widening_operation (vec_info *vinfo, && !nested_in_vect_loop_p (vect_loop, stmt_info) && supportable_widening_operation (vinfo, VEC_WIDEN_MULT_EVEN_EXPR, stmt_info, vectype_out, - vectype_in, code1, code2, - multi_step_cvt, interm_types)) + vectype_in, code1, + code2, multi_step_cvt, + interm_types)) { /* Elements in a vector with vect_used_by_reduction property cannot be reordered if the use chain with this property does not have the @@ -12435,7 +12467,7 @@ supportable_widening_operation (vec_info *vinfo, optab1 = optab_for_tree_code (c1, vectype_out, optab_default); optab2 = optab_for_tree_code (c2, vectype_out, optab_default); } - else if (CONVERT_EXPR_CODE_P (code) + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) && VECTOR_BOOLEAN_TYPE_P (wide_vectype) && VECTOR_BOOLEAN_TYPE_P (vectype) && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) @@ -12460,8 +12492,12 @@ supportable_widening_operation (vec_info *vinfo, || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing) return false; - *code1 = c1; - *code2 = c2; + if (code.is_tree_code ()) + { + *code1 = c1; + *code2 = c2; + } + if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype) && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype)) @@ -12482,7 +12518,7 @@ supportable_widening_operation (vec_info *vinfo, prev_type = vectype; prev_mode = vec_mode; - if (!CONVERT_EXPR_CODE_P (code)) + if (!CONVERT_EXPR_CODE_P (code.safe_as_tree_code ())) return false; /* We assume here that there will not be more than MAX_INTERM_CVT_STEPS @@ -12580,9 +12616,9 @@ supportable_widening_operation (vec_info *vinfo, narrowing operation (short in the above example). */ bool -supportable_narrowing_operation (enum tree_code code, +supportable_narrowing_operation (code_helper code, tree vectype_out, tree vectype_in, - enum tree_code *code1, int *multi_step_cvt, + code_helper *code1, int *multi_step_cvt, vec<tree> *interm_types) { machine_mode vec_mode; @@ -12597,8 +12633,11 @@ supportable_narrowing_operation (enum tree_code code, unsigned HOST_WIDE_INT n_elts; bool uns; + if (!code.is_tree_code ()) + return false; + *multi_step_cvt = 0; - switch (code) + switch ((tree_code) code) { CASE_CONVERT: c1 = VEC_PACK_TRUNC_EXPR; diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 9cf2fb23fe397b467d89aa7cc5ebeaa293ed4cce..f215cd0639bcf803c9d0554cfdc57823431991d5 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2139,13 +2139,12 @@ extern bool vect_is_simple_use (vec_info *, stmt_vec_info, slp_tree, enum vect_def_type *, tree *, stmt_vec_info * = NULL); extern bool vect_maybe_update_slp_op_vectype (slp_tree, tree); -extern bool supportable_widening_operation (vec_info *, - enum tree_code, stmt_vec_info, - tree, tree, enum tree_code *, - enum tree_code *, int *, - vec<tree> *); -extern bool supportable_narrowing_operation (enum tree_code, tree, tree, - enum tree_code *, int *, +extern bool supportable_widening_operation (vec_info*, code_helper, + stmt_vec_info, tree, tree, + code_helper*, code_helper*, + int*, vec<tree> *); +extern bool supportable_narrowing_operation (code_helper, tree, tree, + code_helper *, int *, vec<tree> *); extern unsigned record_stmt_cost (stmt_vector_for_cost *, int, @@ -2583,4 +2582,7 @@ vect_is_integer_truncation (stmt_vec_info stmt_info) && TYPE_PRECISION (lhs_type) < TYPE_PRECISION (rhs_type)); } +/* Build a GIMPLE_ASSIGN or GIMPLE_CALL with the tree_code, + or internal_fn contained in ch, respectively. */ +gimple * vect_gimple_build (tree, code_helper, tree, tree = NULL_TREE); #endif /* GCC_TREE_VECTORIZER_H */ diff --git a/gcc/tree.h b/gcc/tree.h index 0b72663e6a1a94406127f6253460f498b7a3ea9c..6dcb28ebc1df456e3798b8f0b43bae42c145d43d 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -93,6 +93,8 @@ public: bool is_internal_fn () const; bool is_builtin_fn () const; int get_rep () const { return rep; } + tree_code safe_as_tree_code () const; + combined_fn safe_as_fn_code () const; bool operator== (const code_helper &other) { return rep == other.rep; } bool operator!= (const code_helper &other) { return rep != other.rep; } bool operator== (tree_code c) { return rep == code_helper (c).rep; } @@ -102,6 +104,25 @@ private: int rep; }; +/* Helper function that returns the tree_code representation of THIS + code_helper if it is a tree_code and MAX_TREE_CODES otherwise. This is + useful when passing a code_helper to a tree_code only check. */ + +inline tree_code +code_helper::safe_as_tree_code () const +{ + return is_tree_code () ? (tree_code) *this : MAX_TREE_CODES; +} + +/* Helper function that returns the combined_fn representation of THIS + code_helper if it is a fn_code and CFN_LAST otherwise. This is useful when + passing a code_helper to a combined_fn only check. */ + +inline combined_fn +code_helper::safe_as_fn_code () const { + return is_fn_code () ? (combined_fn) *this : CFN_LAST; +} + inline code_helper::operator internal_fn () const { return as_internal_fn (combined_fn (*this)); ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 1/3] Refactor to allow internal_fn's 2023-05-12 12:14 ` Andre Vieira (lists) @ 2023-05-12 13:18 ` Richard Biener 0 siblings, 0 replies; 53+ messages in thread From: Richard Biener @ 2023-05-12 13:18 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches On Fri, 12 May 2023, Andre Vieira (lists) wrote: > Hi, > > I think I tackled all of your comments, let me know if I missed something. This first and the last patch look good to me now. Let me comment on the second. Thanks, Richard. > > gcc/ChangeLog: > > 2023-05-12 Andre Vieira <andre.simoesdiasvieira@arm.com> > Joel Hutton <joel.hutton@arm.com> > > * tree-vect-patterns.cc (vect_gimple_build): New Function. > (vect_recog_widen_op_pattern): Refactor to use code_helper. > * tree-vect-stmts.cc (vect_gen_widened_results_half): Likewise. > (vect_create_vectorized_demotion_stmts): Likewise. > (vect_create_vectorized_promotion_stmts): Likewise. > (vect_create_half_widening_stmts): Likewise. > (vectorizable_conversion): Likewise. > (vectorizable_call): Likewise. > (supportable_widening_operation): Likewise. > (supportable_narrowing_operation): Likewise. > (simple_integer_narrowing): Likewise. > * tree-vectorizer.h (supportable_widening_operation): Likewise. > (supportable_narrowing_operation): Likewise. > (vect_gimple_build): New function prototype. > * tree.h (code_helper::safe_as_tree_code): New function. > (code_helper::safe_as_fn_code): New function. > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH 2/3] Refactor widen_plus as internal_fn 2023-04-25 9:55 ` Andre Vieira (lists) 2023-04-28 12:36 ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists) @ 2023-04-28 12:37 ` Andre Vieira (lists) 2023-05-03 12:11 ` Richard Biener 2023-04-28 12:37 ` [PATCH 3/3] Remove widen_plus/minus_expr tree codes Andre Vieira (lists) 2 siblings, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-04-28 12:37 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches [-- Attachment #1: Type: text/plain, Size: 3598 bytes --] This patch replaces the existing tree_code widen_plus and widen_minus patterns with internal_fn versions. DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it provides convenience wrappers for defining conversions that require a hi/lo split, like widening and narrowing operations. Each definition for <NAME> will require an optab named <OPTAB> and two other optabs that you specify for signed and unsigned. The hi/lo pair is necessary because the widening operations take n narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' operation operates on the first n/2 elements of input. The 'hi' operation operates on the second n/2 elements of input. Defining an internal_fn along with hi/lo variations allows a single internal function to be returned from a vect_recog function that will later be expanded to hi/lo. DEF_INTERNAL_OPTAB_HILO_FN is used in internal-fn.def to register a widening internal_fn. It is defined differently in different places and internal-fn.def is sourced from those places so the parameters given can be reused. internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later defined to generate the 'expand_' functions for the hi/lo versions of the fn. internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original and hi/lo variants of the internal_fn For example: IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>addl_hi_<mode> -> (u/s)addl2 IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>addl_lo_<mode> -> (u/s)addl This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. gcc/ChangeLog: 2023-04-28 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> Tamar Christina <tamar.christina@arm.com> * internal-fn.cc (INCLUDE_MAP): Include maps for use in optab lookup. (DEF_INTERNAL_OPTAB_HILO_FN): Macro to define an internal_fn that expands into multiple internal_fns (for widening). (ifn_cmp): Function to compare ifn's for sorting/searching. (lookup_hilo_ifn_optab): Add lookup function. (lookup_hilo_internal_fn): Add lookup function. (commutative_binary_fn_p): Add widen_plus fn's. (widening_fn_p): New function. (decomposes_to_hilo_fn_p): New function. * internal-fn.def (DEF_INTERNAL_OPTAB_HILO_FN): Define widening plus,minus functions. (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code. (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code. * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. (lookup_hilo_ifn_optab): Add prototype. (lookup_hilo_internal_fn): Likewise. (widening_fn_p): Likewise. (decomposes_to_hilo_fn_p): Likewise. * optabs.cc (commutative_optab_p): Add widening plus, minus optabs. * optabs.def (OPTAB_CD): widen add, sub optabs * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support patterns with a hi/lo split. (vect_recog_widen_plus_pattern): Refactor to return IFN_VECT_WIDEN_PLUS. (vect_recog_widen_minus_pattern): Refactor to return new IFN_VEC_WIDEN_MINUS. * tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus ifn support. (supportable_widening_operation): Add widen plus/minus ifn support. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: Test that new IFN_VEC_WIDEN_PLUS is being used. * gcc.target/aarch64/vect-widen-sub.c: Test that new IFN_VEC_WIDEN_MINUS is being used. [-- Attachment #2: ifn1_v2.patch --] [-- Type: text/plain, Size: 18412 bytes --] diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 6e81dc05e0e0714256759b0594816df451415a2d..e4d815cd577d266d2bccf6fb68d62aac91a8b4cf 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +#define INCLUDE_MAP #include "config.h" #include "system.h" #include "coretypes.h" @@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = { 0 }; +const enum internal_fn internal_fn_hilo_keys_array[] = { +#undef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + IFN_##NAME##_LO, \ + IFN_##NAME##_HI, +#include "internal-fn.def" + IFN_LAST +#undef DEF_INTERNAL_OPTAB_HILO_FN +}; + +const optab internal_fn_hilo_values_array[] = { +#undef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + SOPTAB##_lo_optab, UOPTAB##_lo_optab, \ + SOPTAB##_hi_optab, UOPTAB##_hi_optab, +#include "internal-fn.def" + unknown_optab, unknown_optab +#undef DEF_INTERNAL_OPTAB_HILO_FN +}; + /* Return the internal function called NAME, or IFN_LAST if there's no such function. */ @@ -90,6 +111,61 @@ lookup_internal_fn (const char *name) return entry ? *entry : IFN_LAST; } +static int +ifn_cmp (const void *a_, const void *b_) +{ + typedef std::pair<enum internal_fn, unsigned> ifn_pair; + auto *a = (const std::pair<ifn_pair, optab> *)a_; + auto *b = (const std::pair<ifn_pair, optab> *)b_; + return (int) (a->first.first) - (b->first.first); +} + +/* Return the optab belonging to the given internal function NAME for the given + SIGN or unknown_optab. */ + +optab +lookup_hilo_ifn_optab (enum internal_fn fn, unsigned sign) +{ + typedef std::pair<enum internal_fn, unsigned> ifn_pair; + typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type; + static fn_to_optab_map_type *fn_to_optab_map; + + if (!fn_to_optab_map) + { + unsigned num + = sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn); + fn_to_optab_map = new fn_to_optab_map_type (); + for (unsigned int i = 0; i < num - 1; ++i) + { + enum internal_fn fn = internal_fn_hilo_keys_array[i]; + optab v1 = internal_fn_hilo_values_array[2*i]; + optab v2 = internal_fn_hilo_values_array[2*i + 1]; + ifn_pair key1 (fn, 0); + fn_to_optab_map->safe_push ({key1, v1}); + ifn_pair key2 (fn, 1); + fn_to_optab_map->safe_push ({key2, v2}); + } + fn_to_optab_map->qsort (ifn_cmp); + } + + ifn_pair new_pair (fn, sign ? 1 : 0); + optab tmp; + std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp); + auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp); + return entry != fn_to_optab_map->end () ? entry->second : unknown_optab; +} + +extern void +lookup_hilo_internal_fn (enum internal_fn ifn, enum internal_fn *lo, + enum internal_fn *hi) +{ + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + *lo = internal_fn (ifn + 1); + *hi = internal_fn (ifn + 2); +} + + /* Fnspec of each internal function, indexed by function number. */ const_tree internal_fn_fnspec_array[IFN_LAST + 1]; @@ -3970,6 +4046,9 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: return true; default: @@ -4043,6 +4122,42 @@ first_commutative_argument (internal_fn fn) } } +/* Return true if FN has a wider output type than its argument types. */ + +bool +widening_fn_p (internal_fn fn) +{ + switch (fn) + { + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_MINUS: + return true; + + default: + return false; + } +} + +/* Return true if FN decomposes to _hi and _lo IFN. If true this should also + be a widening function. */ + +bool +decomposes_to_hilo_fn_p (internal_fn fn) +{ + if (!widening_fn_p (fn)) + return false; + + switch (fn) + { + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_MINUS: + return true; + + default: + return false; + } +} + /* Return true if IFN_SET_EDOM is supported. */ bool @@ -4055,6 +4170,32 @@ set_edom_supported_p (void) #endif } +#undef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + static void \ + expand_##CODE (internal_fn, gcall *) \ + { \ + gcc_unreachable (); \ + } \ + static void \ + expand_##CODE##_LO (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab); \ + } \ + static void \ + expand_##CODE##_HI (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab); \ + } + #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \ static void \ expand_##CODE (internal_fn fn, gcall *stmt) \ @@ -4071,6 +4212,7 @@ set_edom_supported_p (void) expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ } #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_HILO_FN /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..347ed667d92620e0ee3ea15c58ecac6c242ebe73 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -85,6 +85,13 @@ along with GCC; see the file COPYING3. If not see says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX} group of functions to any integral mode (including vector modes). + DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it + provides convenience wrappers for defining conversions that require a + hi/lo split, like widening and narrowing operations. Each definition + for <NAME> will require an optab named <OPTAB> and two other optabs that + you specify for signed and unsigned. + + Each entry must have a corresponding expander of the form: void expand_NAME (gimple_call stmt) @@ -123,6 +130,14 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) #endif +#ifndef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE) +#endif + + DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, @@ -315,6 +330,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) +DEF_INTERNAL_OPTAB_HILO_FN (VEC_WIDEN_PLUS, + ECF_CONST | ECF_NOTHROW, + vec_widen_add, vec_widen_saddl, vec_widen_uaddl, + binary) +DEF_INTERNAL_OPTAB_HILO_FN (VEC_WIDEN_MINUS, + ECF_CONST | ECF_NOTHROW, + vec_widen_sub, vec_widen_ssubl, vec_widen_usubl, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 08922ed4254898f5fffca3f33973e96ed9ce772f..6a5f8762e872ad2ef64ce2986a678e3b40622d81 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +#include "insn-codes.h" +#include "insn-opinit.h" + + /* INTEGER_CST values for IFN_UNIQUE function arg-0. UNSPEC: Undifferentiated UNIQUE. @@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn) } extern internal_fn lookup_internal_fn (const char *); +extern optab lookup_hilo_ifn_optab (enum internal_fn, unsigned); +extern void lookup_hilo_internal_fn (enum internal_fn, enum internal_fn *, + enum internal_fn *); /* Return the ECF_* flags for function FN. */ @@ -210,6 +217,8 @@ extern bool commutative_binary_fn_p (internal_fn); extern bool commutative_ternary_fn_p (internal_fn); extern int first_commutative_argument (internal_fn); extern bool associative_binary_fn_p (internal_fn); +extern bool widening_fn_p (internal_fn); +extern bool decomposes_to_hilo_fn_p (internal_fn); extern bool set_edom_supported_p (void); diff --git a/gcc/optabs.cc b/gcc/optabs.cc index c8e39c82d57a7d726e7da33d247b80f32ec9236c..d4dd7ee3d34d01c32ab432ae4e4ce9e4b522b2f7 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1314,7 +1314,12 @@ commutative_optab_p (optab binoptab) || binoptab == smul_widen_optab || binoptab == umul_widen_optab || binoptab == smul_highpart_optab - || binoptab == umul_highpart_optab); + || binoptab == umul_highpart_optab + || binoptab == vec_widen_add_optab + || binoptab == vec_widen_saddl_hi_optab + || binoptab == vec_widen_saddl_lo_optab + || binoptab == vec_widen_uaddl_hi_optab + || binoptab == vec_widen_uaddl_lo_optab); } /* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're diff --git a/gcc/optabs.def b/gcc/optabs.def index 695f5911b300c9ca5737de9be809fa01aabe5e01..e064189103b3be70644468d11f3c91ac45ffe0d0 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4") OPTAB_CD(umsub_widen_optab, "umsub$b$a4") OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4") OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4") +OPTAB_CD(vec_widen_add_optab, "add$a$b3") +OPTAB_CD(vec_widen_sub_optab, "sub$a$b3") OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b") OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b") OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */ /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */ /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index b35023adade94c1996cd076c4b7419560e819c6b..3175dd92187c0935f78ebbf2eb476bdcf8b4ccd1 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -1394,14 +1394,16 @@ static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, tree_code orig_code, code_helper wide_code, - bool shift_p, const char *name) + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) { gimple *last_stmt = last_stmt_info->stmt; vect_unpromoted_value unprom[2]; tree half_type; if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code, - shift_p, 2, unprom, &half_type)) + shift_p, 2, unprom, &half_type, subtype)) + return NULL; /* Pattern detected. */ @@ -1467,6 +1469,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo, type, pattern_stmt, vecctype); } +static gimple * +vect_recog_widen_op_pattern (vec_info *vinfo, + stmt_vec_info last_stmt_info, tree *type_out, + tree_code orig_code, internal_fn wide_ifn, + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) +{ + combined_fn ifn = as_combined_fn (wide_ifn); + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + orig_code, ifn, shift_p, name, + subtype); +} + + /* Try to detect multiplication on widened inputs, converting MULT_EXPR to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ @@ -1480,26 +1496,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, } /* Try to detect addition on widened inputs, converting PLUS_EXPR - to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_PLUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - PLUS_EXPR, WIDEN_PLUS_EXPR, false, - "vect_recog_widen_plus_pattern"); + PLUS_EXPR, IFN_VEC_WIDEN_PLUS, + false, "vect_recog_widen_plus_pattern", + &subtype); } /* Try to detect subtraction on widened inputs, converting MINUS_EXPR - to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_MINUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - MINUS_EXPR, WIDEN_MINUS_EXPR, false, - "vect_recog_widen_minus_pattern"); + MINUS_EXPR, IFN_VEC_WIDEN_MINUS, + false, "vect_recog_widen_minus_pattern", + &subtype); } /* Function vect_recog_popcount_pattern @@ -6067,6 +6087,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + /* These must come after the double widening ones. */ }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index ce47f4940fa9a1baca4ba1162065cfc3b4072eba..2a7ef2439e12d1966e8884433963a3d387a856b7 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5035,7 +5035,9 @@ vectorizable_conversion (vec_info *vinfo, bool widen_arith = (code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); + || code == WIDEN_LSHIFT_EXPR + || code == IFN_VEC_WIDEN_PLUS + || code == IFN_VEC_WIDEN_MINUS); if (!widen_arith && !CONVERT_EXPR_CODE_P (code) @@ -5085,7 +5087,9 @@ vectorizable_conversion (vec_info *vinfo, gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR); + || code == WIDEN_MINUS_EXPR + || code == IFN_VEC_WIDEN_PLUS + || code == IFN_VEC_WIDEN_MINUS); op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : @@ -12335,12 +12339,46 @@ supportable_widening_operation (vec_info *vinfo, optab1 = vec_unpacks_sbool_lo_optab; optab2 = vec_unpacks_sbool_hi_optab; } - else - { - optab1 = optab_for_tree_code (c1, vectype, optab_default); - optab2 = optab_for_tree_code (c2, vectype, optab_default); + + if (code.is_fn_code ()) + { + internal_fn ifn = as_internal_fn ((combined_fn) code); + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + internal_fn lo, hi; + lookup_hilo_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); + optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); } + if (code.is_tree_code ()) + { + if (code == FIX_TRUNC_EXPR) + { + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (c1, vectype_out, optab_default); + optab2 = optab_for_tree_code (c2, vectype_out, optab_default); + } + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) + && VECTOR_BOOLEAN_TYPE_P (wide_vectype) + && VECTOR_BOOLEAN_TYPE_P (vectype) + && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) + { + /* If the input and result modes are the same, a different optab + is needed where we pass in the number of units in vectype. */ + optab1 = vec_unpacks_sbool_lo_optab; + optab2 = vec_unpacks_sbool_hi_optab; + } + else + { + optab1 = optab_for_tree_code (c1, vectype, optab_default); + optab2 = optab_for_tree_code (c2, vectype, optab_default); + } + } + if (!optab1 || !optab2) return false; ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-04-28 12:37 ` [PATCH 2/3] Refactor widen_plus as internal_fn Andre Vieira (lists) @ 2023-05-03 12:11 ` Richard Biener 2023-05-03 19:07 ` Richard Sandiford 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2023-05-03 12:11 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: > This patch replaces the existing tree_code widen_plus and widen_minus > patterns with internal_fn versions. > > DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it provides > convenience wrappers for defining conversions that require a hi/lo split, like > widening and narrowing operations. Each definition for <NAME> will require an > optab named <OPTAB> and two other optabs that you specify for signed and > unsigned. The hi/lo pair is necessary because the widening operations take n > narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' > operation operates on the first n/2 elements of input. The 'hi' operation > operates on the second n/2 elements of input. Defining an internal_fn along > with hi/lo variations allows a single internal function to be returned from a > vect_recog function that will later be expanded to hi/lo. > > DEF_INTERNAL_OPTAB_HILO_FN is used in internal-fn.def to register a widening > internal_fn. It is defined differently in different places and internal-fn.def > is sourced from those places so the parameters given can be reused. > internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later > defined to generate the 'expand_' functions for the hi/lo versions of the fn. > internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original > and hi/lo variants of the internal_fn > > For example: > IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO > for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>addl_hi_<mode> -> > (u/s)addl2 > IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>addl_lo_<mode> > -> (u/s)addl > > This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree > codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. I'll note that it's interesting we have widen multiplication as the only existing example where we have both HI/LO and EVEN/ODD cases. I think we want to share as much of the infrastructure to eventually support targets doing even/odd (I guess all VLA vector targets will be even/odd?). DEF_INTERNAL_OPTAB_HILO_FN also looks to be implicitely directed to widening operations (otherwise no signed/unsigned variants would be necessary). What I don't understand is why we need an optab without _hi/_lo but in that case no signed/unsigned variant? Looks like all plus, plus_lo and plus_hi are commutative but only plus is widening?! So is the setup that the vectorizer doesn't know about the split and uses 'plus' but then the expander performs the split? It does look a bit awkward here (the plain 'plus' is just used for the scalar case during pattern recog it seems). I'd rather have DEF_INTERNAL_OPTAB_HILO_FN split up, declaring the hi/lo pairs and the scalar variant separately using DEF_INTERNAL_FN without expander for that, and having DEF_INTERNAL_HILO_WIDEN_OPTAB_FN and DEF_INTERNAL_EVENODD_WIDEN_OPTAB_FN for the signed/unsigned pairs? (if we need that helper at all) Targets shouldn't need to implement the plain optab (it shouldn't exist) and the vectorizer should query the hi/lo or even/odd optabs for support instead. The vectorizer parts look OK to me, I'd like Richard to chime in on the optab parts as well. Thanks, Richard. > gcc/ChangeLog: > > 2023-04-28 Andre Vieira <andre.simoesdiasvieira@arm.com> > Joel Hutton <joel.hutton@arm.com> > Tamar Christina <tamar.christina@arm.com> > > * internal-fn.cc (INCLUDE_MAP): Include maps for use in optab > lookup. > (DEF_INTERNAL_OPTAB_HILO_FN): Macro to define an internal_fn that > expands into multiple internal_fns (for widening). > (ifn_cmp): Function to compare ifn's for sorting/searching. > (lookup_hilo_ifn_optab): Add lookup function. > (lookup_hilo_internal_fn): Add lookup function. > (commutative_binary_fn_p): Add widen_plus fn's. > (widening_fn_p): New function. > (decomposes_to_hilo_fn_p): New function. > * internal-fn.def (DEF_INTERNAL_OPTAB_HILO_FN): Define widening > plus,minus functions. > (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code. > (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code. > * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. > (lookup_hilo_ifn_optab): Add prototype. > (lookup_hilo_internal_fn): Likewise. > (widening_fn_p): Likewise. > (decomposes_to_hilo_fn_p): Likewise. > * optabs.cc (commutative_optab_p): Add widening plus, minus optabs. > * optabs.def (OPTAB_CD): widen add, sub optabs > * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support > patterns with a hi/lo split. > (vect_recog_widen_plus_pattern): Refactor to return > IFN_VECT_WIDEN_PLUS. > (vect_recog_widen_minus_pattern): Refactor to return new > IFN_VEC_WIDEN_MINUS. > * tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus > ifn > support. > (supportable_widening_operation): Add widen plus/minus ifn support. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/vect-widen-add.c: Test that new > IFN_VEC_WIDEN_PLUS is being used. > * gcc.target/aarch64/vect-widen-sub.c: Test that new > IFN_VEC_WIDEN_MINUS is being used. > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-03 12:11 ` Richard Biener @ 2023-05-03 19:07 ` Richard Sandiford 2023-05-12 12:16 ` Andre Vieira (lists) 0 siblings, 1 reply; 53+ messages in thread From: Richard Sandiford @ 2023-05-03 19:07 UTC (permalink / raw) To: Richard Biener; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches Richard Biener <rguenther@suse.de> writes: > On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: > >> This patch replaces the existing tree_code widen_plus and widen_minus >> patterns with internal_fn versions. >> >> DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it provides >> convenience wrappers for defining conversions that require a hi/lo split, like >> widening and narrowing operations. Each definition for <NAME> will require an >> optab named <OPTAB> and two other optabs that you specify for signed and >> unsigned. The hi/lo pair is necessary because the widening operations take n >> narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' >> operation operates on the first n/2 elements of input. The 'hi' operation >> operates on the second n/2 elements of input. Defining an internal_fn along >> with hi/lo variations allows a single internal function to be returned from a >> vect_recog function that will later be expanded to hi/lo. >> >> DEF_INTERNAL_OPTAB_HILO_FN is used in internal-fn.def to register a widening >> internal_fn. It is defined differently in different places and internal-fn.def >> is sourced from those places so the parameters given can be reused. >> internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later >> defined to generate the 'expand_' functions for the hi/lo versions of the fn. >> internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original >> and hi/lo variants of the internal_fn >> >> For example: >> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO >> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>addl_hi_<mode> -> >> (u/s)addl2 >> IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>addl_lo_<mode> >> -> (u/s)addl >> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. > > I'll note that it's interesting we have widen multiplication as > the only existing example where we have both HI/LO and EVEN/ODD cases. > I think we want to share as much of the infrastructure to eventually > support targets doing even/odd (I guess all VLA vector targets will > be even/odd?). Can't speak for all, but SVE2 certainly is. > DEF_INTERNAL_OPTAB_HILO_FN also looks to be implicitely directed to > widening operations (otherwise no signed/unsigned variants would be > necessary). What I don't understand is why we need an optab > without _hi/_lo but in that case no signed/unsigned variant? > > Looks like all plus, plus_lo and plus_hi are commutative but > only plus is widening?! So is the setup that the vectorizer > doesn't know about the split and uses 'plus' but then the > expander performs the split? It does look a bit awkward here > (the plain 'plus' is just used for the scalar case during > pattern recog it seems). > > I'd rather have DEF_INTERNAL_OPTAB_HILO_FN split up, declaring > the hi/lo pairs and the scalar variant separately using > DEF_INTERNAL_FN without expander for that, and having > DEF_INTERNAL_HILO_WIDEN_OPTAB_FN and DEF_INTERNAL_EVENODD_WIDEN_OPTAB_FN > for the signed/unsigned pairs? (if we need that helper at all) > > Targets shouldn't need to implement the plain optab (it shouldn't > exist) and the vectorizer should query the hi/lo or even/odd > optabs for support instead. I dread these kinds of review because I think I'm almost certain to flatly contradict something I said last time round, but +1 FWIW. It seems OK to define an ifn to represent the combined effect, for the scalar case, but that shouldn't leak into optabs unless we actually want to use the ifn for "real" scalar ops (as opposed to a temporary placeholder during pattern recognition). On the optabs/ifn bits: > +static int > +ifn_cmp (const void *a_, const void *b_) > +{ > + typedef std::pair<enum internal_fn, unsigned> ifn_pair; > + auto *a = (const std::pair<ifn_pair, optab> *)a_; > + auto *b = (const std::pair<ifn_pair, optab> *)b_; > + return (int) (a->first.first) - (b->first.first); > +} > + > +/* Return the optab belonging to the given internal function NAME for the given > + SIGN or unknown_optab. */ > + > +optab > +lookup_hilo_ifn_optab (enum internal_fn fn, unsigned sign) There is no NAME parameter. It also isn't clear what SIGN means: is 1 for unsigned or signed? Would be better to use signop and TYPE_SIGN IMO. > +{ > + typedef std::pair<enum internal_fn, unsigned> ifn_pair; > + typedef auto_vec <std::pair<ifn_pair, optab>>fn_to_optab_map_type; > + static fn_to_optab_map_type *fn_to_optab_map; > + > + if (!fn_to_optab_map) > + { > + unsigned num > + = sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn); > + fn_to_optab_map = new fn_to_optab_map_type (); > + for (unsigned int i = 0; i < num - 1; ++i) > + { > + enum internal_fn fn = internal_fn_hilo_keys_array[i]; > + optab v1 = internal_fn_hilo_values_array[2*i]; > + optab v2 = internal_fn_hilo_values_array[2*i + 1]; > + ifn_pair key1 (fn, 0); > + fn_to_optab_map->safe_push ({key1, v1}); > + ifn_pair key2 (fn, 1); > + fn_to_optab_map->safe_push ({key2, v2}); > + } > + fn_to_optab_map->qsort (ifn_cmp); > + } > + > + ifn_pair new_pair (fn, sign ? 1 : 0); > + optab tmp; > + std::pair<ifn_pair,optab> pair_wrap (new_pair, tmp); > + auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp); > + return entry != fn_to_optab_map->end () ? entry->second : unknown_optab; > +} > + Do we need to use a map for this? It seems like it follows mechanically from the macro definition and could be handled using a switch statement and preprocessor logic. Also, it would be good to make direct_internal_fn_optab DTRT for this case, rather than needing a separate function. > +extern void > +lookup_hilo_internal_fn (enum internal_fn ifn, enum internal_fn *lo, > + enum internal_fn *hi) > +{ > + gcc_assert (decomposes_to_hilo_fn_p (ifn)); > + > + *lo = internal_fn (ifn + 1); > + *hi = internal_fn (ifn + 2); > +} Nit: spurious extern. Function needs a comment. There have been requests to drop redundant "enum" keywords from new code. > +/* Return true if FN decomposes to _hi and _lo IFN. If true this should also > + be a widening function. */ > + > +bool > +decomposes_to_hilo_fn_p (internal_fn fn) > +{ > + if (!widening_fn_p (fn)) > + return false; > + > + switch (fn) > + { > + case IFN_VEC_WIDEN_PLUS: > + case IFN_VEC_WIDEN_MINUS: > + return true; > + > + default: > + return false; > + } > +} > + Similarly here I think we should use the preprocessor. It isn't clear why this returns false for !widening_fn_p. Narrowing hi/lo functions would decompose in a similar way. As a general comment, how about naming the new macro: DEF_INTERNAL_SIGNED_HILO_OPTAB_FN and make it invoke DEF_INTERNAL_SIGNED_OPTAB_FN twice, once for the hi and once for the lo? The new optabs need to be documented in md.texi. I think it'd be better to drop the "l" suffix in "addl" and "subl", since that's an Arm convention and is redundant with the earlier "widen". Sorry for the nitpicks and thanks for picking up this work. Richard ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-03 19:07 ` Richard Sandiford @ 2023-05-12 12:16 ` Andre Vieira (lists) 2023-05-12 13:28 ` Richard Biener 0 siblings, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-05-12 12:16 UTC (permalink / raw) To: Richard Biener, Richard Biener, gcc-patches, richard.sandiford [-- Attachment #1: Type: text/plain, Size: 4743 bytes --] I have dealt with, I think..., most of your comments. There's quite a few changes, I think it's all a bit simpler now. I made some other changes to the costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve the same behaviour as we had with the tree codes before. Also added some extra checks to tree-cfg.cc that made sense to me. I am still regression testing the gimple-range-op change, as that was a last minute change, but the rest survived a bootstrap and regression test on aarch64-unknown-linux-gnu. cover letter: This patch replaces the existing tree_code widen_plus and widen_minus patterns with internal_fn versions. DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively except they provide convenience wrappers for defining conversions that require a hi/lo split. Each definition for <NAME> will require optabs for _hi and _lo and each of those will also require a signed and unsigned version in the case of widening. The hi/lo pair is necessary because the widening and narrowing operations take n narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' operation operates on the first n/2 elements of input. The 'hi' operation operates on the second n/2 elements of input. Defining an internal_fn along with hi/lo variations allows a single internal function to be returned from a vect_recog function that will later be expanded to hi/lo. For example: IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> (u/s)addl2 IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> -> (u/s)addl This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. gcc/ChangeLog: 2023-05-12 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> Tamar Christina <tamar.christina@arm.com> * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): Rename this ... (vec_widen_<su>add_lo_<mode>): ... to this. (vec_widen_<su>addl_hi_<mode>): Rename this ... (vec_widen_<su>add_hi_<mode>): ... to this. (vec_widen_<su>subl_lo_<mode>): Rename this ... (vec_widen_<su>sub_lo_<mode>): ... to this. (vec_widen_<su>subl_hi_<mode>): Rename this ... (vec_widen_<su>sub_hi_<mode>): ...to this. * doc/generic.texi: Document new IFN codes. * internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to define an internal_fn that expands into multiple internal_fns for widening. (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing. (ifn_cmp): Function to compare ifn's for sorting/searching. (lookup_hilo_internal_fn): Add lookup function. (commutative_binary_fn_p): Add widen_plus fn's. (widening_fn_p): New function. (narrowing_fn_p): New function. (decomposes_to_hilo_fn_p): New function. (direct_internal_fn_optab): Change visibility. * internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define widening plus,minus functions. (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code. (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code. * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. (direct_internal_fn_optab): Declare new prototype. (lookup_hilo_internal_fn): Likewise. (widening_fn_p): Likewise. (Narrowing_fn_p): Likewise. (decomposes_to_hilo_fn_p): Likewise. * optabs.cc (commutative_optab_p): Add widening plus optabs. * optabs.def (OPTAB_D): Define widen add, sub optabs. * tree-cfg.cc (verify_gimple_call): Add checks for new widen add and sub IFNs. * tree-inline.cc (estimate_num_insns): Return same cost for widen add and sub IFNs as previous tree_codes. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support patterns with a hi/lo split. (vect_recog_sad_pattern): Refactor to use new IFN codes. (vect_recog_widen_plus_pattern): Likewise. (vect_recog_widen_minus_pattern): Likewise. (vect_recog_average_pattern): Likewise. * tree-vect-stmts.cc (vectorizable_conversion): Add support for _HILO IFNs. (supportable_widening_operation): Likewise. * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: Test that new IFN_VEC_WIDEN_PLUS is being used. * gcc.target/aarch64/vect-widen-sub.c: Test that new IFN_VEC_WIDEN_MINUS is being used. [-- Attachment #2: ifn1v3.patch --] [-- Type: text/plain, Size: 34240 bytes --] diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index bfc98a8d943467b33390defab9682f44efab5907..ffbbecb9409e1c2835d658c2a8855cd0e955c0f2 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4626,7 +4626,7 @@ [(set_attr "type" "neon_<ADDSUB:optab>_long")] ) -(define_expand "vec_widen_<su>addl_lo_<mode>" +(define_expand "vec_widen_<su>add_lo_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4638,7 +4638,7 @@ DONE; }) -(define_expand "vec_widen_<su>addl_hi_<mode>" +(define_expand "vec_widen_<su>add_hi_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4650,7 +4650,7 @@ DONE; }) -(define_expand "vec_widen_<su>subl_lo_<mode>" +(define_expand "vec_widen_<su>sub_lo_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4662,7 +4662,7 @@ DONE; }) -(define_expand "vec_widen_<su>subl_hi_<mode>" +(define_expand "vec_widen_<su>sub_hi_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index 8b2882da4fe7da07d22b4e5384d049ba7d3907bf..0fd7e6cce8bbd4ecb8027b702722adcf6c32eb55 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1811,6 +1811,10 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @tindex VEC_WIDEN_MULT_LO_EXPR +@tindex IFN_VEC_WIDEN_PLUS_HI +@tindex IFN_VEC_WIDEN_PLUS_LO +@tindex IFN_VEC_WIDEN_MINUS_HI +@tindex IFN_VEC_WIDEN_MINUS_LO @tindex VEC_WIDEN_PLUS_HI_EXPR @tindex VEC_WIDEN_PLUS_LO_EXPR @tindex VEC_WIDEN_MINUS_HI_EXPR @@ -1861,6 +1865,33 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the low @code{N/2} elements of the two vector are multiplied to produce the vector of @code{N/2} products. +@item IFN_VEC_WIDEN_PLUS_HI +@itemx IFN_VEC_WIDEN_PLUS_LO +These internal functions represent widening vector addition of the high and low +parts of the two input vectors, respectively. Their operands are vectors that +contain the same number of elements (@code{N}) of the same integral type. The +result is a vector that contains half as many elements, of an integral type +whose size is twice as wide. In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the +high @code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} products. In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low +@code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} products. + +@item IFN_VEC_WIDEN_MINUS_HI +@itemx IFN_VEC_WIDEN_MINUS_LO +These internal functions represent widening vector subtraction of the high and +low parts of the two input vectors, respectively. Their operands are vectors +that contain the same number of elements (@code{N}) of the same integral type. +The high/low elements of the second vector are subtracted from the high/low +elements of the first. The result is a vector that contains half as many +elements, of an integral type whose size is twice as wide. In the case of +@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second +vector are subtracted from the high @code{N/2} of the first to produce the +vector of @code{N/2} products. In the case of +@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second +vector are subtracted from the low @code{N/2} of the first to produce the +vector of @code{N/2} products. + @item VEC_WIDEN_PLUS_HI_EXPR @itemx VEC_WIDEN_PLUS_LO_EXPR These nodes represent widening vector addition of the high and low parts of diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index 594bd3043f0e944299ddfff219f757ef15a3dd61..66636d82df27626e7911efd0cb8526921b39633f 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -1187,6 +1187,7 @@ gimple_range_op_handler::maybe_non_standard () { range_operator *signed_op = ptr_op_widen_mult_signed; range_operator *unsigned_op = ptr_op_widen_mult_unsigned; + bool signed1, signed2, signed_ret; if (gimple_code (m_stmt) == GIMPLE_ASSIGN) switch (gimple_assign_rhs_code (m_stmt)) { @@ -1202,32 +1203,55 @@ gimple_range_op_handler::maybe_non_standard () m_op1 = gimple_assign_rhs1 (m_stmt); m_op2 = gimple_assign_rhs2 (m_stmt); tree ret = gimple_assign_lhs (m_stmt); - bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; - bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; - bool signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED; - - /* Normally these operands should all have the same sign, but - some passes and violate this by taking mismatched sign args. At - the moment the only one that's possible is mismatch inputs and - unsigned output. Once ranger supports signs for the operands we - can properly fix it, for now only accept the case we can do - correctly. */ - if ((signed1 ^ signed2) && signed_ret) - return; - - m_valid = true; - if (signed2 && !signed1) - std::swap (m_op1, m_op2); - - if (signed1 || signed2) - m_int = signed_op; - else - m_int = unsigned_op; + signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; + signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; + signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED; break; } default: - break; + return; } + else if (gimple_code (m_stmt) == GIMPLE_CALL + && gimple_call_internal_p (m_stmt) + && gimple_get_lhs (m_stmt) != NULL_TREE) + switch (gimple_call_internal_fn (m_stmt)) + { + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: + { + signed_op = ptr_op_widen_plus_signed; + unsigned_op = ptr_op_widen_plus_unsigned; + m_valid = false; + m_op1 = gimple_call_arg (m_stmt, 0); + m_op2 = gimple_call_arg (m_stmt, 1); + tree ret = gimple_get_lhs (m_stmt); + signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; + signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; + signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED; + break; + } + default: + return; + } + else + return; + + /* Normally these operands should all have the same sign, but some passes + and violate this by taking mismatched sign args. At the moment the only + one that's possible is mismatch inputs and unsigned output. Once ranger + supports signs for the operands we can properly fix it, for now only + accept the case we can do correctly. */ + if ((signed1 ^ signed2) && signed_ret) + return; + + m_valid = true; + if (signed2 && !signed1) + std::swap (m_op1, m_op2); + + if (signed1 || signed2) + m_int = signed_op; + else + m_int = unsigned_op; } // Set up a gimple_range_op_handler for any built in function which can be diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..1acea5ae33046b70de247b1688aea874d9956abc 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -90,6 +90,19 @@ lookup_internal_fn (const char *name) return entry ? *entry : IFN_LAST; } +/* Given an internal_fn IFN that is a HILO function, return its corresponding + LO and HI internal_fns. */ + +extern void +lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi) +{ + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + *lo = internal_fn (ifn + 1); + *hi = internal_fn (ifn + 2); +} + + /* Fnspec of each internal function, indexed by function number. */ const_tree internal_fn_fnspec_array[IFN_LAST + 1]; @@ -137,7 +150,16 @@ const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = { #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) TYPE##_direct, #define DEF_INTERNAL_SIGNED_OPTAB_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \ UNSIGNED_OPTAB, TYPE) TYPE##_direct, +#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN +#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN +#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) \ +TYPE##_direct, TYPE##_direct, TYPE##_direct, +#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE) \ +TYPE##_direct, TYPE##_direct, TYPE##_direct, #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN +#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN not_direct }; @@ -3852,7 +3874,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, /* Return the optab used by internal function FN. */ -static optab +optab direct_internal_fn_optab (internal_fn fn, tree_pair types) { switch (fn) @@ -3971,6 +3993,9 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_VEC_WIDEN_PLUS_HILO: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: return true; default: @@ -4044,6 +4069,88 @@ first_commutative_argument (internal_fn fn) } } +/* Return true if this CODE describes an internal_fn that returns a vector with + elements twice as wide as the element size of the input vectors. */ + +bool +widening_fn_p (code_helper code) +{ + if (!code.is_fn_code ()) + return false; + + if (!internal_fn_p ((combined_fn) code)) + return false; + + internal_fn fn = as_internal_fn ((combined_fn) code); + switch (fn) + { + #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN + #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \ + case IFN_##NAME##_HILO:\ + case IFN_##NAME##_HI: \ + case IFN_##NAME##_LO: \ + return true; + #include "internal-fn.def" + #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN + + default: + return false; + } +} + +/* Return true if this CODE describes an internal_fn that returns a vector with + elements twice as narrow as the element size of the input vectors. */ + +bool +narrowing_fn_p (code_helper code) +{ + if (!code.is_fn_code ()) + return false; + + if (!internal_fn_p ((combined_fn) code)) + return false; + + internal_fn fn = as_internal_fn ((combined_fn) code); + switch (fn) + { + #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN + #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \ + case IFN_##NAME##_HILO:\ + case IFN_##NAME##_HI: \ + case IFN_##NAME##_LO: \ + return true; + #include "internal-fn.def" + #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN + + default: + return false; + } +} + +/* Return true if FN decomposes to _hi and _lo IFN. */ + +bool +decomposes_to_hilo_fn_p (internal_fn fn) +{ + switch (fn) + { + #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN + #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \ + case IFN_##NAME##_HILO:\ + return true; + #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN + #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \ + case IFN_##NAME##_HILO:\ + return true; + #include "internal-fn.def" + #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN + #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN + + default: + return false; + } +} + /* Return true if IFN_SET_EDOM is supported. */ bool @@ -4071,7 +4178,33 @@ set_edom_supported_p (void) optab which_optab = direct_internal_fn_optab (fn, types); \ expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ } +#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR, \ + SIGNED_OPTAB, UNSIGNED_OPTAB, \ + TYPE) \ + static void \ + expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED, \ + gcall *stmt ATTRIBUTE_UNUSED) \ + { \ + gcc_unreachable (); \ + } \ + DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_HI, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_LO, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) +#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE) \ + static void \ + expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED, \ + gcall *stmt ATTRIBUTE_UNUSED) \ + { \ + gcc_unreachable (); \ + } \ + DEF_INTERNAL_OPTAB_FN(CODE##_LO, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN(CODE##_HI, FLAGS, OPTAB, TYPE) #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_FN +#undef DEF_INTERNAL_SIGNED_OPTAB_FN +#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN +#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: @@ -4080,6 +4213,7 @@ set_edom_supported_p (void) where STMT is the statement that performs the call. */ static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = { + #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE, #include "internal-fn.def" 0 diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..012dd323b86dd7cfcc5c13d3a2bb2a453937155d 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -85,6 +85,13 @@ along with GCC; see the file COPYING3. If not see says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX} group of functions to any integral mode (including vector modes). + DEF_INTERNAL_SIGNED_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it + provides convenience wrappers for defining conversions that require a + hi/lo split, like widening and narrowing operations. Each definition + for <NAME> will require an optab named <OPTAB> and two other optabs that + you specify for signed and unsigned. + + Each entry must have a corresponding expander of the form: void expand_NAME (gimple_call stmt) @@ -123,6 +130,20 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) #endif +#ifndef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN +#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE) +#endif + +#ifndef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN +#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, OPTAB##_lo, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, OPTAB##_hi, TYPE) +#endif + DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, @@ -315,6 +336,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) +DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_PLUS, + ECF_CONST | ECF_NOTHROW, + first, + vec_widen_sadd, vec_widen_uadd, + binary) +DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_MINUS, + ECF_CONST | ECF_NOTHROW, + first, + vec_widen_ssub, vec_widen_usub, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 08922ed4254898f5fffca3f33973e96ed9ce772f..8ba07d6d1338e75bc5a451d9e403112a608f3ea2 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +#include "insn-codes.h" +#include "insn-opinit.h" + + /* INTEGER_CST values for IFN_UNIQUE function arg-0. UNSPEC: Undifferentiated UNIQUE. @@ -112,6 +116,8 @@ internal_fn_name (enum internal_fn fn) } extern internal_fn lookup_internal_fn (const char *); +extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *); +extern optab direct_internal_fn_optab (internal_fn, tree_pair); /* Return the ECF_* flags for function FN. */ @@ -210,6 +216,9 @@ extern bool commutative_binary_fn_p (internal_fn); extern bool commutative_ternary_fn_p (internal_fn); extern int first_commutative_argument (internal_fn); extern bool associative_binary_fn_p (internal_fn); +extern bool widening_fn_p (code_helper); +extern bool narrowing_fn_p (code_helper); +extern bool decomposes_to_hilo_fn_p (internal_fn); extern bool set_edom_supported_p (void); diff --git a/gcc/optabs.cc b/gcc/optabs.cc index c8e39c82d57a7d726e7da33d247b80f32ec9236c..5a08d91e550b2d92e9572211f811fdba99a33a38 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1314,7 +1314,15 @@ commutative_optab_p (optab binoptab) || binoptab == smul_widen_optab || binoptab == umul_widen_optab || binoptab == smul_highpart_optab - || binoptab == umul_highpart_optab); + || binoptab == umul_highpart_optab + || binoptab == vec_widen_saddl_hi_optab + || binoptab == vec_widen_saddl_lo_optab + || binoptab == vec_widen_uaddl_hi_optab + || binoptab == vec_widen_uaddl_lo_optab + || binoptab == vec_widen_sadd_hi_optab + || binoptab == vec_widen_sadd_lo_optab + || binoptab == vec_widen_uadd_hi_optab + || binoptab == vec_widen_uadd_lo_optab); } /* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're diff --git a/gcc/optabs.def b/gcc/optabs.def index 695f5911b300c9ca5737de9be809fa01aabe5e01..16d121722c8c5723d9b164f5a2c616dc7ec143de 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -410,6 +410,10 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a") OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a") OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a") OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a") +OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a") +OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a") +OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a") +OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a") OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a") OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a") OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a") @@ -422,6 +426,10 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a") OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a") OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a") OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a") +OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a") +OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a") +OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a") +OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a") OPTAB_D (vec_addsub_optab, "vec_addsub$a3") OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4") OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */ /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */ /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */ diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 0aeebb67fac864db284985f4a6f0653af281d62b..28464ad9e3a7ea25557ffebcdbdbc1340f9e0d8b 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -65,6 +65,7 @@ along with GCC; see the file COPYING3. If not see #include "asan.h" #include "profile.h" #include "sreal.h" +#include "internal-fn.h" /* This file contains functions for building the Control Flow Graph (CFG) for a function tree. */ @@ -3411,6 +3412,52 @@ verify_gimple_call (gcall *stmt) debug_generic_stmt (fn); return true; } + internal_fn ifn = gimple_call_internal_fn (stmt); + if (ifn == IFN_LAST) + { + error ("gimple call has an invalid IFN"); + debug_generic_stmt (fn); + return true; + } + else if (decomposes_to_hilo_fn_p (ifn)) + { + /* Non decomposed HILO stmts should not appear in IL, these are + merely used as an internal representation to the auto-vectorizer + pass and should have been expanded to their _LO _HI variants. */ + error ("gimple call has an non decomposed HILO IFN"); + debug_generic_stmt (fn); + return true; + } + else if (ifn == IFN_VEC_WIDEN_PLUS_LO + || ifn == IFN_VEC_WIDEN_PLUS_HI + || ifn == IFN_VEC_WIDEN_MINUS_LO + || ifn == IFN_VEC_WIDEN_MINUS_HI) + { + tree rhs1_type = TREE_TYPE (gimple_call_arg (stmt, 0)); + tree rhs2_type = TREE_TYPE (gimple_call_arg (stmt, 1)); + tree lhs_type = TREE_TYPE (gimple_get_lhs (stmt)); + if (TREE_CODE (lhs_type) == VECTOR_TYPE) + { + if (TREE_CODE (rhs1_type) != VECTOR_TYPE + || TREE_CODE (rhs2_type) != VECTOR_TYPE) + { + error ("invalid non-vector operands in vector IFN call"); + debug_generic_stmt (fn); + return true; + } + lhs_type = TREE_TYPE (lhs_type); + rhs1_type = TREE_TYPE (rhs1_type); + rhs2_type = TREE_TYPE (rhs2_type); + } + if (POINTER_TYPE_P (lhs_type) + || POINTER_TYPE_P (rhs1_type) + || POINTER_TYPE_P (rhs2_type)) + { + error ("invalid (pointer) operands in vector IFN call"); + debug_generic_stmt (fn); + return true; + } + } } else { diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index 63a19f8d1d89c6bd5d8e55a299cbffaa324b4b84..d74d8db2173b1ab117250fea89de5212d5e354ec 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -4433,7 +4433,20 @@ estimate_num_insns (gimple *stmt, eni_weights *weights) tree decl; if (gimple_call_internal_p (stmt)) - return 0; + { + internal_fn fn = gimple_call_internal_fn (stmt); + switch (fn) + { + case IFN_VEC_WIDEN_PLUS_HI: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_MINUS_HI: + case IFN_VEC_WIDEN_MINUS_LO: + return 1; + + default: + return 0; + } + } else if ((decl = gimple_call_fndecl (stmt)) && fndecl_built_in_p (decl)) { diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 1778af0242898e3dc73d94d22a5b8505628a53b5..93cebc72beb4f65249a69b2665dfeb8a0991c1d1 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type) static unsigned int vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, - tree_code widened_code, bool shift_p, + code_helper widened_code, bool shift_p, unsigned int max_nops, vect_unpromoted_value *unprom, tree *common_type, enum optab_subtype *subtype = NULL) { /* Check for an integer operation with the right code. */ - gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); - if (!assign) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) + return 0; + + code_helper rhs_code; + if (is_gimple_assign (stmt)) + rhs_code = gimple_assign_rhs_code (stmt); + else if (is_gimple_call (stmt)) + rhs_code = gimple_call_combined_fn (stmt); + else return 0; - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + if (rhs_code != code + && rhs_code != widened_code) return 0; - tree type = TREE_TYPE (gimple_assign_lhs (assign)); + tree lhs = gimple_get_lhs (stmt); + tree type = TREE_TYPE (lhs); if (!INTEGRAL_TYPE_P (type)) return 0; @@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, { vect_unpromoted_value *this_unprom = &unprom[next_op]; unsigned int nops = 1; - tree op = gimple_op (assign, i + 1); + tree op = gimple_arg (stmt, i); if (i == 1 && TREE_CODE (op) == INTEGER_CST) { /* We already have a common type from earlier operands. @@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR, + if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, + IFN_VEC_WIDEN_MINUS_HILO, false, 2, unprom, &half_type)) return NULL; @@ -1395,14 +1405,16 @@ static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, tree_code orig_code, code_helper wide_code, - bool shift_p, const char *name) + bool shift_p, const char *name, + optab_subtype *subtype = NULL) { gimple *last_stmt = last_stmt_info->stmt; vect_unpromoted_value unprom[2]; tree half_type; if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code, - shift_p, 2, unprom, &half_type)) + shift_p, 2, unprom, &half_type, subtype)) + return NULL; /* Pattern detected. */ @@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo, type, pattern_stmt, vecctype); } +static gimple * +vect_recog_widen_op_pattern (vec_info *vinfo, + stmt_vec_info last_stmt_info, tree *type_out, + tree_code orig_code, internal_fn wide_ifn, + bool shift_p, const char *name, + optab_subtype *subtype = NULL) +{ + combined_fn ifn = as_combined_fn (wide_ifn); + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + orig_code, ifn, shift_p, name, + subtype); +} + + /* Try to detect multiplication on widened inputs, converting MULT_EXPR to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ @@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, } /* Try to detect addition on widened inputs, converting PLUS_EXPR - to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_PLUS_HILO. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - PLUS_EXPR, WIDEN_PLUS_EXPR, false, - "vect_recog_widen_plus_pattern"); + PLUS_EXPR, IFN_VEC_WIDEN_PLUS_HILO, + false, "vect_recog_widen_plus_pattern", + &subtype); } /* Try to detect subtraction on widened inputs, converting MINUS_EXPR - to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_MINUS_HILO. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - MINUS_EXPR, WIDEN_MINUS_EXPR, false, - "vect_recog_widen_minus_pattern"); + MINUS_EXPR, IFN_VEC_WIDEN_MINUS_HILO, + false, "vect_recog_widen_minus_pattern", + &subtype); } /* Function vect_recog_ctz_ffs_pattern @@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo, vect_unpromoted_value unprom[3]; tree new_type; unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR, - WIDEN_PLUS_EXPR, false, 3, + IFN_VEC_WIDEN_PLUS_HILO, false, 3, unprom, &new_type); if (nops == 0) return NULL; @@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + /* These must come after the double widening ones. */ }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index d152ae9ab10b361b88c0f839d6951c43b954750a..24c811ebe01fb8b003100dea494cf64fea72a975 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5038,7 +5038,9 @@ vectorizable_conversion (vec_info *vinfo, bool widen_arith = (code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); + || code == WIDEN_LSHIFT_EXPR + || code == IFN_VEC_WIDEN_PLUS_HILO + || code == IFN_VEC_WIDEN_MINUS_HILO); if (!widen_arith && !CONVERT_EXPR_CODE_P (code) @@ -5088,7 +5090,9 @@ vectorizable_conversion (vec_info *vinfo, gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR); + || code == WIDEN_MINUS_EXPR + || code == IFN_VEC_WIDEN_PLUS_HILO + || code == IFN_VEC_WIDEN_MINUS_HILO); op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : @@ -12478,10 +12482,43 @@ supportable_widening_operation (vec_info *vinfo, optab1 = vec_unpacks_sbool_lo_optab; optab2 = vec_unpacks_sbool_hi_optab; } - else + + if (code.is_fn_code ()) + { + internal_fn ifn = as_internal_fn ((combined_fn) code); + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + internal_fn lo, hi; + lookup_hilo_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = direct_internal_fn_optab (lo, {vectype, vectype}); + optab2 = direct_internal_fn_optab (hi, {vectype, vectype}); + } + else if (code.is_tree_code ()) { - optab1 = optab_for_tree_code (c1, vectype, optab_default); - optab2 = optab_for_tree_code (c2, vectype, optab_default); + if (code == FIX_TRUNC_EXPR) + { + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (c1, vectype_out, optab_default); + optab2 = optab_for_tree_code (c2, vectype_out, optab_default); + } + else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ()) + && VECTOR_BOOLEAN_TYPE_P (wide_vectype) + && VECTOR_BOOLEAN_TYPE_P (vectype) + && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) + { + /* If the input and result modes are the same, a different optab + is needed where we pass in the number of units in vectype. */ + optab1 = vec_unpacks_sbool_lo_optab; + optab2 = vec_unpacks_sbool_hi_optab; + } + else + { + optab1 = optab_for_tree_code (c1, vectype, optab_default); + optab2 = optab_for_tree_code (c2, vectype, optab_default); + } } if (!optab1 || !optab2) diff --git a/gcc/tree.def b/gcc/tree.def index 90ceeec0b512bfa5f983359c0af03cc71de32007..b37b0b35927b92a6536e5c2d9805ffce8319a240 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3) DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2) /* Widening sad (sum of absolute differences). - The first two arguments are of type t1 which should be integer. - The third argument and the result are of type t2, such that t2 is at least - twice the size of t1. Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is + The first two arguments are of type t1 which should be a vector of integers. + The third argument and the result are of type t2, such that the size of + the elements of t2 is at least twice the size of the elements of t1. + Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is equivalent to: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = PLUS_EXPR (tmp2, arg3) or: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = WIDEN_SUM_EXPR (tmp2, arg3) */ ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-12 12:16 ` Andre Vieira (lists) @ 2023-05-12 13:28 ` Richard Biener 2023-05-12 13:55 ` Andre Vieira (lists) 2023-05-12 14:01 ` Richard Sandiford 0 siblings, 2 replies; 53+ messages in thread From: Richard Biener @ 2023-05-12 13:28 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Biener, gcc-patches, richard.sandiford On Fri, 12 May 2023, Andre Vieira (lists) wrote: > I have dealt with, I think..., most of your comments. There's quite a few > changes, I think it's all a bit simpler now. I made some other changes to the > costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve > the same behaviour as we had with the tree codes before. Also added some extra > checks to tree-cfg.cc that made sense to me. > > I am still regression testing the gimple-range-op change, as that was a last > minute change, but the rest survived a bootstrap and regression test on > aarch64-unknown-linux-gnu. > > cover letter: > > This patch replaces the existing tree_code widen_plus and widen_minus > patterns with internal_fn versions. > > DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN > are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively > except they provide convenience wrappers for defining conversions that require > a hi/lo split. Each definition for <NAME> will require optabs for _hi and _lo > and each of those will also require a signed and unsigned version in the case > of widening. The hi/lo pair is necessary because the widening and narrowing > operations take n narrow elements as inputs and return n/2 wide elements as > outputs. The 'lo' operation operates on the first n/2 elements of input. The > 'hi' operation operates on the second n/2 elements of input. Defining an > internal_fn along with hi/lo variations allows a single internal function to > be returned from a vect_recog function that will later be expanded to hi/lo. > > > For example: > IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO > for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> > (u/s)addl2 > IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> > -> (u/s)addl > > This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree > codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. What I still don't understand is how we are so narrowly focused on HI/LO? We need a combined scalar IFN for pattern selection (not sure why that's now called _HILO, I expected no suffix). Then there's three possibilities the target can implement this: 1) with a widen_[su]add<mode> instruction - I _think_ that's what RISCV is going to offer since it is a target where vector modes have "padding" (aka you cannot subreg a V2SI to get V4HI). Instead RVV can do a V4HI to V4SI widening and widening add/subtract using vwadd[u] and vwsub[u] (the HI->SI widening is actually done with a widening add of zero - eh). IIRC GCN is the same here. 2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree codes currently support (exclusively) 3) similar, but widen_[su]add{_even,_odd}<mode> that said, things like decomposes_to_hilo_fn_p look to paint us into a 2) corner without good reason. Richard. > gcc/ChangeLog: > > 2023-05-12 Andre Vieira <andre.simoesdiasvieira@arm.com> > Joel Hutton <joel.hutton@arm.com> > Tamar Christina <tamar.christina@arm.com> > > * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): > Rename > this ... > (vec_widen_<su>add_lo_<mode>): ... to this. > (vec_widen_<su>addl_hi_<mode>): Rename this ... > (vec_widen_<su>add_hi_<mode>): ... to this. > (vec_widen_<su>subl_lo_<mode>): Rename this ... > (vec_widen_<su>sub_lo_<mode>): ... to this. > (vec_widen_<su>subl_hi_<mode>): Rename this ... > (vec_widen_<su>sub_hi_<mode>): ...to this. > * doc/generic.texi: Document new IFN codes. > * internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to > define an > internal_fn that expands into multiple internal_fns for widening. > (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing. > (ifn_cmp): Function to compare ifn's for sorting/searching. > (lookup_hilo_internal_fn): Add lookup function. > (commutative_binary_fn_p): Add widen_plus fn's. > (widening_fn_p): New function. > (narrowing_fn_p): New function. > (decomposes_to_hilo_fn_p): New function. > (direct_internal_fn_optab): Change visibility. > * internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define > widening > plus,minus functions. > (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code. > (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code. > * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. > (direct_internal_fn_optab): Declare new prototype. > (lookup_hilo_internal_fn): Likewise. > (widening_fn_p): Likewise. > (Narrowing_fn_p): Likewise. > (decomposes_to_hilo_fn_p): Likewise. > * optabs.cc (commutative_optab_p): Add widening plus optabs. > * optabs.def (OPTAB_D): Define widen add, sub optabs. > * tree-cfg.cc (verify_gimple_call): Add checks for new widen > add and sub IFNs. > * tree-inline.cc (estimate_num_insns): Return same > cost for widen add and sub IFNs as previous tree_codes. > * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support > patterns with a hi/lo split. > (vect_recog_sad_pattern): Refactor to use new IFN codes. > (vect_recog_widen_plus_pattern): Likewise. > (vect_recog_widen_minus_pattern): Likewise. > (vect_recog_average_pattern): Likewise. > * tree-vect-stmts.cc (vectorizable_conversion): Add support for > _HILO IFNs. > (supportable_widening_operation): Likewise. > * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/vect-widen-add.c: Test that new > IFN_VEC_WIDEN_PLUS is being used. > * gcc.target/aarch64/vect-widen-sub.c: Test that new > IFN_VEC_WIDEN_MINUS is being used. > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-12 13:28 ` Richard Biener @ 2023-05-12 13:55 ` Andre Vieira (lists) 2023-05-12 14:01 ` Richard Sandiford 1 sibling, 0 replies; 53+ messages in thread From: Andre Vieira (lists) @ 2023-05-12 13:55 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Biener, gcc-patches, richard.sandiford On 12/05/2023 14:28, Richard Biener wrote: > On Fri, 12 May 2023, Andre Vieira (lists) wrote: > >> I have dealt with, I think..., most of your comments. There's quite a few >> changes, I think it's all a bit simpler now. I made some other changes to the >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve >> the same behaviour as we had with the tree codes before. Also added some extra >> checks to tree-cfg.cc that made sense to me. >> >> I am still regression testing the gimple-range-op change, as that was a last >> minute change, but the rest survived a bootstrap and regression test on >> aarch64-unknown-linux-gnu. >> >> cover letter: >> >> This patch replaces the existing tree_code widen_plus and widen_minus >> patterns with internal_fn versions. >> >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively >> except they provide convenience wrappers for defining conversions that require >> a hi/lo split. Each definition for <NAME> will require optabs for _hi and _lo >> and each of those will also require a signed and unsigned version in the case >> of widening. The hi/lo pair is necessary because the widening and narrowing >> operations take n narrow elements as inputs and return n/2 wide elements as >> outputs. The 'lo' operation operates on the first n/2 elements of input. The >> 'hi' operation operates on the second n/2 elements of input. Defining an >> internal_fn along with hi/lo variations allows a single internal function to >> be returned from a vect_recog function that will later be expanded to hi/lo. >> >> >> For example: >> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO >> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> >> (u/s)addl2 >> IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> >> -> (u/s)addl >> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. > > What I still don't understand is how we are so narrowly focused on > HI/LO? We need a combined scalar IFN for pattern selection (not > sure why that's now called _HILO, I expected no suffix). Then there's > three possibilities the target can implement this: > > 1) with a widen_[su]add<mode> instruction - I _think_ that's what > RISCV is going to offer since it is a target where vector modes > have "padding" (aka you cannot subreg a V2SI to get V4HI). Instead > RVV can do a V4HI to V4SI widening and widening add/subtract > using vwadd[u] and vwsub[u] (the HI->SI widening is actually > done with a widening add of zero - eh). > IIRC GCN is the same here. > 2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree > codes currently support (exclusively) > 3) similar, but widen_[su]add{_even,_odd}<mode> > > that said, things like decomposes_to_hilo_fn_p look to paint us into > a 2) corner without good reason. I was kind of just keeping the naming, I had forgotten to mention I was also going to add _EVENODD but you are right, the pattern selection IFN does not need to be restrictive. And then at supportable_widening_operation we could check what the target offers support for (either 1, 2 or 3). We can then actually just get rid of decomposes_to_hilo_fn_p and just assume that for all narrowing or widening IFN's there are optabs (that may or may not be implemented by a target) for all three variants Having said that, that means we should have an optab to cover 1, which should probably just have the original name. Let me write it out... Say we have a IFN_VEC_WIDEN_PLUS pattern and assume its signed, supportable_widening_operation would then first check if the target supported vec_widen_sadd_optab for say V8HI -> V8SI? Risc-V would take this path I guess? If the target doesn't then it could check for support for: vec_widen_sadd_lo_optab V4HI -> V4SI vec_widen_sadd_hi_optab V4HI -> V4SI AArch64 Advanced SIMD would implement this. If the target still didn't support this it would check for (not sure about the modes here): vec_widen_sadd_even_optab VNx8HI -> VNx4SI vec_widen_sadd_odd_optab VNx8HI -> VNx4SI This is one SVE would implement. So that would mean that I'd probably end up rewriting #define DEF_INTERNAL_OPTAB_WIDENING_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) as: for1) DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) for 2) DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_LO, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_HI, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) for 3) DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_EVEN, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_ODD, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) And the same for narrowing (but with DEF_INTERNAL_OPTAB_FN instead of SIGNED_OPTAB). So each widening and narrowing IFN would have optabs for all its variants and each target implements the ones it supports. I'm happy to do this, but implementing support to handle the 1 and 3 variants without having optabs for them right now seems a bit odd and it would delay this patch, so I suggest I add the framework and the optabs but leave adding the vectorizer support for later? I can add comments to where I think that should go. > Richard. > >> gcc/ChangeLog: >> >> 2023-05-12 Andre Vieira <andre.simoesdiasvieira@arm.com> >> Joel Hutton <joel.hutton@arm.com> >> Tamar Christina <tamar.christina@arm.com> >> >> * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): >> Rename >> this ... >> (vec_widen_<su>add_lo_<mode>): ... to this. >> (vec_widen_<su>addl_hi_<mode>): Rename this ... >> (vec_widen_<su>add_hi_<mode>): ... to this. >> (vec_widen_<su>subl_lo_<mode>): Rename this ... >> (vec_widen_<su>sub_lo_<mode>): ... to this. >> (vec_widen_<su>subl_hi_<mode>): Rename this ... >> (vec_widen_<su>sub_hi_<mode>): ...to this. >> * doc/generic.texi: Document new IFN codes. >> * internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to >> define an >> internal_fn that expands into multiple internal_fns for widening. >> (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing. >> (ifn_cmp): Function to compare ifn's for sorting/searching. >> (lookup_hilo_internal_fn): Add lookup function. >> (commutative_binary_fn_p): Add widen_plus fn's. >> (widening_fn_p): New function. >> (narrowing_fn_p): New function. >> (decomposes_to_hilo_fn_p): New function. >> (direct_internal_fn_optab): Change visibility. >> * internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define >> widening >> plus,minus functions. >> (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code. >> (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code. >> * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. >> (direct_internal_fn_optab): Declare new prototype. >> (lookup_hilo_internal_fn): Likewise. >> (widening_fn_p): Likewise. >> (Narrowing_fn_p): Likewise. >> (decomposes_to_hilo_fn_p): Likewise. >> * optabs.cc (commutative_optab_p): Add widening plus optabs. >> * optabs.def (OPTAB_D): Define widen add, sub optabs. >> * tree-cfg.cc (verify_gimple_call): Add checks for new widen >> add and sub IFNs. >> * tree-inline.cc (estimate_num_insns): Return same >> cost for widen add and sub IFNs as previous tree_codes. >> * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support >> patterns with a hi/lo split. >> (vect_recog_sad_pattern): Refactor to use new IFN codes. >> (vect_recog_widen_plus_pattern): Likewise. >> (vect_recog_widen_minus_pattern): Likewise. >> (vect_recog_average_pattern): Likewise. >> * tree-vect-stmts.cc (vectorizable_conversion): Add support for >> _HILO IFNs. >> (supportable_widening_operation): Likewise. >> * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/aarch64/vect-widen-add.c: Test that new >> IFN_VEC_WIDEN_PLUS is being used. >> * gcc.target/aarch64/vect-widen-sub.c: Test that new >> IFN_VEC_WIDEN_MINUS is being used. >> > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-12 13:28 ` Richard Biener 2023-05-12 13:55 ` Andre Vieira (lists) @ 2023-05-12 14:01 ` Richard Sandiford 2023-05-15 10:20 ` Richard Biener 1 sibling, 1 reply; 53+ messages in thread From: Richard Sandiford @ 2023-05-12 14:01 UTC (permalink / raw) To: Richard Biener; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches Richard Biener <rguenther@suse.de> writes: > On Fri, 12 May 2023, Andre Vieira (lists) wrote: > >> I have dealt with, I think..., most of your comments. There's quite a few >> changes, I think it's all a bit simpler now. I made some other changes to the >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve >> the same behaviour as we had with the tree codes before. Also added some extra >> checks to tree-cfg.cc that made sense to me. >> >> I am still regression testing the gimple-range-op change, as that was a last >> minute change, but the rest survived a bootstrap and regression test on >> aarch64-unknown-linux-gnu. >> >> cover letter: >> >> This patch replaces the existing tree_code widen_plus and widen_minus >> patterns with internal_fn versions. >> >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively >> except they provide convenience wrappers for defining conversions that require >> a hi/lo split. Each definition for <NAME> will require optabs for _hi and _lo >> and each of those will also require a signed and unsigned version in the case >> of widening. The hi/lo pair is necessary because the widening and narrowing >> operations take n narrow elements as inputs and return n/2 wide elements as >> outputs. The 'lo' operation operates on the first n/2 elements of input. The >> 'hi' operation operates on the second n/2 elements of input. Defining an >> internal_fn along with hi/lo variations allows a single internal function to >> be returned from a vect_recog function that will later be expanded to hi/lo. >> >> >> For example: >> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO >> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> >> (u/s)addl2 >> IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> >> -> (u/s)addl >> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. > > What I still don't understand is how we are so narrowly focused on > HI/LO? We need a combined scalar IFN for pattern selection (not > sure why that's now called _HILO, I expected no suffix). Then there's > three possibilities the target can implement this: > > 1) with a widen_[su]add<mode> instruction - I _think_ that's what > RISCV is going to offer since it is a target where vector modes > have "padding" (aka you cannot subreg a V2SI to get V4HI). Instead > RVV can do a V4HI to V4SI widening and widening add/subtract > using vwadd[u] and vwsub[u] (the HI->SI widening is actually > done with a widening add of zero - eh). > IIRC GCN is the same here. SVE currently does this too, but the addition and widening are separate operations. E.g. in principle there's no reason why you can't sign-extend one operand, zero-extend the other, and then add the result together. Or you could extend them from different sizes (QI and HI). All of those are supported (if the costing allows them). If the target has operations to do combined extending and adding (or whatever), then at the moment we rely on combine to generate them. So I think this case is separate from Andre's work. The addition itself is just an ordinary addition, and any widening happens by vectorising a CONVERT/NOP_EXPR. > 2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree > codes currently support (exclusively) > 3) similar, but widen_[su]add{_even,_odd}<mode> > > that said, things like decomposes_to_hilo_fn_p look to paint us into > a 2) corner without good reason. I suppose one question is: how much of the patch is really specific to HI/LO, and how much is just grouping two halves together? The nice thing about the internal-fn grouping macros is that, if (3) is implemented in future, the structure will strongly encourage even/odd pairs to be supported for all operations that support hi/lo. That is, I would expect the grouping macros to be extended to define even/odd ifns alongside hi/lo ones, rather than adding separate definitions for even/odd functions. If so, at least from the internal-fn.* side of things, I think the question is whether it's OK to stick with hilo names for now, or whether we should use more forward-looking names. Thanks, Richard > > Richard. > >> gcc/ChangeLog: >> >> 2023-05-12 Andre Vieira <andre.simoesdiasvieira@arm.com> >> Joel Hutton <joel.hutton@arm.com> >> Tamar Christina <tamar.christina@arm.com> >> >> * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): >> Rename >> this ... >> (vec_widen_<su>add_lo_<mode>): ... to this. >> (vec_widen_<su>addl_hi_<mode>): Rename this ... >> (vec_widen_<su>add_hi_<mode>): ... to this. >> (vec_widen_<su>subl_lo_<mode>): Rename this ... >> (vec_widen_<su>sub_lo_<mode>): ... to this. >> (vec_widen_<su>subl_hi_<mode>): Rename this ... >> (vec_widen_<su>sub_hi_<mode>): ...to this. >> * doc/generic.texi: Document new IFN codes. >> * internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to >> define an >> internal_fn that expands into multiple internal_fns for widening. >> (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing. >> (ifn_cmp): Function to compare ifn's for sorting/searching. >> (lookup_hilo_internal_fn): Add lookup function. >> (commutative_binary_fn_p): Add widen_plus fn's. >> (widening_fn_p): New function. >> (narrowing_fn_p): New function. >> (decomposes_to_hilo_fn_p): New function. >> (direct_internal_fn_optab): Change visibility. >> * internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define >> widening >> plus,minus functions. >> (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code. >> (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code. >> * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. >> (direct_internal_fn_optab): Declare new prototype. >> (lookup_hilo_internal_fn): Likewise. >> (widening_fn_p): Likewise. >> (Narrowing_fn_p): Likewise. >> (decomposes_to_hilo_fn_p): Likewise. >> * optabs.cc (commutative_optab_p): Add widening plus optabs. >> * optabs.def (OPTAB_D): Define widen add, sub optabs. >> * tree-cfg.cc (verify_gimple_call): Add checks for new widen >> add and sub IFNs. >> * tree-inline.cc (estimate_num_insns): Return same >> cost for widen add and sub IFNs as previous tree_codes. >> * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support >> patterns with a hi/lo split. >> (vect_recog_sad_pattern): Refactor to use new IFN codes. >> (vect_recog_widen_plus_pattern): Likewise. >> (vect_recog_widen_minus_pattern): Likewise. >> (vect_recog_average_pattern): Likewise. >> * tree-vect-stmts.cc (vectorizable_conversion): Add support for >> _HILO IFNs. >> (supportable_widening_operation): Likewise. >> * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/aarch64/vect-widen-add.c: Test that new >> IFN_VEC_WIDEN_PLUS is being used. >> * gcc.target/aarch64/vect-widen-sub.c: Test that new >> IFN_VEC_WIDEN_MINUS is being used. >> ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-12 14:01 ` Richard Sandiford @ 2023-05-15 10:20 ` Richard Biener 2023-05-15 10:47 ` Richard Sandiford 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2023-05-15 10:20 UTC (permalink / raw) To: Richard Sandiford; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches On Fri, 12 May 2023, Richard Sandiford wrote: > Richard Biener <rguenther@suse.de> writes: > > On Fri, 12 May 2023, Andre Vieira (lists) wrote: > > > >> I have dealt with, I think..., most of your comments. There's quite a few > >> changes, I think it's all a bit simpler now. I made some other changes to the > >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve > >> the same behaviour as we had with the tree codes before. Also added some extra > >> checks to tree-cfg.cc that made sense to me. > >> > >> I am still regression testing the gimple-range-op change, as that was a last > >> minute change, but the rest survived a bootstrap and regression test on > >> aarch64-unknown-linux-gnu. > >> > >> cover letter: > >> > >> This patch replaces the existing tree_code widen_plus and widen_minus > >> patterns with internal_fn versions. > >> > >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN > >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively > >> except they provide convenience wrappers for defining conversions that require > >> a hi/lo split. Each definition for <NAME> will require optabs for _hi and _lo > >> and each of those will also require a signed and unsigned version in the case > >> of widening. The hi/lo pair is necessary because the widening and narrowing > >> operations take n narrow elements as inputs and return n/2 wide elements as > >> outputs. The 'lo' operation operates on the first n/2 elements of input. The > >> 'hi' operation operates on the second n/2 elements of input. Defining an > >> internal_fn along with hi/lo variations allows a single internal function to > >> be returned from a vect_recog function that will later be expanded to hi/lo. > >> > >> > >> For example: > >> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO > >> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> > >> (u/s)addl2 > >> IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> > >> -> (u/s)addl > >> > >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree > >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. > > > > What I still don't understand is how we are so narrowly focused on > > HI/LO? We need a combined scalar IFN for pattern selection (not > > sure why that's now called _HILO, I expected no suffix). Then there's > > three possibilities the target can implement this: > > > > 1) with a widen_[su]add<mode> instruction - I _think_ that's what > > RISCV is going to offer since it is a target where vector modes > > have "padding" (aka you cannot subreg a V2SI to get V4HI). Instead > > RVV can do a V4HI to V4SI widening and widening add/subtract > > using vwadd[u] and vwsub[u] (the HI->SI widening is actually > > done with a widening add of zero - eh). > > IIRC GCN is the same here. > > SVE currently does this too, but the addition and widening are > separate operations. E.g. in principle there's no reason why > you can't sign-extend one operand, zero-extend the other, and > then add the result together. Or you could extend them from > different sizes (QI and HI). All of those are supported > (if the costing allows them). I see. So why does the target the expose widen_[su]add<mode> at all? > If the target has operations to do combined extending and adding (or > whatever), then at the moment we rely on combine to generate them. > > So I think this case is separate from Andre's work. The addition > itself is just an ordinary addition, and any widening happens by > vectorising a CONVERT/NOP_EXPR. > > > 2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree > > codes currently support (exclusively) > > 3) similar, but widen_[su]add{_even,_odd}<mode> > > > > that said, things like decomposes_to_hilo_fn_p look to paint us into > > a 2) corner without good reason. > > I suppose one question is: how much of the patch is really specific > to HI/LO, and how much is just grouping two halves together? Yep, that I don't know for sure. > The nice > thing about the internal-fn grouping macros is that, if (3) is > implemented in future, the structure will strongly encourage even/odd > pairs to be supported for all operations that support hi/lo. That is, > I would expect the grouping macros to be extended to define even/odd > ifns alongside hi/lo ones, rather than adding separate definitions > for even/odd functions. > > If so, at least from the internal-fn.* side of things, I think the question > is whether it's OK to stick with hilo names for now, or whether we should > use more forward-looking names. I think for parts that are independent we could use a more forward-looking name. Maybe _halves? But I'm also not sure how much of that is really needed (it seems to be tied around optimizing optabs space?) Richard. > Thanks, > Richard > > > > > Richard. > > > >> gcc/ChangeLog: > >> > >> 2023-05-12 Andre Vieira <andre.simoesdiasvieira@arm.com> > >> Joel Hutton <joel.hutton@arm.com> > >> Tamar Christina <tamar.christina@arm.com> > >> > >> * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): > >> Rename > >> this ... > >> (vec_widen_<su>add_lo_<mode>): ... to this. > >> (vec_widen_<su>addl_hi_<mode>): Rename this ... > >> (vec_widen_<su>add_hi_<mode>): ... to this. > >> (vec_widen_<su>subl_lo_<mode>): Rename this ... > >> (vec_widen_<su>sub_lo_<mode>): ... to this. > >> (vec_widen_<su>subl_hi_<mode>): Rename this ... > >> (vec_widen_<su>sub_hi_<mode>): ...to this. > >> * doc/generic.texi: Document new IFN codes. > >> * internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to > >> define an > >> internal_fn that expands into multiple internal_fns for widening. > >> (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing. > >> (ifn_cmp): Function to compare ifn's for sorting/searching. > >> (lookup_hilo_internal_fn): Add lookup function. > >> (commutative_binary_fn_p): Add widen_plus fn's. > >> (widening_fn_p): New function. > >> (narrowing_fn_p): New function. > >> (decomposes_to_hilo_fn_p): New function. > >> (direct_internal_fn_optab): Change visibility. > >> * internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define > >> widening > >> plus,minus functions. > >> (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code. > >> (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code. > >> * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. > >> (direct_internal_fn_optab): Declare new prototype. > >> (lookup_hilo_internal_fn): Likewise. > >> (widening_fn_p): Likewise. > >> (Narrowing_fn_p): Likewise. > >> (decomposes_to_hilo_fn_p): Likewise. > >> * optabs.cc (commutative_optab_p): Add widening plus optabs. > >> * optabs.def (OPTAB_D): Define widen add, sub optabs. > >> * tree-cfg.cc (verify_gimple_call): Add checks for new widen > >> add and sub IFNs. > >> * tree-inline.cc (estimate_num_insns): Return same > >> cost for widen add and sub IFNs as previous tree_codes. > >> * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support > >> patterns with a hi/lo split. > >> (vect_recog_sad_pattern): Refactor to use new IFN codes. > >> (vect_recog_widen_plus_pattern): Likewise. > >> (vect_recog_widen_minus_pattern): Likewise. > >> (vect_recog_average_pattern): Likewise. > >> * tree-vect-stmts.cc (vectorizable_conversion): Add support for > >> _HILO IFNs. > >> (supportable_widening_operation): Likewise. > >> * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. > >> > >> gcc/testsuite/ChangeLog: > >> > >> * gcc.target/aarch64/vect-widen-add.c: Test that new > >> IFN_VEC_WIDEN_PLUS is being used. > >> * gcc.target/aarch64/vect-widen-sub.c: Test that new > >> IFN_VEC_WIDEN_MINUS is being used. > >> > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-15 10:20 ` Richard Biener @ 2023-05-15 10:47 ` Richard Sandiford 2023-05-15 11:01 ` Richard Biener 0 siblings, 1 reply; 53+ messages in thread From: Richard Sandiford @ 2023-05-15 10:47 UTC (permalink / raw) To: Richard Biener; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches Richard Biener <rguenther@suse.de> writes: > On Fri, 12 May 2023, Richard Sandiford wrote: > >> Richard Biener <rguenther@suse.de> writes: >> > On Fri, 12 May 2023, Andre Vieira (lists) wrote: >> > >> >> I have dealt with, I think..., most of your comments. There's quite a few >> >> changes, I think it's all a bit simpler now. I made some other changes to the >> >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve >> >> the same behaviour as we had with the tree codes before. Also added some extra >> >> checks to tree-cfg.cc that made sense to me. >> >> >> >> I am still regression testing the gimple-range-op change, as that was a last >> >> minute change, but the rest survived a bootstrap and regression test on >> >> aarch64-unknown-linux-gnu. >> >> >> >> cover letter: >> >> >> >> This patch replaces the existing tree_code widen_plus and widen_minus >> >> patterns with internal_fn versions. >> >> >> >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN >> >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively >> >> except they provide convenience wrappers for defining conversions that require >> >> a hi/lo split. Each definition for <NAME> will require optabs for _hi and _lo >> >> and each of those will also require a signed and unsigned version in the case >> >> of widening. The hi/lo pair is necessary because the widening and narrowing >> >> operations take n narrow elements as inputs and return n/2 wide elements as >> >> outputs. The 'lo' operation operates on the first n/2 elements of input. The >> >> 'hi' operation operates on the second n/2 elements of input. Defining an >> >> internal_fn along with hi/lo variations allows a single internal function to >> >> be returned from a vect_recog function that will later be expanded to hi/lo. >> >> >> >> >> >> For example: >> >> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO >> >> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> >> >> (u/s)addl2 >> >> IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> >> >> -> (u/s)addl >> >> >> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree >> >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. >> > >> > What I still don't understand is how we are so narrowly focused on >> > HI/LO? We need a combined scalar IFN for pattern selection (not >> > sure why that's now called _HILO, I expected no suffix). Then there's >> > three possibilities the target can implement this: >> > >> > 1) with a widen_[su]add<mode> instruction - I _think_ that's what >> > RISCV is going to offer since it is a target where vector modes >> > have "padding" (aka you cannot subreg a V2SI to get V4HI). Instead >> > RVV can do a V4HI to V4SI widening and widening add/subtract >> > using vwadd[u] and vwsub[u] (the HI->SI widening is actually >> > done with a widening add of zero - eh). >> > IIRC GCN is the same here. >> >> SVE currently does this too, but the addition and widening are >> separate operations. E.g. in principle there's no reason why >> you can't sign-extend one operand, zero-extend the other, and >> then add the result together. Or you could extend them from >> different sizes (QI and HI). All of those are supported >> (if the costing allows them). > > I see. So why does the target the expose widen_[su]add<mode> at all? It shouldn't (need to) do that. I don't think we should have an optab for the unsplit operation. At least on SVE, we really want the extensions to be fused with loads (where possible) rather than with arithmetic. We can still do the widening arithmetic in one go. It's just that fusing with the loads works for the mixed-sign and mixed-size cases, and can handle more than just doubling the element size. >> If the target has operations to do combined extending and adding (or >> whatever), then at the moment we rely on combine to generate them. >> >> So I think this case is separate from Andre's work. The addition >> itself is just an ordinary addition, and any widening happens by >> vectorising a CONVERT/NOP_EXPR. >> >> > 2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree >> > codes currently support (exclusively) >> > 3) similar, but widen_[su]add{_even,_odd}<mode> >> > >> > that said, things like decomposes_to_hilo_fn_p look to paint us into >> > a 2) corner without good reason. >> >> I suppose one question is: how much of the patch is really specific >> to HI/LO, and how much is just grouping two halves together? > > Yep, that I don't know for sure. > >> The nice >> thing about the internal-fn grouping macros is that, if (3) is >> implemented in future, the structure will strongly encourage even/odd >> pairs to be supported for all operations that support hi/lo. That is, >> I would expect the grouping macros to be extended to define even/odd >> ifns alongside hi/lo ones, rather than adding separate definitions >> for even/odd functions. >> >> If so, at least from the internal-fn.* side of things, I think the question >> is whether it's OK to stick with hilo names for now, or whether we should >> use more forward-looking names. > > I think for parts that are independent we could use a more > forward-looking name. Maybe _halves? Using _halves for the ifn macros sounds good to me FWIW. > But I'm also not sure > how much of that is really needed (it seems to be tied around > optimizing optabs space?) Not sure what you mean by "this". Optabs space shouldn't be a problem though. The optab encoding gives us a full int to play with, and it could easily go up to 64 bits if necessary/convenient. At least on the internal-fn.* side, the aim is really just to establish a regular structure, so that we don't have arbitrary differences between different widening operations, or too much cut-&-paste. Thanks, Richard ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-15 10:47 ` Richard Sandiford @ 2023-05-15 11:01 ` Richard Biener 2023-05-15 11:10 ` Richard Sandiford 2023-05-15 11:53 ` Andre Vieira (lists) 0 siblings, 2 replies; 53+ messages in thread From: Richard Biener @ 2023-05-15 11:01 UTC (permalink / raw) To: Richard Sandiford; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches On Mon, 15 May 2023, Richard Sandiford wrote: > Richard Biener <rguenther@suse.de> writes: > > On Fri, 12 May 2023, Richard Sandiford wrote: > > > >> Richard Biener <rguenther@suse.de> writes: > >> > On Fri, 12 May 2023, Andre Vieira (lists) wrote: > >> > > >> >> I have dealt with, I think..., most of your comments. There's quite a few > >> >> changes, I think it's all a bit simpler now. I made some other changes to the > >> >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve > >> >> the same behaviour as we had with the tree codes before. Also added some extra > >> >> checks to tree-cfg.cc that made sense to me. > >> >> > >> >> I am still regression testing the gimple-range-op change, as that was a last > >> >> minute change, but the rest survived a bootstrap and regression test on > >> >> aarch64-unknown-linux-gnu. > >> >> > >> >> cover letter: > >> >> > >> >> This patch replaces the existing tree_code widen_plus and widen_minus > >> >> patterns with internal_fn versions. > >> >> > >> >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN > >> >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively > >> >> except they provide convenience wrappers for defining conversions that require > >> >> a hi/lo split. Each definition for <NAME> will require optabs for _hi and _lo > >> >> and each of those will also require a signed and unsigned version in the case > >> >> of widening. The hi/lo pair is necessary because the widening and narrowing > >> >> operations take n narrow elements as inputs and return n/2 wide elements as > >> >> outputs. The 'lo' operation operates on the first n/2 elements of input. The > >> >> 'hi' operation operates on the second n/2 elements of input. Defining an > >> >> internal_fn along with hi/lo variations allows a single internal function to > >> >> be returned from a vect_recog function that will later be expanded to hi/lo. > >> >> > >> >> > >> >> For example: > >> >> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO > >> >> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> > >> >> (u/s)addl2 > >> >> IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> > >> >> -> (u/s)addl > >> >> > >> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree > >> >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. > >> > > >> > What I still don't understand is how we are so narrowly focused on > >> > HI/LO? We need a combined scalar IFN for pattern selection (not > >> > sure why that's now called _HILO, I expected no suffix). Then there's > >> > three possibilities the target can implement this: > >> > > >> > 1) with a widen_[su]add<mode> instruction - I _think_ that's what > >> > RISCV is going to offer since it is a target where vector modes > >> > have "padding" (aka you cannot subreg a V2SI to get V4HI). Instead > >> > RVV can do a V4HI to V4SI widening and widening add/subtract > >> > using vwadd[u] and vwsub[u] (the HI->SI widening is actually > >> > done with a widening add of zero - eh). > >> > IIRC GCN is the same here. > >> > >> SVE currently does this too, but the addition and widening are > >> separate operations. E.g. in principle there's no reason why > >> you can't sign-extend one operand, zero-extend the other, and > >> then add the result together. Or you could extend them from > >> different sizes (QI and HI). All of those are supported > >> (if the costing allows them). > > > > I see. So why does the target the expose widen_[su]add<mode> at all? > > It shouldn't (need to) do that. I don't think we should have an optab > for the unsplit operation. > > At least on SVE, we really want the extensions to be fused with loads > (where possible) rather than with arithmetic. > > We can still do the widening arithmetic in one go. It's just that > fusing with the loads works for the mixed-sign and mixed-size cases, > and can handle more than just doubling the element size. > > >> If the target has operations to do combined extending and adding (or > >> whatever), then at the moment we rely on combine to generate them. > >> > >> So I think this case is separate from Andre's work. The addition > >> itself is just an ordinary addition, and any widening happens by > >> vectorising a CONVERT/NOP_EXPR. > >> > >> > 2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree > >> > codes currently support (exclusively) > >> > 3) similar, but widen_[su]add{_even,_odd}<mode> > >> > > >> > that said, things like decomposes_to_hilo_fn_p look to paint us into > >> > a 2) corner without good reason. > >> > >> I suppose one question is: how much of the patch is really specific > >> to HI/LO, and how much is just grouping two halves together? > > > > Yep, that I don't know for sure. > > > >> The nice > >> thing about the internal-fn grouping macros is that, if (3) is > >> implemented in future, the structure will strongly encourage even/odd > >> pairs to be supported for all operations that support hi/lo. That is, > >> I would expect the grouping macros to be extended to define even/odd > >> ifns alongside hi/lo ones, rather than adding separate definitions > >> for even/odd functions. > >> > >> If so, at least from the internal-fn.* side of things, I think the question > >> is whether it's OK to stick with hilo names for now, or whether we should > >> use more forward-looking names. > > > > I think for parts that are independent we could use a more > > forward-looking name. Maybe _halves? > > Using _halves for the ifn macros sounds good to me FWIW. > > > But I'm also not sure > > how much of that is really needed (it seems to be tied around > > optimizing optabs space?) > > Not sure what you mean by "this". Optabs space shouldn't be a problem > though. The optab encoding gives us a full int to play with, and it > could easily go up to 64 bits if necessary/convenient. > > At least on the internal-fn.* side, the aim is really just to establish > a regular structure, so that we don't have arbitrary differences between > different widening operations, or too much cut-&-paste. Hmm, I'm looking at the need for the std::map and internal_fn_hilo_keys_array and internal_fn_hilo_values_array. The vectorizer pieces contain + if (code.is_fn_code ()) + { + internal_fn ifn = as_internal_fn ((combined_fn) code); + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + internal_fn lo, hi; + lookup_hilo_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); + optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); so that tries to automatically associate the scalar widening IFN with the set(s) of IFN pairs we can split to. But then this list should be static and there's no need to create a std::map? Maybe gencfn-macros.cc can be enhanced to output these static cases? Or the vectorizer could (as it did previously) simply open-code the handled cases (I guess since we deal with two cases only now I'd prefer that). Thanks, Richard. > Thanks, > Richard > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-15 11:01 ` Richard Biener @ 2023-05-15 11:10 ` Richard Sandiford 2023-05-15 11:53 ` Andre Vieira (lists) 1 sibling, 0 replies; 53+ messages in thread From: Richard Sandiford @ 2023-05-15 11:10 UTC (permalink / raw) To: Richard Biener; +Cc: Andre Vieira (lists), Richard Biener, gcc-patches Richard Biener <rguenther@suse.de> writes: > On Mon, 15 May 2023, Richard Sandiford wrote: > >> Richard Biener <rguenther@suse.de> writes: >> > But I'm also not sure >> > how much of that is really needed (it seems to be tied around >> > optimizing optabs space?) >> >> Not sure what you mean by "this". Optabs space shouldn't be a problem >> though. The optab encoding gives us a full int to play with, and it >> could easily go up to 64 bits if necessary/convenient. >> >> At least on the internal-fn.* side, the aim is really just to establish >> a regular structure, so that we don't have arbitrary differences between >> different widening operations, or too much cut-&-paste. > > Hmm, I'm looking at the need for the std::map and > internal_fn_hilo_keys_array and internal_fn_hilo_values_array. > The vectorizer pieces contain > > + if (code.is_fn_code ()) > + { > + internal_fn ifn = as_internal_fn ((combined_fn) code); > + gcc_assert (decomposes_to_hilo_fn_p (ifn)); > + > + internal_fn lo, hi; > + lookup_hilo_internal_fn (ifn, &lo, &hi); > + *code1 = as_combined_fn (lo); > + *code2 = as_combined_fn (hi); > + optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); > + optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); > > so that tries to automatically associate the scalar widening IFN > with the set(s) of IFN pairs we can split to. But then this > list should be static and there's no need to create a std::map? > Maybe gencfn-macros.cc can be enhanced to output these static > cases? Or the vectorizer could (as it did previously) simply > open-code the handled cases (I guess since we deal with two > cases only now I'd prefer that). Ah, yeah, I pushed back against that too. I think it should be possible to do it using the preprocessor, if the macros are defined appropriately. But if it isn't possible to do it with macros then I agree that a generator would be better than initialisation within the compiler. Thanks, Richard ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-15 11:01 ` Richard Biener 2023-05-15 11:10 ` Richard Sandiford @ 2023-05-15 11:53 ` Andre Vieira (lists) 2023-05-15 12:21 ` Richard Biener 1 sibling, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-05-15 11:53 UTC (permalink / raw) To: Richard Biener, Richard Sandiford; +Cc: Richard Biener, gcc-patches [-- Attachment #1: Type: text/plain, Size: 7818 bytes --] On 15/05/2023 12:01, Richard Biener wrote: > On Mon, 15 May 2023, Richard Sandiford wrote: > >> Richard Biener <rguenther@suse.de> writes: >>> On Fri, 12 May 2023, Richard Sandiford wrote: >>> >>>> Richard Biener <rguenther@suse.de> writes: >>>>> On Fri, 12 May 2023, Andre Vieira (lists) wrote: >>>>> >>>>>> I have dealt with, I think..., most of your comments. There's quite a few >>>>>> changes, I think it's all a bit simpler now. I made some other changes to the >>>>>> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve >>>>>> the same behaviour as we had with the tree codes before. Also added some extra >>>>>> checks to tree-cfg.cc that made sense to me. >>>>>> >>>>>> I am still regression testing the gimple-range-op change, as that was a last >>>>>> minute change, but the rest survived a bootstrap and regression test on >>>>>> aarch64-unknown-linux-gnu. >>>>>> >>>>>> cover letter: >>>>>> >>>>>> This patch replaces the existing tree_code widen_plus and widen_minus >>>>>> patterns with internal_fn versions. >>>>>> >>>>>> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN >>>>>> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively >>>>>> except they provide convenience wrappers for defining conversions that require >>>>>> a hi/lo split. Each definition for <NAME> will require optabs for _hi and _lo >>>>>> and each of those will also require a signed and unsigned version in the case >>>>>> of widening. The hi/lo pair is necessary because the widening and narrowing >>>>>> operations take n narrow elements as inputs and return n/2 wide elements as >>>>>> outputs. The 'lo' operation operates on the first n/2 elements of input. The >>>>>> 'hi' operation operates on the second n/2 elements of input. Defining an >>>>>> internal_fn along with hi/lo variations allows a single internal function to >>>>>> be returned from a vect_recog function that will later be expanded to hi/lo. >>>>>> >>>>>> >>>>>> For example: >>>>>> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO >>>>>> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> >>>>>> (u/s)addl2 >>>>>> IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> >>>>>> -> (u/s)addl >>>>>> >>>>>> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree >>>>>> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. >>>>> >>>>> What I still don't understand is how we are so narrowly focused on >>>>> HI/LO? We need a combined scalar IFN for pattern selection (not >>>>> sure why that's now called _HILO, I expected no suffix). Then there's >>>>> three possibilities the target can implement this: >>>>> >>>>> 1) with a widen_[su]add<mode> instruction - I _think_ that's what >>>>> RISCV is going to offer since it is a target where vector modes >>>>> have "padding" (aka you cannot subreg a V2SI to get V4HI). Instead >>>>> RVV can do a V4HI to V4SI widening and widening add/subtract >>>>> using vwadd[u] and vwsub[u] (the HI->SI widening is actually >>>>> done with a widening add of zero - eh). >>>>> IIRC GCN is the same here. >>>> >>>> SVE currently does this too, but the addition and widening are >>>> separate operations. E.g. in principle there's no reason why >>>> you can't sign-extend one operand, zero-extend the other, and >>>> then add the result together. Or you could extend them from >>>> different sizes (QI and HI). All of those are supported >>>> (if the costing allows them). >>> >>> I see. So why does the target the expose widen_[su]add<mode> at all? >> >> It shouldn't (need to) do that. I don't think we should have an optab >> for the unsplit operation. >> >> At least on SVE, we really want the extensions to be fused with loads >> (where possible) rather than with arithmetic. >> >> We can still do the widening arithmetic in one go. It's just that >> fusing with the loads works for the mixed-sign and mixed-size cases, >> and can handle more than just doubling the element size. >> >>>> If the target has operations to do combined extending and adding (or >>>> whatever), then at the moment we rely on combine to generate them. >>>> >>>> So I think this case is separate from Andre's work. The addition >>>> itself is just an ordinary addition, and any widening happens by >>>> vectorising a CONVERT/NOP_EXPR. >>>> >>>>> 2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree >>>>> codes currently support (exclusively) >>>>> 3) similar, but widen_[su]add{_even,_odd}<mode> >>>>> >>>>> that said, things like decomposes_to_hilo_fn_p look to paint us into >>>>> a 2) corner without good reason. >>>> >>>> I suppose one question is: how much of the patch is really specific >>>> to HI/LO, and how much is just grouping two halves together? >>> >>> Yep, that I don't know for sure. >>> >>>> The nice >>>> thing about the internal-fn grouping macros is that, if (3) is >>>> implemented in future, the structure will strongly encourage even/odd >>>> pairs to be supported for all operations that support hi/lo. That is, >>>> I would expect the grouping macros to be extended to define even/odd >>>> ifns alongside hi/lo ones, rather than adding separate definitions >>>> for even/odd functions. >>>> >>>> If so, at least from the internal-fn.* side of things, I think the question >>>> is whether it's OK to stick with hilo names for now, or whether we should >>>> use more forward-looking names. >>> >>> I think for parts that are independent we could use a more >>> forward-looking name. Maybe _halves? >> >> Using _halves for the ifn macros sounds good to me FWIW. >> >>> But I'm also not sure >>> how much of that is really needed (it seems to be tied around >>> optimizing optabs space?) >> >> Not sure what you mean by "this". Optabs space shouldn't be a problem >> though. The optab encoding gives us a full int to play with, and it >> could easily go up to 64 bits if necessary/convenient. >> >> At least on the internal-fn.* side, the aim is really just to establish >> a regular structure, so that we don't have arbitrary differences between >> different widening operations, or too much cut-&-paste. > > Hmm, I'm looking at the need for the std::map and > internal_fn_hilo_keys_array and internal_fn_hilo_values_array. > The vectorizer pieces contain > > + if (code.is_fn_code ()) > + { > + internal_fn ifn = as_internal_fn ((combined_fn) code); > + gcc_assert (decomposes_to_hilo_fn_p (ifn)); > + > + internal_fn lo, hi; > + lookup_hilo_internal_fn (ifn, &lo, &hi); > + *code1 = as_combined_fn (lo); > + *code2 = as_combined_fn (hi); > + optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); > + optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); > > so that tries to automatically associate the scalar widening IFN > with the set(s) of IFN pairs we can split to. But then this > list should be static and there's no need to create a std::map? > Maybe gencfn-macros.cc can be enhanced to output these static > cases? Or the vectorizer could (as it did previously) simply > open-code the handled cases (I guess since we deal with two > cases only now I'd prefer that). > > Thanks, > Richard. > > >> Thanks, >> Richard >> > The patch I uploaded last no longer has std::map nor internal_fn_hilo_keys_array and internal_fn_hilo_values_array. (I've attached it again) I'm not sure I understand the _halves, do you mean that for the case where I had _hilo or _HILO before we rename that to _halves/_HALVES such that it later represents both _hi/_lo separation and _even/_odd? And am I correct to assume we are just giving up on having a INTERNAL_OPTAB_FN idea for 1)? Kind regards, Andre [-- Attachment #2: ifn1v3.patch --] [-- Type: text/plain, Size: 34240 bytes --] diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index bfc98a8d943467b33390defab9682f44efab5907..ffbbecb9409e1c2835d658c2a8855cd0e955c0f2 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4626,7 +4626,7 @@ [(set_attr "type" "neon_<ADDSUB:optab>_long")] ) -(define_expand "vec_widen_<su>addl_lo_<mode>" +(define_expand "vec_widen_<su>add_lo_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4638,7 +4638,7 @@ DONE; }) -(define_expand "vec_widen_<su>addl_hi_<mode>" +(define_expand "vec_widen_<su>add_hi_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4650,7 +4650,7 @@ DONE; }) -(define_expand "vec_widen_<su>subl_lo_<mode>" +(define_expand "vec_widen_<su>sub_lo_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4662,7 +4662,7 @@ DONE; }) -(define_expand "vec_widen_<su>subl_hi_<mode>" +(define_expand "vec_widen_<su>sub_hi_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index 8b2882da4fe7da07d22b4e5384d049ba7d3907bf..0fd7e6cce8bbd4ecb8027b702722adcf6c32eb55 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1811,6 +1811,10 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @tindex VEC_WIDEN_MULT_LO_EXPR +@tindex IFN_VEC_WIDEN_PLUS_HI +@tindex IFN_VEC_WIDEN_PLUS_LO +@tindex IFN_VEC_WIDEN_MINUS_HI +@tindex IFN_VEC_WIDEN_MINUS_LO @tindex VEC_WIDEN_PLUS_HI_EXPR @tindex VEC_WIDEN_PLUS_LO_EXPR @tindex VEC_WIDEN_MINUS_HI_EXPR @@ -1861,6 +1865,33 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the low @code{N/2} elements of the two vector are multiplied to produce the vector of @code{N/2} products. +@item IFN_VEC_WIDEN_PLUS_HI +@itemx IFN_VEC_WIDEN_PLUS_LO +These internal functions represent widening vector addition of the high and low +parts of the two input vectors, respectively. Their operands are vectors that +contain the same number of elements (@code{N}) of the same integral type. The +result is a vector that contains half as many elements, of an integral type +whose size is twice as wide. In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the +high @code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} products. In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low +@code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} products. + +@item IFN_VEC_WIDEN_MINUS_HI +@itemx IFN_VEC_WIDEN_MINUS_LO +These internal functions represent widening vector subtraction of the high and +low parts of the two input vectors, respectively. Their operands are vectors +that contain the same number of elements (@code{N}) of the same integral type. +The high/low elements of the second vector are subtracted from the high/low +elements of the first. The result is a vector that contains half as many +elements, of an integral type whose size is twice as wide. In the case of +@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second +vector are subtracted from the high @code{N/2} of the first to produce the +vector of @code{N/2} products. In the case of +@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second +vector are subtracted from the low @code{N/2} of the first to produce the +vector of @code{N/2} products. + @item VEC_WIDEN_PLUS_HI_EXPR @itemx VEC_WIDEN_PLUS_LO_EXPR These nodes represent widening vector addition of the high and low parts of diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index 594bd3043f0e944299ddfff219f757ef15a3dd61..66636d82df27626e7911efd0cb8526921b39633f 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -1187,6 +1187,7 @@ gimple_range_op_handler::maybe_non_standard () { range_operator *signed_op = ptr_op_widen_mult_signed; range_operator *unsigned_op = ptr_op_widen_mult_unsigned; + bool signed1, signed2, signed_ret; if (gimple_code (m_stmt) == GIMPLE_ASSIGN) switch (gimple_assign_rhs_code (m_stmt)) { @@ -1202,32 +1203,55 @@ gimple_range_op_handler::maybe_non_standard () m_op1 = gimple_assign_rhs1 (m_stmt); m_op2 = gimple_assign_rhs2 (m_stmt); tree ret = gimple_assign_lhs (m_stmt); - bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; - bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; - bool signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED; - - /* Normally these operands should all have the same sign, but - some passes and violate this by taking mismatched sign args. At - the moment the only one that's possible is mismatch inputs and - unsigned output. Once ranger supports signs for the operands we - can properly fix it, for now only accept the case we can do - correctly. */ - if ((signed1 ^ signed2) && signed_ret) - return; - - m_valid = true; - if (signed2 && !signed1) - std::swap (m_op1, m_op2); - - if (signed1 || signed2) - m_int = signed_op; - else - m_int = unsigned_op; + signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; + signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; + signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED; break; } default: - break; + return; } + else if (gimple_code (m_stmt) == GIMPLE_CALL + && gimple_call_internal_p (m_stmt) + && gimple_get_lhs (m_stmt) != NULL_TREE) + switch (gimple_call_internal_fn (m_stmt)) + { + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: + { + signed_op = ptr_op_widen_plus_signed; + unsigned_op = ptr_op_widen_plus_unsigned; + m_valid = false; + m_op1 = gimple_call_arg (m_stmt, 0); + m_op2 = gimple_call_arg (m_stmt, 1); + tree ret = gimple_get_lhs (m_stmt); + signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; + signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; + signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED; + break; + } + default: + return; + } + else + return; + + /* Normally these operands should all have the same sign, but some passes + and violate this by taking mismatched sign args. At the moment the only + one that's possible is mismatch inputs and unsigned output. Once ranger + supports signs for the operands we can properly fix it, for now only + accept the case we can do correctly. */ + if ((signed1 ^ signed2) && signed_ret) + return; + + m_valid = true; + if (signed2 && !signed1) + std::swap (m_op1, m_op2); + + if (signed1 || signed2) + m_int = signed_op; + else + m_int = unsigned_op; } // Set up a gimple_range_op_handler for any built in function which can be diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..1acea5ae33046b70de247b1688aea874d9956abc 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -90,6 +90,19 @@ lookup_internal_fn (const char *name) return entry ? *entry : IFN_LAST; } +/* Given an internal_fn IFN that is a HILO function, return its corresponding + LO and HI internal_fns. */ + +extern void +lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi) +{ + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + *lo = internal_fn (ifn + 1); + *hi = internal_fn (ifn + 2); +} + + /* Fnspec of each internal function, indexed by function number. */ const_tree internal_fn_fnspec_array[IFN_LAST + 1]; @@ -137,7 +150,16 @@ const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = { #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) TYPE##_direct, #define DEF_INTERNAL_SIGNED_OPTAB_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \ UNSIGNED_OPTAB, TYPE) TYPE##_direct, +#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN +#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN +#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) \ +TYPE##_direct, TYPE##_direct, TYPE##_direct, +#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE) \ +TYPE##_direct, TYPE##_direct, TYPE##_direct, #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN +#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN not_direct }; @@ -3852,7 +3874,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, /* Return the optab used by internal function FN. */ -static optab +optab direct_internal_fn_optab (internal_fn fn, tree_pair types) { switch (fn) @@ -3971,6 +3993,9 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_VEC_WIDEN_PLUS_HILO: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: return true; default: @@ -4044,6 +4069,88 @@ first_commutative_argument (internal_fn fn) } } +/* Return true if this CODE describes an internal_fn that returns a vector with + elements twice as wide as the element size of the input vectors. */ + +bool +widening_fn_p (code_helper code) +{ + if (!code.is_fn_code ()) + return false; + + if (!internal_fn_p ((combined_fn) code)) + return false; + + internal_fn fn = as_internal_fn ((combined_fn) code); + switch (fn) + { + #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN + #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \ + case IFN_##NAME##_HILO:\ + case IFN_##NAME##_HI: \ + case IFN_##NAME##_LO: \ + return true; + #include "internal-fn.def" + #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN + + default: + return false; + } +} + +/* Return true if this CODE describes an internal_fn that returns a vector with + elements twice as narrow as the element size of the input vectors. */ + +bool +narrowing_fn_p (code_helper code) +{ + if (!code.is_fn_code ()) + return false; + + if (!internal_fn_p ((combined_fn) code)) + return false; + + internal_fn fn = as_internal_fn ((combined_fn) code); + switch (fn) + { + #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN + #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \ + case IFN_##NAME##_HILO:\ + case IFN_##NAME##_HI: \ + case IFN_##NAME##_LO: \ + return true; + #include "internal-fn.def" + #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN + + default: + return false; + } +} + +/* Return true if FN decomposes to _hi and _lo IFN. */ + +bool +decomposes_to_hilo_fn_p (internal_fn fn) +{ + switch (fn) + { + #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN + #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \ + case IFN_##NAME##_HILO:\ + return true; + #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN + #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \ + case IFN_##NAME##_HILO:\ + return true; + #include "internal-fn.def" + #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN + #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN + + default: + return false; + } +} + /* Return true if IFN_SET_EDOM is supported. */ bool @@ -4071,7 +4178,33 @@ set_edom_supported_p (void) optab which_optab = direct_internal_fn_optab (fn, types); \ expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ } +#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR, \ + SIGNED_OPTAB, UNSIGNED_OPTAB, \ + TYPE) \ + static void \ + expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED, \ + gcall *stmt ATTRIBUTE_UNUSED) \ + { \ + gcc_unreachable (); \ + } \ + DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_HI, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_LO, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) +#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE) \ + static void \ + expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED, \ + gcall *stmt ATTRIBUTE_UNUSED) \ + { \ + gcc_unreachable (); \ + } \ + DEF_INTERNAL_OPTAB_FN(CODE##_LO, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN(CODE##_HI, FLAGS, OPTAB, TYPE) #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_FN +#undef DEF_INTERNAL_SIGNED_OPTAB_FN +#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN +#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: @@ -4080,6 +4213,7 @@ set_edom_supported_p (void) where STMT is the statement that performs the call. */ static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = { + #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE, #include "internal-fn.def" 0 diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..012dd323b86dd7cfcc5c13d3a2bb2a453937155d 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -85,6 +85,13 @@ along with GCC; see the file COPYING3. If not see says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX} group of functions to any integral mode (including vector modes). + DEF_INTERNAL_SIGNED_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it + provides convenience wrappers for defining conversions that require a + hi/lo split, like widening and narrowing operations. Each definition + for <NAME> will require an optab named <OPTAB> and two other optabs that + you specify for signed and unsigned. + + Each entry must have a corresponding expander of the form: void expand_NAME (gimple_call stmt) @@ -123,6 +130,20 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) #endif +#ifndef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN +#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE) +#endif + +#ifndef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN +#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, OPTAB##_lo, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, OPTAB##_hi, TYPE) +#endif + DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, @@ -315,6 +336,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) +DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_PLUS, + ECF_CONST | ECF_NOTHROW, + first, + vec_widen_sadd, vec_widen_uadd, + binary) +DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_MINUS, + ECF_CONST | ECF_NOTHROW, + first, + vec_widen_ssub, vec_widen_usub, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 08922ed4254898f5fffca3f33973e96ed9ce772f..8ba07d6d1338e75bc5a451d9e403112a608f3ea2 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +#include "insn-codes.h" +#include "insn-opinit.h" + + /* INTEGER_CST values for IFN_UNIQUE function arg-0. UNSPEC: Undifferentiated UNIQUE. @@ -112,6 +116,8 @@ internal_fn_name (enum internal_fn fn) } extern internal_fn lookup_internal_fn (const char *); +extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *); +extern optab direct_internal_fn_optab (internal_fn, tree_pair); /* Return the ECF_* flags for function FN. */ @@ -210,6 +216,9 @@ extern bool commutative_binary_fn_p (internal_fn); extern bool commutative_ternary_fn_p (internal_fn); extern int first_commutative_argument (internal_fn); extern bool associative_binary_fn_p (internal_fn); +extern bool widening_fn_p (code_helper); +extern bool narrowing_fn_p (code_helper); +extern bool decomposes_to_hilo_fn_p (internal_fn); extern bool set_edom_supported_p (void); diff --git a/gcc/optabs.cc b/gcc/optabs.cc index c8e39c82d57a7d726e7da33d247b80f32ec9236c..5a08d91e550b2d92e9572211f811fdba99a33a38 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1314,7 +1314,15 @@ commutative_optab_p (optab binoptab) || binoptab == smul_widen_optab || binoptab == umul_widen_optab || binoptab == smul_highpart_optab - || binoptab == umul_highpart_optab); + || binoptab == umul_highpart_optab + || binoptab == vec_widen_saddl_hi_optab + || binoptab == vec_widen_saddl_lo_optab + || binoptab == vec_widen_uaddl_hi_optab + || binoptab == vec_widen_uaddl_lo_optab + || binoptab == vec_widen_sadd_hi_optab + || binoptab == vec_widen_sadd_lo_optab + || binoptab == vec_widen_uadd_hi_optab + || binoptab == vec_widen_uadd_lo_optab); } /* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're diff --git a/gcc/optabs.def b/gcc/optabs.def index 695f5911b300c9ca5737de9be809fa01aabe5e01..16d121722c8c5723d9b164f5a2c616dc7ec143de 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -410,6 +410,10 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a") OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a") OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a") OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a") +OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a") +OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a") +OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a") +OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a") OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a") OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a") OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a") @@ -422,6 +426,10 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a") OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a") OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a") OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a") +OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a") +OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a") +OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a") +OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a") OPTAB_D (vec_addsub_optab, "vec_addsub$a3") OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4") OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */ /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */ /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */ diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 0aeebb67fac864db284985f4a6f0653af281d62b..28464ad9e3a7ea25557ffebcdbdbc1340f9e0d8b 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -65,6 +65,7 @@ along with GCC; see the file COPYING3. If not see #include "asan.h" #include "profile.h" #include "sreal.h" +#include "internal-fn.h" /* This file contains functions for building the Control Flow Graph (CFG) for a function tree. */ @@ -3411,6 +3412,52 @@ verify_gimple_call (gcall *stmt) debug_generic_stmt (fn); return true; } + internal_fn ifn = gimple_call_internal_fn (stmt); + if (ifn == IFN_LAST) + { + error ("gimple call has an invalid IFN"); + debug_generic_stmt (fn); + return true; + } + else if (decomposes_to_hilo_fn_p (ifn)) + { + /* Non decomposed HILO stmts should not appear in IL, these are + merely used as an internal representation to the auto-vectorizer + pass and should have been expanded to their _LO _HI variants. */ + error ("gimple call has an non decomposed HILO IFN"); + debug_generic_stmt (fn); + return true; + } + else if (ifn == IFN_VEC_WIDEN_PLUS_LO + || ifn == IFN_VEC_WIDEN_PLUS_HI + || ifn == IFN_VEC_WIDEN_MINUS_LO + || ifn == IFN_VEC_WIDEN_MINUS_HI) + { + tree rhs1_type = TREE_TYPE (gimple_call_arg (stmt, 0)); + tree rhs2_type = TREE_TYPE (gimple_call_arg (stmt, 1)); + tree lhs_type = TREE_TYPE (gimple_get_lhs (stmt)); + if (TREE_CODE (lhs_type) == VECTOR_TYPE) + { + if (TREE_CODE (rhs1_type) != VECTOR_TYPE + || TREE_CODE (rhs2_type) != VECTOR_TYPE) + { + error ("invalid non-vector operands in vector IFN call"); + debug_generic_stmt (fn); + return true; + } + lhs_type = TREE_TYPE (lhs_type); + rhs1_type = TREE_TYPE (rhs1_type); + rhs2_type = TREE_TYPE (rhs2_type); + } + if (POINTER_TYPE_P (lhs_type) + || POINTER_TYPE_P (rhs1_type) + || POINTER_TYPE_P (rhs2_type)) + { + error ("invalid (pointer) operands in vector IFN call"); + debug_generic_stmt (fn); + return true; + } + } } else { diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index 63a19f8d1d89c6bd5d8e55a299cbffaa324b4b84..d74d8db2173b1ab117250fea89de5212d5e354ec 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -4433,7 +4433,20 @@ estimate_num_insns (gimple *stmt, eni_weights *weights) tree decl; if (gimple_call_internal_p (stmt)) - return 0; + { + internal_fn fn = gimple_call_internal_fn (stmt); + switch (fn) + { + case IFN_VEC_WIDEN_PLUS_HI: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_MINUS_HI: + case IFN_VEC_WIDEN_MINUS_LO: + return 1; + + default: + return 0; + } + } else if ((decl = gimple_call_fndecl (stmt)) && fndecl_built_in_p (decl)) { diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 1778af0242898e3dc73d94d22a5b8505628a53b5..93cebc72beb4f65249a69b2665dfeb8a0991c1d1 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type) static unsigned int vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, - tree_code widened_code, bool shift_p, + code_helper widened_code, bool shift_p, unsigned int max_nops, vect_unpromoted_value *unprom, tree *common_type, enum optab_subtype *subtype = NULL) { /* Check for an integer operation with the right code. */ - gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); - if (!assign) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) + return 0; + + code_helper rhs_code; + if (is_gimple_assign (stmt)) + rhs_code = gimple_assign_rhs_code (stmt); + else if (is_gimple_call (stmt)) + rhs_code = gimple_call_combined_fn (stmt); + else return 0; - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + if (rhs_code != code + && rhs_code != widened_code) return 0; - tree type = TREE_TYPE (gimple_assign_lhs (assign)); + tree lhs = gimple_get_lhs (stmt); + tree type = TREE_TYPE (lhs); if (!INTEGRAL_TYPE_P (type)) return 0; @@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, { vect_unpromoted_value *this_unprom = &unprom[next_op]; unsigned int nops = 1; - tree op = gimple_op (assign, i + 1); + tree op = gimple_arg (stmt, i); if (i == 1 && TREE_CODE (op) == INTEGER_CST) { /* We already have a common type from earlier operands. @@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR, + if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, + IFN_VEC_WIDEN_MINUS_HILO, false, 2, unprom, &half_type)) return NULL; @@ -1395,14 +1405,16 @@ static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, tree_code orig_code, code_helper wide_code, - bool shift_p, const char *name) + bool shift_p, const char *name, + optab_subtype *subtype = NULL) { gimple *last_stmt = last_stmt_info->stmt; vect_unpromoted_value unprom[2]; tree half_type; if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code, - shift_p, 2, unprom, &half_type)) + shift_p, 2, unprom, &half_type, subtype)) + return NULL; /* Pattern detected. */ @@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo, type, pattern_stmt, vecctype); } +static gimple * +vect_recog_widen_op_pattern (vec_info *vinfo, + stmt_vec_info last_stmt_info, tree *type_out, + tree_code orig_code, internal_fn wide_ifn, + bool shift_p, const char *name, + optab_subtype *subtype = NULL) +{ + combined_fn ifn = as_combined_fn (wide_ifn); + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + orig_code, ifn, shift_p, name, + subtype); +} + + /* Try to detect multiplication on widened inputs, converting MULT_EXPR to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ @@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, } /* Try to detect addition on widened inputs, converting PLUS_EXPR - to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_PLUS_HILO. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - PLUS_EXPR, WIDEN_PLUS_EXPR, false, - "vect_recog_widen_plus_pattern"); + PLUS_EXPR, IFN_VEC_WIDEN_PLUS_HILO, + false, "vect_recog_widen_plus_pattern", + &subtype); } /* Try to detect subtraction on widened inputs, converting MINUS_EXPR - to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_MINUS_HILO. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - MINUS_EXPR, WIDEN_MINUS_EXPR, false, - "vect_recog_widen_minus_pattern"); + MINUS_EXPR, IFN_VEC_WIDEN_MINUS_HILO, + false, "vect_recog_widen_minus_pattern", + &subtype); } /* Function vect_recog_ctz_ffs_pattern @@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo, vect_unpromoted_value unprom[3]; tree new_type; unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR, - WIDEN_PLUS_EXPR, false, 3, + IFN_VEC_WIDEN_PLUS_HILO, false, 3, unprom, &new_type); if (nops == 0) return NULL; @@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + /* These must come after the double widening ones. */ }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index d152ae9ab10b361b88c0f839d6951c43b954750a..24c811ebe01fb8b003100dea494cf64fea72a975 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5038,7 +5038,9 @@ vectorizable_conversion (vec_info *vinfo, bool widen_arith = (code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); + || code == WIDEN_LSHIFT_EXPR + || code == IFN_VEC_WIDEN_PLUS_HILO + || code == IFN_VEC_WIDEN_MINUS_HILO); if (!widen_arith && !CONVERT_EXPR_CODE_P (code) @@ -5088,7 +5090,9 @@ vectorizable_conversion (vec_info *vinfo, gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR); + || code == WIDEN_MINUS_EXPR + || code == IFN_VEC_WIDEN_PLUS_HILO + || code == IFN_VEC_WIDEN_MINUS_HILO); op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : @@ -12478,10 +12482,43 @@ supportable_widening_operation (vec_info *vinfo, optab1 = vec_unpacks_sbool_lo_optab; optab2 = vec_unpacks_sbool_hi_optab; } - else + + if (code.is_fn_code ()) + { + internal_fn ifn = as_internal_fn ((combined_fn) code); + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + internal_fn lo, hi; + lookup_hilo_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = direct_internal_fn_optab (lo, {vectype, vectype}); + optab2 = direct_internal_fn_optab (hi, {vectype, vectype}); + } + else if (code.is_tree_code ()) { - optab1 = optab_for_tree_code (c1, vectype, optab_default); - optab2 = optab_for_tree_code (c2, vectype, optab_default); + if (code == FIX_TRUNC_EXPR) + { + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (c1, vectype_out, optab_default); + optab2 = optab_for_tree_code (c2, vectype_out, optab_default); + } + else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ()) + && VECTOR_BOOLEAN_TYPE_P (wide_vectype) + && VECTOR_BOOLEAN_TYPE_P (vectype) + && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) + { + /* If the input and result modes are the same, a different optab + is needed where we pass in the number of units in vectype. */ + optab1 = vec_unpacks_sbool_lo_optab; + optab2 = vec_unpacks_sbool_hi_optab; + } + else + { + optab1 = optab_for_tree_code (c1, vectype, optab_default); + optab2 = optab_for_tree_code (c2, vectype, optab_default); + } } if (!optab1 || !optab2) diff --git a/gcc/tree.def b/gcc/tree.def index 90ceeec0b512bfa5f983359c0af03cc71de32007..b37b0b35927b92a6536e5c2d9805ffce8319a240 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3) DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2) /* Widening sad (sum of absolute differences). - The first two arguments are of type t1 which should be integer. - The third argument and the result are of type t2, such that t2 is at least - twice the size of t1. Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is + The first two arguments are of type t1 which should be a vector of integers. + The third argument and the result are of type t2, such that the size of + the elements of t2 is at least twice the size of the elements of t1. + Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is equivalent to: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = PLUS_EXPR (tmp2, arg3) or: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = WIDEN_SUM_EXPR (tmp2, arg3) */ ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-15 11:53 ` Andre Vieira (lists) @ 2023-05-15 12:21 ` Richard Biener 2023-05-18 17:15 ` Andre Vieira (lists) 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2023-05-15 12:21 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Sandiford, Richard Biener, gcc-patches On Mon, 15 May 2023, Andre Vieira (lists) wrote: > > > On 15/05/2023 12:01, Richard Biener wrote: > > On Mon, 15 May 2023, Richard Sandiford wrote: > > > >> Richard Biener <rguenther@suse.de> writes: > >>> On Fri, 12 May 2023, Richard Sandiford wrote: > >>> > >>>> Richard Biener <rguenther@suse.de> writes: > >>>>> On Fri, 12 May 2023, Andre Vieira (lists) wrote: > >>>>> > >>>>>> I have dealt with, I think..., most of your comments. There's quite a > >>>>>> few > >>>>>> changes, I think it's all a bit simpler now. I made some other changes > >>>>>> to the > >>>>>> costing in tree-inline.cc and gimple-range-op.cc in which I try to > >>>>>> preserve > >>>>>> the same behaviour as we had with the tree codes before. Also added > >>>>>> some extra > >>>>>> checks to tree-cfg.cc that made sense to me. > >>>>>> > >>>>>> I am still regression testing the gimple-range-op change, as that was a > >>>>>> last > >>>>>> minute change, but the rest survived a bootstrap and regression test on > >>>>>> aarch64-unknown-linux-gnu. > >>>>>> > >>>>>> cover letter: > >>>>>> > >>>>>> This patch replaces the existing tree_code widen_plus and widen_minus > >>>>>> patterns with internal_fn versions. > >>>>>> > >>>>>> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and > >>>>>> DEF_INTERNAL_OPTAB_NARROWING_HILO_FN > >>>>>> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN > >>>>>> respectively > >>>>>> except they provide convenience wrappers for defining conversions that > >>>>>> require > >>>>>> a hi/lo split. Each definition for <NAME> will require optabs for _hi > >>>>>> and _lo > >>>>>> and each of those will also require a signed and unsigned version in > >>>>>> the case > >>>>>> of widening. The hi/lo pair is necessary because the widening and > >>>>>> narrowing > >>>>>> operations take n narrow elements as inputs and return n/2 wide > >>>>>> elements as > >>>>>> outputs. The 'lo' operation operates on the first n/2 elements of > >>>>>> input. The > >>>>>> 'hi' operation operates on the second n/2 elements of input. Defining > >>>>>> an > >>>>>> internal_fn along with hi/lo variations allows a single internal > >>>>>> function to > >>>>>> be returned from a vect_recog function that will later be expanded to > >>>>>> hi/lo. > >>>>>> > >>>>>> > >>>>>> For example: > >>>>>> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO > >>>>>> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> > >>>>>> (u/s)addl2 > >>>>>> IFN_VEC_WIDEN_PLUS_LO -> > >>>>>> vec_widen_<su>add_lo_<mode> > >>>>>> -> (u/s)addl > >>>>>> > >>>>>> This gives the same functionality as the previous > >>>>>> WIDEN_PLUS/WIDEN_MINUS tree > >>>>>> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. > >>>>> > >>>>> What I still don't understand is how we are so narrowly focused on > >>>>> HI/LO? We need a combined scalar IFN for pattern selection (not > >>>>> sure why that's now called _HILO, I expected no suffix). Then there's > >>>>> three possibilities the target can implement this: > >>>>> > >>>>> 1) with a widen_[su]add<mode> instruction - I _think_ that's what > >>>>> RISCV is going to offer since it is a target where vector modes > >>>>> have "padding" (aka you cannot subreg a V2SI to get V4HI). Instead > >>>>> RVV can do a V4HI to V4SI widening and widening add/subtract > >>>>> using vwadd[u] and vwsub[u] (the HI->SI widening is actually > >>>>> done with a widening add of zero - eh). > >>>>> IIRC GCN is the same here. > >>>> > >>>> SVE currently does this too, but the addition and widening are > >>>> separate operations. E.g. in principle there's no reason why > >>>> you can't sign-extend one operand, zero-extend the other, and > >>>> then add the result together. Or you could extend them from > >>>> different sizes (QI and HI). All of those are supported > >>>> (if the costing allows them). > >>> > >>> I see. So why does the target the expose widen_[su]add<mode> at all? > >> > >> It shouldn't (need to) do that. I don't think we should have an optab > >> for the unsplit operation. > >> > >> At least on SVE, we really want the extensions to be fused with loads > >> (where possible) rather than with arithmetic. > >> > >> We can still do the widening arithmetic in one go. It's just that > >> fusing with the loads works for the mixed-sign and mixed-size cases, > >> and can handle more than just doubling the element size. > >> > >>>> If the target has operations to do combined extending and adding (or > >>>> whatever), then at the moment we rely on combine to generate them. > >>>> > >>>> So I think this case is separate from Andre's work. The addition > >>>> itself is just an ordinary addition, and any widening happens by > >>>> vectorising a CONVERT/NOP_EXPR. > >>>> > >>>>> 2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree > >>>>> codes currently support (exclusively) > >>>>> 3) similar, but widen_[su]add{_even,_odd}<mode> > >>>>> > >>>>> that said, things like decomposes_to_hilo_fn_p look to paint us into > >>>>> a 2) corner without good reason. > >>>> > >>>> I suppose one question is: how much of the patch is really specific > >>>> to HI/LO, and how much is just grouping two halves together? > >>> > >>> Yep, that I don't know for sure. > >>> > >>>> The nice > >>>> thing about the internal-fn grouping macros is that, if (3) is > >>>> implemented in future, the structure will strongly encourage even/odd > >>>> pairs to be supported for all operations that support hi/lo. That is, > >>>> I would expect the grouping macros to be extended to define even/odd > >>>> ifns alongside hi/lo ones, rather than adding separate definitions > >>>> for even/odd functions. > >>>> > >>>> If so, at least from the internal-fn.* side of things, I think the > >>>> question > >>>> is whether it's OK to stick with hilo names for now, or whether we should > >>>> use more forward-looking names. > >>> > >>> I think for parts that are independent we could use a more > >>> forward-looking name. Maybe _halves? > >> > >> Using _halves for the ifn macros sounds good to me FWIW. > >> > >>> But I'm also not sure > >>> how much of that is really needed (it seems to be tied around > >>> optimizing optabs space?) > >> > >> Not sure what you mean by "this". Optabs space shouldn't be a problem > >> though. The optab encoding gives us a full int to play with, and it > >> could easily go up to 64 bits if necessary/convenient. > >> > >> At least on the internal-fn.* side, the aim is really just to establish > >> a regular structure, so that we don't have arbitrary differences between > >> different widening operations, or too much cut-&-paste. > > > > Hmm, I'm looking at the need for the std::map and > > internal_fn_hilo_keys_array and internal_fn_hilo_values_array. > > The vectorizer pieces contain > > > > + if (code.is_fn_code ()) > > + { > > + internal_fn ifn = as_internal_fn ((combined_fn) code); > > + gcc_assert (decomposes_to_hilo_fn_p (ifn)); > > + > > + internal_fn lo, hi; > > + lookup_hilo_internal_fn (ifn, &lo, &hi); > > + *code1 = as_combined_fn (lo); > > + *code2 = as_combined_fn (hi); > > + optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); > > + optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); > > > > so that tries to automatically associate the scalar widening IFN > > with the set(s) of IFN pairs we can split to. But then this > > list should be static and there's no need to create a std::map? > > Maybe gencfn-macros.cc can be enhanced to output these static > > cases? Or the vectorizer could (as it did previously) simply > > open-code the handled cases (I guess since we deal with two > > cases only now I'd prefer that). > > > > Thanks, > > Richard. > > > > > >> Thanks, > >> Richard > >> > > > The patch I uploaded last no longer has std::map nor > internal_fn_hilo_keys_array and internal_fn_hilo_values_array. (I've attached > it again) Whoops, too many patches ... > I'm not sure I understand the _halves, do you mean that for the case where I > had _hilo or _HILO before we rename that to _halves/_HALVES such that it later > represents both _hi/_lo separation and _even/_odd? I don't see much shared stuff, but I guess we'd see when we add a case for EVEN/ODD. The verifier contains + else if (decomposes_to_hilo_fn_p (ifn)) + { + /* Non decomposed HILO stmts should not appear in IL, these are + merely used as an internal representation to the auto-vectorizer + pass and should have been expanded to their _LO _HI variants. */ + error ("gimple call has an non decomposed HILO IFN"); + debug_generic_stmt (fn); + return true; I think to support case 1) that's not wanted. Instead what you could check is that the types involved are vector types, so a subset of what you check for IFN_VEC_WIDEN_PLUS_LO etc. (but oddly it's not verified those are all operating on vector types only?) +/* Given an internal_fn IFN that is a HILO function, return its corresponding + LO and HI internal_fns. */ + +extern void +lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi) +{ + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + *lo = internal_fn (ifn + 1); + *hi = internal_fn (ifn + 2); that might become fragile if we add EVEN/ODD besides HI/LO unless we merge those with a DEF_INTERNAL_OPTAB_WIDENING_HILO_EVENODD_FN case, right? > And am I correct to assume we are just giving up on having a INTERNAL_OPTAB_FN > idea for 1)? Well, I think we want all of them in the end (or at least support them if target need arises). full vector, hi/lo and even/odd. Richard. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-15 12:21 ` Richard Biener @ 2023-05-18 17:15 ` Andre Vieira (lists) 2023-05-22 13:06 ` Richard Biener 0 siblings, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-05-18 17:15 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Sandiford, Richard Biener, gcc-patches [-- Attachment #1: Type: text/plain, Size: 2935 bytes --] How about this? Not sure about the DEF_INTERNAL documentation I rewrote in internal-fn.def, was struggling to word these, so improvements welcome! gcc/ChangeLog: 2023-04-25 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> Tamar Christina <tamar.christina@arm.com> * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): Rename this ... (vec_widen_<su>add_lo_<mode>): ... to this. (vec_widen_<su>addl_hi_<mode>): Rename this ... (vec_widen_<su>add_hi_<mode>): ... to this. (vec_widen_<su>subl_lo_<mode>): Rename this ... (vec_widen_<su>sub_lo_<mode>): ... to this. (vec_widen_<su>subl_hi_<mode>): Rename this ... (vec_widen_<su>sub_hi_<mode>): ...to this. * doc/generic.texi: Document new IFN codes. * internal-fn.cc (ifn_cmp): Function to compare ifn's for sorting/searching. (lookup_hilo_internal_fn): Add lookup function. (commutative_binary_fn_p): Add widen_plus fn's. (widening_fn_p): New function. (narrowing_fn_p): New function. (direct_internal_fn_optab): Change visibility. * internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an internal_fn that expands into multiple internal_fns for widening. (DEF_INTERNAL_NARROWING_OPTAB_FN): Likewise but for narrowing. (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD, IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, IFN_VEC_WIDEN_MINUS_LO, IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define widening plus,minus functions. * internal-fn.h (direct_internal_fn_optab): Declare new prototype. (lookup_hilo_internal_fn): Likewise. (widening_fn_p): Likewise. (Narrowing_fn_p): Likewise. * optabs.cc (commutative_optab_p): Add widening plus optabs. * optabs.def (OPTAB_D): Define widen add, sub optabs. * tree-cfg.cc (verify_gimple_call): Add checks for widening ifns. * tree-inline.cc (estimate_num_insns): Return same cost for widen add and sub IFNs as previous tree_codes. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support patterns with a hi/lo or even/odd split. (vect_recog_sad_pattern): Refactor to use new IFN codes. (vect_recog_widen_plus_pattern): Likewise. (vect_recog_widen_minus_pattern): Likewise. (vect_recog_average_pattern): Likewise. * tree-vect-stmts.cc (vectorizable_conversion): Add support for _HILO IFNs. (supportable_widening_operation): Likewise. * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: Test that new IFN_VEC_WIDEN_PLUS is being used. * gcc.target/aarch64/vect-widen-sub.c: Test that new IFN_VEC_WIDEN_MINUS is being used. [-- Attachment #2: ifn1v4.patch --] [-- Type: text/plain, Size: 39515 bytes --] diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index bfc98a8d943467b33390defab9682f44efab5907..ffbbecb9409e1c2835d658c2a8855cd0e955c0f2 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4626,7 +4626,7 @@ [(set_attr "type" "neon_<ADDSUB:optab>_long")] ) -(define_expand "vec_widen_<su>addl_lo_<mode>" +(define_expand "vec_widen_<su>add_lo_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4638,7 +4638,7 @@ DONE; }) -(define_expand "vec_widen_<su>addl_hi_<mode>" +(define_expand "vec_widen_<su>add_hi_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4650,7 +4650,7 @@ DONE; }) -(define_expand "vec_widen_<su>subl_lo_<mode>" +(define_expand "vec_widen_<su>sub_lo_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4662,7 +4662,7 @@ DONE; }) -(define_expand "vec_widen_<su>subl_hi_<mode>" +(define_expand "vec_widen_<su>sub_hi_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index 8b2882da4fe7da07d22b4e5384d049ba7d3907bf..5e36dac2b1a10257616f12cdfb0b12d0f2879ae9 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1811,10 +1811,16 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @tindex VEC_WIDEN_MULT_LO_EXPR -@tindex VEC_WIDEN_PLUS_HI_EXPR -@tindex VEC_WIDEN_PLUS_LO_EXPR -@tindex VEC_WIDEN_MINUS_HI_EXPR -@tindex VEC_WIDEN_MINUS_LO_EXPR +@tindex IFN_VEC_WIDEN_PLUS +@tindex IFN_VEC_WIDEN_PLUS_HI +@tindex IFN_VEC_WIDEN_PLUS_LO +@tindex IFN_VEC_WIDEN_PLUS_EVEN +@tindex IFN_VEC_WIDEN_PLUS_ODD +@tindex IFN_VEC_WIDEN_MINUS +@tindex IFN_VEC_WIDEN_MINUS_HI +@tindex IFN_VEC_WIDEN_MINUS_LO +@tindex IFN_VEC_WIDEN_MINUS_EVEN +@tindex IFN_VEC_WIDEN_MINUS_ODD @tindex VEC_UNPACK_HI_EXPR @tindex VEC_UNPACK_LO_EXPR @tindex VEC_UNPACK_FLOAT_HI_EXPR @@ -1861,6 +1867,82 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the low @code{N/2} elements of the two vector are multiplied to produce the vector of @code{N/2} products. +@item IFN_VEC_WIDEN_PLUS +This internal function represents widening vector addition of two input +vectors. Its operands are vectors that contain the same number of elements +(@code{N}) of the same integral type. The result is a vector that contains +the same amount (@code{N}) of elements, of an integral type whose size is twice +as wide, as the input vectors. If the current target does not implement the +corresponding optabs the vectorizer may choose to split it into either a pair +of @code{IFN_VEC_WIDEN_PLUS_HI} and @code{IFN_VEC_WIDEN_PLUS_LO} or +@code{IFN_VEC_WIDEN_PLUS_EVEN} and @code{IFN_VEC_WIDEN_PLUS_ODD}, depending +on what optabs the target implements. + +@item IFN_VEC_WIDEN_PLUS_HI +@itemx IFN_VEC_WIDEN_PLUS_LO +These internal functions represent widening vector addition of the high and low +parts of the two input vectors, respectively. Their operands are vectors that +contain the same number of elements (@code{N}) of the same integral type. The +result is a vector that contains half as many elements, of an integral type +whose size is twice as wide. In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the +high @code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} additions. In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low +@code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} additions. + +@item IFN_VEC_WIDEN_PLUS_EVEN +@itemx IFN_VEC_WIDEN_PLUS_ODD +These internal functions represent widening vector addition of the even and odd +elements of the two input vectors, respectively. Their operands are vectors +that contain the same number of elements (@code{N}) of the same integral type. +The result is a vector that contains half as many elements, of an integral type +whose size is twice as wide. In the case of @code{IFN_VEC_WIDEN_PLUS_EVEN} the +even @code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} additions. In the case of @code{IFN_VEC_WIDEN_PLUS_ODD} the odd +@code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} additions. + +@item IFN_VEC_WIDEN_MINUS +This internal function represents widening vector subtraction of two input +vectors. Its operands are vectors that contain the same number of elements +(@code{N}) of the same integral type. The result is a vector that contains +the same amount (@code{N}) of elements, of an integral type whose size is twice +as wide, as the input vectors. If the current target does not implement the +corresponding optabs the vectorizer may choose to split it into either a pair +of @code{IFN_VEC_WIDEN_MINUS_HI} and @code{IFN_VEC_WIDEN_MINUS_LO} or +@code{IFN_VEC_WIDEN_MINUS_EVEN} and @code{IFN_VEC_WIDEN_MINUS_ODD}, depending +on what optabs the target implements. + +@item IFN_VEC_WIDEN_MINUS_HI +@itemx IFN_VEC_WIDEN_MINUS_LO +These internal functions represent widening vector subtraction of the high and +low parts of the two input vectors, respectively. Their operands are vectors +that contain the same number of elements (@code{N}) of the same integral type. +The high/low elements of the second vector are subtracted from the high/low +elements of the first. The result is a vector that contains half as many +elements, of an integral type whose size is twice as wide. In the case of +@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second +vector are subtracted from the high @code{N/2} of the first to produce the +vector of @code{N/2} subtractions. In the case of +@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second +vector are subtracted from the low @code{N/2} of the first to produce the +vector of @code{N/2} subtractions. + +@item IFN_VEC_WIDEN_MINUS_EVEN +@itemx IFN_VEC_WIDEN_MINUS_ODD +These internal functions represent widening vector subtraction of the even and +odd parts of the two input vectors, respectively. Their operands are vectors +that contain the same number of elements (@code{N}) of the same integral type. +The even/odd elements of the second vector are subtracted from the even/odd +elements of the first. The result is a vector that contains half as many +elements, of an integral type whose size is twice as wide. In the case of +@code{IFN_VEC_WIDEN_MINUS_EVEN} the even @code{N/2} elements of the second +vector are subtracted from the even @code{N/2} of the first to produce the +vector of @code{N/2} subtractions. In the case of +@code{IFN_VEC_WIDEN_MINUS_ODD} the odd @code{N/2} elements of the second +vector are subtracted from the odd @code{N/2} of the first to produce the +vector of @code{N/2} subtractions. + @item VEC_WIDEN_PLUS_HI_EXPR @itemx VEC_WIDEN_PLUS_LO_EXPR These nodes represent widening vector addition of the high and low parts of diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index 594bd3043f0e944299ddfff219f757ef15a3dd61..33f4b7064a2a22aad49f27b24b409e91a5b89c69 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -1187,6 +1187,7 @@ gimple_range_op_handler::maybe_non_standard () { range_operator *signed_op = ptr_op_widen_mult_signed; range_operator *unsigned_op = ptr_op_widen_mult_unsigned; + bool signed1, signed2, signed_ret; if (gimple_code (m_stmt) == GIMPLE_ASSIGN) switch (gimple_assign_rhs_code (m_stmt)) { @@ -1202,32 +1203,55 @@ gimple_range_op_handler::maybe_non_standard () m_op1 = gimple_assign_rhs1 (m_stmt); m_op2 = gimple_assign_rhs2 (m_stmt); tree ret = gimple_assign_lhs (m_stmt); - bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; - bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; - bool signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED; - - /* Normally these operands should all have the same sign, but - some passes and violate this by taking mismatched sign args. At - the moment the only one that's possible is mismatch inputs and - unsigned output. Once ranger supports signs for the operands we - can properly fix it, for now only accept the case we can do - correctly. */ - if ((signed1 ^ signed2) && signed_ret) - return; - - m_valid = true; - if (signed2 && !signed1) - std::swap (m_op1, m_op2); - - if (signed1 || signed2) - m_int = signed_op; - else - m_int = unsigned_op; + signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; + signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; + signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED; break; } default: - break; + return; + } + else if (gimple_code (m_stmt) == GIMPLE_CALL + && gimple_call_internal_p (m_stmt) + && gimple_get_lhs (m_stmt) != NULL_TREE) + switch (gimple_call_internal_fn (m_stmt)) + { + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: + { + signed_op = ptr_op_widen_plus_signed; + unsigned_op = ptr_op_widen_plus_unsigned; + m_valid = false; + m_op1 = gimple_call_arg (m_stmt, 0); + m_op2 = gimple_call_arg (m_stmt, 1); + tree ret = gimple_get_lhs (m_stmt); + signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED; + signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED; + signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED; + break; + } + default: + return; } + else + return; + + /* Normally these operands should all have the same sign, but some passes + and violate this by taking mismatched sign args. At the moment the only + one that's possible is mismatch inputs and unsigned output. Once ranger + supports signs for the operands we can properly fix it, for now only + accept the case we can do correctly. */ + if ((signed1 ^ signed2) && signed_ret) + return; + + m_valid = true; + if (signed2 && !signed1) + std::swap (m_op1, m_op2); + + if (signed1 || signed2) + m_int = signed_op; + else + m_int = unsigned_op; } // Set up a gimple_range_op_handler for any built in function which can be diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..348bee35a35ae4ed9a8652f5349f430c2733e1cb 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -90,6 +90,71 @@ lookup_internal_fn (const char *name) return entry ? *entry : IFN_LAST; } +/* Given an internal_fn IFN that is either a widening or narrowing function, return its + corresponding LO and HI internal_fns. */ + +extern void +lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi) +{ + gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn)); + + switch (ifn) + { + default: + gcc_unreachable (); +#undef DEF_INTERNAL_FN +#undef DEF_INTERNAL_WIDENING_OPTAB_FN +#undef DEF_INTERNAL_NARROWING_OPTAB_FN +#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE) +#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \ + case IFN_##NAME: \ + *lo = internal_fn (IFN_##NAME##_LO); \ + *hi = internal_fn (IFN_##NAME##_HI); \ + break; +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T) \ + case IFN_##NAME: \ + *lo = internal_fn (IFN_##NAME##_LO); \ + *hi = internal_fn (IFN_##NAME##_HI); \ + break; +#include "internal-fn.def" +#undef DEF_INTERNAL_FN +#undef DEF_INTERNAL_WIDENING_OPTAB_FN +#undef DEF_INTERNAL_NARROWING_OPTAB_FN + } +} + +extern void +lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even, + internal_fn *odd) +{ + gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn)); + + switch (ifn) + { + default: + gcc_unreachable (); +#undef DEF_INTERNAL_FN +#undef DEF_INTERNAL_WIDENING_OPTAB_FN +#undef DEF_INTERNAL_NARROWING_OPTAB_FN +#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE) +#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \ + case IFN_##NAME: \ + *even = internal_fn (IFN_##NAME##_EVEN); \ + *odd = internal_fn (IFN_##NAME##_ODD); \ + break; +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T) \ + case IFN_##NAME: \ + *even = internal_fn (IFN_##NAME##_EVEN); \ + *odd = internal_fn (IFN_##NAME##_ODD); \ + break; +#include "internal-fn.def" +#undef DEF_INTERNAL_FN +#undef DEF_INTERNAL_WIDENING_OPTAB_FN +#undef DEF_INTERNAL_NARROWING_OPTAB_FN + } +} + + /* Fnspec of each internal function, indexed by function number. */ const_tree internal_fn_fnspec_array[IFN_LAST + 1]; @@ -3852,7 +3917,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, /* Return the optab used by internal function FN. */ -static optab +optab direct_internal_fn_optab (internal_fn fn, tree_pair types) { switch (fn) @@ -3971,6 +4036,9 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: return true; default: @@ -4044,6 +4112,68 @@ first_commutative_argument (internal_fn fn) } } +/* Return true if this CODE describes an internal_fn that returns a vector with + elements twice as wide as the element size of the input vectors. */ + +bool +widening_fn_p (code_helper code) +{ + if (!code.is_fn_code ()) + return false; + + if (!internal_fn_p ((combined_fn) code)) + return false; + + internal_fn fn = as_internal_fn ((combined_fn) code); + switch (fn) + { + #undef DEF_INTERNAL_WIDENING_OPTAB_FN + #define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \ + case IFN_##NAME: \ + case IFN_##NAME##_HI: \ + case IFN_##NAME##_LO: \ + case IFN_##NAME##_EVEN: \ + case IFN_##NAME##_ODD: \ + return true; + #include "internal-fn.def" + #undef DEF_INTERNAL_WIDENING_OPTAB_FN + + default: + return false; + } +} + +/* Return true if this CODE describes an internal_fn that returns a vector with + elements twice as narrow as the element size of the input vectors. */ + +bool +narrowing_fn_p (code_helper code) +{ + if (!code.is_fn_code ()) + return false; + + if (!internal_fn_p ((combined_fn) code)) + return false; + + internal_fn fn = as_internal_fn ((combined_fn) code); + switch (fn) + { + #undef DEF_INTERNAL_NARROWING_OPTAB_FN + #define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T) \ + case IFN_##NAME##: \ + case IFN_##NAME##_HI: \ + case IFN_##NAME##_LO: \ + case IFN_##NAME##_HI: \ + case IFN_##NAME##_LO: \ + return true; + #include "internal-fn.def" + #undef DEF_INTERNAL_NARROWING_OPTAB_FN + + default: + return false; + } +} + /* Return true if IFN_SET_EDOM is supported. */ bool @@ -4072,6 +4202,8 @@ set_edom_supported_p (void) expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ } #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_FN +#undef DEF_INTERNAL_SIGNED_OPTAB_FN /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: @@ -4080,6 +4212,7 @@ set_edom_supported_p (void) where STMT is the statement that performs the call. */ static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = { + #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE, #include "internal-fn.def" 0 diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..e9edaa201ad4ad171a49119efa9d6bff49add9f4 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -85,6 +85,34 @@ along with GCC; see the file COPYING3. If not see says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX} group of functions to any integral mode (including vector modes). + DEF_INTERNAL_WIDENING_OPTAB_FN is a wrapper that defines five internal + functions with DEF_INTERNAL_SIGNED_OPTAB_FN: + - one that describes a widening operation with the same number of elements + in the output and input vectors, + - two that describe a pair of high-low widening operations where the output + vectors each have half the number of elements of the input vectors, + corresponding to the result of the widening operation on the top half and + bottom half, these have the suffixes _HI and _LO, + - and two that describe a pair of even-odd widening operations where the + output vectors each have half the number of elements of the input vectors, + corresponding to the result of the widening operation on the even and odd + elements, these have the suffixes _EVEN and _ODD. + These five internal functions will require two optabs each, a SIGNED_OPTAB + and an UNSIGNED_OTPAB. + + DEF_INTERNAL_NARROWING_OPTAB_FN is a wrapper that defines five internal + functions with DEF_INTERNAL_OPTAB_FN: + - one that describes a narrowing operation with the same number of elements + in the output and input vectors, + - two that describe a pair of high-low narrowing operations where the output + vector has the same number of elements in the top or bottom halves as the + full input vectors, these have the suffixes _HI and _LO. + - and two that describe a pair of even-odd narrowing operations where the + output vector has the same number of elements, in the even or odd positions, + as the full input vectors, these have the suffixes _EVEN and _ODD. + These five internal functions will require an optab each. + + Each entry must have a corresponding expander of the form: void expand_NAME (gimple_call stmt) @@ -123,6 +151,24 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) #endif +#ifndef DEF_INTERNAL_WIDENING_OPTAB_FN +#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _EVEN, FLAGS, SELECTOR, SOPTAB##_even, UOPTAB##_even, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS, SELECTOR, SOPTAB##_odd, UOPTAB##_odd, TYPE) +#endif + +#ifndef DEF_INTERNAL_NARROWING_OPTAB_FN +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, OPTAB##_lo, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, OPTAB##_hi, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _EVEN, FLAGS, OPTAB##_even, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _ODD, FLAGS, OPTAB##_odd, TYPE) +#endif + DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, @@ -315,6 +361,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) +DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_PLUS, + ECF_CONST | ECF_NOTHROW, + first, + vec_widen_sadd, vec_widen_uadd, + binary) +DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_MINUS, + ECF_CONST | ECF_NOTHROW, + first, + vec_widen_ssub, vec_widen_usub, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 08922ed4254898f5fffca3f33973e96ed9ce772f..3904ba3ca36949d844532a6a9303f550533311a4 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +#include "insn-codes.h" +#include "insn-opinit.h" + + /* INTEGER_CST values for IFN_UNIQUE function arg-0. UNSPEC: Undifferentiated UNIQUE. @@ -112,6 +116,10 @@ internal_fn_name (enum internal_fn fn) } extern internal_fn lookup_internal_fn (const char *); +extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *); +extern void lookup_evenodd_internal_fn (internal_fn, internal_fn *, + internal_fn *); +extern optab direct_internal_fn_optab (internal_fn, tree_pair); /* Return the ECF_* flags for function FN. */ @@ -210,6 +218,8 @@ extern bool commutative_binary_fn_p (internal_fn); extern bool commutative_ternary_fn_p (internal_fn); extern int first_commutative_argument (internal_fn); extern bool associative_binary_fn_p (internal_fn); +extern bool widening_fn_p (code_helper); +extern bool narrowing_fn_p (code_helper); extern bool set_edom_supported_p (void); diff --git a/gcc/optabs.cc b/gcc/optabs.cc index c8e39c82d57a7d726e7da33d247b80f32ec9236c..5a08d91e550b2d92e9572211f811fdba99a33a38 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1314,7 +1314,15 @@ commutative_optab_p (optab binoptab) || binoptab == smul_widen_optab || binoptab == umul_widen_optab || binoptab == smul_highpart_optab - || binoptab == umul_highpart_optab); + || binoptab == umul_highpart_optab + || binoptab == vec_widen_saddl_hi_optab + || binoptab == vec_widen_saddl_lo_optab + || binoptab == vec_widen_uaddl_hi_optab + || binoptab == vec_widen_uaddl_lo_optab + || binoptab == vec_widen_sadd_hi_optab + || binoptab == vec_widen_sadd_lo_optab + || binoptab == vec_widen_uadd_hi_optab + || binoptab == vec_widen_uadd_lo_optab); } /* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're diff --git a/gcc/optabs.def b/gcc/optabs.def index 695f5911b300c9ca5737de9be809fa01aabe5e01..d41ed6e1afaddd019c7470f965c0ad21c8b2b9d7 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -410,6 +410,16 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a") OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a") OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a") OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a") +OPTAB_D (vec_widen_ssub_optab, "vec_widen_ssub_$a") +OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a") +OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a") +OPTAB_D (vec_widen_ssub_odd_optab, "vec_widen_ssub_odd_$a") +OPTAB_D (vec_widen_ssub_even_optab, "vec_widen_ssub_even_$a") +OPTAB_D (vec_widen_sadd_optab, "vec_widen_sadd_$a") +OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a") +OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a") +OPTAB_D (vec_widen_sadd_odd_optab, "vec_widen_sadd_odd_$a") +OPTAB_D (vec_widen_sadd_even_optab, "vec_widen_sadd_even_$a") OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a") OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a") OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a") @@ -422,6 +432,16 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a") OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a") OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a") OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a") +OPTAB_D (vec_widen_usub_optab, "vec_widen_usub_$a") +OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a") +OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a") +OPTAB_D (vec_widen_usub_odd_optab, "vec_widen_usub_odd_$a") +OPTAB_D (vec_widen_usub_even_optab, "vec_widen_usub_even_$a") +OPTAB_D (vec_widen_uadd_optab, "vec_widen_uadd_$a") +OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a") +OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a") +OPTAB_D (vec_widen_uadd_odd_optab, "vec_widen_uadd_odd_$a") +OPTAB_D (vec_widen_uadd_even_optab, "vec_widen_uadd_even_$a") OPTAB_D (vec_addsub_optab, "vec_addsub$a3") OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4") OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */ /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */ /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */ diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 0aeebb67fac864db284985f4a6f0653af281d62b..0e847cd04ca6e33f67a86a78a36d35d42aba2627 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -65,6 +65,7 @@ along with GCC; see the file COPYING3. If not see #include "asan.h" #include "profile.h" #include "sreal.h" +#include "internal-fn.h" /* This file contains functions for building the Control Flow Graph (CFG) for a function tree. */ @@ -3411,6 +3412,40 @@ verify_gimple_call (gcall *stmt) debug_generic_stmt (fn); return true; } + internal_fn ifn = gimple_call_internal_fn (stmt); + if (ifn == IFN_LAST) + { + error ("gimple call has an invalid IFN"); + debug_generic_stmt (fn); + return true; + } + else if (widening_fn_p (ifn) + || narrowing_fn_p (ifn)) + { + tree lhs = gimple_get_lhs (stmt); + if (!lhs) + { + error ("vector IFN call with no lhs"); + debug_generic_stmt (fn); + return true; + } + + bool non_vector_operands = false; + for (unsigned i = 0; i < gimple_call_num_args (stmt); ++i) + if (!VECTOR_TYPE_P (TREE_TYPE (gimple_call_arg (stmt, i)))) + { + non_vector_operands = true; + break; + } + + if (non_vector_operands + || !VECTOR_TYPE_P (TREE_TYPE (lhs))) + { + error ("invalid non-vector operands in vector IFN call"); + debug_generic_stmt (fn); + return true; + } + } } else { diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index 63a19f8d1d89c6bd5d8e55a299cbffaa324b4b84..d74d8db2173b1ab117250fea89de5212d5e354ec 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -4433,7 +4433,20 @@ estimate_num_insns (gimple *stmt, eni_weights *weights) tree decl; if (gimple_call_internal_p (stmt)) - return 0; + { + internal_fn fn = gimple_call_internal_fn (stmt); + switch (fn) + { + case IFN_VEC_WIDEN_PLUS_HI: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_MINUS_HI: + case IFN_VEC_WIDEN_MINUS_LO: + return 1; + + default: + return 0; + } + } else if ((decl = gimple_call_fndecl (stmt)) && fndecl_built_in_p (decl)) { diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 1778af0242898e3dc73d94d22a5b8505628a53b5..dcd4b5561600346a2c10bd5133507329206e8837 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type) static unsigned int vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, - tree_code widened_code, bool shift_p, + code_helper widened_code, bool shift_p, unsigned int max_nops, vect_unpromoted_value *unprom, tree *common_type, enum optab_subtype *subtype = NULL) { /* Check for an integer operation with the right code. */ - gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); - if (!assign) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) + return 0; + + code_helper rhs_code; + if (is_gimple_assign (stmt)) + rhs_code = gimple_assign_rhs_code (stmt); + else if (is_gimple_call (stmt)) + rhs_code = gimple_call_combined_fn (stmt); + else return 0; - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + if (rhs_code != code + && rhs_code != widened_code) return 0; - tree type = TREE_TYPE (gimple_assign_lhs (assign)); + tree lhs = gimple_get_lhs (stmt); + tree type = TREE_TYPE (lhs); if (!INTEGRAL_TYPE_P (type)) return 0; @@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, { vect_unpromoted_value *this_unprom = &unprom[next_op]; unsigned int nops = 1; - tree op = gimple_op (assign, i + 1); + tree op = gimple_arg (stmt, i); if (i == 1 && TREE_CODE (op) == INTEGER_CST) { /* We already have a common type from earlier operands. @@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR, + if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, + IFN_VEC_WIDEN_MINUS, false, 2, unprom, &half_type)) return NULL; @@ -1395,14 +1405,16 @@ static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, tree_code orig_code, code_helper wide_code, - bool shift_p, const char *name) + bool shift_p, const char *name, + optab_subtype *subtype = NULL) { gimple *last_stmt = last_stmt_info->stmt; vect_unpromoted_value unprom[2]; tree half_type; if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code, - shift_p, 2, unprom, &half_type)) + shift_p, 2, unprom, &half_type, subtype)) + return NULL; /* Pattern detected. */ @@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo, type, pattern_stmt, vecctype); } +static gimple * +vect_recog_widen_op_pattern (vec_info *vinfo, + stmt_vec_info last_stmt_info, tree *type_out, + tree_code orig_code, internal_fn wide_ifn, + bool shift_p, const char *name, + optab_subtype *subtype = NULL) +{ + combined_fn ifn = as_combined_fn (wide_ifn); + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + orig_code, ifn, shift_p, name, + subtype); +} + + /* Try to detect multiplication on widened inputs, converting MULT_EXPR to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ @@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, } /* Try to detect addition on widened inputs, converting PLUS_EXPR - to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_PLUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - PLUS_EXPR, WIDEN_PLUS_EXPR, false, - "vect_recog_widen_plus_pattern"); + PLUS_EXPR, IFN_VEC_WIDEN_PLUS, + false, "vect_recog_widen_plus_pattern", + &subtype); } /* Try to detect subtraction on widened inputs, converting MINUS_EXPR - to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_MINUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - MINUS_EXPR, WIDEN_MINUS_EXPR, false, - "vect_recog_widen_minus_pattern"); + MINUS_EXPR, IFN_VEC_WIDEN_MINUS, + false, "vect_recog_widen_minus_pattern", + &subtype); } /* Function vect_recog_ctz_ffs_pattern @@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo, vect_unpromoted_value unprom[3]; tree new_type; unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR, - WIDEN_PLUS_EXPR, false, 3, + IFN_VEC_WIDEN_PLUS, false, 3, unprom, &new_type); if (nops == 0) return NULL; @@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + /* These must come after the double widening ones. */ }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index d152ae9ab10b361b88c0f839d6951c43b954750a..132c0337b7f541bfb114c0a3d2abbeffdad79880 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5038,7 +5038,8 @@ vectorizable_conversion (vec_info *vinfo, bool widen_arith = (code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); + || code == WIDEN_LSHIFT_EXPR + || widening_fn_p (code)); if (!widen_arith && !CONVERT_EXPR_CODE_P (code) @@ -5088,8 +5089,8 @@ vectorizable_conversion (vec_info *vinfo, gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR); - + || code == WIDEN_MINUS_EXPR + || widening_fn_p (code)); op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : gimple_call_arg (stmt, 0); @@ -12478,26 +12479,69 @@ supportable_widening_operation (vec_info *vinfo, optab1 = vec_unpacks_sbool_lo_optab; optab2 = vec_unpacks_sbool_hi_optab; } - else - { - optab1 = optab_for_tree_code (c1, vectype, optab_default); - optab2 = optab_for_tree_code (c2, vectype, optab_default); + + vec_mode = TYPE_MODE (vectype); + if (widening_fn_p (code)) + { + /* If this is an internal fn then we must check whether the target + supports either a low-high split or an even-odd split. */ + internal_fn ifn = as_internal_fn ((combined_fn) code); + + internal_fn lo, hi, even, odd; + lookup_hilo_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = direct_internal_fn_optab (lo, {vectype, vectype}); + optab2 = direct_internal_fn_optab (hi, {vectype, vectype}); + + /* If we don't support low-high, then check for even-odd. */ + if (!optab1 + || (icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing + || !optab2 + || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing) + { + lookup_evenodd_internal_fn (ifn, &even, &odd); + *code1 = as_combined_fn (even); + *code2 = as_combined_fn (odd); + optab1 = direct_internal_fn_optab (even, {vectype, vectype}); + optab2 = direct_internal_fn_optab (odd, {vectype, vectype}); + } + } + else if (code.is_tree_code ()) + { + if (code == FIX_TRUNC_EXPR) + { + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (c1, vectype_out, optab_default); + optab2 = optab_for_tree_code (c2, vectype_out, optab_default); + } + else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ()) + && VECTOR_BOOLEAN_TYPE_P (wide_vectype) + && VECTOR_BOOLEAN_TYPE_P (vectype) + && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) + { + /* If the input and result modes are the same, a different optab + is needed where we pass in the number of units in vectype. */ + optab1 = vec_unpacks_sbool_lo_optab; + optab2 = vec_unpacks_sbool_hi_optab; + } + else + { + optab1 = optab_for_tree_code (c1, vectype, optab_default); + optab2 = optab_for_tree_code (c2, vectype, optab_default); + } + *code1 = c1; + *code2 = c2; } if (!optab1 || !optab2) return false; - vec_mode = TYPE_MODE (vectype); if ((icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing) return false; - if (code.is_tree_code ()) - { - *code1 = c1; - *code2 = c2; - } - if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype) && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype)) diff --git a/gcc/tree.def b/gcc/tree.def index 90ceeec0b512bfa5f983359c0af03cc71de32007..b37b0b35927b92a6536e5c2d9805ffce8319a240 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3) DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2) /* Widening sad (sum of absolute differences). - The first two arguments are of type t1 which should be integer. - The third argument and the result are of type t2, such that t2 is at least - twice the size of t1. Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is + The first two arguments are of type t1 which should be a vector of integers. + The third argument and the result are of type t2, such that the size of + the elements of t2 is at least twice the size of the elements of t1. + Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is equivalent to: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = PLUS_EXPR (tmp2, arg3) or: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = WIDEN_SUM_EXPR (tmp2, arg3) */ ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-18 17:15 ` Andre Vieira (lists) @ 2023-05-22 13:06 ` Richard Biener 2023-06-01 16:27 ` Andre Vieira (lists) 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2023-05-22 13:06 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Sandiford, Richard Biener, gcc-patches On Thu, 18 May 2023, Andre Vieira (lists) wrote: > How about this? > > Not sure about the DEF_INTERNAL documentation I rewrote in internal-fn.def, > was struggling to word these, so improvements welcome! The even/odd variant optabs are also commutative_optab_p, so is the vec_widen_sadd without hi/lo or even/odd. +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ do you really want -all? I think you want -details + else if (widening_fn_p (ifn) + || narrowing_fn_p (ifn)) + { + tree lhs = gimple_get_lhs (stmt); + if (!lhs) + { + error ("vector IFN call with no lhs"); + debug_generic_stmt (fn); that's an error because ...? Maybe we want to verify this for all ECF_CONST|ECF_NOTHROW (or pure instead of const) internal function calls, but I wouldn't add any verification as part of this patch (not special to widening/narrowing fns either). if (gimple_call_internal_p (stmt)) - return 0; + { + internal_fn fn = gimple_call_internal_fn (stmt); + switch (fn) + { + case IFN_VEC_WIDEN_PLUS_HI: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_MINUS_HI: + case IFN_VEC_WIDEN_MINUS_LO: + return 1; this now looks incomplete. I think that we want instead to have a default: returning 1 and then special-cases we want to cost as zero. Not sure which - maybe blame tells why this was added? I think we can deal with this as followup (likewise the ranger additions). Otherwise looks good to me. Thanks, Richard. > gcc/ChangeLog: > > 2023-04-25 Andre Vieira <andre.simoesdiasvieira@arm.com> > Joel Hutton <joel.hutton@arm.com> > Tamar Christina <tamar.christina@arm.com> > > * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): > Rename > this ... > (vec_widen_<su>add_lo_<mode>): ... to this. > (vec_widen_<su>addl_hi_<mode>): Rename this ... > (vec_widen_<su>add_hi_<mode>): ... to this. > (vec_widen_<su>subl_lo_<mode>): Rename this ... > (vec_widen_<su>sub_lo_<mode>): ... to this. > (vec_widen_<su>subl_hi_<mode>): Rename this ... > (vec_widen_<su>sub_hi_<mode>): ...to this. > * doc/generic.texi: Document new IFN codes. > * internal-fn.cc (ifn_cmp): Function to compare ifn's for > sorting/searching. > (lookup_hilo_internal_fn): Add lookup function. > (commutative_binary_fn_p): Add widen_plus fn's. > (widening_fn_p): New function. > (narrowing_fn_p): New function. > (direct_internal_fn_optab): Change visibility. > * internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an > internal_fn that expands into multiple internal_fns for widening. > (DEF_INTERNAL_NARROWING_OPTAB_FN): Likewise but for narrowing. > (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, > IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD, > IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, > IFN_VEC_WIDEN_MINUS_LO, > IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define widening > plus,minus functions. > * internal-fn.h (direct_internal_fn_optab): Declare new prototype. > (lookup_hilo_internal_fn): Likewise. > (widening_fn_p): Likewise. > (Narrowing_fn_p): Likewise. > * optabs.cc (commutative_optab_p): Add widening plus optabs. > * optabs.def (OPTAB_D): Define widen add, sub optabs. > * tree-cfg.cc (verify_gimple_call): Add checks for widening ifns. > * tree-inline.cc (estimate_num_insns): Return same > cost for widen add and sub IFNs as previous tree_codes. > * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support > patterns with a hi/lo or even/odd split. > (vect_recog_sad_pattern): Refactor to use new IFN codes. > (vect_recog_widen_plus_pattern): Likewise. > (vect_recog_widen_minus_pattern): Likewise. > (vect_recog_average_pattern): Likewise. > * tree-vect-stmts.cc (vectorizable_conversion): Add support for > _HILO IFNs. > (supportable_widening_operation): Likewise. > * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/vect-widen-add.c: Test that new > IFN_VEC_WIDEN_PLUS is being used. > * gcc.target/aarch64/vect-widen-sub.c: Test that new > IFN_VEC_WIDEN_MINUS is being used. > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-05-22 13:06 ` Richard Biener @ 2023-06-01 16:27 ` Andre Vieira (lists) 2023-06-02 12:00 ` Richard Sandiford 2023-06-06 19:00 ` Jakub Jelinek 0 siblings, 2 replies; 53+ messages in thread From: Andre Vieira (lists) @ 2023-06-01 16:27 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Sandiford, Richard Biener, gcc-patches [-- Attachment #1: Type: text/plain, Size: 8566 bytes --] Hi, This is the updated patch and cover letter. Patches for inline and gimple-op changes will follow soon. DEF_INTERNAL_WIDENING_OPTAB_FN and DEF_INTERNAL_NARROWING_OPTAB_FN are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively. With the exception that they provide convenience wrappers for a single vector to vector conversion, a hi/lo split or an even/odd split. Each definition for <NAME> will require either signed optabs named <UOPTAB> and <SOPTAB> (for widening) or a single <OPTAB> (for narrowing) for each of the five functions it creates. For example, for widening addition the DEF_INTERNAL_WIDENING_OPTAB_FN will create five internal functions: IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, IFN_VEC_WIDEN_PLUS_EVEN and IFN_VEC_WIDEN_PLUS_ODD. Each requiring two optabs, one for signed and one for unsigned. Aarch64 implements the hi/lo split optabs: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> (u/s)addl2 IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> -> (u/s)addl This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. gcc/ChangeLog: 2023-04-25 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> Tamar Christina <tamar.christina@arm.com> * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): Rename this ... (vec_widen_<su>add_lo_<mode>): ... to this. (vec_widen_<su>addl_hi_<mode>): Rename this ... (vec_widen_<su>add_hi_<mode>): ... to this. (vec_widen_<su>subl_lo_<mode>): Rename this ... (vec_widen_<su>sub_lo_<mode>): ... to this. (vec_widen_<su>subl_hi_<mode>): Rename this ... (vec_widen_<su>sub_hi_<mode>): ...to this. * doc/generic.texi: Document new IFN codes. * internal-fn.cc (ifn_cmp): Function to compare ifn's for sorting/searching. (lookup_hilo_internal_fn): Add lookup function. (commutative_binary_fn_p): Add widen_plus fn's. (widening_fn_p): New function. (narrowing_fn_p): New function. (direct_internal_fn_optab): Change visibility. * internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an internal_fn that expands into multiple internal_fns for widening. (DEF_INTERNAL_NARROWING_OPTAB_FN): Likewise but for narrowing. (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD, IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, IFN_VEC_WIDEN_MINUS_LO, IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define widening plus,minus functions. * internal-fn.h (direct_internal_fn_optab): Declare new prototype. (lookup_hilo_internal_fn): Likewise. (widening_fn_p): Likewise. (Narrowing_fn_p): Likewise. * optabs.cc (commutative_optab_p): Add widening plus optabs. * optabs.def (OPTAB_D): Define widen add, sub optabs. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support patterns with a hi/lo or even/odd split. (vect_recog_sad_pattern): Refactor to use new IFN codes. (vect_recog_widen_plus_pattern): Likewise. (vect_recog_widen_minus_pattern): Likewise. (vect_recog_average_pattern): Likewise. * tree-vect-stmts.cc (vectorizable_conversion): Add support for _HILO IFNs. (supportable_widening_operation): Likewise. * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: Test that new IFN_VEC_WIDEN_PLUS is being used. * gcc.target/aarch64/vect-widen-sub.c: Test that new IFN_VEC_WIDEN_MINUS is being used. On 22/05/2023 14:06, Richard Biener wrote: > On Thu, 18 May 2023, Andre Vieira (lists) wrote: > >> How about this? >> >> Not sure about the DEF_INTERNAL documentation I rewrote in internal-fn.def, >> was struggling to word these, so improvements welcome! > > The even/odd variant optabs are also commutative_optab_p, so is > the vec_widen_sadd without hi/lo or even/odd. > > +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ > > do you really want -all? I think you want -details > > + else if (widening_fn_p (ifn) > + || narrowing_fn_p (ifn)) > + { > + tree lhs = gimple_get_lhs (stmt); > + if (!lhs) > + { > + error ("vector IFN call with no lhs"); > + debug_generic_stmt (fn); > > that's an error because ...? Maybe we want to verify this > for all ECF_CONST|ECF_NOTHROW (or pure instead of const) internal > function calls, but I wouldn't add any verification as part > of this patch (not special to widening/narrowing fns either). > > if (gimple_call_internal_p (stmt)) > - return 0; > + { > + internal_fn fn = gimple_call_internal_fn (stmt); > + switch (fn) > + { > + case IFN_VEC_WIDEN_PLUS_HI: > + case IFN_VEC_WIDEN_PLUS_LO: > + case IFN_VEC_WIDEN_MINUS_HI: > + case IFN_VEC_WIDEN_MINUS_LO: > + return 1; > > this now looks incomplete. I think that we want instead to > have a default: returning 1 and then special-cases we want > to cost as zero. Not sure which - maybe blame tells why > this was added? I think we can deal with this as followup > (likewise the ranger additions). > > Otherwise looks good to me. > > Thanks, > Richard. > >> gcc/ChangeLog: >> >> 2023-04-25 Andre Vieira <andre.simoesdiasvieira@arm.com> >> Joel Hutton <joel.hutton@arm.com> >> Tamar Christina <tamar.christina@arm.com> >> >> * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): >> Rename >> this ... >> (vec_widen_<su>add_lo_<mode>): ... to this. >> (vec_widen_<su>addl_hi_<mode>): Rename this ... >> (vec_widen_<su>add_hi_<mode>): ... to this. >> (vec_widen_<su>subl_lo_<mode>): Rename this ... >> (vec_widen_<su>sub_lo_<mode>): ... to this. >> (vec_widen_<su>subl_hi_<mode>): Rename this ... >> (vec_widen_<su>sub_hi_<mode>): ...to this. >> * doc/generic.texi: Document new IFN codes. >> * internal-fn.cc (ifn_cmp): Function to compare ifn's for >> sorting/searching. >> (lookup_hilo_internal_fn): Add lookup function. >> (commutative_binary_fn_p): Add widen_plus fn's. >> (widening_fn_p): New function. >> (narrowing_fn_p): New function. >> (direct_internal_fn_optab): Change visibility. >> * internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an >> internal_fn that expands into multiple internal_fns for widening. >> (DEF_INTERNAL_NARROWING_OPTAB_FN): Likewise but for narrowing. >> (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, >> IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD, >> IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, >> IFN_VEC_WIDEN_MINUS_LO, >> IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define widening >> plus,minus functions. >> * internal-fn.h (direct_internal_fn_optab): Declare new prototype. >> (lookup_hilo_internal_fn): Likewise. >> (widening_fn_p): Likewise. >> (Narrowing_fn_p): Likewise. >> * optabs.cc (commutative_optab_p): Add widening plus optabs. >> * optabs.def (OPTAB_D): Define widen add, sub optabs. >> * tree-cfg.cc (verify_gimple_call): Add checks for widening ifns. >> * tree-inline.cc (estimate_num_insns): Return same >> cost for widen add and sub IFNs as previous tree_codes. >> * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support >> patterns with a hi/lo or even/odd split. >> (vect_recog_sad_pattern): Refactor to use new IFN codes. >> (vect_recog_widen_plus_pattern): Likewise. >> (vect_recog_widen_minus_pattern): Likewise. >> (vect_recog_average_pattern): Likewise. >> * tree-vect-stmts.cc (vectorizable_conversion): Add support for >> _HILO IFNs. >> (supportable_widening_operation): Likewise. >> * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/aarch64/vect-widen-add.c: Test that new >> IFN_VEC_WIDEN_PLUS is being used. >> * gcc.target/aarch64/vect-widen-sub.c: Test that new >> IFN_VEC_WIDEN_MINUS is being used. >> > [-- Attachment #2: ifn1v5.patch --] [-- Type: text/plain, Size: 34259 bytes --] diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index da9c59e655465a74926b81b95b4ac8c353efb1b7..b404d5cabf9df8ea8c70ea4537deb978d351c51e 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4626,7 +4626,7 @@ [(set_attr "type" "neon_<ADDSUB:optab>_long")] ) -(define_expand "vec_widen_<su>addl_lo_<mode>" +(define_expand "vec_widen_<su>add_lo_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4638,7 +4638,7 @@ DONE; }) -(define_expand "vec_widen_<su>addl_hi_<mode>" +(define_expand "vec_widen_<su>add_hi_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4650,7 +4650,7 @@ DONE; }) -(define_expand "vec_widen_<su>subl_lo_<mode>" +(define_expand "vec_widen_<su>sub_lo_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] @@ -4662,7 +4662,7 @@ DONE; }) -(define_expand "vec_widen_<su>subl_hi_<mode>" +(define_expand "vec_widen_<su>sub_hi_<mode>" [(match_operand:<VWIDE> 0 "register_operand") (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand")) (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))] diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index 8b2882da4fe7da07d22b4e5384d049ba7d3907bf..5e36dac2b1a10257616f12cdfb0b12d0f2879ae9 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1811,10 +1811,16 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @tindex VEC_WIDEN_MULT_LO_EXPR -@tindex VEC_WIDEN_PLUS_HI_EXPR -@tindex VEC_WIDEN_PLUS_LO_EXPR -@tindex VEC_WIDEN_MINUS_HI_EXPR -@tindex VEC_WIDEN_MINUS_LO_EXPR +@tindex IFN_VEC_WIDEN_PLUS +@tindex IFN_VEC_WIDEN_PLUS_HI +@tindex IFN_VEC_WIDEN_PLUS_LO +@tindex IFN_VEC_WIDEN_PLUS_EVEN +@tindex IFN_VEC_WIDEN_PLUS_ODD +@tindex IFN_VEC_WIDEN_MINUS +@tindex IFN_VEC_WIDEN_MINUS_HI +@tindex IFN_VEC_WIDEN_MINUS_LO +@tindex IFN_VEC_WIDEN_MINUS_EVEN +@tindex IFN_VEC_WIDEN_MINUS_ODD @tindex VEC_UNPACK_HI_EXPR @tindex VEC_UNPACK_LO_EXPR @tindex VEC_UNPACK_FLOAT_HI_EXPR @@ -1861,6 +1867,82 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the low @code{N/2} elements of the two vector are multiplied to produce the vector of @code{N/2} products. +@item IFN_VEC_WIDEN_PLUS +This internal function represents widening vector addition of two input +vectors. Its operands are vectors that contain the same number of elements +(@code{N}) of the same integral type. The result is a vector that contains +the same amount (@code{N}) of elements, of an integral type whose size is twice +as wide, as the input vectors. If the current target does not implement the +corresponding optabs the vectorizer may choose to split it into either a pair +of @code{IFN_VEC_WIDEN_PLUS_HI} and @code{IFN_VEC_WIDEN_PLUS_LO} or +@code{IFN_VEC_WIDEN_PLUS_EVEN} and @code{IFN_VEC_WIDEN_PLUS_ODD}, depending +on what optabs the target implements. + +@item IFN_VEC_WIDEN_PLUS_HI +@itemx IFN_VEC_WIDEN_PLUS_LO +These internal functions represent widening vector addition of the high and low +parts of the two input vectors, respectively. Their operands are vectors that +contain the same number of elements (@code{N}) of the same integral type. The +result is a vector that contains half as many elements, of an integral type +whose size is twice as wide. In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the +high @code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} additions. In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low +@code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} additions. + +@item IFN_VEC_WIDEN_PLUS_EVEN +@itemx IFN_VEC_WIDEN_PLUS_ODD +These internal functions represent widening vector addition of the even and odd +elements of the two input vectors, respectively. Their operands are vectors +that contain the same number of elements (@code{N}) of the same integral type. +The result is a vector that contains half as many elements, of an integral type +whose size is twice as wide. In the case of @code{IFN_VEC_WIDEN_PLUS_EVEN} the +even @code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} additions. In the case of @code{IFN_VEC_WIDEN_PLUS_ODD} the odd +@code{N/2} elements of the two vectors are added to produce the vector of +@code{N/2} additions. + +@item IFN_VEC_WIDEN_MINUS +This internal function represents widening vector subtraction of two input +vectors. Its operands are vectors that contain the same number of elements +(@code{N}) of the same integral type. The result is a vector that contains +the same amount (@code{N}) of elements, of an integral type whose size is twice +as wide, as the input vectors. If the current target does not implement the +corresponding optabs the vectorizer may choose to split it into either a pair +of @code{IFN_VEC_WIDEN_MINUS_HI} and @code{IFN_VEC_WIDEN_MINUS_LO} or +@code{IFN_VEC_WIDEN_MINUS_EVEN} and @code{IFN_VEC_WIDEN_MINUS_ODD}, depending +on what optabs the target implements. + +@item IFN_VEC_WIDEN_MINUS_HI +@itemx IFN_VEC_WIDEN_MINUS_LO +These internal functions represent widening vector subtraction of the high and +low parts of the two input vectors, respectively. Their operands are vectors +that contain the same number of elements (@code{N}) of the same integral type. +The high/low elements of the second vector are subtracted from the high/low +elements of the first. The result is a vector that contains half as many +elements, of an integral type whose size is twice as wide. In the case of +@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second +vector are subtracted from the high @code{N/2} of the first to produce the +vector of @code{N/2} subtractions. In the case of +@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second +vector are subtracted from the low @code{N/2} of the first to produce the +vector of @code{N/2} subtractions. + +@item IFN_VEC_WIDEN_MINUS_EVEN +@itemx IFN_VEC_WIDEN_MINUS_ODD +These internal functions represent widening vector subtraction of the even and +odd parts of the two input vectors, respectively. Their operands are vectors +that contain the same number of elements (@code{N}) of the same integral type. +The even/odd elements of the second vector are subtracted from the even/odd +elements of the first. The result is a vector that contains half as many +elements, of an integral type whose size is twice as wide. In the case of +@code{IFN_VEC_WIDEN_MINUS_EVEN} the even @code{N/2} elements of the second +vector are subtracted from the even @code{N/2} of the first to produce the +vector of @code{N/2} subtractions. In the case of +@code{IFN_VEC_WIDEN_MINUS_ODD} the odd @code{N/2} elements of the second +vector are subtracted from the odd @code{N/2} of the first to produce the +vector of @code{N/2} subtractions. + @item VEC_WIDEN_PLUS_HI_EXPR @itemx VEC_WIDEN_PLUS_LO_EXPR These nodes represent widening vector addition of the high and low parts of diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..348bee35a35ae4ed9a8652f5349f430c2733e1cb 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -90,6 +90,71 @@ lookup_internal_fn (const char *name) return entry ? *entry : IFN_LAST; } +/* Given an internal_fn IFN that is either a widening or narrowing function, return its + corresponding LO and HI internal_fns. */ + +extern void +lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi) +{ + gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn)); + + switch (ifn) + { + default: + gcc_unreachable (); +#undef DEF_INTERNAL_FN +#undef DEF_INTERNAL_WIDENING_OPTAB_FN +#undef DEF_INTERNAL_NARROWING_OPTAB_FN +#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE) +#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \ + case IFN_##NAME: \ + *lo = internal_fn (IFN_##NAME##_LO); \ + *hi = internal_fn (IFN_##NAME##_HI); \ + break; +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T) \ + case IFN_##NAME: \ + *lo = internal_fn (IFN_##NAME##_LO); \ + *hi = internal_fn (IFN_##NAME##_HI); \ + break; +#include "internal-fn.def" +#undef DEF_INTERNAL_FN +#undef DEF_INTERNAL_WIDENING_OPTAB_FN +#undef DEF_INTERNAL_NARROWING_OPTAB_FN + } +} + +extern void +lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even, + internal_fn *odd) +{ + gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn)); + + switch (ifn) + { + default: + gcc_unreachable (); +#undef DEF_INTERNAL_FN +#undef DEF_INTERNAL_WIDENING_OPTAB_FN +#undef DEF_INTERNAL_NARROWING_OPTAB_FN +#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE) +#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \ + case IFN_##NAME: \ + *even = internal_fn (IFN_##NAME##_EVEN); \ + *odd = internal_fn (IFN_##NAME##_ODD); \ + break; +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T) \ + case IFN_##NAME: \ + *even = internal_fn (IFN_##NAME##_EVEN); \ + *odd = internal_fn (IFN_##NAME##_ODD); \ + break; +#include "internal-fn.def" +#undef DEF_INTERNAL_FN +#undef DEF_INTERNAL_WIDENING_OPTAB_FN +#undef DEF_INTERNAL_NARROWING_OPTAB_FN + } +} + + /* Fnspec of each internal function, indexed by function number. */ const_tree internal_fn_fnspec_array[IFN_LAST + 1]; @@ -3852,7 +3917,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, /* Return the optab used by internal function FN. */ -static optab +optab direct_internal_fn_optab (internal_fn fn, tree_pair types) { switch (fn) @@ -3971,6 +4036,9 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: return true; default: @@ -4044,6 +4112,68 @@ first_commutative_argument (internal_fn fn) } } +/* Return true if this CODE describes an internal_fn that returns a vector with + elements twice as wide as the element size of the input vectors. */ + +bool +widening_fn_p (code_helper code) +{ + if (!code.is_fn_code ()) + return false; + + if (!internal_fn_p ((combined_fn) code)) + return false; + + internal_fn fn = as_internal_fn ((combined_fn) code); + switch (fn) + { + #undef DEF_INTERNAL_WIDENING_OPTAB_FN + #define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \ + case IFN_##NAME: \ + case IFN_##NAME##_HI: \ + case IFN_##NAME##_LO: \ + case IFN_##NAME##_EVEN: \ + case IFN_##NAME##_ODD: \ + return true; + #include "internal-fn.def" + #undef DEF_INTERNAL_WIDENING_OPTAB_FN + + default: + return false; + } +} + +/* Return true if this CODE describes an internal_fn that returns a vector with + elements twice as narrow as the element size of the input vectors. */ + +bool +narrowing_fn_p (code_helper code) +{ + if (!code.is_fn_code ()) + return false; + + if (!internal_fn_p ((combined_fn) code)) + return false; + + internal_fn fn = as_internal_fn ((combined_fn) code); + switch (fn) + { + #undef DEF_INTERNAL_NARROWING_OPTAB_FN + #define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T) \ + case IFN_##NAME##: \ + case IFN_##NAME##_HI: \ + case IFN_##NAME##_LO: \ + case IFN_##NAME##_HI: \ + case IFN_##NAME##_LO: \ + return true; + #include "internal-fn.def" + #undef DEF_INTERNAL_NARROWING_OPTAB_FN + + default: + return false; + } +} + /* Return true if IFN_SET_EDOM is supported. */ bool @@ -4072,6 +4202,8 @@ set_edom_supported_p (void) expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ } #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_FN +#undef DEF_INTERNAL_SIGNED_OPTAB_FN /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: @@ -4080,6 +4212,7 @@ set_edom_supported_p (void) where STMT is the statement that performs the call. */ static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = { + #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE, #include "internal-fn.def" 0 diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..e9edaa201ad4ad171a49119efa9d6bff49add9f4 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -85,6 +85,34 @@ along with GCC; see the file COPYING3. If not see says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX} group of functions to any integral mode (including vector modes). + DEF_INTERNAL_WIDENING_OPTAB_FN is a wrapper that defines five internal + functions with DEF_INTERNAL_SIGNED_OPTAB_FN: + - one that describes a widening operation with the same number of elements + in the output and input vectors, + - two that describe a pair of high-low widening operations where the output + vectors each have half the number of elements of the input vectors, + corresponding to the result of the widening operation on the top half and + bottom half, these have the suffixes _HI and _LO, + - and two that describe a pair of even-odd widening operations where the + output vectors each have half the number of elements of the input vectors, + corresponding to the result of the widening operation on the even and odd + elements, these have the suffixes _EVEN and _ODD. + These five internal functions will require two optabs each, a SIGNED_OPTAB + and an UNSIGNED_OTPAB. + + DEF_INTERNAL_NARROWING_OPTAB_FN is a wrapper that defines five internal + functions with DEF_INTERNAL_OPTAB_FN: + - one that describes a narrowing operation with the same number of elements + in the output and input vectors, + - two that describe a pair of high-low narrowing operations where the output + vector has the same number of elements in the top or bottom halves as the + full input vectors, these have the suffixes _HI and _LO. + - and two that describe a pair of even-odd narrowing operations where the + output vector has the same number of elements, in the even or odd positions, + as the full input vectors, these have the suffixes _EVEN and _ODD. + These five internal functions will require an optab each. + + Each entry must have a corresponding expander of the form: void expand_NAME (gimple_call stmt) @@ -123,6 +151,24 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) #endif +#ifndef DEF_INTERNAL_WIDENING_OPTAB_FN +#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _EVEN, FLAGS, SELECTOR, SOPTAB##_even, UOPTAB##_even, TYPE) \ + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS, SELECTOR, SOPTAB##_odd, UOPTAB##_odd, TYPE) +#endif + +#ifndef DEF_INTERNAL_NARROWING_OPTAB_FN +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, OPTAB##_lo, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, OPTAB##_hi, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _EVEN, FLAGS, OPTAB##_even, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _ODD, FLAGS, OPTAB##_odd, TYPE) +#endif + DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, @@ -315,6 +361,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) +DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_PLUS, + ECF_CONST | ECF_NOTHROW, + first, + vec_widen_sadd, vec_widen_uadd, + binary) +DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_MINUS, + ECF_CONST | ECF_NOTHROW, + first, + vec_widen_ssub, vec_widen_usub, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 08922ed4254898f5fffca3f33973e96ed9ce772f..3904ba3ca36949d844532a6a9303f550533311a4 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +#include "insn-codes.h" +#include "insn-opinit.h" + + /* INTEGER_CST values for IFN_UNIQUE function arg-0. UNSPEC: Undifferentiated UNIQUE. @@ -112,6 +116,10 @@ internal_fn_name (enum internal_fn fn) } extern internal_fn lookup_internal_fn (const char *); +extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *); +extern void lookup_evenodd_internal_fn (internal_fn, internal_fn *, + internal_fn *); +extern optab direct_internal_fn_optab (internal_fn, tree_pair); /* Return the ECF_* flags for function FN. */ @@ -210,6 +218,8 @@ extern bool commutative_binary_fn_p (internal_fn); extern bool commutative_ternary_fn_p (internal_fn); extern int first_commutative_argument (internal_fn); extern bool associative_binary_fn_p (internal_fn); +extern bool widening_fn_p (code_helper); +extern bool narrowing_fn_p (code_helper); extern bool set_edom_supported_p (void); diff --git a/gcc/optabs.cc b/gcc/optabs.cc index a12333c7169fc6219b0e34b6169780f78e033ee3..aab6ab6faf244a8236dac81be2d68fc28819bc9a 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1314,7 +1314,17 @@ commutative_optab_p (optab binoptab) || binoptab == smul_widen_optab || binoptab == umul_widen_optab || binoptab == smul_highpart_optab - || binoptab == umul_highpart_optab); + || binoptab == umul_highpart_optab + || binoptab == vec_widen_sadd_optab + || binoptab == vec_widen_uadd_optab + || binoptab == vec_widen_sadd_hi_optab + || binoptab == vec_widen_sadd_lo_optab + || binoptab == vec_widen_uadd_hi_optab + || binoptab == vec_widen_uadd_lo_optab + || binoptab == vec_widen_sadd_even_optab + || binoptab == vec_widen_sadd_odd_optab + || binoptab == vec_widen_uadd_even_optab + || binoptab == vec_widen_uadd_odd_optab); } /* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're diff --git a/gcc/optabs.def b/gcc/optabs.def index 695f5911b300c9ca5737de9be809fa01aabe5e01..d41ed6e1afaddd019c7470f965c0ad21c8b2b9d7 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -410,6 +410,16 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a") OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a") OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a") OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a") +OPTAB_D (vec_widen_ssub_optab, "vec_widen_ssub_$a") +OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a") +OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a") +OPTAB_D (vec_widen_ssub_odd_optab, "vec_widen_ssub_odd_$a") +OPTAB_D (vec_widen_ssub_even_optab, "vec_widen_ssub_even_$a") +OPTAB_D (vec_widen_sadd_optab, "vec_widen_sadd_$a") +OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a") +OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a") +OPTAB_D (vec_widen_sadd_odd_optab, "vec_widen_sadd_odd_$a") +OPTAB_D (vec_widen_sadd_even_optab, "vec_widen_sadd_even_$a") OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a") OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a") OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a") @@ -422,6 +432,16 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a") OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a") OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a") OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a") +OPTAB_D (vec_widen_usub_optab, "vec_widen_usub_$a") +OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a") +OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a") +OPTAB_D (vec_widen_usub_odd_optab, "vec_widen_usub_odd_$a") +OPTAB_D (vec_widen_usub_even_optab, "vec_widen_usub_even_$a") +OPTAB_D (vec_widen_uadd_optab, "vec_widen_uadd_$a") +OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a") +OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a") +OPTAB_D (vec_widen_uadd_odd_optab, "vec_widen_uadd_odd_$a") +OPTAB_D (vec_widen_uadd_even_optab, "vec_widen_uadd_even_$a") OPTAB_D (vec_addsub_optab, "vec_addsub$a3") OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4") OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c index 220bd9352a4c7acd2e3713e441d74898d3e92b30..b5a73867e44ec3fa04d1201decf81353a67b4c82 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-details" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */ /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c index a2bed63affbd091977df95a126da1f5b8c1d41d2..1686c3f2f344c367ebb9cf34e558d0878849f9bc 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-details" } */ #include <stdint.h> #include <string.h> @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */ /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 1778af0242898e3dc73d94d22a5b8505628a53b5..dcd4b5561600346a2c10bd5133507329206e8837 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type) static unsigned int vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, - tree_code widened_code, bool shift_p, + code_helper widened_code, bool shift_p, unsigned int max_nops, vect_unpromoted_value *unprom, tree *common_type, enum optab_subtype *subtype = NULL) { /* Check for an integer operation with the right code. */ - gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); - if (!assign) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) + return 0; + + code_helper rhs_code; + if (is_gimple_assign (stmt)) + rhs_code = gimple_assign_rhs_code (stmt); + else if (is_gimple_call (stmt)) + rhs_code = gimple_call_combined_fn (stmt); + else return 0; - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + if (rhs_code != code + && rhs_code != widened_code) return 0; - tree type = TREE_TYPE (gimple_assign_lhs (assign)); + tree lhs = gimple_get_lhs (stmt); + tree type = TREE_TYPE (lhs); if (!INTEGRAL_TYPE_P (type)) return 0; @@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, { vect_unpromoted_value *this_unprom = &unprom[next_op]; unsigned int nops = 1; - tree op = gimple_op (assign, i + 1); + tree op = gimple_arg (stmt, i); if (i == 1 && TREE_CODE (op) == INTEGER_CST) { /* We already have a common type from earlier operands. @@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR, + if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, + IFN_VEC_WIDEN_MINUS, false, 2, unprom, &half_type)) return NULL; @@ -1395,14 +1405,16 @@ static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, tree_code orig_code, code_helper wide_code, - bool shift_p, const char *name) + bool shift_p, const char *name, + optab_subtype *subtype = NULL) { gimple *last_stmt = last_stmt_info->stmt; vect_unpromoted_value unprom[2]; tree half_type; if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code, - shift_p, 2, unprom, &half_type)) + shift_p, 2, unprom, &half_type, subtype)) + return NULL; /* Pattern detected. */ @@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo, type, pattern_stmt, vecctype); } +static gimple * +vect_recog_widen_op_pattern (vec_info *vinfo, + stmt_vec_info last_stmt_info, tree *type_out, + tree_code orig_code, internal_fn wide_ifn, + bool shift_p, const char *name, + optab_subtype *subtype = NULL) +{ + combined_fn ifn = as_combined_fn (wide_ifn); + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + orig_code, ifn, shift_p, name, + subtype); +} + + /* Try to detect multiplication on widened inputs, converting MULT_EXPR to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ @@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, } /* Try to detect addition on widened inputs, converting PLUS_EXPR - to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_PLUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - PLUS_EXPR, WIDEN_PLUS_EXPR, false, - "vect_recog_widen_plus_pattern"); + PLUS_EXPR, IFN_VEC_WIDEN_PLUS, + false, "vect_recog_widen_plus_pattern", + &subtype); } /* Try to detect subtraction on widened inputs, converting MINUS_EXPR - to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_MINUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - MINUS_EXPR, WIDEN_MINUS_EXPR, false, - "vect_recog_widen_minus_pattern"); + MINUS_EXPR, IFN_VEC_WIDEN_MINUS, + false, "vect_recog_widen_minus_pattern", + &subtype); } /* Function vect_recog_ctz_ffs_pattern @@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo, vect_unpromoted_value unprom[3]; tree new_type; unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR, - WIDEN_PLUS_EXPR, false, 3, + IFN_VEC_WIDEN_PLUS, false, 3, unprom, &new_type); if (nops == 0) return NULL; @@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + /* These must come after the double widening ones. */ }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index d73e7f0936435951fe05fa6b787ba053233635aa..4f1569023a4e42ad6d058bccf62687dc3fe1302e 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5038,7 +5038,8 @@ vectorizable_conversion (vec_info *vinfo, bool widen_arith = (code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); + || code == WIDEN_LSHIFT_EXPR + || widening_fn_p (code)); if (!widen_arith && !CONVERT_EXPR_CODE_P (code) @@ -5088,8 +5089,8 @@ vectorizable_conversion (vec_info *vinfo, gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR); - + || code == WIDEN_MINUS_EXPR + || widening_fn_p (code)); op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : gimple_call_arg (stmt, 0); @@ -12500,26 +12501,69 @@ supportable_widening_operation (vec_info *vinfo, optab1 = vec_unpacks_sbool_lo_optab; optab2 = vec_unpacks_sbool_hi_optab; } - else - { - optab1 = optab_for_tree_code (c1, vectype, optab_default); - optab2 = optab_for_tree_code (c2, vectype, optab_default); + + vec_mode = TYPE_MODE (vectype); + if (widening_fn_p (code)) + { + /* If this is an internal fn then we must check whether the target + supports either a low-high split or an even-odd split. */ + internal_fn ifn = as_internal_fn ((combined_fn) code); + + internal_fn lo, hi, even, odd; + lookup_hilo_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = direct_internal_fn_optab (lo, {vectype, vectype}); + optab2 = direct_internal_fn_optab (hi, {vectype, vectype}); + + /* If we don't support low-high, then check for even-odd. */ + if (!optab1 + || (icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing + || !optab2 + || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing) + { + lookup_evenodd_internal_fn (ifn, &even, &odd); + *code1 = as_combined_fn (even); + *code2 = as_combined_fn (odd); + optab1 = direct_internal_fn_optab (even, {vectype, vectype}); + optab2 = direct_internal_fn_optab (odd, {vectype, vectype}); + } + } + else if (code.is_tree_code ()) + { + if (code == FIX_TRUNC_EXPR) + { + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (c1, vectype_out, optab_default); + optab2 = optab_for_tree_code (c2, vectype_out, optab_default); + } + else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ()) + && VECTOR_BOOLEAN_TYPE_P (wide_vectype) + && VECTOR_BOOLEAN_TYPE_P (vectype) + && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) + { + /* If the input and result modes are the same, a different optab + is needed where we pass in the number of units in vectype. */ + optab1 = vec_unpacks_sbool_lo_optab; + optab2 = vec_unpacks_sbool_hi_optab; + } + else + { + optab1 = optab_for_tree_code (c1, vectype, optab_default); + optab2 = optab_for_tree_code (c2, vectype, optab_default); + } + *code1 = c1; + *code2 = c2; } if (!optab1 || !optab2) return false; - vec_mode = TYPE_MODE (vectype); if ((icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing || (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing) return false; - if (code.is_tree_code ()) - { - *code1 = c1; - *code2 = c2; - } - if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype) && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype)) diff --git a/gcc/tree.def b/gcc/tree.def index 90ceeec0b512bfa5f983359c0af03cc71de32007..b37b0b35927b92a6536e5c2d9805ffce8319a240 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3) DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2) /* Widening sad (sum of absolute differences). - The first two arguments are of type t1 which should be integer. - The third argument and the result are of type t2, such that t2 is at least - twice the size of t1. Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is + The first two arguments are of type t1 which should be a vector of integers. + The third argument and the result are of type t2, such that the size of + the elements of t2 is at least twice the size of the elements of t1. + Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is equivalent to: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = PLUS_EXPR (tmp2, arg3) or: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = WIDEN_SUM_EXPR (tmp2, arg3) */ ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-06-01 16:27 ` Andre Vieira (lists) @ 2023-06-02 12:00 ` Richard Sandiford 2023-06-06 19:00 ` Jakub Jelinek 1 sibling, 0 replies; 53+ messages in thread From: Richard Sandiford @ 2023-06-02 12:00 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Biener, Richard Biener, gcc-patches Just some very minor things. "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes: > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..348bee35a35ae4ed9a8652f5349f430c2733e1cb 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -90,6 +90,71 @@ lookup_internal_fn (const char *name) > return entry ? *entry : IFN_LAST; > } > > +/* Given an internal_fn IFN that is either a widening or narrowing function, return its > + corresponding LO and HI internal_fns. */ Long line and too much space after "/*": /* Given an internal_fn IFN that is either a widening or narrowing function, return its corresponding _LO and _HI internal_fns in *LO and *HI. */ > +extern void > +lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi) > +{ > + gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn)); > + > + switch (ifn) > + { > + default: > + gcc_unreachable (); > +#undef DEF_INTERNAL_FN > +#undef DEF_INTERNAL_WIDENING_OPTAB_FN > +#undef DEF_INTERNAL_NARROWING_OPTAB_FN > +#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE) > +#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \ > + case IFN_##NAME: \ > + *lo = internal_fn (IFN_##NAME##_LO); \ > + *hi = internal_fn (IFN_##NAME##_HI); \ > + break; > +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T) \ > + case IFN_##NAME: \ > + *lo = internal_fn (IFN_##NAME##_LO); \ > + *hi = internal_fn (IFN_##NAME##_HI); \ > + break; > +#include "internal-fn.def" > +#undef DEF_INTERNAL_FN > +#undef DEF_INTERNAL_WIDENING_OPTAB_FN > +#undef DEF_INTERNAL_NARROWING_OPTAB_FN > + } > +} > + > +extern void > +lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even, > + internal_fn *odd) This needs a similar comment: /* Given an internal_fn IFN that is either a widening or narrowing function, return its corresponding _EVEN and _ODD internal_fns in *EVEN and *ODD. */ > @@ -3971,6 +4036,9 @@ commutative_binary_fn_p (internal_fn fn) > case IFN_UBSAN_CHECK_MUL: > case IFN_ADD_OVERFLOW: > case IFN_MUL_OVERFLOW: > + case IFN_VEC_WIDEN_PLUS: > + case IFN_VEC_WIDEN_PLUS_LO: > + case IFN_VEC_WIDEN_PLUS_HI: Should include even & odd as well. I'd suggest leaving out the narrowing stuff for now. There are some questions that would be easier to answer once we add the first use, such as whether one of the hi/lo pair and one or the even/odd pair merge with a vector containing the other half, whether all four define the other half to be zero, etc. OK for the optab/internal-fn parts with those changes from my POV. Thanks again for doing this! Richard ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 2/3] Refactor widen_plus as internal_fn 2023-06-01 16:27 ` Andre Vieira (lists) 2023-06-02 12:00 ` Richard Sandiford @ 2023-06-06 19:00 ` Jakub Jelinek 2023-06-06 21:28 ` [PATCH] modula2: Fix bootstrap Jakub Jelinek 1 sibling, 1 reply; 53+ messages in thread From: Jakub Jelinek @ 2023-06-06 19:00 UTC (permalink / raw) To: Andre Vieira (lists) Cc: Richard Biener, Richard Sandiford, Richard Biener, gcc-patches On Thu, Jun 01, 2023 at 05:27:56PM +0100, Andre Vieira (lists) via Gcc-patches wrote: > --- a/gcc/internal-fn.h > +++ b/gcc/internal-fn.h > @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see > #ifndef GCC_INTERNAL_FN_H > #define GCC_INTERNAL_FN_H > > +#include "insn-codes.h" > +#include "insn-opinit.h" My i686-linux build configured with ../configure --enable-languages=default,obj-c++,lto,go,d,rust,m2 --enable-checking=yes,rtl,extra --enable-libstdcxx-backtrace=yes just died with In file included from ../../gcc/m2/gm2-gcc/gcc-consolidation.h:74, from ../../gcc/m2/gm2-gcc/m2except.cc:22: ../../gcc/internal-fn.h:24:10: fatal error: insn-opinit.h: No such file or directory 24 | #include "insn-opinit.h" | ^~~~~~~~~~~~~~~ compilation terminated. In file included from ../../gcc/m2/gm2-gcc/gcc-consolidation.h:74, from ../../gcc/m2/m2pp.cc:23: ../../gcc/internal-fn.h:24:10: fatal error: insn-opinit.h: No such file or directory 24 | #include "insn-opinit.h" | ^~~~~~~~~~~~~~~ In file included from ../../gcc/m2/gm2-gcc/gcc-consolidation.h:74, from ../../gcc/m2/gm2-gcc/rtegraph.cc:22: ../../gcc/internal-fn.h:24:10: fatal error: insn-opinit.h: No such file or directory 24 | #include "insn-opinit.h" | ^~~~~~~~~~~~~~~ compilation terminated. compilation terminated. supposedly because of this change. Do you really need those includes there? If yes, what is supposed to ensure that the generated includes are generated before compiling files which include those? From what I can see, gcc/Makefile.in has generated_files var which includes among other things insn-opinit.h, and # Dependency information. # In order for parallel make to really start compiling the expensive # objects from $(OBJS) as early as possible, build all their # prerequisites strictly before all objects. $(ALL_HOST_OBJS) : | $(generated_files) rule, plus I see $(generated_files) mentioned in a couple of dependencies in gcc/m2/Make-lang.in . But supposedly because of this change it now needs to be added to tons of other spots. Jakub ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH] modula2: Fix bootstrap 2023-06-06 19:00 ` Jakub Jelinek @ 2023-06-06 21:28 ` Jakub Jelinek 2023-06-06 22:18 ` Gaius Mulley 2023-06-07 8:42 ` Andre Vieira (lists) 0 siblings, 2 replies; 53+ messages in thread From: Jakub Jelinek @ 2023-06-06 21:28 UTC (permalink / raw) To: Gaius Mulley; +Cc: Andre Vieira, Richard Biener, Richard Sandiford, gcc-patches Hi! internal-fn.h since yesterday includes insn-opinit.h, which is a generated header. One of my bootstraps today failed because some m2 sources started compiling before insn-opinit.h has been generated. Normally, gcc/Makefile.in has # In order for parallel make to really start compiling the expensive # objects from $(OBJS) as early as possible, build all their # prerequisites strictly before all objects. $(ALL_HOST_OBJS) : | $(generated_files) rule which ensures that all the generated files are generated before any $(ALL_HOST_OBJS) objects start, but use order-only dependency for this because we don't want to rebuild most of the objects whenever one generated header is regenerated. After the initial build in an empty directory we'll have .deps/ files contain the detailed dependencies. $(ALL_HOST_OBJS) includes even some FE files, I think in the m2 case would be m2_OBJS, but m2/Make-lang.in doesn't define those. The following patch just adds a similar rule to m2/Make-lang.in. Another option would be to set m2_OBJS variable in m2/Make-lang.in to something, but not really sure to which exactly and why it isn't done. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2023-06-06 Jakub Jelinek <jakub@redhat.com> * Make-lang.in: Build $(generated_files) before building all $(GM2_C_OBJS). --- gcc/m2/Make-lang.in.jj 2023-05-04 09:31:27.289948109 +0200 +++ gcc/m2/Make-lang.in 2023-06-06 21:38:26.655336041 +0200 @@ -511,6 +511,8 @@ GM2_LIBS_BOOT = m2/gm2-compiler-boot m2/gm2-libs-boot/libgm2.a \ $(GM2-BOOT-O) +$(GM2_C_OBJS) : | $(generated_files) + cc1gm2$(exeext): m2/stage1/cc1gm2$(exeext) $(m2.prev) cp -p $< $@ Jakub ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH] modula2: Fix bootstrap 2023-06-06 21:28 ` [PATCH] modula2: Fix bootstrap Jakub Jelinek @ 2023-06-06 22:18 ` Gaius Mulley 2023-06-07 8:42 ` Andre Vieira (lists) 1 sibling, 0 replies; 53+ messages in thread From: Gaius Mulley @ 2023-06-06 22:18 UTC (permalink / raw) To: Jakub Jelinek Cc: Andre Vieira, Richard Biener, Richard Sandiford, gcc-patches Jakub Jelinek <jakub@redhat.com> writes: > Hi! > > internal-fn.h since yesterday includes insn-opinit.h, which is a generated > header. > One of my bootstraps today failed because some m2 sources started compiling > before insn-opinit.h has been generated. > > Normally, gcc/Makefile.in has > # In order for parallel make to really start compiling the expensive > # objects from $(OBJS) as early as possible, build all their > # prerequisites strictly before all objects. > $(ALL_HOST_OBJS) : | $(generated_files) > > rule which ensures that all the generated files are generated before > any $(ALL_HOST_OBJS) objects start, but use order-only dependency for > this because we don't want to rebuild most of the objects whenever one > generated header is regenerated. After the initial build in an empty > directory we'll have .deps/ files contain the detailed dependencies. > > $(ALL_HOST_OBJS) includes even some FE files, I think in the m2 case > would be m2_OBJS, but m2/Make-lang.in doesn't define those. > > The following patch just adds a similar rule to m2/Make-lang.in. > Another option would be to set m2_OBJS variable in m2/Make-lang.in to > something, but not really sure to which exactly and why it isn't > done. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > 2023-06-06 Jakub Jelinek <jakub@redhat.com> > > * Make-lang.in: Build $(generated_files) before building > all $(GM2_C_OBJS). > > --- gcc/m2/Make-lang.in.jj 2023-05-04 09:31:27.289948109 +0200 > +++ gcc/m2/Make-lang.in 2023-06-06 21:38:26.655336041 +0200 > @@ -511,6 +511,8 @@ GM2_LIBS_BOOT = m2/gm2-compiler-boot > m2/gm2-libs-boot/libgm2.a \ > $(GM2-BOOT-O) > > +$(GM2_C_OBJS) : | $(generated_files) > + > cc1gm2$(exeext): m2/stage1/cc1gm2$(exeext) $(m2.prev) > cp -p $< $@ > > > > Jakub Hi Jakub, sure looks good to me - thanks for the patch, regards, Gaius ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH] modula2: Fix bootstrap 2023-06-06 21:28 ` [PATCH] modula2: Fix bootstrap Jakub Jelinek 2023-06-06 22:18 ` Gaius Mulley @ 2023-06-07 8:42 ` Andre Vieira (lists) 2023-06-13 14:48 ` Jakub Jelinek 1 sibling, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-06-07 8:42 UTC (permalink / raw) To: Jakub Jelinek, Gaius Mulley Cc: Richard Biener, Richard Sandiford, gcc-patches Thanks Jakub! I do need those includes and sorry I broke your bootstrap it didn't show up on my aarch64-unknown-linux-gnu bootstrap, I'm guessing the rules there were just run in a different order. Glad you were able to fix it :) On 06/06/2023 22:28, Jakub Jelinek wrote: > Hi! > > internal-fn.h since yesterday includes insn-opinit.h, which is a generated > header. > One of my bootstraps today failed because some m2 sources started compiling > before insn-opinit.h has been generated. > > Normally, gcc/Makefile.in has > # In order for parallel make to really start compiling the expensive > # objects from $(OBJS) as early as possible, build all their > # prerequisites strictly before all objects. > $(ALL_HOST_OBJS) : | $(generated_files) > > rule which ensures that all the generated files are generated before > any $(ALL_HOST_OBJS) objects start, but use order-only dependency for > this because we don't want to rebuild most of the objects whenever one > generated header is regenerated. After the initial build in an empty > directory we'll have .deps/ files contain the detailed dependencies. > > $(ALL_HOST_OBJS) includes even some FE files, I think in the m2 case > would be m2_OBJS, but m2/Make-lang.in doesn't define those. > > The following patch just adds a similar rule to m2/Make-lang.in. > Another option would be to set m2_OBJS variable in m2/Make-lang.in to > something, but not really sure to which exactly and why it isn't > done. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2023-06-06 Jakub Jelinek <jakub@redhat.com> > > * Make-lang.in: Build $(generated_files) before building > all $(GM2_C_OBJS). > > --- gcc/m2/Make-lang.in.jj 2023-05-04 09:31:27.289948109 +0200 > +++ gcc/m2/Make-lang.in 2023-06-06 21:38:26.655336041 +0200 > @@ -511,6 +511,8 @@ GM2_LIBS_BOOT = m2/gm2-compiler-boot > m2/gm2-libs-boot/libgm2.a \ > $(GM2-BOOT-O) > > +$(GM2_C_OBJS) : | $(generated_files) > + > cc1gm2$(exeext): m2/stage1/cc1gm2$(exeext) $(m2.prev) > cp -p $< $@ > > > > Jakub > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH] modula2: Fix bootstrap 2023-06-07 8:42 ` Andre Vieira (lists) @ 2023-06-13 14:48 ` Jakub Jelinek 0 siblings, 0 replies; 53+ messages in thread From: Jakub Jelinek @ 2023-06-13 14:48 UTC (permalink / raw) To: Andre Vieira (lists) Cc: Gaius Mulley, Richard Biener, Richard Sandiford, gcc-patches On Wed, Jun 07, 2023 at 09:42:22AM +0100, Andre Vieira (lists) wrote: > I do need those includes and sorry I broke your bootstrap it didn't show up > on my aarch64-unknown-linux-gnu bootstrap, I'm guessing the rules there were > just run in a different order. Glad you were able to fix it :) Unfortunately, it doesn't really work. My x86_64-linux bootstrap today died again with: In file included from ../../gcc/m2/gm2-gcc/gcc-consolidation.h:74, from ../../gcc/m2/gm2-lang.cc:24: ../../gcc/internal-fn.h:24:10: fatal error: insn-opinit.h: No such file or directory 24 | #include "insn-opinit.h" | ^~~~~~~~~~~~~~~ compilation terminated. /home/jakub/src/gcc/obj36/./prev-gcc/xg++ -B/home/jakub/src/gcc/obj36/./prev-gcc/ -B/usr/local/x86_64-pc-linux-gnu/bin/ -nostdinc++ -B/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs -B/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -I/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu -I/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/include -I/home/jakub/src/gcc/libstdc++-v3/libsupc++ -L/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs -L/home/jakub/src/gcc/obj36/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g -g -O2 -fchecking=1 -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -fno-common -DHAVE_CONFIG_H \ -I. -Im2/gm2-gcc -I../../gcc -I../../gcc/m2/gm2-gcc -I../../gcc/../include -I../../gcc/../libcpp/include -I../../gcc/../libcody -I../../gcc/../libdecnumber -I../../gcc/../libdecnumber/bid -I../libdecnumber -I../../gcc/../libbacktrace -I. -Im2/gm2-gcc -I../../gcc -I../../gcc/m2/gm2-gcc -I../../gcc/../include -I../../gcc/../libcpp/include -I../../gcc/../libcody -I../../gcc/../libdecnumber -I../../gcc/../libdecnumber/bid -I../libdecnumber -I../../gcc/../libbacktrace ../../gcc/m2/gm2-gcc/m2type.cc -o m2/gm2-gcc/m2type.o make[3]: *** [../../gcc/m2/Make-lang.in:570: m2/gm2-lang.o] Error 1 make[3]: *** Waiting for unfinished jobs.... errors. Dunno what is going on. I've tried --- gcc/m2/Make-lang.in.jj 2023-06-07 15:56:07.112684198 +0200 +++ gcc/m2/Make-lang.in 2023-06-13 16:08:55.409364765 +0200 @@ -511,7 +511,7 @@ GM2_LIBS_BOOT = m2/gm2-compiler-boot m2/gm2-libs-boot/libgm2.a \ $(GM2-BOOT-O) -$(GM2_C_OBJS) : | $(generated_files) +m2_OBJS = $(GM2_C_OBJS) cc1gm2$(exeext): m2/stage1/cc1gm2$(exeext) $(m2.prev) cp -p $< $@ but that doesn't really work either, this time not just random bootstrap breakages from time to time, but all the time. Including GM2_C_OBJS in m2_OBJS is I think the right thing, but that results in predefining IN_GCC_FRONTEND macro and we have e.g. /* Front ends should never have to include middle-end headers. Enforce this by poisoning the header double-include protection defines. */ #ifdef IN_GCC_FRONTEND #pragma GCC poison GCC_RTL_H GCC_EXCEPT_H GCC_EXPR_H #endif in system.h to make sure that FE sources don't include rtl.h, except.h, expr.h. But m2/gm2-gcc/gcc-consolidation.h includes tons of the RTL headers, rtl.h, df.h (twice), except.h; why? Also, seems one of GM2_C_OBJS is some special copy of stor-layout.cc which really isn't a FE file and so needs the RTL-ish headers. Jakub ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH 3/3] Remove widen_plus/minus_expr tree codes 2023-04-25 9:55 ` Andre Vieira (lists) 2023-04-28 12:36 ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists) 2023-04-28 12:37 ` [PATCH 2/3] Refactor widen_plus as internal_fn Andre Vieira (lists) @ 2023-04-28 12:37 ` Andre Vieira (lists) 2023-05-03 12:29 ` Richard Biener 2 siblings, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-04-28 12:37 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches [-- Attachment #1: Type: text/plain, Size: 1543 bytes --] This is a rebase of Joel's previous patch. This patch removes the old widen plus/minus tree codes which have been replaced by internal functions. gcc/ChangeLog: 2023-04-28 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> * doc/generic.texi: Remove old tree codes. * expr.cc (expand_expr_real_2): Remove old tree code cases. * gimple-pretty-print.cc (dump_binary_rhs): Likewise. * optabs-tree.cc (optab_for_tree_code): Likewise. (supportable_half_widening_operation): Likewise. * tree-cfg.cc (verify_gimple_assign_binary): Likewise. * tree-inline.cc (estimate_operator_cost): Likewise. (op_symbol_code): Likewise. * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Likewise. (vect_analyze_data_ref_accesses): Likewise. * tree-vect-generic.cc (expand_vector_operations_1): Likewise. * cfgexpand.cc (expand_debug_expr): Likewise. * tree-vect-stmts.cc (vectorizable_conversion): Likewise. (supportable_widening_operation): Likewise. * gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard): Likewise. * tree-vect-patterns.cc (vect_widened_op_tree): Refactor to replace usage in vect_recog_sad_pattern. (vect_recog_sad_pattern): Replace tree code widening pattern with internal function. (vect_recog_average_pattern): Likewise. * tree-pretty-print.cc (dump_generic_node): Remove tree code definition. * tree.def (WIDEN_PLUS_EXPR, WIDEN_MINUS_EXPR, VEC_WIDEN_PLUS_HI_EXPR, VEC_WIDEN_PLUS_LO_EXPR, VEC_WIDEN_MINUS_HI_EXPR, VEC_WIDEN_MINUS_LO_EXPR): Likewise [-- Attachment #2: ifn2_v2.patch --] [-- Type: text/plain, Size: 19090 bytes --] diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index 1a1b26b1c6c23ce273bcd08dc9a973f777174007..25b1558dcb941ea491a19aeeb2cd8f4d2dbdf7c6 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -5365,10 +5365,6 @@ expand_debug_expr (tree exp) case VEC_WIDEN_MULT_ODD_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_PERM_EXPR: case VEC_DUPLICATE_EXPR: case VEC_SERIES_EXPR: @@ -5405,8 +5401,6 @@ expand_debug_expr (tree exp) case WIDEN_MULT_EXPR: case WIDEN_MULT_PLUS_EXPR: case WIDEN_MULT_MINUS_EXPR: - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: if (SCALAR_INT_MODE_P (GET_MODE (op0)) && SCALAR_INT_MODE_P (mode)) { @@ -5419,10 +5413,6 @@ expand_debug_expr (tree exp) op1 = simplify_gen_unary (ZERO_EXTEND, mode, op1, inner_mode); else op1 = simplify_gen_unary (SIGN_EXTEND, mode, op1, inner_mode); - if (TREE_CODE (exp) == WIDEN_PLUS_EXPR) - return simplify_gen_binary (PLUS, mode, op0, op1); - else if (TREE_CODE (exp) == WIDEN_MINUS_EXPR) - return simplify_gen_binary (MINUS, mode, op0, op1); op0 = simplify_gen_binary (MULT, mode, op0, op1); if (TREE_CODE (exp) == WIDEN_MULT_EXPR) return op0; diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index 2c14b7abce2db0a3da0a21e916907947cb56a265..3816abaaf4d364d604a44942317f96f3f303e5b6 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1811,10 +1811,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @tindex VEC_WIDEN_MULT_LO_EXPR -@tindex VEC_WIDEN_PLUS_HI_EXPR -@tindex VEC_WIDEN_PLUS_LO_EXPR -@tindex VEC_WIDEN_MINUS_HI_EXPR -@tindex VEC_WIDEN_MINUS_LO_EXPR @tindex VEC_UNPACK_HI_EXPR @tindex VEC_UNPACK_LO_EXPR @tindex VEC_UNPACK_FLOAT_HI_EXPR @@ -1861,33 +1857,6 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the low @code{N/2} elements of the two vector are multiplied to produce the vector of @code{N/2} products. -@item VEC_WIDEN_PLUS_HI_EXPR -@itemx VEC_WIDEN_PLUS_LO_EXPR -These nodes represent widening vector addition of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The result -is a vector that contains half as many elements, of an integral type whose size -is twice as wide. In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. - -@item VEC_WIDEN_MINUS_HI_EXPR -@itemx VEC_WIDEN_MINUS_LO_EXPR -These nodes represent widening vector subtraction of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The high/low -elements of the second vector are subtracted from the high/low elements of the -first. The result is a vector that contains half as many elements, of an -integral type whose size is twice as wide. In the case of -@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second -vector are subtracted from the high @code{N/2} of the first to produce the -vector of @code{N/2} products. In the case of -@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second -vector are subtracted from the low @code{N/2} of the first to produce the -vector of @code{N/2} products. - @item VEC_UNPACK_HI_EXPR @itemx VEC_UNPACK_LO_EXPR These nodes represent unpacking of the high and low parts of the input vector, diff --git a/gcc/expr.cc b/gcc/expr.cc index f8f5cc5a6ca67f291b3c8b7246d593c0be80272f..454d1391b19a7d2aa53f0a88876d1eaf0494de51 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -9601,8 +9601,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target, unsignedp); return target; - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_MULT_EXPR: /* If first operand is constant, swap them. Thus the following special case checks need only @@ -10380,10 +10378,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, return temp; } - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc index 300e9d7ed1e7be73f30875e08c461a8880c3134e..d903826894e7f0dfd34dc0caad92eea3caa45e05 100644 --- a/gcc/gimple-pretty-print.cc +++ b/gcc/gimple-pretty-print.cc @@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc, case VEC_PACK_FLOAT_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_SERIES_EXPR: for (p = get_tree_code_name (code); *p; p++) pp_character (buffer, TOUPPER (*p)); diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index 4ca32a7b5d52f8426b09d1446a336650e143b41f..5ae7f7596c6fc6f901e4e47ae44f00185f4602b2 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -797,12 +797,6 @@ gimple_range_op_handler::maybe_non_standard () if (gimple_code (m_stmt) == GIMPLE_ASSIGN) switch (gimple_assign_rhs_code (m_stmt)) { - case WIDEN_PLUS_EXPR: - { - signed_op = ptr_op_widen_plus_signed; - unsigned_op = ptr_op_widen_plus_unsigned; - } - gcc_fallthrough (); case WIDEN_MULT_EXPR: { m_valid = false; diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc index 8010046c6a8b3e809c989ddef7a06ddaa68ae32a..ee1aa8c9676ee9c67edbf403e6295da391826a62 100644 --- a/gcc/optabs-tree.cc +++ b/gcc/optabs-tree.cc @@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, return (TYPE_UNSIGNED (type) ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab); - case VEC_WIDEN_PLUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab); - - case VEC_WIDEN_PLUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab); - - case VEC_WIDEN_MINUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab); - - case VEC_WIDEN_MINUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab); - case VEC_UNPACK_HI_EXPR: return (TYPE_UNSIGNED (type) ? vec_unpacku_hi_optab : vec_unpacks_hi_optab); @@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, 'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO. Supported widening operations: - WIDEN_MINUS_EXPR - WIDEN_PLUS_EXPR WIDEN_MULT_EXPR WIDEN_LSHIFT_EXPR @@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out, case WIDEN_LSHIFT_EXPR: *code1 = LSHIFT_EXPR; break; - case WIDEN_MINUS_EXPR: - *code1 = MINUS_EXPR; - break; - case WIDEN_PLUS_EXPR: - *code1 = PLUS_EXPR; - break; case WIDEN_MULT_EXPR: *code1 = MULT_EXPR; break; diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index a9fcc7fd050f871437ef336ecfb8d6cc81280ee0..f80cd1465df83b5540492e619e56b9af249e9f31 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -4017,8 +4017,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case PLUS_EXPR: case MINUS_EXPR: { @@ -4139,10 +4137,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index c702f0032a19203a7c536a01c1e7f47fc7b77add..6e5fd45a0c2435109dd3d50e8fc8e1d4969a1fd0 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -4273,8 +4273,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case REALIGN_LOAD_EXPR: - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case DOT_PROD_EXPR: @@ -4283,10 +4281,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case WIDEN_MULT_MINUS_EXPR: case WIDEN_LSHIFT_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 7947f9647a15110b52d195643ad7d28ee32d4236..9941d8bf80535a98e647b8928619a6bf08bc434c 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -2874,8 +2874,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, break; /* Binary arithmetic and logic expressions. */ - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case MULT_EXPR: @@ -3831,10 +3829,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, case VEC_SERIES_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: case VEC_WIDEN_MULT_ODD_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: @@ -4352,12 +4346,6 @@ op_symbol_code (enum tree_code code) case WIDEN_LSHIFT_EXPR: return "w<<"; - case WIDEN_PLUS_EXPR: - return "w+"; - - case WIDEN_MINUS_EXPR: - return "w-"; - case POINTER_PLUS_EXPR: return "+"; diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 8daf7bd7dd34d043b1d7b4cba1779f0ecf9f520a..213a3899a6c145bb057cd118bec1df7a05728aef 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR || gimple_assign_rhs_code (assign) == FLOAT_EXPR) { tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign)); diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index 445da53292e9d1d2db62ca962fc017bb0e6c9bbe..342ffc5fa7f3b8f37e6bd4658d2f1fccf1d2c7fa 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -2227,10 +2227,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi, arguments, not the widened result. VEC_UNPACK_FLOAT_*_EXPR is calculated in the same way above. */ if (code == WIDEN_SUM_EXPR - || code == VEC_WIDEN_PLUS_HI_EXPR - || code == VEC_WIDEN_PLUS_LO_EXPR - || code == VEC_WIDEN_MINUS_HI_EXPR - || code == VEC_WIDEN_MINUS_LO_EXPR || code == VEC_WIDEN_MULT_HI_EXPR || code == VEC_WIDEN_MULT_LO_EXPR || code == VEC_WIDEN_MULT_EVEN_EXPR diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 3175dd92187c0935f78ebbf2eb476bdcf8b4ccd1..ab3162b5ac66ea8a96c0ea7c45138ca5ee13423f 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -561,21 +561,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type) static unsigned int vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, - tree_code widened_code, bool shift_p, + code_helper widened_code, bool shift_p, unsigned int max_nops, vect_unpromoted_value *unprom, tree *common_type, enum optab_subtype *subtype = NULL) { /* Check for an integer operation with the right code. */ - gassign *assign = dyn_cast <gassign *> (stmt_info->stmt); - if (!assign) + gimple* stmt = stmt_info->stmt; + if (!(is_gimple_assign (stmt) || is_gimple_call (stmt))) + return 0; + + code_helper rhs_code; + if (is_gimple_assign (stmt)) + rhs_code = gimple_assign_rhs_code (stmt); + else if (is_gimple_call (stmt)) + rhs_code = gimple_call_combined_fn (stmt); + else return 0; - tree_code rhs_code = gimple_assign_rhs_code (assign); - if (rhs_code != code && rhs_code != widened_code) + if (rhs_code != code + && rhs_code != widened_code) return 0; - tree type = TREE_TYPE (gimple_assign_lhs (assign)); + tree lhs = gimple_get_lhs (stmt); + tree type = TREE_TYPE (lhs); if (!INTEGRAL_TYPE_P (type)) return 0; @@ -588,7 +597,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code, { vect_unpromoted_value *this_unprom = &unprom[next_op]; unsigned int nops = 1; - tree op = gimple_op (assign, i + 1); + tree op = gimple_arg (stmt, i); if (i == 1 && TREE_CODE (op) == INTEGER_CST) { /* We already have a common type from earlier operands. @@ -1342,8 +1351,9 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR, - false, 2, unprom, &half_type)) + if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, + CFN_VEC_WIDEN_MINUS, false, 2, unprom, + &half_type)) return NULL; vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); @@ -2696,9 +2706,10 @@ vect_recog_average_pattern (vec_info *vinfo, internal_fn ifn = IFN_AVG_FLOOR; vect_unpromoted_value unprom[3]; tree new_type; + enum optab_subtype subtype; unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR, - WIDEN_PLUS_EXPR, false, 3, - unprom, &new_type); + CFN_VEC_WIDEN_PLUS, false, 3, + unprom, &new_type, &subtype); if (nops == 0) return NULL; if (nops == 3) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 2a7ef2439e12d1966e8884433963a3d387a856b7..ef3ac551f7fe247893b021d98e43c581e2078dbb 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5032,9 +5032,7 @@ vectorizable_conversion (vec_info *vinfo, else return false; - bool widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR + bool widen_arith = (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == IFN_VEC_WIDEN_PLUS || code == IFN_VEC_WIDEN_MINUS); @@ -5086,8 +5084,6 @@ vectorizable_conversion (vec_info *vinfo, { gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR || code == IFN_VEC_WIDEN_PLUS || code == IFN_VEC_WIDEN_MINUS); @@ -12192,7 +12188,7 @@ supportable_widening_operation (vec_info *vinfo, class loop *vect_loop = NULL; machine_mode vec_mode; enum insn_code icode1, icode2; - optab optab1, optab2; + optab optab1 = unknown_optab, optab2 = unknown_optab; tree vectype = vectype_in; tree wide_vectype = vectype_out; tree_code c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES; @@ -12290,16 +12286,6 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_WIDEN_LSHIFT_HI_EXPR; break; - case WIDEN_PLUS_EXPR: - c1 = VEC_WIDEN_PLUS_LO_EXPR; - c2 = VEC_WIDEN_PLUS_HI_EXPR; - break; - - case WIDEN_MINUS_EXPR: - c1 = VEC_WIDEN_MINUS_LO_EXPR; - c2 = VEC_WIDEN_MINUS_HI_EXPR; - break; - CASE_CONVERT: c1 = VEC_UNPACK_LO_EXPR; c2 = VEC_UNPACK_HI_EXPR; diff --git a/gcc/tree.def b/gcc/tree.def index ee02754354f015a16737c7e879d89c3e3be0d5aa..a58e608a90078818a7ade9d1173ac7ec84c48c7a 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3) DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2) /* Widening sad (sum of absolute differences). - The first two arguments are of type t1 which should be integer. - The third argument and the result are of type t2, such that t2 is at least - twice the size of t1. Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is + The first two arguments are of type t1 which should be a vector of integers. + The third argument and the result are of type t2, such that the size of + the elements of t2 is at least twice the size of the elements of t1. + Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is equivalent to: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = PLUS_EXPR (tmp2, arg3) or: - tmp = WIDEN_MINUS_EXPR (arg1, arg2) + tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = WIDEN_SUM_EXPR (tmp2, arg3) */ @@ -1421,8 +1422,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3) the first argument from type t1 to type t2, and then shifting it by the second argument. */ DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2) /* Widening vector multiplication. The two operands are vectors with N elements of size S. Multiplying the @@ -1487,10 +1486,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2) */ DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2) /* PREDICT_EXPR. Specify hint for branch prediction. The PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 3/3] Remove widen_plus/minus_expr tree codes 2023-04-28 12:37 ` [PATCH 3/3] Remove widen_plus/minus_expr tree codes Andre Vieira (lists) @ 2023-05-03 12:29 ` Richard Biener 2023-05-10 9:15 ` Andre Vieira (lists) 0 siblings, 1 reply; 53+ messages in thread From: Richard Biener @ 2023-05-03 12:29 UTC (permalink / raw) To: Andre Vieira (lists); +Cc: Richard Biener, Richard Sandiford, gcc-patches On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: > This is a rebase of Joel's previous patch. > > This patch removes the old widen plus/minus tree codes which have been > replaced by internal functions. I guess that's obvious then. I wonder what we do to internal fns in debug stmts? Looks like we throw those away and do not generate debug stmts from calls. Given you remove handling of the scalar WIDEN_PLUS/MINUS_EXPR codes everywhere do we want to add checking code the scalar IFNs do not appear in the IL? For at least some cases there are corresponding functions handling internal functions that you could have amended otherwise. Richard. > gcc/ChangeLog: > > 2023-04-28 Andre Vieira <andre.simoesdiasvieira@arm.com> > Joel Hutton <joel.hutton@arm.com> > > * doc/generic.texi: Remove old tree codes. > * expr.cc (expand_expr_real_2): Remove old tree code cases. > * gimple-pretty-print.cc (dump_binary_rhs): Likewise. > * optabs-tree.cc (optab_for_tree_code): Likewise. > (supportable_half_widening_operation): Likewise. > * tree-cfg.cc (verify_gimple_assign_binary): Likewise. > * tree-inline.cc (estimate_operator_cost): Likewise. > (op_symbol_code): Likewise. > * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Likewise. > (vect_analyze_data_ref_accesses): Likewise. > * tree-vect-generic.cc (expand_vector_operations_1): Likewise. > * cfgexpand.cc (expand_debug_expr): Likewise. > * tree-vect-stmts.cc (vectorizable_conversion): Likewise. > (supportable_widening_operation): Likewise. > * gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard): > Likewise. > * tree-vect-patterns.cc (vect_widened_op_tree): Refactor to replace > usage in vect_recog_sad_pattern. > (vect_recog_sad_pattern): Replace tree code widening pattern with > internal function. > (vect_recog_average_pattern): Likewise. > * tree-pretty-print.cc (dump_generic_node): Remove tree code > definition. > * tree.def (WIDEN_PLUS_EXPR, WIDEN_MINUS_EXPR, VEC_WIDEN_PLUS_HI_EXPR, > VEC_WIDEN_PLUS_LO_EXPR, VEC_WIDEN_MINUS_HI_EXPR, > VEC_WIDEN_MINUS_LO_EXPR): Likewise > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 3/3] Remove widen_plus/minus_expr tree codes 2023-05-03 12:29 ` Richard Biener @ 2023-05-10 9:15 ` Andre Vieira (lists) 2023-05-12 12:18 ` Andre Vieira (lists) 0 siblings, 1 reply; 53+ messages in thread From: Andre Vieira (lists) @ 2023-05-10 9:15 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches On 03/05/2023 13:29, Richard Biener wrote: > On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: > >> This is a rebase of Joel's previous patch. >> >> This patch removes the old widen plus/minus tree codes which have been >> replaced by internal functions. > > I guess that's obvious then. I wonder what we do to internal > fns in debug stmts? Looks like we throw those away and do not > generate debug stmts from calls. See the comment above the removed lines in expand_debug_expr: /* Vector stuff. For most of the codes we don't have rtl codes. */ And it then just returns NULL for those expr's. So the behaviour there remains unchanged, not saying we couldn't do anything but I don > > Given you remove handling of the scalar WIDEN_PLUS/MINUS_EXPR > codes everywhere do we want to add checking code the scalar > IFNs do not appear in the IL? For at least some cases there > are corresponding functions handling internal functions that > you could have amended otherwise. I am making some changes to PATCH 2 of this series, in the new version I am adding some extra code to the gimple checks, one of which is to error if it comes a cross an IFN that decomposes to HILO as that should only occur as an intermediary representation of the vect pass. > > Richard. > >> gcc/ChangeLog: >> >> 2023-04-28 Andre Vieira <andre.simoesdiasvieira@arm.com> >> Joel Hutton <joel.hutton@arm.com> >> >> * doc/generic.texi: Remove old tree codes. >> * expr.cc (expand_expr_real_2): Remove old tree code cases. >> * gimple-pretty-print.cc (dump_binary_rhs): Likewise. >> * optabs-tree.cc (optab_for_tree_code): Likewise. >> (supportable_half_widening_operation): Likewise. >> * tree-cfg.cc (verify_gimple_assign_binary): Likewise. >> * tree-inline.cc (estimate_operator_cost): Likewise. >> (op_symbol_code): Likewise. >> * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Likewise. >> (vect_analyze_data_ref_accesses): Likewise. >> * tree-vect-generic.cc (expand_vector_operations_1): Likewise. >> * cfgexpand.cc (expand_debug_expr): Likewise. >> * tree-vect-stmts.cc (vectorizable_conversion): Likewise. >> (supportable_widening_operation): Likewise. >> * gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard): >> Likewise. >> * tree-vect-patterns.cc (vect_widened_op_tree): Refactor to replace >> usage in vect_recog_sad_pattern. >> (vect_recog_sad_pattern): Replace tree code widening pattern with >> internal function. >> (vect_recog_average_pattern): Likewise. >> * tree-pretty-print.cc (dump_generic_node): Remove tree code >> definition. >> * tree.def (WIDEN_PLUS_EXPR, WIDEN_MINUS_EXPR, VEC_WIDEN_PLUS_HI_EXPR, >> VEC_WIDEN_PLUS_LO_EXPR, VEC_WIDEN_MINUS_HI_EXPR, >> VEC_WIDEN_MINUS_LO_EXPR): Likewise >> > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH 3/3] Remove widen_plus/minus_expr tree codes 2023-05-10 9:15 ` Andre Vieira (lists) @ 2023-05-12 12:18 ` Andre Vieira (lists) 0 siblings, 0 replies; 53+ messages in thread From: Andre Vieira (lists) @ 2023-05-12 12:18 UTC (permalink / raw) To: Richard Biener; +Cc: Richard Biener, Richard Sandiford, gcc-patches [-- Attachment #1: Type: text/plain, Size: 1583 bytes --] Moved the 'changes' from this patch back to the second so it's all just about removing code that we no longer use. I don't really know why Joel formatted the patches this way, but I thought I'd keep it as is for now. cover letter: This patch removes the old widen plus/minus tree codes which have been replaced by internal functions. gcc/ChangeLog: 2023-05-12 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> * cfgexpand.cc (expand_debug_expr): Remove old tree codes. * doc/generic.texi: Likewise. * expr.cc (expand_expr_real_2): Likewise. * gimple-pretty-print.cc (dump_binary_rhs): Likewise. * gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard): Likewise. * optabs-tree.cc (optab_for_tree_code): Likewise. (supportable_half_widening_operation): Likewise. * optabs.cc (commutative_optab_p): Likewise. * optabs.def (OPTAB_D): Likewise. * tree-cfg.cc (verify_gimple_assign_binary): Likewise. * tree-inline.cc (estimate_operator_cost): Likewise. (op_symbol_code): Likewise. * tree-pretty-print.cc (dump_generic_node): Remove tree code definition. * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Likewise. (vect_analyze_data_ref_accesses): Likewise. * tree-vect-generic.cc (expand_vector_operations_1): Likewise. * tree-vect-stmts.cc (vectorizable_conversion): Likewise. (supportable_widening_operation): Likewise. * tree.def (WIDEN_PLUS_EXPR, WIDEN_MINUS_EXPR, VEC_WIDEN_PLUS_HI_EXPR, VEC_WIDEN_PLUS_LO_EXPR, VEC_WIDEN_MINUS_HI_EXPR, VEC_WIDEN_MINUS_LO_EXPR): Likewise. [-- Attachment #2: ifn2v3.patch --] [-- Type: text/plain, Size: 17304 bytes --] diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index 1a1b26b1c6c23ce273bcd08dc9a973f777174007..25b1558dcb941ea491a19aeeb2cd8f4d2dbdf7c6 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -5365,10 +5365,6 @@ expand_debug_expr (tree exp) case VEC_WIDEN_MULT_ODD_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_PERM_EXPR: case VEC_DUPLICATE_EXPR: case VEC_SERIES_EXPR: @@ -5405,8 +5401,6 @@ expand_debug_expr (tree exp) case WIDEN_MULT_EXPR: case WIDEN_MULT_PLUS_EXPR: case WIDEN_MULT_MINUS_EXPR: - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: if (SCALAR_INT_MODE_P (GET_MODE (op0)) && SCALAR_INT_MODE_P (mode)) { @@ -5419,10 +5413,6 @@ expand_debug_expr (tree exp) op1 = simplify_gen_unary (ZERO_EXTEND, mode, op1, inner_mode); else op1 = simplify_gen_unary (SIGN_EXTEND, mode, op1, inner_mode); - if (TREE_CODE (exp) == WIDEN_PLUS_EXPR) - return simplify_gen_binary (PLUS, mode, op0, op1); - else if (TREE_CODE (exp) == WIDEN_MINUS_EXPR) - return simplify_gen_binary (MINUS, mode, op0, op1); op0 = simplify_gen_binary (MULT, mode, op0, op1); if (TREE_CODE (exp) == WIDEN_MULT_EXPR) return op0; diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index 0fd7e6cce8bbd4ecb8027b702722adcf6c32eb55..a23d57af20610e0bb4809f06fb0c91253ae56d11 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1815,10 +1815,6 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex IFN_VEC_WIDEN_PLUS_LO @tindex IFN_VEC_WIDEN_MINUS_HI @tindex IFN_VEC_WIDEN_MINUS_LO -@tindex VEC_WIDEN_PLUS_HI_EXPR -@tindex VEC_WIDEN_PLUS_LO_EXPR -@tindex VEC_WIDEN_MINUS_HI_EXPR -@tindex VEC_WIDEN_MINUS_LO_EXPR @tindex VEC_UNPACK_HI_EXPR @tindex VEC_UNPACK_LO_EXPR @tindex VEC_UNPACK_FLOAT_HI_EXPR @@ -1892,33 +1888,6 @@ vector of @code{N/2} products. In the case of vector are subtracted from the low @code{N/2} of the first to produce the vector of @code{N/2} products. -@item VEC_WIDEN_PLUS_HI_EXPR -@itemx VEC_WIDEN_PLUS_LO_EXPR -These nodes represent widening vector addition of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The result -is a vector that contains half as many elements, of an integral type whose size -is twice as wide. In the case of @code{VEC_WIDEN_PLUS_HI_EXPR} the high -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. In the case of @code{VEC_WIDEN_PLUS_LO_EXPR} the low -@code{N/2} elements of the two vectors are added to produce the vector of -@code{N/2} products. - -@item VEC_WIDEN_MINUS_HI_EXPR -@itemx VEC_WIDEN_MINUS_LO_EXPR -These nodes represent widening vector subtraction of the high and low parts of -the two input vectors, respectively. Their operands are vectors that contain -the same number of elements (@code{N}) of the same integral type. The high/low -elements of the second vector are subtracted from the high/low elements of the -first. The result is a vector that contains half as many elements, of an -integral type whose size is twice as wide. In the case of -@code{VEC_WIDEN_MINUS_HI_EXPR} the high @code{N/2} elements of the second -vector are subtracted from the high @code{N/2} of the first to produce the -vector of @code{N/2} products. In the case of -@code{VEC_WIDEN_MINUS_LO_EXPR} the low @code{N/2} elements of the second -vector are subtracted from the low @code{N/2} of the first to produce the -vector of @code{N/2} products. - @item VEC_UNPACK_HI_EXPR @itemx VEC_UNPACK_LO_EXPR These nodes represent unpacking of the high and low parts of the input vector, diff --git a/gcc/expr.cc b/gcc/expr.cc index 758dda9ec68a8ba7a7b0e247aee50fd7996aa1d7..dd03688167b04299be213a5c379876499cb6a317 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -9600,8 +9600,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target, unsignedp); return target; - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_MULT_EXPR: /* If first operand is constant, swap them. Thus the following special case checks need only @@ -10379,10 +10377,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, return temp; } - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc index e46f7d5f55a31bf6453cd33683aa536f7fbe606f..8db221f65fe7e2fc1ce25685240f11516af87fe6 100644 --- a/gcc/gimple-pretty-print.cc +++ b/gcc/gimple-pretty-print.cc @@ -459,10 +459,6 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, int spc, case VEC_PACK_FLOAT_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_SERIES_EXPR: for (p = get_tree_code_name (code); *p; p++) pp_character (buffer, TOUPPER (*p)); diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index 66636d82df27626e7911efd0cb8526921b39633f..466985bfd39a147d47ac525b7fe9bc3fd2d0b7b3 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -1191,12 +1191,6 @@ gimple_range_op_handler::maybe_non_standard () if (gimple_code (m_stmt) == GIMPLE_ASSIGN) switch (gimple_assign_rhs_code (m_stmt)) { - case WIDEN_PLUS_EXPR: - { - signed_op = ptr_op_widen_plus_signed; - unsigned_op = ptr_op_widen_plus_unsigned; - } - gcc_fallthrough (); case WIDEN_MULT_EXPR: { m_valid = false; diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc index 8010046c6a8b3e809c989ddef7a06ddaa68ae32a..ee1aa8c9676ee9c67edbf403e6295da391826a62 100644 --- a/gcc/optabs-tree.cc +++ b/gcc/optabs-tree.cc @@ -190,22 +190,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, return (TYPE_UNSIGNED (type) ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab); - case VEC_WIDEN_PLUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab); - - case VEC_WIDEN_PLUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab); - - case VEC_WIDEN_MINUS_LO_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab); - - case VEC_WIDEN_MINUS_HI_EXPR: - return (TYPE_UNSIGNED (type) - ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab); - case VEC_UNPACK_HI_EXPR: return (TYPE_UNSIGNED (type) ? vec_unpacku_hi_optab : vec_unpacks_hi_optab); @@ -312,8 +296,6 @@ optab_for_tree_code (enum tree_code code, const_tree type, 'hi'/'lo' pair using codes such as VEC_WIDEN_MINUS_HI/LO. Supported widening operations: - WIDEN_MINUS_EXPR - WIDEN_PLUS_EXPR WIDEN_MULT_EXPR WIDEN_LSHIFT_EXPR @@ -345,12 +327,6 @@ supportable_half_widening_operation (enum tree_code code, tree vectype_out, case WIDEN_LSHIFT_EXPR: *code1 = LSHIFT_EXPR; break; - case WIDEN_MINUS_EXPR: - *code1 = MINUS_EXPR; - break; - case WIDEN_PLUS_EXPR: - *code1 = PLUS_EXPR; - break; case WIDEN_MULT_EXPR: *code1 = MULT_EXPR; break; diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 5a08d91e550b2d92e9572211f811fdba99a33a38..4309733a39be3d2a82dd2b13a50d73e6ddc2e0ff 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1315,10 +1315,6 @@ commutative_optab_p (optab binoptab) || binoptab == umul_widen_optab || binoptab == smul_highpart_optab || binoptab == umul_highpart_optab - || binoptab == vec_widen_saddl_hi_optab - || binoptab == vec_widen_saddl_lo_optab - || binoptab == vec_widen_uaddl_hi_optab - || binoptab == vec_widen_uaddl_lo_optab || binoptab == vec_widen_sadd_hi_optab || binoptab == vec_widen_sadd_lo_optab || binoptab == vec_widen_uadd_hi_optab diff --git a/gcc/optabs.def b/gcc/optabs.def index 16d121722c8c5723d9b164f5a2c616dc7ec143de..d4b3befdb822b98f12a9a440261f1b8e81432639 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -406,10 +406,6 @@ OPTAB_D (vec_widen_smult_even_optab, "vec_widen_smult_even_$a") OPTAB_D (vec_widen_smult_hi_optab, "vec_widen_smult_hi_$a") OPTAB_D (vec_widen_smult_lo_optab, "vec_widen_smult_lo_$a") OPTAB_D (vec_widen_smult_odd_optab, "vec_widen_smult_odd_$a") -OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a") -OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a") -OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a") -OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a") OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a") OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a") OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a") @@ -422,10 +418,6 @@ OPTAB_D (vec_widen_umult_lo_optab, "vec_widen_umult_lo_$a") OPTAB_D (vec_widen_umult_odd_optab, "vec_widen_umult_odd_$a") OPTAB_D (vec_widen_ushiftl_hi_optab, "vec_widen_ushiftl_hi_$a") OPTAB_D (vec_widen_ushiftl_lo_optab, "vec_widen_ushiftl_lo_$a") -OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a") -OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a") -OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a") -OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a") OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a") OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a") OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a") diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 28464ad9e3a7ea25557ffebcdbdbc1340f9e0d8b..dc28f5bbfa6272a92b68489fe67446bd3eba0caf 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -4068,8 +4068,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case PLUS_EXPR: case MINUS_EXPR: { @@ -4190,10 +4188,6 @@ verify_gimple_assign_binary (gassign *stmt) return false; } - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc index d74d8db2173b1ab117250fea89de5212d5e354ec..7b056c7dc7e173b0bc9981a5c98f0d50685b6b66 100644 --- a/gcc/tree-inline.cc +++ b/gcc/tree-inline.cc @@ -4273,8 +4273,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case REALIGN_LOAD_EXPR: - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case DOT_PROD_EXPR: @@ -4283,10 +4281,6 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case WIDEN_MULT_MINUS_EXPR: case WIDEN_LSHIFT_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 7947f9647a15110b52d195643ad7d28ee32d4236..9941d8bf80535a98e647b8928619a6bf08bc434c 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -2874,8 +2874,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, break; /* Binary arithmetic and logic expressions. */ - case WIDEN_PLUS_EXPR: - case WIDEN_MINUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case MULT_EXPR: @@ -3831,10 +3829,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, case VEC_SERIES_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: - case VEC_WIDEN_PLUS_HI_EXPR: - case VEC_WIDEN_PLUS_LO_EXPR: - case VEC_WIDEN_MINUS_HI_EXPR: - case VEC_WIDEN_MINUS_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: case VEC_WIDEN_MULT_ODD_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: @@ -4352,12 +4346,6 @@ op_symbol_code (enum tree_code code) case WIDEN_LSHIFT_EXPR: return "w<<"; - case WIDEN_PLUS_EXPR: - return "w+"; - - case WIDEN_MINUS_EXPR: - return "w-"; - case POINTER_PLUS_EXPR: return "+"; diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 6721ab6efc4f029be8e2315c31ba87d94230cda5..68b29ee4661a0d08cf8f7048c23f992d4440b08b 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -136,8 +136,6 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_PLUS_EXPR - || gimple_assign_rhs_code (assign) == WIDEN_MINUS_EXPR || gimple_assign_rhs_code (assign) == FLOAT_EXPR) { tree rhs_type = TREE_TYPE (gimple_assign_rhs1 (assign)); diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index 59115b2e1629358e85cb770f6da04cc5a2adb27a..7f55966310cee67238b2561e333ea45ee6153d9a 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -2198,10 +2198,6 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi, arguments, not the widened result. VEC_UNPACK_FLOAT_*_EXPR is calculated in the same way above. */ if (code == WIDEN_SUM_EXPR - || code == VEC_WIDEN_PLUS_HI_EXPR - || code == VEC_WIDEN_PLUS_LO_EXPR - || code == VEC_WIDEN_MINUS_HI_EXPR - || code == VEC_WIDEN_MINUS_LO_EXPR || code == VEC_WIDEN_MULT_HI_EXPR || code == VEC_WIDEN_MULT_LO_EXPR || code == VEC_WIDEN_MULT_EVEN_EXPR diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 24c811ebe01fb8b003100dea494cf64fea72a975..7a818a3b7ad4c9e6b1f45abcc1e4fbd056aa1d29 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5035,9 +5035,7 @@ vectorizable_conversion (vec_info *vinfo, else return false; - bool widen_arith = (code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR - || code == WIDEN_MULT_EXPR + bool widen_arith = (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == IFN_VEC_WIDEN_PLUS_HILO || code == IFN_VEC_WIDEN_MINUS_HILO); @@ -5089,8 +5087,6 @@ vectorizable_conversion (vec_info *vinfo, { gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR - || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR || code == IFN_VEC_WIDEN_PLUS_HILO || code == IFN_VEC_WIDEN_MINUS_HILO); @@ -12335,7 +12331,7 @@ supportable_widening_operation (vec_info *vinfo, class loop *vect_loop = NULL; machine_mode vec_mode; enum insn_code icode1, icode2; - optab optab1, optab2; + optab optab1 = unknown_optab, optab2 = unknown_optab; tree vectype = vectype_in; tree wide_vectype = vectype_out; tree_code c1 = MAX_TREE_CODES, c2 = MAX_TREE_CODES; @@ -12433,16 +12429,6 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_WIDEN_LSHIFT_HI_EXPR; break; - case WIDEN_PLUS_EXPR: - c1 = VEC_WIDEN_PLUS_LO_EXPR; - c2 = VEC_WIDEN_PLUS_HI_EXPR; - break; - - case WIDEN_MINUS_EXPR: - c1 = VEC_WIDEN_MINUS_LO_EXPR; - c2 = VEC_WIDEN_MINUS_HI_EXPR; - break; - CASE_CONVERT: c1 = VEC_UNPACK_LO_EXPR; c2 = VEC_UNPACK_HI_EXPR; diff --git a/gcc/tree.def b/gcc/tree.def index b37b0b35927b92a6536e5c2d9805ffce8319a240..1fc2ca7a7249d4767aa2448219bc21a8c650aeb4 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1422,8 +1422,6 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3) the first argument from type t1 to type t2, and then shifting it by the second argument. */ DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_PLUS_EXPR, "widen_plus_expr", tcc_binary, 2) -DEFTREECODE (WIDEN_MINUS_EXPR, "widen_minus_expr", tcc_binary, 2) /* Widening vector multiplication. The two operands are vectors with N elements of size S. Multiplying the @@ -1488,10 +1486,6 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2) */ DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_HI_EXPR, "widen_plus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_PLUS_LO_EXPR, "widen_plus_lo_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_HI_EXPR, "widen_minus_hi_expr", tcc_binary, 2) -DEFTREECODE (VEC_WIDEN_MINUS_LO_EXPR, "widen_minus_lo_expr", tcc_binary, 2) /* PREDICT_EXPR. Specify hint for branch prediction. The PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns 2022-06-07 8:18 ` Richard Sandiford 2022-06-07 9:01 ` Joel Hutton @ 2022-06-13 9:18 ` Richard Biener 1 sibling, 0 replies; 53+ messages in thread From: Richard Biener @ 2022-06-13 9:18 UTC (permalink / raw) To: Richard Sandiford; +Cc: Joel Hutton, gcc-patches On Tue, 7 Jun 2022, Richard Sandiford wrote: > Joel Hutton <Joel.Hutton@arm.com> writes: > >> > Patches attached. They already incorporated the .cc rename, now > >> > rebased to be after the change to tree.h > >> > >> @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, > >> 2, oprnd, half_type, unprom, vectype); > >> > >> tree var = vect_recog_temp_ssa_var (itype, NULL); > >> - gimple *pattern_stmt = gimple_build_assign (var, wide_code, > >> - oprnd[0], oprnd[1]); > >> + gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], > >> oprnd[1]); > >> > >> > >> you should be able to do without the new gimple_build overload > >> by using > >> > >> gimple_seq stmts = NULL; > >> gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]); > >> gimple *pattern_stmt = gimple_seq_last_stmt (stmts); > >> > >> because 'gimple_build' is an existing API. > > > > Done. > > > > The gimple_build overload was at the request of Richard Sandiford, I assume removing it is ok with you Richard S? > > From Richard Sandiford: > >> For example, I think we should hide this inside a new: > >> > >> gimple_build (var, wide_code, oprnd[0], oprnd[1]); > >> > >> that works directly on code_helper, similarly to the new code_helper > >> gimple_build interfaces. > > I thought the potential problem with the above is that gimple_build > is a folding interface, so in principle it's allowed to return an > existing SSA_NAME set by an existing statement (or even a constant). > I think in this context we do need to force a new statement to be > created. Yes, that's due to how we use vect_finish_stmt_generation (only?). It might be useful to add an overload that takes a gimple_seq instead of a single gimple * for the vectorized stmt and leave all the magic to that. Now - we have the additional issue that we have STMT_VINFO_VEC_STMTS instead of STMT_VINFO_VEC_DEFS (in the end we'll only ever need the defs, never the stmts I think). I do think that we eventually want to 'enhance' the gimple.h non-folding stmt building API, unfortunately I took the 'gimple_build' name for the folding one, so alternatively we can unify assign/call with gimple_build_assign_or_call (...). I don't really like the idea of having folding and non-folding APIs being overloads :/ Maybe the non-folding API should be CTORs (guess GTY won't like that) or static member functions: gimple *gimple::build (tree, code_helper, tree, tree); and in the long run the gimple_build API should be (for some uses?) off a class as well, like instead of gimple_seq seq = NULL; op = gimple_build (&seq, ...); do gimple_builder b (location); // location defaulted to UNKNOWN op = b.build (...); So - writing the above I somewhat like the idea of static member functions in 'gimple' (yes, at the root of the class hierarchy, definitely not at gimple_statement_with_memory_ops_base, not sure if we want gassign::build for assigns and only the code_helper 'overloads' at the class root). Richard. -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2023-06-13 14:48 UTC | newest] Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-25 9:11 [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Joel Hutton 2022-05-27 13:23 ` Richard Biener 2022-05-31 10:07 ` Joel Hutton 2022-05-31 16:46 ` Tamar Christina 2022-06-01 10:11 ` Richard Biener 2022-06-06 17:20 ` Joel Hutton 2022-06-07 8:18 ` Richard Sandiford 2022-06-07 9:01 ` Joel Hutton 2022-06-09 14:03 ` Joel Hutton 2022-06-13 9:02 ` Richard Biener 2022-06-30 13:20 ` Joel Hutton 2022-07-12 12:32 ` Richard Biener 2023-03-17 10:14 ` Andre Vieira (lists) 2023-03-17 11:52 ` Richard Biener 2023-04-20 13:23 ` Andre Vieira (lists) 2023-04-24 11:57 ` Richard Biener 2023-04-24 13:01 ` Richard Sandiford 2023-04-25 12:30 ` Richard Biener 2023-04-28 16:06 ` Andre Vieira (lists) 2023-04-25 9:55 ` Andre Vieira (lists) 2023-04-28 12:36 ` [PATCH 1/3] Refactor to allow internal_fn's Andre Vieira (lists) 2023-05-03 11:55 ` Richard Biener 2023-05-04 15:20 ` Andre Vieira (lists) 2023-05-05 6:09 ` Richard Biener 2023-05-12 12:14 ` Andre Vieira (lists) 2023-05-12 13:18 ` Richard Biener 2023-04-28 12:37 ` [PATCH 2/3] Refactor widen_plus as internal_fn Andre Vieira (lists) 2023-05-03 12:11 ` Richard Biener 2023-05-03 19:07 ` Richard Sandiford 2023-05-12 12:16 ` Andre Vieira (lists) 2023-05-12 13:28 ` Richard Biener 2023-05-12 13:55 ` Andre Vieira (lists) 2023-05-12 14:01 ` Richard Sandiford 2023-05-15 10:20 ` Richard Biener 2023-05-15 10:47 ` Richard Sandiford 2023-05-15 11:01 ` Richard Biener 2023-05-15 11:10 ` Richard Sandiford 2023-05-15 11:53 ` Andre Vieira (lists) 2023-05-15 12:21 ` Richard Biener 2023-05-18 17:15 ` Andre Vieira (lists) 2023-05-22 13:06 ` Richard Biener 2023-06-01 16:27 ` Andre Vieira (lists) 2023-06-02 12:00 ` Richard Sandiford 2023-06-06 19:00 ` Jakub Jelinek 2023-06-06 21:28 ` [PATCH] modula2: Fix bootstrap Jakub Jelinek 2023-06-06 22:18 ` Gaius Mulley 2023-06-07 8:42 ` Andre Vieira (lists) 2023-06-13 14:48 ` Jakub Jelinek 2023-04-28 12:37 ` [PATCH 3/3] Remove widen_plus/minus_expr tree codes Andre Vieira (lists) 2023-05-03 12:29 ` Richard Biener 2023-05-10 9:15 ` Andre Vieira (lists) 2023-05-12 12:18 ` Andre Vieira (lists) 2022-06-13 9:18 ` [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns Richard Biener
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).