From: Richard Biener <rguenther@suse.de>
To: Richard Sandiford <richard.sandiford@arm.com>
Cc: gcc-patches@gcc.gnu.org, hubicka@ucw.cz, ubizjak@gmail.com,
kirill.yukhin@gmail.com, hongtao.liu@intel.com,
dje.gcc@gmail.com, segher@kernel.crashing.org
Subject: Re: [RFC] vect: Convert cost hooks to classes
Date: Thu, 21 Oct 2021 14:29:13 +0200 (CEST) [thread overview]
Message-ID: <r53843o4-83o-1q20-2556-spps5o363811@fhfr.qr> (raw)
In-Reply-To: <mpta6jbep0l.fsf@arm.com>
On Thu, 14 Oct 2021, Richard Sandiford wrote:
> The current vector cost interface has a quite a bit of redundancy
> built in. Each target that defines its own hooks has to replicate
> the basic unsigned[3] management. Currently each target also
> duplicates the cost adjustment for inner loops.
>
> This patch instead defines a vector_costs class for holding
> the scalar or vector cost and allows targets to subclass it.
> There is then only one costing hook: to create a new costs
> structure of the appropriate type. Everything else can be
> virtual functions, with common concepts implemented in the
> base class rather than in each target's derivation.
>
> This might seem like excess C++-ification, but it shaves
> ~100 LOC. I've also got some follow-on changes that become
> significantly easier with this patch. Maybe it could help
> with things like weighting blocks based on frequency too.
>
> This will clash with Andre's unrolling patches. His patches
> have priority so this patch should queue behind them.
>
> The x86 and rs6000 parts fully convert to a self-contained class.
> The equivalent aarch64 changes are more complex, so this patch
> just does the bare minimum. A later patch will rework the
> aarch64 bits.
>
> Tested on aarch64-linux-gnu, arm-linux-gnueabihf, x86_64-linux-gnu
> and powerpc64le-linux-gnu. WDYT?
I like it! Thus OK.
I suggested sth similar to Martin for the backend state
'[PATCH 3/N] Come up with casm global state.', abstracting
varasm global state and allowing targets to override this
via the adjusted init_sections target hook.
Richard.
> Richard
>
>
> gcc/
> * target.def (targetm.vectorize.init_cost): Replace with...
> (targetm.vectorize.create_costs): ...this.
> (targetm.vectorize.add_stmt_cost): Delete.
> (targetm.vectorize.finish_cost): Likewise.
> (targetm.vectorize.destroy_cost_data): Likewise.
> * doc/tm.texi.in (TARGET_VECTORIZE_INIT_COST): Replace with...
> (TARGET_VECTORIZE_CREATE_COSTS): ...this.
> (TARGET_VECTORIZE_ADD_STMT_COST): Delete.
> (TARGET_VECTORIZE_FINISH_COST): Likewise.
> (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
> * doc/tm.texi: Regenerate.
> * tree-vectorizer.h (vec_info::vec_info): Remove target_cost_data
> parameter.
> (vec_info::target_cost_data): Change from a void * to a vector_costs *.
> (vector_costs): New class.
> (init_cost): Take a vec_info and return a vector_costs.
> (dump_stmt_cost): Remove data parameter.
> (add_stmt_cost): Replace vinfo and data parameters with a vector_costs.
> (add_stmt_costs): Likewise.
> (finish_cost): Replace data parameter with a vector_costs.
> (destroy_cost_data): Delete.
> * tree-vectorizer.c (dump_stmt_cost): Remove data argument and
> don't print it.
> (vec_info::vec_info): Remove the target_cost_data parameter and
> initialize the member variable to null instead.
> (vec_info::~vec_info): Delete target_cost_data instead of calling
> destroy_cost_data.
> (vector_costs::add_stmt_cost): New function.
> (vector_costs::finish_cost): Likewise.
> (vector_costs::record_stmt_cost): Likewise.
> (vector_costs::adjust_cost_for_freq): Likewise.
> * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Update
> call to vec_info::vec_info.
> (vect_compute_single_scalar_iteration_cost): Update after above
> changes to costing interface.
> (vect_analyze_loop_operations): Likewise.
> (vect_estimate_min_profitable_iters): Likewise.
> (vect_analyze_loop_2): Initialize LOOP_VINFO_TARGET_COST_DATA
> at the start_over point, where it needs to be recreated after
> trying without slp. Update retry code accordingly.
> * tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Update call
> to vec_info::vec_info.
> (vect_slp_analyze_operation): Update after above changes to costing
> interface.
> (vect_bb_vectorization_profitable_p): Likewise.
> * targhooks.h (default_init_cost): Replace with...
> (default_vectorize_create_costs): ...this.
> (default_add_stmt_cost): Delete.
> (default_finish_cost, default_destroy_cost_data): Likewise.
> * targhooks.c (default_init_cost): Replace with...
> (default_vectorize_create_costs): ...this.
> (default_add_stmt_cost): Delete, moving logic to vector_costs instead.
> (default_finish_cost, default_destroy_cost_data): Delete.
> * config/aarch64/aarch64.c (aarch64_vector_costs): Inherit from
> vector_costs. Add a constructor.
> (aarch64_init_cost): Replace with...
> (aarch64_vectorize_create_costs): ...this.
> (aarch64_add_stmt_cost): Replace with...
> (aarch64_vector_costs::add_stmt_cost): ...this. Use record_stmt_cost
> to adjust the cost for inner loops.
> (aarch64_finish_cost): Replace with...
> (aarch64_vector_costs::finish_cost): ...this.
> (aarch64_destroy_cost_data): Delete.
> (TARGET_VECTORIZE_INIT_COST): Replace with...
> (TARGET_VECTORIZE_CREATE_COSTS): ...this.
> (TARGET_VECTORIZE_ADD_STMT_COST): Delete.
> (TARGET_VECTORIZE_FINISH_COST): Likewise.
> (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
> * config/i386/i386.c (ix86_vector_costs): New structure.
> (ix86_init_cost): Replace with...
> (ix86_vectorize_create_costs): ...this.
> (ix86_add_stmt_cost): Replace with...
> (ix86_vector_costs::add_stmt_cost): ...this. Use adjust_cost_for_freq
> to adjust the cost for inner loops.
> (ix86_finish_cost, ix86_destroy_cost_data): Delete.
> (TARGET_VECTORIZE_INIT_COST): Replace with...
> (TARGET_VECTORIZE_CREATE_COSTS): ...this.
> (TARGET_VECTORIZE_ADD_STMT_COST): Delete.
> (TARGET_VECTORIZE_FINISH_COST): Likewise.
> (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
> * config/rs6000/rs6000.c (TARGET_VECTORIZE_INIT_COST): Replace with...
> (TARGET_VECTORIZE_CREATE_COSTS): ...this.
> (TARGET_VECTORIZE_ADD_STMT_COST): Delete.
> (TARGET_VECTORIZE_FINISH_COST): Likewise.
> (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
> (rs6000_cost_data): Inherit from vector_costs.
> Add a constructor. Drop loop_info, cost and costing_for_scalar
> in favor of the corresponding vector_costs member variables.
> Add "m_" to the names of the remaining member variables and
> initialize them.
> (rs6000_density_test): Replace with...
> (rs6000_cost_data::density_test): ...this.
> (rs6000_init_cost): Replace with...
> (rs6000_vectorize_create_costs): ...this.
> (rs6000_update_target_cost_per_stmt): Replace with...
> (rs6000_cost_data::update_target_cost_per_stmt): ...this.
> (rs6000_add_stmt_cost): Replace with...
> (rs6000_cost_data::add_stmt_cost): ...this. Use adjust_cost_for_freq
> to adjust the cost for inner loops.
> (rs6000_adjust_vect_cost_per_loop): Replace with...
> (rs6000_cost_data::adjust_vect_cost_per_loop): ...this.
> (rs6000_finish_cost): Replace with...
> (rs6000_cost_data::finish_cost): ...this. Group loop code
> into a single if statement and pass the loop_vinfo down to
> subroutines.
> (rs6000_destroy_cost_data): Delete.
> ---
> gcc/config/aarch64/aarch64.c | 137 ++++++++++--------------
> gcc/config/i386/i386.c | 76 ++++----------
> gcc/config/rs6000/rs6000.c | 196 ++++++++++++++---------------------
> gcc/doc/tm.texi | 25 +----
> gcc/doc/tm.texi.in | 8 +-
> gcc/target.def | 49 +--------
> gcc/targhooks.c | 61 +----------
> gcc/targhooks.h | 8 +-
> gcc/tree-vect-loop.c | 51 ++++-----
> gcc/tree-vect-slp.c | 18 ++--
> gcc/tree-vectorizer.c | 67 ++++++++++--
> gcc/tree-vectorizer.h | 141 ++++++++++++++++++++-----
> 12 files changed, 374 insertions(+), 463 deletions(-)
>
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 902402d7503..f2e90990d9f 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -6255,7 +6255,7 @@ type @code{internal_fn}) should be considered expensive when the mask is
> all zeros. GCC can then try to branch around the instruction instead.
> @end deftypefn
>
> -@deftypefn {Target Hook} {void *} TARGET_VECTORIZE_INIT_COST (class loop *@var{loop_info}, bool @var{costing_for_scalar})
> +@deftypefn {Target Hook} {class vector_costs *} TARGET_VECTORIZE_CREATE_COSTS (vec_info *@var{vinfo}, bool @var{costing_for_scalar})
> This hook should initialize target-specific data structures in preparation
> for modeling the costs of vectorizing a loop or basic block. The default
> allocates three unsigned integers for accumulating costs for the prologue,
> @@ -6266,29 +6266,6 @@ current cost model is for the scalar version of a loop or block; otherwise
> it is for the vector version.
> @end deftypefn
>
> -@deftypefn {Target Hook} unsigned TARGET_VECTORIZE_ADD_STMT_COST (class vec_info *@var{}, void *@var{data}, int @var{count}, enum vect_cost_for_stmt @var{kind}, class _stmt_vec_info *@var{stmt_info}, tree @var{vectype}, int @var{misalign}, enum vect_cost_model_location @var{where})
> -This hook should update the target-specific @var{data} in response to
> -adding @var{count} copies of the given @var{kind} of statement to a
> -loop or basic block. The default adds the builtin vectorizer cost for
> -the copies of the statement to the accumulator specified by @var{where},
> -(the prologue, body, or epilogue) and returns the amount added. The
> -return value should be viewed as a tentative cost that may later be
> -revised.
> -@end deftypefn
> -
> -@deftypefn {Target Hook} void TARGET_VECTORIZE_FINISH_COST (void *@var{data}, unsigned *@var{prologue_cost}, unsigned *@var{body_cost}, unsigned *@var{epilogue_cost})
> -This hook should complete calculations of the cost of vectorizing a loop
> -or basic block based on @var{data}, and return the prologue, body, and
> -epilogue costs as unsigned integers. The default returns the value of
> -the three accumulators.
> -@end deftypefn
> -
> -@deftypefn {Target Hook} void TARGET_VECTORIZE_DESTROY_COST_DATA (void *@var{data})
> -This hook should release @var{data} and any related data structures
> -allocated by TARGET_VECTORIZE_INIT_COST. The default releases the
> -accumulator.
> -@end deftypefn
> -
> @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_GATHER (const_tree @var{mem_vectype}, const_tree @var{index_type}, int @var{scale})
> Target builtin that implements vector gather operation. @var{mem_vectype}
> is the vector type of the load and @var{index_type} is scalar type of
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 86352dc9bd2..738e7b8c19e 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -4190,13 +4190,7 @@ address; but often a machine-dependent strategy can generate better code.
>
> @hook TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE
>
> -@hook TARGET_VECTORIZE_INIT_COST
> -
> -@hook TARGET_VECTORIZE_ADD_STMT_COST
> -
> -@hook TARGET_VECTORIZE_FINISH_COST
> -
> -@hook TARGET_VECTORIZE_DESTROY_COST_DATA
> +@hook TARGET_VECTORIZE_CREATE_COSTS
>
> @hook TARGET_VECTORIZE_BUILTIN_GATHER
>
> diff --git a/gcc/target.def b/gcc/target.def
> index c5d90cace80..1baaba4cd0f 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -2051,7 +2051,7 @@ stores.",
>
> /* Target function to initialize the cost model for a loop or block. */
> DEFHOOK
> -(init_cost,
> +(create_costs,
> "This hook should initialize target-specific data structures in preparation\n\
> for modeling the costs of vectorizing a loop or basic block. The default\n\
> allocates three unsigned integers for accumulating costs for the prologue,\n\
> @@ -2060,50 +2060,9 @@ non-NULL, it identifies the loop being vectorized; otherwise a single block\n\
> is being vectorized. If @var{costing_for_scalar} is true, it indicates the\n\
> current cost model is for the scalar version of a loop or block; otherwise\n\
> it is for the vector version.",
> - void *,
> - (class loop *loop_info, bool costing_for_scalar),
> - default_init_cost)
> -
> -/* Target function to record N statements of the given kind using the
> - given vector type within the cost model data for the current loop or
> - block. */
> -DEFHOOK
> -(add_stmt_cost,
> - "This hook should update the target-specific @var{data} in response to\n\
> -adding @var{count} copies of the given @var{kind} of statement to a\n\
> -loop or basic block. The default adds the builtin vectorizer cost for\n\
> -the copies of the statement to the accumulator specified by @var{where},\n\
> -(the prologue, body, or epilogue) and returns the amount added. The\n\
> -return value should be viewed as a tentative cost that may later be\n\
> -revised.",
> - unsigned,
> - (class vec_info *, void *data, int count, enum vect_cost_for_stmt kind,
> - class _stmt_vec_info *stmt_info, tree vectype, int misalign,
> - enum vect_cost_model_location where),
> - default_add_stmt_cost)
> -
> -/* Target function to calculate the total cost of the current vectorized
> - loop or block. */
> -DEFHOOK
> -(finish_cost,
> - "This hook should complete calculations of the cost of vectorizing a loop\n\
> -or basic block based on @var{data}, and return the prologue, body, and\n\
> -epilogue costs as unsigned integers. The default returns the value of\n\
> -the three accumulators.",
> - void,
> - (void *data, unsigned *prologue_cost, unsigned *body_cost,
> - unsigned *epilogue_cost),
> - default_finish_cost)
> -
> -/* Function to delete target-specific cost modeling data. */
> -DEFHOOK
> -(destroy_cost_data,
> - "This hook should release @var{data} and any related data structures\n\
> -allocated by TARGET_VECTORIZE_INIT_COST. The default releases the\n\
> -accumulator.",
> - void,
> - (void *data),
> - default_destroy_cost_data)
> + class vector_costs *,
> + (vec_info *vinfo, bool costing_for_scalar),
> + default_vectorize_create_costs)
>
> HOOK_VECTOR_END (vectorize)
>
> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> index cbbcedf790f..ee8798cc84b 100644
> --- a/gcc/targhooks.c
> +++ b/gcc/targhooks.c
> @@ -1474,65 +1474,10 @@ default_empty_mask_is_expensive (unsigned ifn)
> loop body, and epilogue) for a vectorized loop or block. So allocate an
> array of three unsigned ints, set it to zero, and return its address. */
>
> -void *
> -default_init_cost (class loop *loop_info ATTRIBUTE_UNUSED,
> - bool costing_for_scalar ATTRIBUTE_UNUSED)
> -{
> - unsigned *cost = XNEWVEC (unsigned, 3);
> - cost[vect_prologue] = cost[vect_body] = cost[vect_epilogue] = 0;
> - return cost;
> -}
> -
> -/* By default, the cost model looks up the cost of the given statement
> - kind and mode, multiplies it by the occurrence count, accumulates
> - it into the cost specified by WHERE, and returns the cost added. */
> -
> -unsigned
> -default_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> - enum vect_cost_for_stmt kind,
> - class _stmt_vec_info *stmt_info, tree vectype,
> - int misalign,
> - enum vect_cost_model_location where)
> -{
> - unsigned *cost = (unsigned *) data;
> - unsigned retval = 0;
> - int stmt_cost = targetm.vectorize.builtin_vectorization_cost (kind, vectype,
> - misalign);
> - /* Statements in an inner loop relative to the loop being
> - vectorized are weighted more heavily. The value here is
> - arbitrary and could potentially be improved with analysis. */
> - if (where == vect_body && stmt_info
> - && stmt_in_inner_loop_p (vinfo, stmt_info))
> - {
> - loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo);
> - gcc_assert (loop_vinfo);
> - count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo);
> - }
> -
> - retval = (unsigned) (count * stmt_cost);
> - cost[where] += retval;
> -
> - return retval;
> -}
> -
> -/* By default, the cost model just returns the accumulated costs. */
> -
> -void
> -default_finish_cost (void *data, unsigned *prologue_cost,
> - unsigned *body_cost, unsigned *epilogue_cost)
> -{
> - unsigned *cost = (unsigned *) data;
> - *prologue_cost = cost[vect_prologue];
> - *body_cost = cost[vect_body];
> - *epilogue_cost = cost[vect_epilogue];
> -}
> -
> -/* Free the cost data. */
> -
> -void
> -default_destroy_cost_data (void *data)
> +vector_costs *
> +default_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
> {
> - free (data);
> + return new vector_costs (vinfo, costing_for_scalar);
> }
>
> /* Determine whether or not a pointer mode is valid. Assume defaults
> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> index 92d51992e62..64f2dc7e4a6 100644
> --- a/gcc/targhooks.h
> +++ b/gcc/targhooks.h
> @@ -118,13 +118,7 @@ extern opt_machine_mode default_vectorize_related_mode (machine_mode,
> poly_uint64);
> extern opt_machine_mode default_get_mask_mode (machine_mode);
> extern bool default_empty_mask_is_expensive (unsigned);
> -extern void *default_init_cost (class loop *, bool);
> -extern unsigned default_add_stmt_cost (class vec_info *, void *, int,
> - enum vect_cost_for_stmt,
> - class _stmt_vec_info *, tree, int,
> - enum vect_cost_model_location);
> -extern void default_finish_cost (void *, unsigned *, unsigned *, unsigned *);
> -extern void default_destroy_cost_data (void *);
> +extern vector_costs *default_vectorize_create_costs (vec_info *, bool);
>
> /* OpenACC hooks. */
> extern bool default_goacc_validate_dims (tree, int [], int, unsigned);
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index 961c1623f81..201000af425 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -814,7 +814,7 @@ bb_in_loop_p (const_basic_block bb, const void *data)
> stmt_vec_info structs for all the stmts in LOOP_IN. */
>
> _loop_vec_info::_loop_vec_info (class loop *loop_in, vec_info_shared *shared)
> - : vec_info (vec_info::loop, init_cost (loop_in, false), shared),
> + : vec_info (vec_info::loop, shared),
> loop (loop_in),
> bbs (XCNEWVEC (basic_block, loop->num_nodes)),
> num_itersm1 (NULL_TREE),
> @@ -1292,18 +1292,18 @@ vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
> }
>
> /* Now accumulate cost. */
> - void *target_cost_data = init_cost (loop, true);
> + vector_costs *target_cost_data = init_cost (loop_vinfo, true);
> stmt_info_for_cost *si;
> int j;
> FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo),
> j, si)
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, si->count,
> + (void) add_stmt_cost (target_cost_data, si->count,
> si->kind, si->stmt_info, si->vectype,
> si->misalign, si->where);
> unsigned prologue_cost = 0, body_cost = 0, epilogue_cost = 0;
> finish_cost (target_cost_data, &prologue_cost, &body_cost,
> &epilogue_cost);
> - destroy_cost_data (target_cost_data);
> + delete target_cost_data;
> LOOP_VINFO_SINGLE_SCALAR_ITERATION_COST (loop_vinfo)
> = prologue_cost + body_cost + epilogue_cost;
> }
> @@ -1783,7 +1783,7 @@ vect_analyze_loop_operations (loop_vec_info loop_vinfo)
> }
> } /* bbs */
>
> - add_stmt_costs (loop_vinfo, loop_vinfo->target_cost_data, &cost_vec);
> + add_stmt_costs (loop_vinfo->target_cost_data, &cost_vec);
>
> /* All operations in the loop are either irrelevant (deal with loop
> control, or dead), or only used outside the loop and can be moved
> @@ -2393,6 +2393,8 @@ start_over:
> LOOP_VINFO_INT_NITERS (loop_vinfo));
> }
>
> + LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) = init_cost (loop_vinfo, false);
> +
> /* Analyze the alignment of the data-refs in the loop.
> Fail if a data reference is found that cannot be vectorized. */
>
> @@ -2757,9 +2759,8 @@ again:
> LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).release ();
> LOOP_VINFO_CHECK_UNEQUAL_ADDRS (loop_vinfo).release ();
> /* Reset target cost data. */
> - destroy_cost_data (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo));
> - LOOP_VINFO_TARGET_COST_DATA (loop_vinfo)
> - = init_cost (LOOP_VINFO_LOOP (loop_vinfo), false);
> + delete LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
> + LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) = nullptr;
> /* Reset accumulated rgroup information. */
> release_vec_loop_controls (&LOOP_VINFO_MASKS (loop_vinfo));
> release_vec_loop_controls (&LOOP_VINFO_LENS (loop_vinfo));
> @@ -3895,7 +3896,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> int scalar_outside_cost = 0;
> int assumed_vf = vect_vf_for_cost (loop_vinfo);
> int npeel = LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo);
> - void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
> + vector_costs *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
>
> /* Cost model disabled. */
> if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
> @@ -3912,7 +3913,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> {
> /* FIXME: Make cost depend on complexity of individual check. */
> unsigned len = LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo).length ();
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, len, vector_stmt,
> + (void) add_stmt_cost (target_cost_data, len, vector_stmt,
> NULL, NULL_TREE, 0, vect_prologue);
> if (dump_enabled_p ())
> dump_printf (MSG_NOTE,
> @@ -3925,12 +3926,12 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> {
> /* FIXME: Make cost depend on complexity of individual check. */
> unsigned len = LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).length ();
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, len, vector_stmt,
> + (void) add_stmt_cost (target_cost_data, len, vector_stmt,
> NULL, NULL_TREE, 0, vect_prologue);
> len = LOOP_VINFO_CHECK_UNEQUAL_ADDRS (loop_vinfo).length ();
> if (len)
> /* Count LEN - 1 ANDs and LEN comparisons. */
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, len * 2 - 1,
> + (void) add_stmt_cost (target_cost_data, len * 2 - 1,
> scalar_stmt, NULL, NULL_TREE, 0, vect_prologue);
> len = LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).length ();
> if (len)
> @@ -3941,7 +3942,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> for (unsigned int i = 0; i < len; ++i)
> if (!LOOP_VINFO_LOWER_BOUNDS (loop_vinfo)[i].unsigned_p)
> nstmts += 1;
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, nstmts,
> + (void) add_stmt_cost (target_cost_data, nstmts,
> scalar_stmt, NULL, NULL_TREE, 0, vect_prologue);
> }
> if (dump_enabled_p ())
> @@ -3954,7 +3955,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> if (LOOP_REQUIRES_VERSIONING_FOR_NITERS (loop_vinfo))
> {
> /* FIXME: Make cost depend on complexity of individual check. */
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, vector_stmt,
> + (void) add_stmt_cost (target_cost_data, 1, vector_stmt,
> NULL, NULL_TREE, 0, vect_prologue);
> if (dump_enabled_p ())
> dump_printf (MSG_NOTE,
> @@ -3963,7 +3964,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> }
>
> if (LOOP_REQUIRES_VERSIONING (loop_vinfo))
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, cond_branch_taken,
> + (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken,
> NULL, NULL_TREE, 0, vect_prologue);
>
> /* Count statements in scalar loop. Using this as scalar cost for a single
> @@ -4051,7 +4052,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> if (peel_iters_prologue)
> FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), j, si)
> {
> - (void) add_stmt_cost (loop_vinfo, target_cost_data,
> + (void) add_stmt_cost (target_cost_data,
> si->count * peel_iters_prologue, si->kind,
> si->stmt_info, si->vectype, si->misalign,
> vect_prologue);
> @@ -4061,7 +4062,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> if (peel_iters_epilogue)
> FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), j, si)
> {
> - (void) add_stmt_cost (loop_vinfo, target_cost_data,
> + (void) add_stmt_cost (target_cost_data,
> si->count * peel_iters_epilogue, si->kind,
> si->stmt_info, si->vectype, si->misalign,
> vect_epilogue);
> @@ -4070,20 +4071,20 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> /* Add possible cond_branch_taken/cond_branch_not_taken cost. */
>
> if (prologue_need_br_taken_cost)
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, cond_branch_taken,
> + (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken,
> NULL, NULL_TREE, 0, vect_prologue);
>
> if (prologue_need_br_not_taken_cost)
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1,
> + (void) add_stmt_cost (target_cost_data, 1,
> cond_branch_not_taken, NULL, NULL_TREE, 0,
> vect_prologue);
>
> if (epilogue_need_br_taken_cost)
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, cond_branch_taken,
> + (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken,
> NULL, NULL_TREE, 0, vect_epilogue);
>
> if (epilogue_need_br_not_taken_cost)
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1,
> + (void) add_stmt_cost (target_cost_data, 1,
> cond_branch_not_taken, NULL, NULL_TREE, 0,
> vect_epilogue);
>
> @@ -4111,9 +4112,9 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> simpler and safer to use the worst-case cost; if this ends up
> being the tie-breaker between vectorizing or not, then it's
> probably better not to vectorize. */
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, num_masks,
> + (void) add_stmt_cost (target_cost_data, num_masks,
> vector_stmt, NULL, NULL_TREE, 0, vect_prologue);
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, num_masks - 1,
> + (void) add_stmt_cost (target_cost_data, num_masks - 1,
> vector_stmt, NULL, NULL_TREE, 0, vect_body);
> }
> else if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
> @@ -4163,9 +4164,9 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> body_stmts += 3 * num_vectors;
> }
>
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, prologue_stmts,
> + (void) add_stmt_cost (target_cost_data, prologue_stmts,
> scalar_stmt, NULL, NULL_TREE, 0, vect_prologue);
> - (void) add_stmt_cost (loop_vinfo, target_cost_data, body_stmts,
> + (void) add_stmt_cost (target_cost_data, body_stmts,
> scalar_stmt, NULL, NULL_TREE, 0, vect_body);
> }
>
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 709bcb63686..a2eb20faef5 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -4355,7 +4355,7 @@ vect_detect_hybrid_slp (loop_vec_info loop_vinfo)
> /* Initialize a bb_vec_info struct for the statements in BBS basic blocks. */
>
> _bb_vec_info::_bb_vec_info (vec<basic_block> _bbs, vec_info_shared *shared)
> - : vec_info (vec_info::bb, init_cost (NULL, false), shared),
> + : vec_info (vec_info::bb, shared),
> bbs (_bbs),
> roots (vNULL)
> {
> @@ -4897,7 +4897,7 @@ vect_slp_analyze_operations (vec_info *vinfo)
> instance->cost_vec = cost_vec;
> else
> {
> - add_stmt_costs (vinfo, vinfo->target_cost_data, &cost_vec);
> + add_stmt_costs (vinfo->target_cost_data, &cost_vec);
> cost_vec.release ();
> }
> }
> @@ -5337,32 +5337,30 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
> continue;
> }
>
> - void *scalar_target_cost_data = init_cost (NULL, true);
> + class vector_costs *scalar_target_cost_data = init_cost (bb_vinfo, true);
> do
> {
> - add_stmt_cost (bb_vinfo, scalar_target_cost_data,
> - li_scalar_costs[si].second);
> + add_stmt_cost (scalar_target_cost_data, li_scalar_costs[si].second);
> si++;
> }
> while (si < li_scalar_costs.length ()
> && li_scalar_costs[si].first == sl);
> unsigned dummy;
> finish_cost (scalar_target_cost_data, &dummy, &scalar_cost, &dummy);
> - destroy_cost_data (scalar_target_cost_data);
> + delete scalar_target_cost_data;
>
> /* Complete the target-specific vector cost calculation. */
> - void *vect_target_cost_data = init_cost (NULL, false);
> + class vector_costs *vect_target_cost_data = init_cost (bb_vinfo, false);
> do
> {
> - add_stmt_cost (bb_vinfo, vect_target_cost_data,
> - li_vector_costs[vi].second);
> + add_stmt_cost (vect_target_cost_data, li_vector_costs[vi].second);
> vi++;
> }
> while (vi < li_vector_costs.length ()
> && li_vector_costs[vi].first == vl);
> finish_cost (vect_target_cost_data, &vec_prologue_cost,
> &vec_inside_cost, &vec_epilogue_cost);
> - destroy_cost_data (vect_target_cost_data);
> + delete vect_target_cost_data;
>
> vec_outside_cost = vec_prologue_cost + vec_epilogue_cost;
>
> diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
> index 20daa31187d..0b3a2dd6dc0 100644
> --- a/gcc/tree-vectorizer.c
> +++ b/gcc/tree-vectorizer.c
> @@ -98,11 +98,10 @@ auto_purge_vect_location::~auto_purge_vect_location ()
> /* Dump a cost entry according to args to F. */
>
> void
> -dump_stmt_cost (FILE *f, void *data, int count, enum vect_cost_for_stmt kind,
> +dump_stmt_cost (FILE *f, int count, enum vect_cost_for_stmt kind,
> stmt_vec_info stmt_info, tree, int misalign, unsigned cost,
> enum vect_cost_model_location where)
> {
> - fprintf (f, "%p ", data);
> if (stmt_info)
> {
> print_gimple_expr (f, STMT_VINFO_STMT (stmt_info), 0, TDF_SLIM);
> @@ -457,12 +456,11 @@ shrink_simd_arrays
> /* Initialize the vec_info with kind KIND_IN and target cost data
> TARGET_COST_DATA_IN. */
>
> -vec_info::vec_info (vec_info::vec_kind kind_in, void *target_cost_data_in,
> - vec_info_shared *shared_)
> +vec_info::vec_info (vec_info::vec_kind kind_in, vec_info_shared *shared_)
> : kind (kind_in),
> shared (shared_),
> stmt_vec_info_ro (false),
> - target_cost_data (target_cost_data_in)
> + target_cost_data (nullptr)
> {
> stmt_vec_infos.create (50);
> }
> @@ -472,7 +470,7 @@ vec_info::~vec_info ()
> for (slp_instance &instance : slp_instances)
> vect_free_slp_instance (instance);
>
> - destroy_cost_data (target_cost_data);
> + delete target_cost_data;
> free_stmt_vec_infos ();
> }
>
> @@ -1694,3 +1692,60 @@ scalar_cond_masked_key::get_cond_ops_from_tree (tree t)
> this->op0 = t;
> this->op1 = build_zero_cst (TREE_TYPE (t));
> }
> +
> +/* See the comment above the declaration for details. */
> +
> +unsigned int
> +vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> + stmt_vec_info stmt_info, tree vectype,
> + int misalign, vect_cost_model_location where)
> +{
> + unsigned int cost
> + = builtin_vectorization_cost (kind, vectype, misalign) * count;
> + return record_stmt_cost (stmt_info, where, cost);
> +}
> +
> +/* See the comment above the declaration for details. */
> +
> +void
> +vector_costs::finish_cost ()
> +{
> + gcc_assert (!m_finished);
> + m_finished = true;
> +}
> +
> +/* Record a base cost of COST units against WHERE. If STMT_INFO is
> + nonnull, use it to adjust the cost based on execution frequency
> + (where appropriate). */
> +
> +unsigned int
> +vector_costs::record_stmt_cost (stmt_vec_info stmt_info,
> + vect_cost_model_location where,
> + unsigned int cost)
> +{
> + cost = adjust_cost_for_freq (stmt_info, where, cost);
> + m_costs[where] += cost;
> + return cost;
> +}
> +
> +/* COST is the base cost we have calculated for an operation in location WHERE.
> + If STMT_INFO is nonnull, use it to adjust the cost based on execution
> + frequency (where appropriate). Return the adjusted cost. */
> +
> +unsigned int
> +vector_costs::adjust_cost_for_freq (stmt_vec_info stmt_info,
> + vect_cost_model_location where,
> + unsigned int cost)
> +{
> + /* Statements in an inner loop relative to the loop being
> + vectorized are weighted more heavily. The value here is
> + arbitrary and could potentially be improved with analysis. */
> + if (where == vect_body
> + && stmt_info
> + && stmt_in_inner_loop_p (m_vinfo, stmt_info))
> + {
> + loop_vec_info loop_vinfo = as_a<loop_vec_info> (m_vinfo);
> + cost *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo);
> + }
> + return cost;
> +}
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index 4aa84acff59..44afda2bc9b 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -368,7 +368,7 @@ public:
> typedef hash_set<int_hash<machine_mode, E_VOIDmode, E_BLKmode> > mode_set;
> enum vec_kind { bb, loop };
>
> - vec_info (vec_kind, void *, vec_info_shared *);
> + vec_info (vec_kind, vec_info_shared *);
> ~vec_info ();
>
> stmt_vec_info add_stmt (gimple *);
> @@ -406,7 +406,7 @@ public:
> auto_vec<stmt_vec_info> grouped_stores;
>
> /* Cost data used by the target cost model. */
> - void *target_cost_data;
> + class vector_costs *target_cost_data;
>
> /* The set of vector modes used in the vectorized region. */
> mode_set used_vector_modes;
> @@ -1395,6 +1395,103 @@ struct gather_scatter_info {
> #define PURE_SLP_STMT(S) ((S)->slp_type == pure_slp)
> #define STMT_SLP_TYPE(S) (S)->slp_type
>
> +/* Contains the scalar or vector costs for a vec_info. */
> +class vector_costs
> +{
> +public:
> + vector_costs (vec_info *, bool);
> + virtual ~vector_costs () {}
> +
> + /* Update the costs in response to adding COUNT copies of a statement.
> +
> + - WHERE specifies whether the cost occurs in the loop prologue,
> + the loop body, or the loop epilogue.
> + - KIND is the kind of statement, which is always meaningful.
> + - STMT_INFO, if nonnull, describes the statement that will be
> + vectorized.
> + - VECTYPE, if nonnull, is the vector type that the vectorized
> + statement will operate on. Note that this should be used in
> + preference to STMT_VINFO_VECTYPE (STMT_INFO) since the latter
> + is not correct for SLP.
> + - for unaligned_load and unaligned_store statements, MISALIGN is
> + the byte misalignment of the load or store relative to the target's
> + preferred alignment for VECTYPE, or DR_MISALIGNMENT_UNKNOWN
> + if the misalignment is not known.
> +
> + Return the calculated cost as well as recording it. The return
> + value is used for dumping purposes. */
> + virtual unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> + stmt_vec_info stmt_info, tree vectype,
> + int misalign,
> + vect_cost_model_location where);
> +
> + /* Finish calculating the cost of the code. The results can be
> + read back using the functions below. */
> + virtual void finish_cost ();
> +
> + unsigned int prologue_cost () const;
> + unsigned int body_cost () const;
> + unsigned int epilogue_cost () const;
> +
> +protected:
> + unsigned int record_stmt_cost (stmt_vec_info, vect_cost_model_location,
> + unsigned int);
> + unsigned int adjust_cost_for_freq (stmt_vec_info, vect_cost_model_location,
> + unsigned int);
> +
> + /* The region of code that we're considering vectorizing. */
> + vec_info *m_vinfo;
> +
> + /* True if we're costing the scalar code, false if we're costing
> + the vector code. */
> + bool m_costing_for_scalar;
> +
> + /* The costs of the three regions, indexed by vect_cost_model_location. */
> + unsigned int m_costs[3];
> +
> + /* True if finish_cost has been called. */
> + bool m_finished;
> +};
> +
> +/* Create costs for VINFO. COSTING_FOR_SCALAR is true if the costs
> + are for scalar code, false if they are for vector code. */
> +
> +inline
> +vector_costs::vector_costs (vec_info *vinfo, bool costing_for_scalar)
> + : m_vinfo (vinfo),
> + m_costing_for_scalar (costing_for_scalar),
> + m_costs (),
> + m_finished (false)
> +{
> +}
> +
> +/* Return the cost of the prologue code (in abstract units). */
> +
> +inline unsigned int
> +vector_costs::prologue_cost () const
> +{
> + gcc_checking_assert (m_finished);
> + return m_costs[vect_prologue];
> +}
> +
> +/* Return the cost of the body code (in abstract units). */
> +
> +inline unsigned int
> +vector_costs::body_cost () const
> +{
> + gcc_checking_assert (m_finished);
> + return m_costs[vect_body];
> +}
> +
> +/* Return the cost of the epilogue code (in abstract units). */
> +
> +inline unsigned int
> +vector_costs::epilogue_cost () const
> +{
> + gcc_checking_assert (m_finished);
> + return m_costs[vect_epilogue];
> +}
> +
> #define VECT_MAX_COST 1000
>
> /* The maximum number of intermediate steps required in multi-step type
> @@ -1531,29 +1628,28 @@ int vect_get_stmt_cost (enum vect_cost_for_stmt type_of_cost)
>
> /* Alias targetm.vectorize.init_cost. */
>
> -static inline void *
> -init_cost (class loop *loop_info, bool costing_for_scalar)
> +static inline vector_costs *
> +init_cost (vec_info *vinfo, bool costing_for_scalar)
> {
> - return targetm.vectorize.init_cost (loop_info, costing_for_scalar);
> + return targetm.vectorize.create_costs (vinfo, costing_for_scalar);
> }
>
> -extern void dump_stmt_cost (FILE *, void *, int, enum vect_cost_for_stmt,
> +extern void dump_stmt_cost (FILE *, int, enum vect_cost_for_stmt,
> stmt_vec_info, tree, int, unsigned,
> enum vect_cost_model_location);
>
> /* Alias targetm.vectorize.add_stmt_cost. */
>
> static inline unsigned
> -add_stmt_cost (vec_info *vinfo, void *data, int count,
> +add_stmt_cost (vector_costs *costs, int count,
> enum vect_cost_for_stmt kind,
> stmt_vec_info stmt_info, tree vectype, int misalign,
> enum vect_cost_model_location where)
> {
> - unsigned cost = targetm.vectorize.add_stmt_cost (vinfo, data, count, kind,
> - stmt_info, vectype,
> - misalign, where);
> + unsigned cost = costs->add_stmt_cost (count, kind, stmt_info, vectype,
> + misalign, where);
> if (dump_file && (dump_flags & TDF_DETAILS))
> - dump_stmt_cost (dump_file, data, count, kind, stmt_info, vectype, misalign,
> + dump_stmt_cost (dump_file, count, kind, stmt_info, vectype, misalign,
> cost, where);
> return cost;
> }
> @@ -1561,36 +1657,31 @@ add_stmt_cost (vec_info *vinfo, void *data, int count,
> /* Alias targetm.vectorize.add_stmt_cost. */
>
> static inline unsigned
> -add_stmt_cost (vec_info *vinfo, void *data, stmt_info_for_cost *i)
> +add_stmt_cost (vector_costs *costs, stmt_info_for_cost *i)
> {
> - return add_stmt_cost (vinfo, data, i->count, i->kind, i->stmt_info,
> + return add_stmt_cost (costs, i->count, i->kind, i->stmt_info,
> i->vectype, i->misalign, i->where);
> }
>
> /* Alias targetm.vectorize.finish_cost. */
>
> static inline void
> -finish_cost (void *data, unsigned *prologue_cost,
> +finish_cost (vector_costs *costs, unsigned *prologue_cost,
> unsigned *body_cost, unsigned *epilogue_cost)
> {
> - targetm.vectorize.finish_cost (data, prologue_cost, body_cost, epilogue_cost);
> -}
> -
> -/* Alias targetm.vectorize.destroy_cost_data. */
> -
> -static inline void
> -destroy_cost_data (void *data)
> -{
> - targetm.vectorize.destroy_cost_data (data);
> + costs->finish_cost ();
> + *prologue_cost = costs->prologue_cost ();
> + *body_cost = costs->body_cost ();
> + *epilogue_cost = costs->epilogue_cost ();
> }
>
> inline void
> -add_stmt_costs (vec_info *vinfo, void *data, stmt_vector_for_cost *cost_vec)
> +add_stmt_costs (vector_costs *costs, stmt_vector_for_cost *cost_vec)
> {
> stmt_info_for_cost *cost;
> unsigned i;
> FOR_EACH_VEC_ELT (*cost_vec, i, cost)
> - add_stmt_cost (vinfo, data, cost->count, cost->kind, cost->stmt_info,
> + add_stmt_cost (costs, cost->count, cost->kind, cost->stmt_info,
> cost->vectype, cost->misalign, cost->where);
> }
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 76d99d247ae..93388ef9684 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -14523,11 +14523,15 @@ struct aarch64_sve_op_count : aarch64_vec_op_count
> };
>
> /* Information about vector code that we're in the process of costing. */
> -struct aarch64_vector_costs
> +struct aarch64_vector_costs : public vector_costs
> {
> - /* The normal latency-based costs for each region (prologue, body and
> - epilogue), indexed by vect_cost_model_location. */
> - unsigned int region[3] = {};
> + using vector_costs::vector_costs;
> +
> + unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> + stmt_vec_info stmt_info, tree vectype,
> + int misalign,
> + vect_cost_model_location where) override;
> + void finish_cost () override;
>
> /* True if we have performed one-time initialization based on the vec_info.
>
> @@ -14593,11 +14597,11 @@ struct aarch64_vector_costs
> hash_map<nofree_ptr_hash<_stmt_vec_info>, unsigned int> seen_loads;
> };
>
> -/* Implement TARGET_VECTORIZE_INIT_COST. */
> -void *
> -aarch64_init_cost (class loop *, bool)
> +/* Implement TARGET_VECTORIZE_CREATE_COSTS. */
> +vector_costs *
> +aarch64_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
> {
> - return new aarch64_vector_costs;
> + return new aarch64_vector_costs (vinfo, costing_for_scalar);
> }
>
> /* Return true if the current CPU should use the new costs defined
> @@ -15283,7 +15287,7 @@ aarch64_adjust_stmt_cost (vect_cost_for_stmt kind, stmt_vec_info stmt_info,
> }
>
> /* VINFO, COSTS, COUNT, KIND, STMT_INFO and VECTYPE are the same as for
> - TARGET_VECTORIZE_ADD_STMT_COST and they describe an operation in the
> + vector_costs::add_stmt_cost and they describe an operation in the
> body of a vector loop. Record issue information relating to the vector
> operation in OPS, where OPS is one of COSTS->scalar_ops, COSTS->advsimd_ops
> or COSTS->sve_ops; see the comments above those variables for details.
> @@ -15479,32 +15483,29 @@ aarch64_count_ops (class vec_info *vinfo, aarch64_vector_costs *costs,
> }
> }
>
> -/* Implement targetm.vectorize.add_stmt_cost. */
> -static unsigned
> -aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> - enum vect_cost_for_stmt kind,
> - struct _stmt_vec_info *stmt_info, tree vectype,
> - int misalign, enum vect_cost_model_location where)
> +unsigned
> +aarch64_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> + stmt_vec_info stmt_info, tree vectype,
> + int misalign,
> + vect_cost_model_location where)
> {
> - auto *costs = static_cast<aarch64_vector_costs *> (data);
> -
> fractional_cost stmt_cost
> = aarch64_builtin_vectorization_cost (kind, vectype, misalign);
>
> bool in_inner_loop_p = (where == vect_body
> && stmt_info
> - && stmt_in_inner_loop_p (vinfo, stmt_info));
> + && stmt_in_inner_loop_p (m_vinfo, stmt_info));
>
> /* Do one-time initialization based on the vinfo. */
> - loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo);
> - bb_vec_info bb_vinfo = dyn_cast<bb_vec_info> (vinfo);
> - if (!costs->analyzed_vinfo && aarch64_use_new_vector_costs_p ())
> + loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (m_vinfo);
> + bb_vec_info bb_vinfo = dyn_cast<bb_vec_info> (m_vinfo);
> + if (!analyzed_vinfo && aarch64_use_new_vector_costs_p ())
> {
> if (loop_vinfo)
> - aarch64_analyze_loop_vinfo (loop_vinfo, costs);
> + aarch64_analyze_loop_vinfo (loop_vinfo, this);
> else
> - aarch64_analyze_bb_vinfo (bb_vinfo, costs);
> - costs->analyzed_vinfo = true;
> + aarch64_analyze_bb_vinfo (bb_vinfo, this);
> + this->analyzed_vinfo = true;
> }
>
> /* Try to get a more accurate cost by looking at STMT_INFO instead
> @@ -15512,7 +15513,7 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> if (stmt_info && aarch64_use_new_vector_costs_p ())
> {
> if (vectype && aarch64_sve_only_stmt_p (stmt_info, vectype))
> - costs->saw_sve_only_op = true;
> + this->saw_sve_only_op = true;
>
> /* If we scalarize a strided store, the vectorizer costs one
> vec_to_scalar for each element. However, we can store the first
> @@ -15521,17 +15522,17 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> count -= 1;
>
> stmt_cost = aarch64_detect_scalar_stmt_subtype
> - (vinfo, kind, stmt_info, stmt_cost);
> + (m_vinfo, kind, stmt_info, stmt_cost);
>
> - if (vectype && costs->vec_flags)
> - stmt_cost = aarch64_detect_vector_stmt_subtype (vinfo, kind,
> + if (vectype && this->vec_flags)
> + stmt_cost = aarch64_detect_vector_stmt_subtype (m_vinfo, kind,
> stmt_info, vectype,
> where, stmt_cost);
> }
>
> /* Do any SVE-specific adjustments to the cost. */
> if (stmt_info && vectype && aarch64_sve_mode_p (TYPE_MODE (vectype)))
> - stmt_cost = aarch64_sve_adjust_stmt_cost (vinfo, kind, stmt_info,
> + stmt_cost = aarch64_sve_adjust_stmt_cost (m_vinfo, kind, stmt_info,
> vectype, stmt_cost);
>
> if (stmt_info && aarch64_use_new_vector_costs_p ())
> @@ -15547,36 +15548,36 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> auto *issue_info = aarch64_tune_params.vec_costs->issue_info;
> if (loop_vinfo
> && issue_info
> - && costs->vec_flags
> + && this->vec_flags
> && where == vect_body
> && (!LOOP_VINFO_LOOP (loop_vinfo)->inner || in_inner_loop_p)
> && vectype
> && stmt_cost != 0)
> {
> /* Record estimates for the scalar code. */
> - aarch64_count_ops (vinfo, costs, count, kind, stmt_info, vectype,
> - 0, &costs->scalar_ops, issue_info->scalar,
> + aarch64_count_ops (m_vinfo, this, count, kind, stmt_info, vectype,
> + 0, &this->scalar_ops, issue_info->scalar,
> vect_nunits_for_cost (vectype));
>
> - if (aarch64_sve_mode_p (vinfo->vector_mode) && issue_info->sve)
> + if (aarch64_sve_mode_p (m_vinfo->vector_mode) && issue_info->sve)
> {
> /* Record estimates for a possible Advanced SIMD version
> of the SVE code. */
> - aarch64_count_ops (vinfo, costs, count, kind, stmt_info,
> - vectype, VEC_ADVSIMD, &costs->advsimd_ops,
> + aarch64_count_ops (m_vinfo, this, count, kind, stmt_info,
> + vectype, VEC_ADVSIMD, &this->advsimd_ops,
> issue_info->advsimd,
> aarch64_estimated_sve_vq ());
>
> /* Record estimates for the SVE code itself. */
> - aarch64_count_ops (vinfo, costs, count, kind, stmt_info,
> - vectype, VEC_ANY_SVE, &costs->sve_ops,
> + aarch64_count_ops (m_vinfo, this, count, kind, stmt_info,
> + vectype, VEC_ANY_SVE, &this->sve_ops,
> issue_info->sve, 1);
> }
> else
> /* Record estimates for the Advanced SIMD code. Treat SVE like
> Advanced SIMD if the CPU has no specific SVE costs. */
> - aarch64_count_ops (vinfo, costs, count, kind, stmt_info,
> - vectype, VEC_ADVSIMD, &costs->advsimd_ops,
> + aarch64_count_ops (m_vinfo, this, count, kind, stmt_info,
> + vectype, VEC_ADVSIMD, &this->advsimd_ops,
> issue_info->advsimd, 1);
> }
>
> @@ -15585,24 +15586,11 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> loop. For simplicitly, we assume that one iteration of the
> Advanced SIMD loop would need the same number of statements
> as one iteration of the SVE loop. */
> - if (where == vect_body && costs->unrolled_advsimd_niters)
> - costs->unrolled_advsimd_stmts
> - += count * costs->unrolled_advsimd_niters;
> + if (where == vect_body && this->unrolled_advsimd_niters)
> + this->unrolled_advsimd_stmts
> + += count * this->unrolled_advsimd_niters;
> }
> -
> - /* Statements in an inner loop relative to the loop being
> - vectorized are weighted more heavily. The value here is
> - arbitrary and could potentially be improved with analysis. */
> - if (in_inner_loop_p)
> - {
> - gcc_assert (loop_vinfo);
> - count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); /* FIXME */
> - }
> -
> - unsigned retval = (count * stmt_cost).ceil ();
> - costs->region[where] += retval;
> -
> - return retval;
> + return record_stmt_cost (stmt_info, where, (count * stmt_cost).ceil ());
> }
>
> /* Dump information about the structure. */
> @@ -15966,27 +15954,15 @@ aarch64_adjust_body_cost (aarch64_vector_costs *costs, unsigned int body_cost)
> return body_cost;
> }
>
> -/* Implement TARGET_VECTORIZE_FINISH_COST. */
> -static void
> -aarch64_finish_cost (void *data, unsigned *prologue_cost,
> - unsigned *body_cost, unsigned *epilogue_cost)
> +void
> +aarch64_vector_costs::finish_cost ()
> {
> - auto *costs = static_cast<aarch64_vector_costs *> (data);
> - *prologue_cost = costs->region[vect_prologue];
> - *body_cost = costs->region[vect_body];
> - *epilogue_cost = costs->region[vect_epilogue];
> -
> - if (costs->is_loop
> - && costs->vec_flags
> + if (this->is_loop
> + && this->vec_flags
> && aarch64_use_new_vector_costs_p ())
> - *body_cost = aarch64_adjust_body_cost (costs, *body_cost);
> -}
> + m_costs[vect_body] = aarch64_adjust_body_cost (this, m_costs[vect_body]);
>
> -/* Implement TARGET_VECTORIZE_DESTROY_COST_DATA. */
> -static void
> -aarch64_destroy_cost_data (void *data)
> -{
> - delete static_cast<aarch64_vector_costs *> (data);
> + vector_costs::finish_cost ();
> }
>
> static void initialize_aarch64_code_model (struct gcc_options *);
> @@ -26285,17 +26261,8 @@ aarch64_libgcc_floating_mode_supported_p
> #undef TARGET_ARRAY_MODE_SUPPORTED_P
> #define TARGET_ARRAY_MODE_SUPPORTED_P aarch64_array_mode_supported_p
>
> -#undef TARGET_VECTORIZE_INIT_COST
> -#define TARGET_VECTORIZE_INIT_COST aarch64_init_cost
> -
> -#undef TARGET_VECTORIZE_ADD_STMT_COST
> -#define TARGET_VECTORIZE_ADD_STMT_COST aarch64_add_stmt_cost
> -
> -#undef TARGET_VECTORIZE_FINISH_COST
> -#define TARGET_VECTORIZE_FINISH_COST aarch64_finish_cost
> -
> -#undef TARGET_VECTORIZE_DESTROY_COST_DATA
> -#define TARGET_VECTORIZE_DESTROY_COST_DATA aarch64_destroy_cost_data
> +#undef TARGET_VECTORIZE_CREATE_COSTS
> +#define TARGET_VECTORIZE_CREATE_COSTS aarch64_vectorize_create_costs
>
> #undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST
> #define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index fb656094e9e..e40ae2b9c49 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -22842,26 +22842,30 @@ ix86_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info)
> return default_noce_conversion_profitable_p (seq, if_info);
> }
>
> -/* Implement targetm.vectorize.init_cost. */
> +/* x86-specific vector costs. */
> +class ix86_vector_costs : public vector_costs
> +{
> + using vector_costs::vector_costs;
> +
> + unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> + stmt_vec_info stmt_info, tree vectype,
> + int misalign,
> + vect_cost_model_location where) override;
> +};
>
> -static void *
> -ix86_init_cost (class loop *, bool)
> +/* Implement targetm.vectorize.create_costs. */
> +
> +static vector_costs *
> +ix86_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
> {
> - unsigned *cost = XNEWVEC (unsigned, 3);
> - cost[vect_prologue] = cost[vect_body] = cost[vect_epilogue] = 0;
> - return cost;
> + return new ix86_vector_costs (vinfo, costing_for_scalar);
> }
>
> -/* Implement targetm.vectorize.add_stmt_cost. */
> -
> -static unsigned
> -ix86_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> - enum vect_cost_for_stmt kind,
> - class _stmt_vec_info *stmt_info, tree vectype,
> - int misalign,
> - enum vect_cost_model_location where)
> +unsigned
> +ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> + stmt_vec_info stmt_info, tree vectype,
> + int misalign, vect_cost_model_location where)
> {
> - unsigned *cost = (unsigned *) data;
> unsigned retval = 0;
> bool scalar_p
> = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store);
> @@ -23032,15 +23036,7 @@ ix86_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> /* Statements in an inner loop relative to the loop being
> vectorized are weighted more heavily. The value here is
> arbitrary and could potentially be improved with analysis. */
> - if (where == vect_body && stmt_info
> - && stmt_in_inner_loop_p (vinfo, stmt_info))
> - {
> - loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo);
> - gcc_assert (loop_vinfo);
> - count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); /* FIXME. */
> - }
> -
> - retval = (unsigned) (count * stmt_cost);
> + retval = adjust_cost_for_freq (stmt_info, where, count * stmt_cost);
>
> /* We need to multiply all vector stmt cost by 1.7 (estimated cost)
> for Silvermont as it has out of order integer pipeline and can execute
> @@ -23055,31 +23051,11 @@ ix86_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> retval = (retval * 17) / 10;
> }
>
> - cost[where] += retval;
> + m_costs[where] += retval;
>
> return retval;
> }
>
> -/* Implement targetm.vectorize.finish_cost. */
> -
> -static void
> -ix86_finish_cost (void *data, unsigned *prologue_cost,
> - unsigned *body_cost, unsigned *epilogue_cost)
> -{
> - unsigned *cost = (unsigned *) data;
> - *prologue_cost = cost[vect_prologue];
> - *body_cost = cost[vect_body];
> - *epilogue_cost = cost[vect_epilogue];
> -}
> -
> -/* Implement targetm.vectorize.destroy_cost_data. */
> -
> -static void
> -ix86_destroy_cost_data (void *data)
> -{
> - free (data);
> -}
> -
> /* Validate target specific memory model bits in VAL. */
>
> static unsigned HOST_WIDE_INT
> @@ -24363,14 +24339,8 @@ ix86_libgcc_floating_mode_supported_p
> ix86_autovectorize_vector_modes
> #undef TARGET_VECTORIZE_GET_MASK_MODE
> #define TARGET_VECTORIZE_GET_MASK_MODE ix86_get_mask_mode
> -#undef TARGET_VECTORIZE_INIT_COST
> -#define TARGET_VECTORIZE_INIT_COST ix86_init_cost
> -#undef TARGET_VECTORIZE_ADD_STMT_COST
> -#define TARGET_VECTORIZE_ADD_STMT_COST ix86_add_stmt_cost
> -#undef TARGET_VECTORIZE_FINISH_COST
> -#define TARGET_VECTORIZE_FINISH_COST ix86_finish_cost
> -#undef TARGET_VECTORIZE_DESTROY_COST_DATA
> -#define TARGET_VECTORIZE_DESTROY_COST_DATA ix86_destroy_cost_data
> +#undef TARGET_VECTORIZE_CREATE_COSTS
> +#define TARGET_VECTORIZE_CREATE_COSTS ix86_vectorize_create_costs
>
> #undef TARGET_SET_CURRENT_FUNCTION
> #define TARGET_SET_CURRENT_FUNCTION ix86_set_current_function
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 01a95591a5d..2e7b3bcad7e 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -1452,14 +1452,8 @@ static const struct attribute_spec rs6000_attribute_table[] =
> #undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE
> #define TARGET_VECTORIZE_PREFERRED_SIMD_MODE \
> rs6000_preferred_simd_mode
> -#undef TARGET_VECTORIZE_INIT_COST
> -#define TARGET_VECTORIZE_INIT_COST rs6000_init_cost
> -#undef TARGET_VECTORIZE_ADD_STMT_COST
> -#define TARGET_VECTORIZE_ADD_STMT_COST rs6000_add_stmt_cost
> -#undef TARGET_VECTORIZE_FINISH_COST
> -#define TARGET_VECTORIZE_FINISH_COST rs6000_finish_cost
> -#undef TARGET_VECTORIZE_DESTROY_COST_DATA
> -#define TARGET_VECTORIZE_DESTROY_COST_DATA rs6000_destroy_cost_data
> +#undef TARGET_VECTORIZE_CREATE_COSTS
> +#define TARGET_VECTORIZE_CREATE_COSTS rs6000_vectorize_create_costs
>
> #undef TARGET_LOOP_UNROLL_ADJUST
> #define TARGET_LOOP_UNROLL_ADJUST rs6000_loop_unroll_adjust
> @@ -5263,21 +5257,33 @@ rs6000_preferred_simd_mode (scalar_mode mode)
> return word_mode;
> }
>
> -struct rs6000_cost_data
> +class rs6000_cost_data : public vector_costs
> {
> - struct loop *loop_info;
> - unsigned cost[3];
> +public:
> + using vector_costs::vector_costs;
> +
> + unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> + stmt_vec_info stmt_info, tree vectype,
> + int misalign,
> + vect_cost_model_location where) override;
> + void finish_cost () override;
> +
> +protected:
> + void update_target_cost_per_stmt (vect_cost_for_stmt, stmt_vec_info,
> + vect_cost_model_location, int,
> + unsigned int);
> + void density_test (loop_vec_info);
> + void adjust_vect_cost_per_loop (loop_vec_info);
> +
> /* Total number of vectorized stmts (loop only). */
> - unsigned nstmts;
> + unsigned m_nstmts = 0;
> /* Total number of loads (loop only). */
> - unsigned nloads;
> + unsigned m_nloads = 0;
> /* Possible extra penalized cost on vector construction (loop only). */
> - unsigned extra_ctor_cost;
> + unsigned m_extra_ctor_cost = 0;
> /* For each vectorized loop, this var holds TRUE iff a non-memory vector
> instruction is needed by the vectorization. */
> - bool vect_nonmem;
> - /* Indicates this is costing for the scalar version of a loop or block. */
> - bool costing_for_scalar;
> + bool m_vect_nonmem = false;
> };
>
> /* Test for likely overcommitment of vector hardware resources. If a
> @@ -5286,20 +5292,19 @@ struct rs6000_cost_data
> adequately reflect delays from unavailable vector resources.
> Penalize the loop body cost for this case. */
>
> -static void
> -rs6000_density_test (rs6000_cost_data *data)
> +void
> +rs6000_cost_data::density_test (loop_vec_info loop_vinfo)
> {
> /* This density test only cares about the cost of vector version of the
> loop, so immediately return if we are passed costing for the scalar
> version (namely computing single scalar iteration cost). */
> - if (data->costing_for_scalar)
> + if (m_costing_for_scalar)
> return;
>
> - struct loop *loop = data->loop_info;
> + struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> basic_block *bbs = get_loop_body (loop);
> int nbbs = loop->num_nodes;
> - loop_vec_info loop_vinfo = loop_vec_info_for_loop (data->loop_info);
> - int vec_cost = data->cost[vect_body], not_vec_cost = 0;
> + int vec_cost = m_costs[vect_body], not_vec_cost = 0;
>
> for (int i = 0; i < nbbs; i++)
> {
> @@ -5326,7 +5331,7 @@ rs6000_density_test (rs6000_cost_data *data)
> if (density_pct > rs6000_density_pct_threshold
> && vec_cost + not_vec_cost > rs6000_density_size_threshold)
> {
> - data->cost[vect_body] = vec_cost * (100 + rs6000_density_penalty) / 100;
> + m_costs[vect_body] = vec_cost * (100 + rs6000_density_penalty) / 100;
> if (dump_enabled_p ())
> dump_printf_loc (MSG_NOTE, vect_location,
> "density %d%%, cost %d exceeds threshold, penalizing "
> @@ -5336,10 +5341,10 @@ rs6000_density_test (rs6000_cost_data *data)
>
> /* Check whether we need to penalize the body cost to account
> for excess strided or elementwise loads. */
> - if (data->extra_ctor_cost > 0)
> + if (m_extra_ctor_cost > 0)
> {
> - gcc_assert (data->nloads <= data->nstmts);
> - unsigned int load_pct = (data->nloads * 100) / data->nstmts;
> + gcc_assert (m_nloads <= m_nstmts);
> + unsigned int load_pct = (m_nloads * 100) / m_nstmts;
>
> /* It's likely to be bounded by latency and execution resources
> from many scalar loads which are strided or elementwise loads
> @@ -5351,10 +5356,10 @@ rs6000_density_test (rs6000_cost_data *data)
> the loads.
> One typical case is the innermost loop of the hotspot of SPEC2017
> 503.bwaves_r without loop interchange. */
> - if (data->nloads > (unsigned int) rs6000_density_load_num_threshold
> + if (m_nloads > (unsigned int) rs6000_density_load_num_threshold
> && load_pct > (unsigned int) rs6000_density_load_pct_threshold)
> {
> - data->cost[vect_body] += data->extra_ctor_cost;
> + m_costs[vect_body] += m_extra_ctor_cost;
> if (dump_enabled_p ())
> dump_printf_loc (MSG_NOTE, vect_location,
> "Found %u loads and "
> @@ -5363,28 +5368,18 @@ rs6000_density_test (rs6000_cost_data *data)
> "penalizing loop body "
> "cost by extra cost %u "
> "for ctor.\n",
> - data->nloads, load_pct,
> - data->extra_ctor_cost);
> + m_nloads, load_pct,
> + m_extra_ctor_cost);
> }
> }
> }
>
> -/* Implement targetm.vectorize.init_cost. */
> +/* Implement targetm.vectorize.create_costs. */
>
> -static void *
> -rs6000_init_cost (struct loop *loop_info, bool costing_for_scalar)
> +static vector_costs *
> +rs6000_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
> {
> - rs6000_cost_data *data = XNEW (rs6000_cost_data);
> - data->loop_info = loop_info;
> - data->cost[vect_prologue] = 0;
> - data->cost[vect_body] = 0;
> - data->cost[vect_epilogue] = 0;
> - data->vect_nonmem = false;
> - data->nstmts = 0;
> - data->nloads = 0;
> - data->extra_ctor_cost = 0;
> - data->costing_for_scalar = costing_for_scalar;
> - return data;
> + return new rs6000_cost_data (vinfo, costing_for_scalar);
> }
>
> /* Adjust vectorization cost after calling rs6000_builtin_vectorization_cost.
> @@ -5413,13 +5408,12 @@ rs6000_adjust_vect_cost_per_stmt (enum vect_cost_for_stmt kind,
> /* Helper function for add_stmt_cost. Check each statement cost
> entry, gather information and update the target_cost fields
> accordingly. */
> -static void
> -rs6000_update_target_cost_per_stmt (rs6000_cost_data *data,
> - enum vect_cost_for_stmt kind,
> - struct _stmt_vec_info *stmt_info,
> - enum vect_cost_model_location where,
> - int stmt_cost,
> - unsigned int orig_count)
> +void
> +rs6000_cost_data::update_target_cost_per_stmt (vect_cost_for_stmt kind,
> + stmt_vec_info stmt_info,
> + vect_cost_model_location where,
> + int stmt_cost,
> + unsigned int orig_count)
> {
>
> /* Check whether we're doing something other than just a copy loop.
> @@ -5431,17 +5425,19 @@ rs6000_update_target_cost_per_stmt (rs6000_cost_data *data,
> || kind == vec_construct
> || kind == scalar_to_vec
> || (where == vect_body && kind == vector_stmt))
> - data->vect_nonmem = true;
> + m_vect_nonmem = true;
>
> /* Gather some information when we are costing the vectorized instruction
> for the statements located in a loop body. */
> - if (!data->costing_for_scalar && data->loop_info && where == vect_body)
> + if (!m_costing_for_scalar
> + && is_a<loop_vec_info> (m_vinfo)
> + && where == vect_body)
> {
> - data->nstmts += orig_count;
> + m_nstmts += orig_count;
>
> if (kind == scalar_load || kind == vector_load
> || kind == unaligned_load || kind == vector_gather_load)
> - data->nloads += orig_count;
> + m_nloads += orig_count;
>
> /* Power processors do not currently have instructions for strided
> and elementwise loads, and instead we must generate multiple
> @@ -5469,20 +5465,16 @@ rs6000_update_target_cost_per_stmt (rs6000_cost_data *data,
> const unsigned int MAX_PENALIZED_COST_FOR_CTOR = 12;
> if (extra_cost > MAX_PENALIZED_COST_FOR_CTOR)
> extra_cost = MAX_PENALIZED_COST_FOR_CTOR;
> - data->extra_ctor_cost += extra_cost;
> + m_extra_ctor_cost += extra_cost;
> }
> }
> }
>
> -/* Implement targetm.vectorize.add_stmt_cost. */
> -
> -static unsigned
> -rs6000_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> - enum vect_cost_for_stmt kind,
> - struct _stmt_vec_info *stmt_info, tree vectype,
> - int misalign, enum vect_cost_model_location where)
> +unsigned
> +rs6000_cost_data::add_stmt_cost (int count, vect_cost_for_stmt kind,
> + stmt_vec_info stmt_info, tree vectype,
> + int misalign, vect_cost_model_location where)
> {
> - rs6000_cost_data *cost_data = (rs6000_cost_data*) data;
> unsigned retval = 0;
>
> if (flag_vect_cost_model)
> @@ -5494,19 +5486,11 @@ rs6000_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> vectorized are weighted more heavily. The value here is
> arbitrary and could potentially be improved with analysis. */
> unsigned int orig_count = count;
> - if (where == vect_body && stmt_info
> - && stmt_in_inner_loop_p (vinfo, stmt_info))
> - {
> - loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo);
> - gcc_assert (loop_vinfo);
> - count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); /* FIXME. */
> - }
> -
> - retval = (unsigned) (count * stmt_cost);
> - cost_data->cost[where] += retval;
> + retval = adjust_cost_for_freq (stmt_info, where, count * stmt_cost);
> + m_costs[where] += retval;
>
> - rs6000_update_target_cost_per_stmt (cost_data, kind, stmt_info, where,
> - stmt_cost, orig_count);
> + update_target_cost_per_stmt (kind, stmt_info, where,
> + stmt_cost, orig_count);
> }
>
> return retval;
> @@ -5518,13 +5502,9 @@ rs6000_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> vector with length by counting number of required lengths under condition
> LOOP_VINFO_FULLY_WITH_LENGTH_P. */
>
> -static void
> -rs6000_adjust_vect_cost_per_loop (rs6000_cost_data *data)
> +void
> +rs6000_cost_data::adjust_vect_cost_per_loop (loop_vec_info loop_vinfo)
> {
> - struct loop *loop = data->loop_info;
> - gcc_assert (loop);
> - loop_vec_info loop_vinfo = loop_vec_info_for_loop (loop);
> -
> if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
> {
> rgroup_controls *rgc;
> @@ -5535,49 +5515,29 @@ rs6000_adjust_vect_cost_per_loop (rs6000_cost_data *data)
> /* Each length needs one shift to fill into bits 0-7. */
> shift_cnt += num_vectors_m1 + 1;
>
> - rs6000_add_stmt_cost (loop_vinfo, (void *) data, shift_cnt, scalar_stmt,
> - NULL, NULL_TREE, 0, vect_body);
> + add_stmt_cost (shift_cnt, scalar_stmt, NULL, NULL_TREE, 0, vect_body);
> }
> }
>
> -/* Implement targetm.vectorize.finish_cost. */
> -
> -static void
> -rs6000_finish_cost (void *data, unsigned *prologue_cost,
> - unsigned *body_cost, unsigned *epilogue_cost)
> +void
> +rs6000_cost_data::finish_cost ()
> {
> - rs6000_cost_data *cost_data = (rs6000_cost_data*) data;
> -
> - if (cost_data->loop_info)
> + if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (m_vinfo))
> {
> - rs6000_adjust_vect_cost_per_loop (cost_data);
> - rs6000_density_test (cost_data);
> - }
> + adjust_vect_cost_per_loop (loop_vinfo);
> + density_test (loop_vinfo);
>
> - /* Don't vectorize minimum-vectorization-factor, simple copy loops
> - that require versioning for any reason. The vectorization is at
> - best a wash inside the loop, and the versioning checks make
> - profitability highly unlikely and potentially quite harmful. */
> - if (cost_data->loop_info)
> - {
> - loop_vec_info vec_info = loop_vec_info_for_loop (cost_data->loop_info);
> - if (!cost_data->vect_nonmem
> - && LOOP_VINFO_VECT_FACTOR (vec_info) == 2
> - && LOOP_REQUIRES_VERSIONING (vec_info))
> - cost_data->cost[vect_body] += 10000;
> + /* Don't vectorize minimum-vectorization-factor, simple copy loops
> + that require versioning for any reason. The vectorization is at
> + best a wash inside the loop, and the versioning checks make
> + profitability highly unlikely and potentially quite harmful. */
> + if (!m_vect_nonmem
> + && LOOP_VINFO_VECT_FACTOR (loop_vinfo) == 2
> + && LOOP_REQUIRES_VERSIONING (loop_vinfo))
> + m_costs[vect_body] += 10000;
> }
>
> - *prologue_cost = cost_data->cost[vect_prologue];
> - *body_cost = cost_data->cost[vect_body];
> - *epilogue_cost = cost_data->cost[vect_epilogue];
> -}
> -
> -/* Implement targetm.vectorize.destroy_cost_data. */
> -
> -static void
> -rs6000_destroy_cost_data (void *data)
> -{
> - free (data);
> + vector_costs::finish_cost ();
> }
>
> /* Implement targetm.loop_unroll_adjust. */
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
prev parent reply other threads:[~2021-10-21 12:29 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-14 13:04 Richard Sandiford
2021-10-21 12:29 ` Richard Biener [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=r53843o4-83o-1q20-2556-spps5o363811@fhfr.qr \
--to=rguenther@suse.de \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hongtao.liu@intel.com \
--cc=hubicka@ucw.cz \
--cc=kirill.yukhin@gmail.com \
--cc=richard.sandiford@arm.com \
--cc=segher@kernel.crashing.org \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).