Re: [RFC] vect: Convert cost hooks to classes

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Biener <rguenther@suse.de>
To: Richard Sandiford <richard.sandiford@arm.com>
Cc: gcc-patches@gcc.gnu.org, hubicka@ucw.cz, ubizjak@gmail.com,
	 kirill.yukhin@gmail.com, hongtao.liu@intel.com,
	dje.gcc@gmail.com,  segher@kernel.crashing.org
Subject: Re: [RFC] vect: Convert cost hooks to classes
Date: Thu, 21 Oct 2021 14:29:13 +0200 (CEST)	[thread overview]
Message-ID: <r53843o4-83o-1q20-2556-spps5o363811@fhfr.qr> (raw)
In-Reply-To: <mpta6jbep0l.fsf@arm.com>

On Thu, 14 Oct 2021, Richard Sandiford wrote:

> The current vector cost interface has a quite a bit of redundancy
> built in.  Each target that defines its own hooks has to replicate
> the basic unsigned[3] management.  Currently each target also
> duplicates the cost adjustment for inner loops.
> 
> This patch instead defines a vector_costs class for holding
> the scalar or vector cost and allows targets to subclass it.
> There is then only one costing hook: to create a new costs
> structure of the appropriate type.  Everything else can be
> virtual functions, with common concepts implemented in the
> base class rather than in each target's derivation.
> 
> This might seem like excess C++-ification, but it shaves
> ~100 LOC.  I've also got some follow-on changes that become
> significantly easier with this patch.  Maybe it could help
> with things like weighting blocks based on frequency too.
> 
> This will clash with Andre's unrolling patches.  His patches
> have priority so this patch should queue behind them.
> 
> The x86 and rs6000 parts fully convert to a self-contained class.
> The equivalent aarch64 changes are more complex, so this patch
> just does the bare minimum.  A later patch will rework the
> aarch64 bits.
> 
> Tested on aarch64-linux-gnu, arm-linux-gnueabihf, x86_64-linux-gnu
> and powerpc64le-linux-gnu.  WDYT?

I like it!  Thus OK.

I suggested sth similar to Martin for the backend state
'[PATCH 3/N] Come up with casm global state.', abstracting
varasm global state and allowing targets to override this
via the adjusted init_sections target hook.

Richard.

> Richard
> 
> 
> gcc/
> 	* target.def (targetm.vectorize.init_cost): Replace with...
> 	(targetm.vectorize.create_costs): ...this.
> 	(targetm.vectorize.add_stmt_cost): Delete.
> 	(targetm.vectorize.finish_cost): Likewise.
> 	(targetm.vectorize.destroy_cost_data): Likewise.
> 	* doc/tm.texi.in (TARGET_VECTORIZE_INIT_COST): Replace with...
> 	(TARGET_VECTORIZE_CREATE_COSTS): ...this.
> 	(TARGET_VECTORIZE_ADD_STMT_COST): Delete.
> 	(TARGET_VECTORIZE_FINISH_COST): Likewise.
> 	(TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
> 	* doc/tm.texi: Regenerate.
> 	* tree-vectorizer.h (vec_info::vec_info): Remove target_cost_data
> 	parameter.
> 	(vec_info::target_cost_data): Change from a void * to a vector_costs *.
> 	(vector_costs): New class.
> 	(init_cost): Take a vec_info and return a vector_costs.
> 	(dump_stmt_cost): Remove data parameter.
> 	(add_stmt_cost): Replace vinfo and data parameters with a vector_costs.
> 	(add_stmt_costs): Likewise.
> 	(finish_cost): Replace data parameter with a vector_costs.
> 	(destroy_cost_data): Delete.
> 	* tree-vectorizer.c (dump_stmt_cost): Remove data argument and
> 	don't print it.
> 	(vec_info::vec_info): Remove the target_cost_data parameter and
> 	initialize the member variable to null instead.
> 	(vec_info::~vec_info): Delete target_cost_data instead of calling
> 	destroy_cost_data.
> 	(vector_costs::add_stmt_cost): New function.
> 	(vector_costs::finish_cost): Likewise.
> 	(vector_costs::record_stmt_cost): Likewise.
> 	(vector_costs::adjust_cost_for_freq): Likewise.
> 	* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Update
> 	call to vec_info::vec_info.
> 	(vect_compute_single_scalar_iteration_cost): Update after above
> 	changes to costing interface.
> 	(vect_analyze_loop_operations): Likewise.
> 	(vect_estimate_min_profitable_iters): Likewise.
> 	(vect_analyze_loop_2): Initialize LOOP_VINFO_TARGET_COST_DATA
> 	at the start_over point, where it needs to be recreated after
> 	trying without slp.  Update retry code accordingly.
> 	* tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Update call
> 	to vec_info::vec_info.
> 	(vect_slp_analyze_operation): Update after above changes to costing
> 	interface.
> 	(vect_bb_vectorization_profitable_p): Likewise.
> 	* targhooks.h (default_init_cost): Replace with...
> 	(default_vectorize_create_costs): ...this.
> 	(default_add_stmt_cost): Delete.
> 	(default_finish_cost, default_destroy_cost_data): Likewise.
> 	* targhooks.c (default_init_cost): Replace with...
> 	(default_vectorize_create_costs): ...this.
> 	(default_add_stmt_cost): Delete, moving logic to vector_costs instead.
> 	(default_finish_cost, default_destroy_cost_data): Delete.
> 	* config/aarch64/aarch64.c (aarch64_vector_costs): Inherit from
> 	vector_costs.  Add a constructor.
> 	(aarch64_init_cost): Replace with...
> 	(aarch64_vectorize_create_costs): ...this.
> 	(aarch64_add_stmt_cost): Replace with...
> 	(aarch64_vector_costs::add_stmt_cost): ...this.  Use record_stmt_cost
> 	to adjust the cost for inner loops.
> 	(aarch64_finish_cost): Replace with...
> 	(aarch64_vector_costs::finish_cost): ...this.
> 	(aarch64_destroy_cost_data): Delete.
> 	(TARGET_VECTORIZE_INIT_COST): Replace with...
> 	(TARGET_VECTORIZE_CREATE_COSTS): ...this.
> 	(TARGET_VECTORIZE_ADD_STMT_COST): Delete.
> 	(TARGET_VECTORIZE_FINISH_COST): Likewise.
> 	(TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
> 	* config/i386/i386.c (ix86_vector_costs): New structure.
> 	(ix86_init_cost): Replace with...
> 	(ix86_vectorize_create_costs): ...this.
> 	(ix86_add_stmt_cost): Replace with...
> 	(ix86_vector_costs::add_stmt_cost): ...this.  Use adjust_cost_for_freq
> 	to adjust the cost for inner loops.
> 	(ix86_finish_cost, ix86_destroy_cost_data): Delete.
> 	(TARGET_VECTORIZE_INIT_COST): Replace with...
> 	(TARGET_VECTORIZE_CREATE_COSTS): ...this.
> 	(TARGET_VECTORIZE_ADD_STMT_COST): Delete.
> 	(TARGET_VECTORIZE_FINISH_COST): Likewise.
> 	(TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
> 	* config/rs6000/rs6000.c (TARGET_VECTORIZE_INIT_COST): Replace with...
> 	(TARGET_VECTORIZE_CREATE_COSTS): ...this.
> 	(TARGET_VECTORIZE_ADD_STMT_COST): Delete.
> 	(TARGET_VECTORIZE_FINISH_COST): Likewise.
> 	(TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
> 	(rs6000_cost_data): Inherit from vector_costs.
> 	Add a constructor.  Drop loop_info, cost and costing_for_scalar
> 	in favor of the corresponding vector_costs member variables.
> 	Add "m_" to the names of the remaining member variables and
> 	initialize them.
> 	(rs6000_density_test): Replace with...
> 	(rs6000_cost_data::density_test): ...this.
> 	(rs6000_init_cost): Replace with...
> 	(rs6000_vectorize_create_costs): ...this.
> 	(rs6000_update_target_cost_per_stmt): Replace with...
> 	(rs6000_cost_data::update_target_cost_per_stmt): ...this.
> 	(rs6000_add_stmt_cost): Replace with...
> 	(rs6000_cost_data::add_stmt_cost): ...this.  Use adjust_cost_for_freq
> 	to adjust the cost for inner loops.
> 	(rs6000_adjust_vect_cost_per_loop): Replace with...
> 	(rs6000_cost_data::adjust_vect_cost_per_loop): ...this.
> 	(rs6000_finish_cost): Replace with...
> 	(rs6000_cost_data::finish_cost): ...this.  Group loop code
> 	into a single if statement and pass the loop_vinfo down to
> 	subroutines.
> 	(rs6000_destroy_cost_data): Delete.
> ---
>  gcc/config/aarch64/aarch64.c | 137 ++++++++++--------------
>  gcc/config/i386/i386.c       |  76 ++++----------
>  gcc/config/rs6000/rs6000.c   | 196 ++++++++++++++---------------------
>  gcc/doc/tm.texi              |  25 +----
>  gcc/doc/tm.texi.in           |   8 +-
>  gcc/target.def               |  49 +--------
>  gcc/targhooks.c              |  61 +----------
>  gcc/targhooks.h              |   8 +-
>  gcc/tree-vect-loop.c         |  51 ++++-----
>  gcc/tree-vect-slp.c          |  18 ++--
>  gcc/tree-vectorizer.c        |  67 ++++++++++--
>  gcc/tree-vectorizer.h        | 141 ++++++++++++++++++++-----
>  12 files changed, 374 insertions(+), 463 deletions(-)
> 
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 902402d7503..f2e90990d9f 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -6255,7 +6255,7 @@ type @code{internal_fn}) should be considered expensive when the mask is
>  all zeros.  GCC can then try to branch around the instruction instead.
>  @end deftypefn
>  
> -@deftypefn {Target Hook} {void *} TARGET_VECTORIZE_INIT_COST (class loop *@var{loop_info}, bool @var{costing_for_scalar})
> +@deftypefn {Target Hook} {class vector_costs *} TARGET_VECTORIZE_CREATE_COSTS (vec_info *@var{vinfo}, bool @var{costing_for_scalar})
>  This hook should initialize target-specific data structures in preparation
>  for modeling the costs of vectorizing a loop or basic block.  The default
>  allocates three unsigned integers for accumulating costs for the prologue,
> @@ -6266,29 +6266,6 @@ current cost model is for the scalar version of a loop or block; otherwise
>  it is for the vector version.
>  @end deftypefn
>  
> -@deftypefn {Target Hook} unsigned TARGET_VECTORIZE_ADD_STMT_COST (class vec_info *@var{}, void *@var{data}, int @var{count}, enum vect_cost_for_stmt @var{kind}, class _stmt_vec_info *@var{stmt_info}, tree @var{vectype}, int @var{misalign}, enum vect_cost_model_location @var{where})
> -This hook should update the target-specific @var{data} in response to
> -adding @var{count} copies of the given @var{kind} of statement to a
> -loop or basic block.  The default adds the builtin vectorizer cost for
> -the copies of the statement to the accumulator specified by @var{where},
> -(the prologue, body, or epilogue) and returns the amount added.  The
> -return value should be viewed as a tentative cost that may later be
> -revised.
> -@end deftypefn
> -
> -@deftypefn {Target Hook} void TARGET_VECTORIZE_FINISH_COST (void *@var{data}, unsigned *@var{prologue_cost}, unsigned *@var{body_cost}, unsigned *@var{epilogue_cost})
> -This hook should complete calculations of the cost of vectorizing a loop
> -or basic block based on @var{data}, and return the prologue, body, and
> -epilogue costs as unsigned integers.  The default returns the value of
> -the three accumulators.
> -@end deftypefn
> -
> -@deftypefn {Target Hook} void TARGET_VECTORIZE_DESTROY_COST_DATA (void *@var{data})
> -This hook should release @var{data} and any related data structures
> -allocated by TARGET_VECTORIZE_INIT_COST.  The default releases the
> -accumulator.
> -@end deftypefn
> -
>  @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_GATHER (const_tree @var{mem_vectype}, const_tree @var{index_type}, int @var{scale})
>  Target builtin that implements vector gather operation.  @var{mem_vectype}
>  is the vector type of the load and @var{index_type} is scalar type of
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 86352dc9bd2..738e7b8c19e 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -4190,13 +4190,7 @@ address;  but often a machine-dependent strategy can generate better code.
>  
>  @hook TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE
>  
> -@hook TARGET_VECTORIZE_INIT_COST
> -
> -@hook TARGET_VECTORIZE_ADD_STMT_COST
> -
> -@hook TARGET_VECTORIZE_FINISH_COST
> -
> -@hook TARGET_VECTORIZE_DESTROY_COST_DATA
> +@hook TARGET_VECTORIZE_CREATE_COSTS
>  
>  @hook TARGET_VECTORIZE_BUILTIN_GATHER
>  
> diff --git a/gcc/target.def b/gcc/target.def
> index c5d90cace80..1baaba4cd0f 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -2051,7 +2051,7 @@ stores.",
>  
>  /* Target function to initialize the cost model for a loop or block.  */
>  DEFHOOK
> -(init_cost,
> +(create_costs,
>   "This hook should initialize target-specific data structures in preparation\n\
>  for modeling the costs of vectorizing a loop or basic block.  The default\n\
>  allocates three unsigned integers for accumulating costs for the prologue,\n\
> @@ -2060,50 +2060,9 @@ non-NULL, it identifies the loop being vectorized; otherwise a single block\n\
>  is being vectorized.  If @var{costing_for_scalar} is true, it indicates the\n\
>  current cost model is for the scalar version of a loop or block; otherwise\n\
>  it is for the vector version.",
> - void *,
> - (class loop *loop_info, bool costing_for_scalar),
> - default_init_cost)
> -
> -/* Target function to record N statements of the given kind using the
> -   given vector type within the cost model data for the current loop or
> -    block.  */
> -DEFHOOK
> -(add_stmt_cost,
> - "This hook should update the target-specific @var{data} in response to\n\
> -adding @var{count} copies of the given @var{kind} of statement to a\n\
> -loop or basic block.  The default adds the builtin vectorizer cost for\n\
> -the copies of the statement to the accumulator specified by @var{where},\n\
> -(the prologue, body, or epilogue) and returns the amount added.  The\n\
> -return value should be viewed as a tentative cost that may later be\n\
> -revised.",
> - unsigned,
> - (class vec_info *, void *data, int count, enum vect_cost_for_stmt kind,
> -  class _stmt_vec_info *stmt_info, tree vectype, int misalign,
> -  enum vect_cost_model_location where),
> - default_add_stmt_cost)
> -
> -/* Target function to calculate the total cost of the current vectorized
> -   loop or block.  */
> -DEFHOOK
> -(finish_cost,
> - "This hook should complete calculations of the cost of vectorizing a loop\n\
> -or basic block based on @var{data}, and return the prologue, body, and\n\
> -epilogue costs as unsigned integers.  The default returns the value of\n\
> -the three accumulators.",
> - void,
> - (void *data, unsigned *prologue_cost, unsigned *body_cost,
> -  unsigned *epilogue_cost),
> - default_finish_cost)
> -
> -/* Function to delete target-specific cost modeling data.  */
> -DEFHOOK
> -(destroy_cost_data,
> - "This hook should release @var{data} and any related data structures\n\
> -allocated by TARGET_VECTORIZE_INIT_COST.  The default releases the\n\
> -accumulator.",
> - void,
> - (void *data),
> - default_destroy_cost_data)
> + class vector_costs *,
> + (vec_info *vinfo, bool costing_for_scalar),
> + default_vectorize_create_costs)
>  
>  HOOK_VECTOR_END (vectorize)
>  
> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> index cbbcedf790f..ee8798cc84b 100644
> --- a/gcc/targhooks.c
> +++ b/gcc/targhooks.c
> @@ -1474,65 +1474,10 @@ default_empty_mask_is_expensive (unsigned ifn)
>     loop body, and epilogue) for a vectorized loop or block.  So allocate an
>     array of three unsigned ints, set it to zero, and return its address.  */
>  
> -void *
> -default_init_cost (class loop *loop_info ATTRIBUTE_UNUSED,
> -		   bool costing_for_scalar ATTRIBUTE_UNUSED)
> -{
> -  unsigned *cost = XNEWVEC (unsigned, 3);
> -  cost[vect_prologue] = cost[vect_body] = cost[vect_epilogue] = 0;
> -  return cost;
> -}
> -
> -/* By default, the cost model looks up the cost of the given statement
> -   kind and mode, multiplies it by the occurrence count, accumulates
> -   it into the cost specified by WHERE, and returns the cost added.  */
> -
> -unsigned
> -default_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> -		       enum vect_cost_for_stmt kind,
> -		       class _stmt_vec_info *stmt_info, tree vectype,
> -		       int misalign,
> -		       enum vect_cost_model_location where)
> -{
> -  unsigned *cost = (unsigned *) data;
> -  unsigned retval = 0;
> -  int stmt_cost = targetm.vectorize.builtin_vectorization_cost (kind, vectype,
> -								misalign);
> -   /* Statements in an inner loop relative to the loop being
> -      vectorized are weighted more heavily.  The value here is
> -      arbitrary and could potentially be improved with analysis.  */
> -  if (where == vect_body && stmt_info
> -      && stmt_in_inner_loop_p (vinfo, stmt_info))
> -    {
> -      loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo);
> -      gcc_assert (loop_vinfo);
> -      count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo);
> -    }
> -
> -  retval = (unsigned) (count * stmt_cost);
> -  cost[where] += retval;
> -
> -  return retval;
> -}
> -
> -/* By default, the cost model just returns the accumulated costs.  */
> -
> -void
> -default_finish_cost (void *data, unsigned *prologue_cost,
> -		     unsigned *body_cost, unsigned *epilogue_cost)
> -{
> -  unsigned *cost = (unsigned *) data;
> -  *prologue_cost = cost[vect_prologue];
> -  *body_cost     = cost[vect_body];
> -  *epilogue_cost = cost[vect_epilogue];
> -}
> -
> -/* Free the cost data.  */
> -
> -void
> -default_destroy_cost_data (void *data)
> +vector_costs *
> +default_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
>  {
> -  free (data);
> +  return new vector_costs (vinfo, costing_for_scalar);
>  }
>  
>  /* Determine whether or not a pointer mode is valid. Assume defaults
> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> index 92d51992e62..64f2dc7e4a6 100644
> --- a/gcc/targhooks.h
> +++ b/gcc/targhooks.h
> @@ -118,13 +118,7 @@ extern opt_machine_mode default_vectorize_related_mode (machine_mode,
>  							poly_uint64);
>  extern opt_machine_mode default_get_mask_mode (machine_mode);
>  extern bool default_empty_mask_is_expensive (unsigned);
> -extern void *default_init_cost (class loop *, bool);
> -extern unsigned default_add_stmt_cost (class vec_info *, void *, int,
> -				       enum vect_cost_for_stmt,
> -				       class _stmt_vec_info *, tree, int,
> -				       enum vect_cost_model_location);
> -extern void default_finish_cost (void *, unsigned *, unsigned *, unsigned *);
> -extern void default_destroy_cost_data (void *);
> +extern vector_costs *default_vectorize_create_costs (vec_info *, bool);
>  
>  /* OpenACC hooks.  */
>  extern bool default_goacc_validate_dims (tree, int [], int, unsigned);
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index 961c1623f81..201000af425 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -814,7 +814,7 @@ bb_in_loop_p (const_basic_block bb, const void *data)
>     stmt_vec_info structs for all the stmts in LOOP_IN.  */
>  
>  _loop_vec_info::_loop_vec_info (class loop *loop_in, vec_info_shared *shared)
> -  : vec_info (vec_info::loop, init_cost (loop_in, false), shared),
> +  : vec_info (vec_info::loop, shared),
>      loop (loop_in),
>      bbs (XCNEWVEC (basic_block, loop->num_nodes)),
>      num_itersm1 (NULL_TREE),
> @@ -1292,18 +1292,18 @@ vect_compute_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
>      }
>  
>    /* Now accumulate cost.  */
> -  void *target_cost_data = init_cost (loop, true);
> +  vector_costs *target_cost_data = init_cost (loop_vinfo, true);
>    stmt_info_for_cost *si;
>    int j;
>    FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo),
>  		    j, si)
> -    (void) add_stmt_cost (loop_vinfo, target_cost_data, si->count,
> +    (void) add_stmt_cost (target_cost_data, si->count,
>  			  si->kind, si->stmt_info, si->vectype,
>  			  si->misalign, si->where);
>    unsigned prologue_cost = 0, body_cost = 0, epilogue_cost = 0;
>    finish_cost (target_cost_data, &prologue_cost, &body_cost,
>  	       &epilogue_cost);
> -  destroy_cost_data (target_cost_data);
> +  delete target_cost_data;
>    LOOP_VINFO_SINGLE_SCALAR_ITERATION_COST (loop_vinfo)
>      = prologue_cost + body_cost + epilogue_cost;
>  }
> @@ -1783,7 +1783,7 @@ vect_analyze_loop_operations (loop_vec_info loop_vinfo)
>          }
>      } /* bbs */
>  
> -  add_stmt_costs (loop_vinfo, loop_vinfo->target_cost_data, &cost_vec);
> +  add_stmt_costs (loop_vinfo->target_cost_data, &cost_vec);
>  
>    /* All operations in the loop are either irrelevant (deal with loop
>       control, or dead), or only used outside the loop and can be moved
> @@ -2393,6 +2393,8 @@ start_over:
>  		   LOOP_VINFO_INT_NITERS (loop_vinfo));
>      }
>  
> +  LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) = init_cost (loop_vinfo, false);
> +
>    /* Analyze the alignment of the data-refs in the loop.
>       Fail if a data reference is found that cannot be vectorized.  */
>  
> @@ -2757,9 +2759,8 @@ again:
>    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).release ();
>    LOOP_VINFO_CHECK_UNEQUAL_ADDRS (loop_vinfo).release ();
>    /* Reset target cost data.  */
> -  destroy_cost_data (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo));
> -  LOOP_VINFO_TARGET_COST_DATA (loop_vinfo)
> -    = init_cost (LOOP_VINFO_LOOP (loop_vinfo), false);
> +  delete LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
> +  LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) = nullptr;
>    /* Reset accumulated rgroup information.  */
>    release_vec_loop_controls (&LOOP_VINFO_MASKS (loop_vinfo));
>    release_vec_loop_controls (&LOOP_VINFO_LENS (loop_vinfo));
> @@ -3895,7 +3896,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>    int scalar_outside_cost = 0;
>    int assumed_vf = vect_vf_for_cost (loop_vinfo);
>    int npeel = LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo);
> -  void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
> +  vector_costs *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
>  
>    /* Cost model disabled.  */
>    if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
> @@ -3912,7 +3913,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>      {
>        /*  FIXME: Make cost depend on complexity of individual check.  */
>        unsigned len = LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo).length ();
> -      (void) add_stmt_cost (loop_vinfo, target_cost_data, len, vector_stmt,
> +      (void) add_stmt_cost (target_cost_data, len, vector_stmt,
>  			    NULL, NULL_TREE, 0, vect_prologue);
>        if (dump_enabled_p ())
>  	dump_printf (MSG_NOTE,
> @@ -3925,12 +3926,12 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>      {
>        /*  FIXME: Make cost depend on complexity of individual check.  */
>        unsigned len = LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).length ();
> -      (void) add_stmt_cost (loop_vinfo, target_cost_data, len, vector_stmt,
> +      (void) add_stmt_cost (target_cost_data, len, vector_stmt,
>  			    NULL, NULL_TREE, 0, vect_prologue);
>        len = LOOP_VINFO_CHECK_UNEQUAL_ADDRS (loop_vinfo).length ();
>        if (len)
>  	/* Count LEN - 1 ANDs and LEN comparisons.  */
> -	(void) add_stmt_cost (loop_vinfo, target_cost_data, len * 2 - 1,
> +	(void) add_stmt_cost (target_cost_data, len * 2 - 1,
>  			      scalar_stmt, NULL, NULL_TREE, 0, vect_prologue);
>        len = LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).length ();
>        if (len)
> @@ -3941,7 +3942,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>  	  for (unsigned int i = 0; i < len; ++i)
>  	    if (!LOOP_VINFO_LOWER_BOUNDS (loop_vinfo)[i].unsigned_p)
>  	      nstmts += 1;
> -	  (void) add_stmt_cost (loop_vinfo, target_cost_data, nstmts,
> +	  (void) add_stmt_cost (target_cost_data, nstmts,
>  				scalar_stmt, NULL, NULL_TREE, 0, vect_prologue);
>  	}
>        if (dump_enabled_p ())
> @@ -3954,7 +3955,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>    if (LOOP_REQUIRES_VERSIONING_FOR_NITERS (loop_vinfo))
>      {
>        /*  FIXME: Make cost depend on complexity of individual check.  */
> -      (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, vector_stmt,
> +      (void) add_stmt_cost (target_cost_data, 1, vector_stmt,
>  			    NULL, NULL_TREE, 0, vect_prologue);
>        if (dump_enabled_p ())
>  	dump_printf (MSG_NOTE,
> @@ -3963,7 +3964,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>      }
>  
>    if (LOOP_REQUIRES_VERSIONING (loop_vinfo))
> -    (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, cond_branch_taken,
> +    (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken,
>  			  NULL, NULL_TREE, 0, vect_prologue);
>  
>    /* Count statements in scalar loop.  Using this as scalar cost for a single
> @@ -4051,7 +4052,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>    if (peel_iters_prologue)
>      FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), j, si)
>        {
> -	(void) add_stmt_cost (loop_vinfo, target_cost_data,
> +	(void) add_stmt_cost (target_cost_data,
>  			      si->count * peel_iters_prologue, si->kind,
>  			      si->stmt_info, si->vectype, si->misalign,
>  			      vect_prologue);
> @@ -4061,7 +4062,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>    if (peel_iters_epilogue)
>      FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), j, si)
>        {
> -	(void) add_stmt_cost (loop_vinfo, target_cost_data,
> +	(void) add_stmt_cost (target_cost_data,
>  			      si->count * peel_iters_epilogue, si->kind,
>  			      si->stmt_info, si->vectype, si->misalign,
>  			      vect_epilogue);
> @@ -4070,20 +4071,20 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>    /* Add possible cond_branch_taken/cond_branch_not_taken cost.  */
>  
>    if (prologue_need_br_taken_cost)
> -    (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, cond_branch_taken,
> +    (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken,
>  			  NULL, NULL_TREE, 0, vect_prologue);
>  
>    if (prologue_need_br_not_taken_cost)
> -    (void) add_stmt_cost (loop_vinfo, target_cost_data, 1,
> +    (void) add_stmt_cost (target_cost_data, 1,
>  			  cond_branch_not_taken, NULL, NULL_TREE, 0,
>  			  vect_prologue);
>  
>    if (epilogue_need_br_taken_cost)
> -    (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, cond_branch_taken,
> +    (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken,
>  			  NULL, NULL_TREE, 0, vect_epilogue);
>  
>    if (epilogue_need_br_not_taken_cost)
> -    (void) add_stmt_cost (loop_vinfo, target_cost_data, 1,
> +    (void) add_stmt_cost (target_cost_data, 1,
>  			  cond_branch_not_taken, NULL, NULL_TREE, 0,
>  			  vect_epilogue);
>  
> @@ -4111,9 +4112,9 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>  	 simpler and safer to use the worst-case cost; if this ends up
>  	 being the tie-breaker between vectorizing or not, then it's
>  	 probably better not to vectorize.  */
> -      (void) add_stmt_cost (loop_vinfo, target_cost_data, num_masks,
> +      (void) add_stmt_cost (target_cost_data, num_masks,
>  			    vector_stmt, NULL, NULL_TREE, 0, vect_prologue);
> -      (void) add_stmt_cost (loop_vinfo, target_cost_data, num_masks - 1,
> +      (void) add_stmt_cost (target_cost_data, num_masks - 1,
>  			    vector_stmt, NULL, NULL_TREE, 0, vect_body);
>      }
>    else if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
> @@ -4163,9 +4164,9 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>  	      body_stmts += 3 * num_vectors;
>  	  }
>  
> -      (void) add_stmt_cost (loop_vinfo, target_cost_data, prologue_stmts,
> +      (void) add_stmt_cost (target_cost_data, prologue_stmts,
>  			    scalar_stmt, NULL, NULL_TREE, 0, vect_prologue);
> -      (void) add_stmt_cost (loop_vinfo, target_cost_data, body_stmts,
> +      (void) add_stmt_cost (target_cost_data, body_stmts,
>  			    scalar_stmt, NULL, NULL_TREE, 0, vect_body);
>      }
>  
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 709bcb63686..a2eb20faef5 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -4355,7 +4355,7 @@ vect_detect_hybrid_slp (loop_vec_info loop_vinfo)
>  /* Initialize a bb_vec_info struct for the statements in BBS basic blocks.  */
>  
>  _bb_vec_info::_bb_vec_info (vec<basic_block> _bbs, vec_info_shared *shared)
> -  : vec_info (vec_info::bb, init_cost (NULL, false), shared),
> +  : vec_info (vec_info::bb, shared),
>      bbs (_bbs),
>      roots (vNULL)
>  {
> @@ -4897,7 +4897,7 @@ vect_slp_analyze_operations (vec_info *vinfo)
>  	    instance->cost_vec = cost_vec;
>  	  else
>  	    {
> -	      add_stmt_costs (vinfo, vinfo->target_cost_data, &cost_vec);
> +	      add_stmt_costs (vinfo->target_cost_data, &cost_vec);
>  	      cost_vec.release ();
>  	    }
>  	}
> @@ -5337,32 +5337,30 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
>  	  continue;
>  	}
>  
> -      void *scalar_target_cost_data = init_cost (NULL, true);
> +      class vector_costs *scalar_target_cost_data = init_cost (bb_vinfo, true);
>        do
>  	{
> -	  add_stmt_cost (bb_vinfo, scalar_target_cost_data,
> -			 li_scalar_costs[si].second);
> +	  add_stmt_cost (scalar_target_cost_data, li_scalar_costs[si].second);
>  	  si++;
>  	}
>        while (si < li_scalar_costs.length ()
>  	     && li_scalar_costs[si].first == sl);
>        unsigned dummy;
>        finish_cost (scalar_target_cost_data, &dummy, &scalar_cost, &dummy);
> -      destroy_cost_data (scalar_target_cost_data);
> +      delete scalar_target_cost_data;
>  
>        /* Complete the target-specific vector cost calculation.  */
> -      void *vect_target_cost_data = init_cost (NULL, false);
> +      class vector_costs *vect_target_cost_data = init_cost (bb_vinfo, false);
>        do
>  	{
> -	  add_stmt_cost (bb_vinfo, vect_target_cost_data,
> -			 li_vector_costs[vi].second);
> +	  add_stmt_cost (vect_target_cost_data, li_vector_costs[vi].second);
>  	  vi++;
>  	}
>        while (vi < li_vector_costs.length ()
>  	     && li_vector_costs[vi].first == vl);
>        finish_cost (vect_target_cost_data, &vec_prologue_cost,
>  		   &vec_inside_cost, &vec_epilogue_cost);
> -      destroy_cost_data (vect_target_cost_data);
> +      delete vect_target_cost_data;
>  
>        vec_outside_cost = vec_prologue_cost + vec_epilogue_cost;
>  
> diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
> index 20daa31187d..0b3a2dd6dc0 100644
> --- a/gcc/tree-vectorizer.c
> +++ b/gcc/tree-vectorizer.c
> @@ -98,11 +98,10 @@ auto_purge_vect_location::~auto_purge_vect_location ()
>  /* Dump a cost entry according to args to F.  */
>  
>  void
> -dump_stmt_cost (FILE *f, void *data, int count, enum vect_cost_for_stmt kind,
> +dump_stmt_cost (FILE *f, int count, enum vect_cost_for_stmt kind,
>  		stmt_vec_info stmt_info, tree, int misalign, unsigned cost,
>  		enum vect_cost_model_location where)
>  {
> -  fprintf (f, "%p ", data);
>    if (stmt_info)
>      {
>        print_gimple_expr (f, STMT_VINFO_STMT (stmt_info), 0, TDF_SLIM);
> @@ -457,12 +456,11 @@ shrink_simd_arrays
>  /* Initialize the vec_info with kind KIND_IN and target cost data
>     TARGET_COST_DATA_IN.  */
>  
> -vec_info::vec_info (vec_info::vec_kind kind_in, void *target_cost_data_in,
> -		    vec_info_shared *shared_)
> +vec_info::vec_info (vec_info::vec_kind kind_in, vec_info_shared *shared_)
>    : kind (kind_in),
>      shared (shared_),
>      stmt_vec_info_ro (false),
> -    target_cost_data (target_cost_data_in)
> +    target_cost_data (nullptr)
>  {
>    stmt_vec_infos.create (50);
>  }
> @@ -472,7 +470,7 @@ vec_info::~vec_info ()
>    for (slp_instance &instance : slp_instances)
>      vect_free_slp_instance (instance);
>  
> -  destroy_cost_data (target_cost_data);
> +  delete target_cost_data;
>    free_stmt_vec_infos ();
>  }
>  
> @@ -1694,3 +1692,60 @@ scalar_cond_masked_key::get_cond_ops_from_tree (tree t)
>    this->op0 = t;
>    this->op1 = build_zero_cst (TREE_TYPE (t));
>  }
> +
> +/* See the comment above the declaration for details.  */
> +
> +unsigned int
> +vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> +			     stmt_vec_info stmt_info, tree vectype,
> +			     int misalign, vect_cost_model_location where)
> +{
> +  unsigned int cost
> +    = builtin_vectorization_cost (kind, vectype, misalign) * count;
> +  return record_stmt_cost (stmt_info, where, cost);
> +}
> +
> +/* See the comment above the declaration for details.  */
> +
> +void
> +vector_costs::finish_cost ()
> +{
> +  gcc_assert (!m_finished);
> +  m_finished = true;
> +}
> +
> +/* Record a base cost of COST units against WHERE.  If STMT_INFO is
> +   nonnull, use it to adjust the cost based on execution frequency
> +   (where appropriate).  */
> +
> +unsigned int
> +vector_costs::record_stmt_cost (stmt_vec_info stmt_info,
> +				vect_cost_model_location where,
> +				unsigned int cost)
> +{
> +  cost = adjust_cost_for_freq (stmt_info, where, cost);
> +  m_costs[where] += cost;
> +  return cost;
> +}
> +
> +/* COST is the base cost we have calculated for an operation in location WHERE.
> +   If STMT_INFO is nonnull, use it to adjust the cost based on execution
> +   frequency (where appropriate).  Return the adjusted cost.  */
> +
> +unsigned int
> +vector_costs::adjust_cost_for_freq (stmt_vec_info stmt_info,
> +				    vect_cost_model_location where,
> +				    unsigned int cost)
> +{
> +  /* Statements in an inner loop relative to the loop being
> +     vectorized are weighted more heavily.  The value here is
> +     arbitrary and could potentially be improved with analysis.  */
> +  if (where == vect_body
> +      && stmt_info
> +      && stmt_in_inner_loop_p (m_vinfo, stmt_info))
> +    {
> +      loop_vec_info loop_vinfo = as_a<loop_vec_info> (m_vinfo);
> +      cost *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo);
> +    }
> +  return cost;
> +}
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index 4aa84acff59..44afda2bc9b 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -368,7 +368,7 @@ public:
>    typedef hash_set<int_hash<machine_mode, E_VOIDmode, E_BLKmode> > mode_set;
>    enum vec_kind { bb, loop };
>  
> -  vec_info (vec_kind, void *, vec_info_shared *);
> +  vec_info (vec_kind, vec_info_shared *);
>    ~vec_info ();
>  
>    stmt_vec_info add_stmt (gimple *);
> @@ -406,7 +406,7 @@ public:
>    auto_vec<stmt_vec_info> grouped_stores;
>  
>    /* Cost data used by the target cost model.  */
> -  void *target_cost_data;
> +  class vector_costs *target_cost_data;
>  
>    /* The set of vector modes used in the vectorized region.  */
>    mode_set used_vector_modes;
> @@ -1395,6 +1395,103 @@ struct gather_scatter_info {
>  #define PURE_SLP_STMT(S)                  ((S)->slp_type == pure_slp)
>  #define STMT_SLP_TYPE(S)                   (S)->slp_type
>  
> +/* Contains the scalar or vector costs for a vec_info.  */
> +class vector_costs
> +{
> +public:
> +  vector_costs (vec_info *, bool);
> +  virtual ~vector_costs () {}
> +
> +  /* Update the costs in response to adding COUNT copies of a statement.
> +
> +     - WHERE specifies whether the cost occurs in the loop prologue,
> +       the loop body, or the loop epilogue.
> +     - KIND is the kind of statement, which is always meaningful.
> +     - STMT_INFO, if nonnull, describes the statement that will be
> +       vectorized.
> +     - VECTYPE, if nonnull, is the vector type that the vectorized
> +       statement will operate on.  Note that this should be used in
> +       preference to STMT_VINFO_VECTYPE (STMT_INFO) since the latter
> +       is not correct for SLP.
> +     - for unaligned_load and unaligned_store statements, MISALIGN is
> +       the byte misalignment of the load or store relative to the target's
> +       preferred alignment for VECTYPE, or DR_MISALIGNMENT_UNKNOWN
> +       if the misalignment is not known.
> +
> +     Return the calculated cost as well as recording it.  The return
> +     value is used for dumping purposes.  */
> +  virtual unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> +				      stmt_vec_info stmt_info, tree vectype,
> +				      int misalign,
> +				      vect_cost_model_location where);
> +
> +  /* Finish calculating the cost of the code.  The results can be
> +     read back using the functions below.  */
> +  virtual void finish_cost ();
> +
> +  unsigned int prologue_cost () const;
> +  unsigned int body_cost () const;
> +  unsigned int epilogue_cost () const;
> +
> +protected:
> +  unsigned int record_stmt_cost (stmt_vec_info, vect_cost_model_location,
> +				 unsigned int);
> +  unsigned int adjust_cost_for_freq (stmt_vec_info, vect_cost_model_location,
> +				     unsigned int);
> +
> +  /* The region of code that we're considering vectorizing.  */
> +  vec_info *m_vinfo;
> +
> +  /* True if we're costing the scalar code, false if we're costing
> +     the vector code.  */
> +  bool m_costing_for_scalar;
> +
> +  /* The costs of the three regions, indexed by vect_cost_model_location.  */
> +  unsigned int m_costs[3];
> +
> +  /* True if finish_cost has been called.  */
> +  bool m_finished;
> +};
> +
> +/* Create costs for VINFO.  COSTING_FOR_SCALAR is true if the costs
> +   are for scalar code, false if they are for vector code.  */
> +
> +inline
> +vector_costs::vector_costs (vec_info *vinfo, bool costing_for_scalar)
> +  : m_vinfo (vinfo),
> +    m_costing_for_scalar (costing_for_scalar),
> +    m_costs (),
> +    m_finished (false)
> +{
> +}
> +
> +/* Return the cost of the prologue code (in abstract units).  */
> +
> +inline unsigned int
> +vector_costs::prologue_cost () const
> +{
> +  gcc_checking_assert (m_finished);
> +  return m_costs[vect_prologue];
> +}
> +
> +/* Return the cost of the body code (in abstract units).  */
> +
> +inline unsigned int
> +vector_costs::body_cost () const
> +{
> +  gcc_checking_assert (m_finished);
> +  return m_costs[vect_body];
> +}
> +
> +/* Return the cost of the epilogue code (in abstract units).  */
> +
> +inline unsigned int
> +vector_costs::epilogue_cost () const
> +{
> +  gcc_checking_assert (m_finished);
> +  return m_costs[vect_epilogue];
> +}
> +
>  #define VECT_MAX_COST 1000
>  
>  /* The maximum number of intermediate steps required in multi-step type
> @@ -1531,29 +1628,28 @@ int vect_get_stmt_cost (enum vect_cost_for_stmt type_of_cost)
>  
>  /* Alias targetm.vectorize.init_cost.  */
>  
> -static inline void *
> -init_cost (class loop *loop_info, bool costing_for_scalar)
> +static inline vector_costs *
> +init_cost (vec_info *vinfo, bool costing_for_scalar)
>  {
> -  return targetm.vectorize.init_cost (loop_info, costing_for_scalar);
> +  return targetm.vectorize.create_costs (vinfo, costing_for_scalar);
>  }
>  
> -extern void dump_stmt_cost (FILE *, void *, int, enum vect_cost_for_stmt,
> +extern void dump_stmt_cost (FILE *, int, enum vect_cost_for_stmt,
>  			    stmt_vec_info, tree, int, unsigned,
>  			    enum vect_cost_model_location);
>  
>  /* Alias targetm.vectorize.add_stmt_cost.  */
>  
>  static inline unsigned
> -add_stmt_cost (vec_info *vinfo, void *data, int count,
> +add_stmt_cost (vector_costs *costs, int count,
>  	       enum vect_cost_for_stmt kind,
>  	       stmt_vec_info stmt_info, tree vectype, int misalign,
>  	       enum vect_cost_model_location where)
>  {
> -  unsigned cost = targetm.vectorize.add_stmt_cost (vinfo, data, count, kind,
> -						   stmt_info, vectype,
> -						   misalign, where);
> +  unsigned cost = costs->add_stmt_cost (count, kind, stmt_info, vectype,
> +					misalign, where);
>    if (dump_file && (dump_flags & TDF_DETAILS))
> -    dump_stmt_cost (dump_file, data, count, kind, stmt_info, vectype, misalign,
> +    dump_stmt_cost (dump_file, count, kind, stmt_info, vectype, misalign,
>  		    cost, where);
>    return cost;
>  }
> @@ -1561,36 +1657,31 @@ add_stmt_cost (vec_info *vinfo, void *data, int count,
>  /* Alias targetm.vectorize.add_stmt_cost.  */
>  
>  static inline unsigned
> -add_stmt_cost (vec_info *vinfo, void *data, stmt_info_for_cost *i)
> +add_stmt_cost (vector_costs *costs, stmt_info_for_cost *i)
>  {
> -  return add_stmt_cost (vinfo, data, i->count, i->kind, i->stmt_info,
> +  return add_stmt_cost (costs, i->count, i->kind, i->stmt_info,
>  			i->vectype, i->misalign, i->where);
>  }
>  
>  /* Alias targetm.vectorize.finish_cost.  */
>  
>  static inline void
> -finish_cost (void *data, unsigned *prologue_cost,
> +finish_cost (vector_costs *costs, unsigned *prologue_cost,
>  	     unsigned *body_cost, unsigned *epilogue_cost)
>  {
> -  targetm.vectorize.finish_cost (data, prologue_cost, body_cost, epilogue_cost);
> -}
> -
> -/* Alias targetm.vectorize.destroy_cost_data.  */
> -
> -static inline void
> -destroy_cost_data (void *data)
> -{
> -  targetm.vectorize.destroy_cost_data (data);
> +  costs->finish_cost ();
> +  *prologue_cost = costs->prologue_cost ();
> +  *body_cost = costs->body_cost ();
> +  *epilogue_cost = costs->epilogue_cost ();
>  }
>  
>  inline void
> -add_stmt_costs (vec_info *vinfo, void *data, stmt_vector_for_cost *cost_vec)
> +add_stmt_costs (vector_costs *costs, stmt_vector_for_cost *cost_vec)
>  {
>    stmt_info_for_cost *cost;
>    unsigned i;
>    FOR_EACH_VEC_ELT (*cost_vec, i, cost)
> -    add_stmt_cost (vinfo, data, cost->count, cost->kind, cost->stmt_info,
> +    add_stmt_cost (costs, cost->count, cost->kind, cost->stmt_info,
>  		   cost->vectype, cost->misalign, cost->where);
>  }
>  
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 76d99d247ae..93388ef9684 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -14523,11 +14523,15 @@ struct aarch64_sve_op_count : aarch64_vec_op_count
>  };
>  
>  /* Information about vector code that we're in the process of costing.  */
> -struct aarch64_vector_costs
> +struct aarch64_vector_costs : public vector_costs
>  {
> -  /* The normal latency-based costs for each region (prologue, body and
> -     epilogue), indexed by vect_cost_model_location.  */
> -  unsigned int region[3] = {};
> +  using vector_costs::vector_costs;
> +
> +  unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> +			      stmt_vec_info stmt_info, tree vectype,
> +			      int misalign,
> +			      vect_cost_model_location where) override;
> +  void finish_cost () override;
>  
>    /* True if we have performed one-time initialization based on the vec_info.
>  
> @@ -14593,11 +14597,11 @@ struct aarch64_vector_costs
>    hash_map<nofree_ptr_hash<_stmt_vec_info>, unsigned int> seen_loads;
>  };
>  
> -/* Implement TARGET_VECTORIZE_INIT_COST.  */
> -void *
> -aarch64_init_cost (class loop *, bool)
> +/* Implement TARGET_VECTORIZE_CREATE_COSTS.  */
> +vector_costs *
> +aarch64_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
>  {
> -  return new aarch64_vector_costs;
> +  return new aarch64_vector_costs (vinfo, costing_for_scalar);
>  }
>  
>  /* Return true if the current CPU should use the new costs defined
> @@ -15283,7 +15287,7 @@ aarch64_adjust_stmt_cost (vect_cost_for_stmt kind, stmt_vec_info stmt_info,
>  }
>  
>  /* VINFO, COSTS, COUNT, KIND, STMT_INFO and VECTYPE are the same as for
> -   TARGET_VECTORIZE_ADD_STMT_COST and they describe an operation in the
> +   vector_costs::add_stmt_cost and they describe an operation in the
>     body of a vector loop.  Record issue information relating to the vector
>     operation in OPS, where OPS is one of COSTS->scalar_ops, COSTS->advsimd_ops
>     or COSTS->sve_ops; see the comments above those variables for details.
> @@ -15479,32 +15483,29 @@ aarch64_count_ops (class vec_info *vinfo, aarch64_vector_costs *costs,
>      }
>  }
>  
> -/* Implement targetm.vectorize.add_stmt_cost.  */
> -static unsigned
> -aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> -		       enum vect_cost_for_stmt kind,
> -		       struct _stmt_vec_info *stmt_info, tree vectype,
> -		       int misalign, enum vect_cost_model_location where)
> +unsigned
> +aarch64_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> +				     stmt_vec_info stmt_info, tree vectype,
> +				     int misalign,
> +				     vect_cost_model_location where)
>  {
> -  auto *costs = static_cast<aarch64_vector_costs *> (data);
> -
>    fractional_cost stmt_cost
>      = aarch64_builtin_vectorization_cost (kind, vectype, misalign);
>  
>    bool in_inner_loop_p = (where == vect_body
>  			  && stmt_info
> -			  && stmt_in_inner_loop_p (vinfo, stmt_info));
> +			  && stmt_in_inner_loop_p (m_vinfo, stmt_info));
>  
>    /* Do one-time initialization based on the vinfo.  */
> -  loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo);
> -  bb_vec_info bb_vinfo = dyn_cast<bb_vec_info> (vinfo);
> -  if (!costs->analyzed_vinfo && aarch64_use_new_vector_costs_p ())
> +  loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (m_vinfo);
> +  bb_vec_info bb_vinfo = dyn_cast<bb_vec_info> (m_vinfo);
> +  if (!analyzed_vinfo && aarch64_use_new_vector_costs_p ())
>      {
>        if (loop_vinfo)
> -	aarch64_analyze_loop_vinfo (loop_vinfo, costs);
> +	aarch64_analyze_loop_vinfo (loop_vinfo, this);
>        else
> -	aarch64_analyze_bb_vinfo (bb_vinfo, costs);
> -      costs->analyzed_vinfo = true;
> +	aarch64_analyze_bb_vinfo (bb_vinfo, this);
> +      this->analyzed_vinfo = true;
>      }
>  
>    /* Try to get a more accurate cost by looking at STMT_INFO instead
> @@ -15512,7 +15513,7 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
>    if (stmt_info && aarch64_use_new_vector_costs_p ())
>      {
>        if (vectype && aarch64_sve_only_stmt_p (stmt_info, vectype))
> -	costs->saw_sve_only_op = true;
> +	this->saw_sve_only_op = true;
>  
>        /* If we scalarize a strided store, the vectorizer costs one
>  	 vec_to_scalar for each element.  However, we can store the first
> @@ -15521,17 +15522,17 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
>  	count -= 1;
>  
>        stmt_cost = aarch64_detect_scalar_stmt_subtype
> -	(vinfo, kind, stmt_info, stmt_cost);
> +	(m_vinfo, kind, stmt_info, stmt_cost);
>  
> -      if (vectype && costs->vec_flags)
> -	stmt_cost = aarch64_detect_vector_stmt_subtype (vinfo, kind,
> +      if (vectype && this->vec_flags)
> +	stmt_cost = aarch64_detect_vector_stmt_subtype (m_vinfo, kind,
>  							stmt_info, vectype,
>  							where, stmt_cost);
>      }
>  
>    /* Do any SVE-specific adjustments to the cost.  */
>    if (stmt_info && vectype && aarch64_sve_mode_p (TYPE_MODE (vectype)))
> -    stmt_cost = aarch64_sve_adjust_stmt_cost (vinfo, kind, stmt_info,
> +    stmt_cost = aarch64_sve_adjust_stmt_cost (m_vinfo, kind, stmt_info,
>  					      vectype, stmt_cost);
>  
>    if (stmt_info && aarch64_use_new_vector_costs_p ())
> @@ -15547,36 +15548,36 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
>        auto *issue_info = aarch64_tune_params.vec_costs->issue_info;
>        if (loop_vinfo
>  	  && issue_info
> -	  && costs->vec_flags
> +	  && this->vec_flags
>  	  && where == vect_body
>  	  && (!LOOP_VINFO_LOOP (loop_vinfo)->inner || in_inner_loop_p)
>  	  && vectype
>  	  && stmt_cost != 0)
>  	{
>  	  /* Record estimates for the scalar code.  */
> -	  aarch64_count_ops (vinfo, costs, count, kind, stmt_info, vectype,
> -			     0, &costs->scalar_ops, issue_info->scalar,
> +	  aarch64_count_ops (m_vinfo, this, count, kind, stmt_info, vectype,
> +			     0, &this->scalar_ops, issue_info->scalar,
>  			     vect_nunits_for_cost (vectype));
>  
> -	  if (aarch64_sve_mode_p (vinfo->vector_mode) && issue_info->sve)
> +	  if (aarch64_sve_mode_p (m_vinfo->vector_mode) && issue_info->sve)
>  	    {
>  	      /* Record estimates for a possible Advanced SIMD version
>  		 of the SVE code.  */
> -	      aarch64_count_ops (vinfo, costs, count, kind, stmt_info,
> -				 vectype, VEC_ADVSIMD, &costs->advsimd_ops,
> +	      aarch64_count_ops (m_vinfo, this, count, kind, stmt_info,
> +				 vectype, VEC_ADVSIMD, &this->advsimd_ops,
>  				 issue_info->advsimd,
>  				 aarch64_estimated_sve_vq ());
>  
>  	      /* Record estimates for the SVE code itself.  */
> -	      aarch64_count_ops (vinfo, costs, count, kind, stmt_info,
> -				 vectype, VEC_ANY_SVE, &costs->sve_ops,
> +	      aarch64_count_ops (m_vinfo, this, count, kind, stmt_info,
> +				 vectype, VEC_ANY_SVE, &this->sve_ops,
>  				 issue_info->sve, 1);
>  	    }
>  	  else
>  	    /* Record estimates for the Advanced SIMD code.  Treat SVE like
>  	       Advanced SIMD if the CPU has no specific SVE costs.  */
> -	    aarch64_count_ops (vinfo, costs, count, kind, stmt_info,
> -			       vectype, VEC_ADVSIMD, &costs->advsimd_ops,
> +	    aarch64_count_ops (m_vinfo, this, count, kind, stmt_info,
> +			       vectype, VEC_ADVSIMD, &this->advsimd_ops,
>  			       issue_info->advsimd, 1);
>  	}
>  
> @@ -15585,24 +15586,11 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
>  	 loop.  For simplicitly, we assume that one iteration of the
>  	 Advanced SIMD loop would need the same number of statements
>  	 as one iteration of the SVE loop.  */
> -      if (where == vect_body && costs->unrolled_advsimd_niters)
> -	costs->unrolled_advsimd_stmts
> -	  += count * costs->unrolled_advsimd_niters;
> +      if (where == vect_body && this->unrolled_advsimd_niters)
> +	this->unrolled_advsimd_stmts
> +	  += count * this->unrolled_advsimd_niters;
>      }
> -
> -  /* Statements in an inner loop relative to the loop being
> -     vectorized are weighted more heavily.  The value here is
> -     arbitrary and could potentially be improved with analysis.  */
> -  if (in_inner_loop_p)
> -    {
> -      gcc_assert (loop_vinfo);
> -      count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); /*  FIXME  */
> -    }
> -
> -  unsigned retval = (count * stmt_cost).ceil ();
> -  costs->region[where] += retval;
> -
> -  return retval;
> +  return record_stmt_cost (stmt_info, where, (count * stmt_cost).ceil ());
>  }
>  
>  /* Dump information about the structure.  */
> @@ -15966,27 +15954,15 @@ aarch64_adjust_body_cost (aarch64_vector_costs *costs, unsigned int body_cost)
>    return body_cost;
>  }
>  
> -/* Implement TARGET_VECTORIZE_FINISH_COST.  */
> -static void
> -aarch64_finish_cost (void *data, unsigned *prologue_cost,
> -		     unsigned *body_cost, unsigned *epilogue_cost)
> +void
> +aarch64_vector_costs::finish_cost ()
>  {
> -  auto *costs = static_cast<aarch64_vector_costs *> (data);
> -  *prologue_cost = costs->region[vect_prologue];
> -  *body_cost     = costs->region[vect_body];
> -  *epilogue_cost = costs->region[vect_epilogue];
> -
> -  if (costs->is_loop
> -      && costs->vec_flags
> +  if (this->is_loop
> +      && this->vec_flags
>        && aarch64_use_new_vector_costs_p ())
> -    *body_cost = aarch64_adjust_body_cost (costs, *body_cost);
> -}
> +    m_costs[vect_body] = aarch64_adjust_body_cost (this, m_costs[vect_body]);
>  
> -/* Implement TARGET_VECTORIZE_DESTROY_COST_DATA.  */
> -static void
> -aarch64_destroy_cost_data (void *data)
> -{
> -  delete static_cast<aarch64_vector_costs *> (data);
> +  vector_costs::finish_cost ();
>  }
>  
>  static void initialize_aarch64_code_model (struct gcc_options *);
> @@ -26285,17 +26261,8 @@ aarch64_libgcc_floating_mode_supported_p
>  #undef TARGET_ARRAY_MODE_SUPPORTED_P
>  #define TARGET_ARRAY_MODE_SUPPORTED_P aarch64_array_mode_supported_p
>  
> -#undef TARGET_VECTORIZE_INIT_COST
> -#define TARGET_VECTORIZE_INIT_COST aarch64_init_cost
> -
> -#undef TARGET_VECTORIZE_ADD_STMT_COST
> -#define TARGET_VECTORIZE_ADD_STMT_COST aarch64_add_stmt_cost
> -
> -#undef TARGET_VECTORIZE_FINISH_COST
> -#define TARGET_VECTORIZE_FINISH_COST aarch64_finish_cost
> -
> -#undef TARGET_VECTORIZE_DESTROY_COST_DATA
> -#define TARGET_VECTORIZE_DESTROY_COST_DATA aarch64_destroy_cost_data
> +#undef TARGET_VECTORIZE_CREATE_COSTS
> +#define TARGET_VECTORIZE_CREATE_COSTS aarch64_vectorize_create_costs
>  
>  #undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST
>  #define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index fb656094e9e..e40ae2b9c49 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -22842,26 +22842,30 @@ ix86_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info)
>    return default_noce_conversion_profitable_p (seq, if_info);
>  }
>  
> -/* Implement targetm.vectorize.init_cost.  */
> +/* x86-specific vector costs.  */
> +class ix86_vector_costs : public vector_costs
> +{
> +  using vector_costs::vector_costs;
> +
> +  unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> +			      stmt_vec_info stmt_info, tree vectype,
> +			      int misalign,
> +			      vect_cost_model_location where) override;
> +};
>  
> -static void *
> -ix86_init_cost (class loop *, bool)
> +/* Implement targetm.vectorize.create_costs.  */
> +
> +static vector_costs *
> +ix86_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
>  {
> -  unsigned *cost = XNEWVEC (unsigned, 3);
> -  cost[vect_prologue] = cost[vect_body] = cost[vect_epilogue] = 0;
> -  return cost;
> +  return new ix86_vector_costs (vinfo, costing_for_scalar);
>  }
>  
> -/* Implement targetm.vectorize.add_stmt_cost.  */
> -
> -static unsigned
> -ix86_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> -		    enum vect_cost_for_stmt kind,
> -		    class _stmt_vec_info *stmt_info, tree vectype,
> -		    int misalign,
> -		    enum vect_cost_model_location where)
> +unsigned
> +ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> +				  stmt_vec_info stmt_info, tree vectype,
> +				  int misalign, vect_cost_model_location where)
>  {
> -  unsigned *cost = (unsigned *) data;
>    unsigned retval = 0;
>    bool scalar_p
>      = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store);
> @@ -23032,15 +23036,7 @@ ix86_add_stmt_cost (class vec_info *vinfo, void *data, int count,
>    /* Statements in an inner loop relative to the loop being
>       vectorized are weighted more heavily.  The value here is
>       arbitrary and could potentially be improved with analysis.  */
> -  if (where == vect_body && stmt_info
> -      && stmt_in_inner_loop_p (vinfo, stmt_info))
> -    {
> -      loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo);
> -      gcc_assert (loop_vinfo);
> -      count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); /* FIXME.  */
> -    }
> -
> -  retval = (unsigned) (count * stmt_cost);
> +  retval = adjust_cost_for_freq (stmt_info, where, count * stmt_cost);
>  
>    /* We need to multiply all vector stmt cost by 1.7 (estimated cost)
>       for Silvermont as it has out of order integer pipeline and can execute
> @@ -23055,31 +23051,11 @@ ix86_add_stmt_cost (class vec_info *vinfo, void *data, int count,
>  	retval = (retval * 17) / 10;
>      }
>  
> -  cost[where] += retval;
> +  m_costs[where] += retval;
>  
>    return retval;
>  }
>  
> -/* Implement targetm.vectorize.finish_cost.  */
> -
> -static void
> -ix86_finish_cost (void *data, unsigned *prologue_cost,
> -		  unsigned *body_cost, unsigned *epilogue_cost)
> -{
> -  unsigned *cost = (unsigned *) data;
> -  *prologue_cost = cost[vect_prologue];
> -  *body_cost     = cost[vect_body];
> -  *epilogue_cost = cost[vect_epilogue];
> -}
> -
> -/* Implement targetm.vectorize.destroy_cost_data.  */
> -
> -static void
> -ix86_destroy_cost_data (void *data)
> -{
> -  free (data);
> -}
> -
>  /* Validate target specific memory model bits in VAL. */
>  
>  static unsigned HOST_WIDE_INT
> @@ -24363,14 +24339,8 @@ ix86_libgcc_floating_mode_supported_p
>    ix86_autovectorize_vector_modes
>  #undef TARGET_VECTORIZE_GET_MASK_MODE
>  #define TARGET_VECTORIZE_GET_MASK_MODE ix86_get_mask_mode
> -#undef TARGET_VECTORIZE_INIT_COST
> -#define TARGET_VECTORIZE_INIT_COST ix86_init_cost
> -#undef TARGET_VECTORIZE_ADD_STMT_COST
> -#define TARGET_VECTORIZE_ADD_STMT_COST ix86_add_stmt_cost
> -#undef TARGET_VECTORIZE_FINISH_COST
> -#define TARGET_VECTORIZE_FINISH_COST ix86_finish_cost
> -#undef TARGET_VECTORIZE_DESTROY_COST_DATA
> -#define TARGET_VECTORIZE_DESTROY_COST_DATA ix86_destroy_cost_data
> +#undef TARGET_VECTORIZE_CREATE_COSTS
> +#define TARGET_VECTORIZE_CREATE_COSTS ix86_vectorize_create_costs
>  
>  #undef TARGET_SET_CURRENT_FUNCTION
>  #define TARGET_SET_CURRENT_FUNCTION ix86_set_current_function
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 01a95591a5d..2e7b3bcad7e 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -1452,14 +1452,8 @@ static const struct attribute_spec rs6000_attribute_table[] =
>  #undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE
>  #define TARGET_VECTORIZE_PREFERRED_SIMD_MODE \
>    rs6000_preferred_simd_mode
> -#undef TARGET_VECTORIZE_INIT_COST
> -#define TARGET_VECTORIZE_INIT_COST rs6000_init_cost
> -#undef TARGET_VECTORIZE_ADD_STMT_COST
> -#define TARGET_VECTORIZE_ADD_STMT_COST rs6000_add_stmt_cost
> -#undef TARGET_VECTORIZE_FINISH_COST
> -#define TARGET_VECTORIZE_FINISH_COST rs6000_finish_cost
> -#undef TARGET_VECTORIZE_DESTROY_COST_DATA
> -#define TARGET_VECTORIZE_DESTROY_COST_DATA rs6000_destroy_cost_data
> +#undef TARGET_VECTORIZE_CREATE_COSTS
> +#define TARGET_VECTORIZE_CREATE_COSTS rs6000_vectorize_create_costs
>  
>  #undef TARGET_LOOP_UNROLL_ADJUST
>  #define TARGET_LOOP_UNROLL_ADJUST rs6000_loop_unroll_adjust
> @@ -5263,21 +5257,33 @@ rs6000_preferred_simd_mode (scalar_mode mode)
>    return word_mode;
>  }
>  
> -struct rs6000_cost_data
> +class rs6000_cost_data : public vector_costs
>  {
> -  struct loop *loop_info;
> -  unsigned cost[3];
> +public:
> +  using vector_costs::vector_costs;
> +
> +  unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> +			      stmt_vec_info stmt_info, tree vectype,
> +			      int misalign,
> +			      vect_cost_model_location where) override;
> +  void finish_cost () override;
> +
> +protected:
> +  void update_target_cost_per_stmt (vect_cost_for_stmt, stmt_vec_info,
> +				    vect_cost_model_location, int,
> +				    unsigned int);
> +  void density_test (loop_vec_info);
> +  void adjust_vect_cost_per_loop (loop_vec_info);
> +
>    /* Total number of vectorized stmts (loop only).  */
> -  unsigned nstmts;
> +  unsigned m_nstmts = 0;
>    /* Total number of loads (loop only).  */
> -  unsigned nloads;
> +  unsigned m_nloads = 0;
>    /* Possible extra penalized cost on vector construction (loop only).  */
> -  unsigned extra_ctor_cost;
> +  unsigned m_extra_ctor_cost = 0;
>    /* For each vectorized loop, this var holds TRUE iff a non-memory vector
>       instruction is needed by the vectorization.  */
> -  bool vect_nonmem;
> -  /* Indicates this is costing for the scalar version of a loop or block.  */
> -  bool costing_for_scalar;
> +  bool m_vect_nonmem = false;
>  };
>  
>  /* Test for likely overcommitment of vector hardware resources.  If a
> @@ -5286,20 +5292,19 @@ struct rs6000_cost_data
>     adequately reflect delays from unavailable vector resources.
>     Penalize the loop body cost for this case.  */
>  
> -static void
> -rs6000_density_test (rs6000_cost_data *data)
> +void
> +rs6000_cost_data::density_test (loop_vec_info loop_vinfo)
>  {
>    /* This density test only cares about the cost of vector version of the
>       loop, so immediately return if we are passed costing for the scalar
>       version (namely computing single scalar iteration cost).  */
> -  if (data->costing_for_scalar)
> +  if (m_costing_for_scalar)
>      return;
>  
> -  struct loop *loop = data->loop_info;
> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>    basic_block *bbs = get_loop_body (loop);
>    int nbbs = loop->num_nodes;
> -  loop_vec_info loop_vinfo = loop_vec_info_for_loop (data->loop_info);
> -  int vec_cost = data->cost[vect_body], not_vec_cost = 0;
> +  int vec_cost = m_costs[vect_body], not_vec_cost = 0;
>  
>    for (int i = 0; i < nbbs; i++)
>      {
> @@ -5326,7 +5331,7 @@ rs6000_density_test (rs6000_cost_data *data)
>    if (density_pct > rs6000_density_pct_threshold
>        && vec_cost + not_vec_cost > rs6000_density_size_threshold)
>      {
> -      data->cost[vect_body] = vec_cost * (100 + rs6000_density_penalty) / 100;
> +      m_costs[vect_body] = vec_cost * (100 + rs6000_density_penalty) / 100;
>        if (dump_enabled_p ())
>  	dump_printf_loc (MSG_NOTE, vect_location,
>  			 "density %d%%, cost %d exceeds threshold, penalizing "
> @@ -5336,10 +5341,10 @@ rs6000_density_test (rs6000_cost_data *data)
>  
>    /* Check whether we need to penalize the body cost to account
>       for excess strided or elementwise loads.  */
> -  if (data->extra_ctor_cost > 0)
> +  if (m_extra_ctor_cost > 0)
>      {
> -      gcc_assert (data->nloads <= data->nstmts);
> -      unsigned int load_pct = (data->nloads * 100) / data->nstmts;
> +      gcc_assert (m_nloads <= m_nstmts);
> +      unsigned int load_pct = (m_nloads * 100) / m_nstmts;
>  
>        /* It's likely to be bounded by latency and execution resources
>  	 from many scalar loads which are strided or elementwise loads
> @@ -5351,10 +5356,10 @@ rs6000_density_test (rs6000_cost_data *data)
>  	      the loads.
>  	 One typical case is the innermost loop of the hotspot of SPEC2017
>  	 503.bwaves_r without loop interchange.  */
> -      if (data->nloads > (unsigned int) rs6000_density_load_num_threshold
> +      if (m_nloads > (unsigned int) rs6000_density_load_num_threshold
>  	  && load_pct > (unsigned int) rs6000_density_load_pct_threshold)
>  	{
> -	  data->cost[vect_body] += data->extra_ctor_cost;
> +	  m_costs[vect_body] += m_extra_ctor_cost;
>  	  if (dump_enabled_p ())
>  	    dump_printf_loc (MSG_NOTE, vect_location,
>  			     "Found %u loads and "
> @@ -5363,28 +5368,18 @@ rs6000_density_test (rs6000_cost_data *data)
>  			     "penalizing loop body "
>  			     "cost by extra cost %u "
>  			     "for ctor.\n",
> -			     data->nloads, load_pct,
> -			     data->extra_ctor_cost);
> +			     m_nloads, load_pct,
> +			     m_extra_ctor_cost);
>  	}
>      }
>  }
>  
> -/* Implement targetm.vectorize.init_cost.  */
> +/* Implement targetm.vectorize.create_costs.  */
>  
> -static void *
> -rs6000_init_cost (struct loop *loop_info, bool costing_for_scalar)
> +static vector_costs *
> +rs6000_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
>  {
> -  rs6000_cost_data *data = XNEW (rs6000_cost_data);
> -  data->loop_info = loop_info;
> -  data->cost[vect_prologue] = 0;
> -  data->cost[vect_body]     = 0;
> -  data->cost[vect_epilogue] = 0;
> -  data->vect_nonmem = false;
> -  data->nstmts = 0;
> -  data->nloads = 0;
> -  data->extra_ctor_cost = 0;
> -  data->costing_for_scalar = costing_for_scalar;
> -  return data;
> +  return new rs6000_cost_data (vinfo, costing_for_scalar);
>  }
>  
>  /* Adjust vectorization cost after calling rs6000_builtin_vectorization_cost.
> @@ -5413,13 +5408,12 @@ rs6000_adjust_vect_cost_per_stmt (enum vect_cost_for_stmt kind,
>  /* Helper function for add_stmt_cost.  Check each statement cost
>     entry, gather information and update the target_cost fields
>     accordingly.  */
> -static void
> -rs6000_update_target_cost_per_stmt (rs6000_cost_data *data,
> -				    enum vect_cost_for_stmt kind,
> -				    struct _stmt_vec_info *stmt_info,
> -				    enum vect_cost_model_location where,
> -				    int stmt_cost,
> -				    unsigned int orig_count)
> +void
> +rs6000_cost_data::update_target_cost_per_stmt (vect_cost_for_stmt kind,
> +					       stmt_vec_info stmt_info,
> +					       vect_cost_model_location where,
> +					       int stmt_cost,
> +					       unsigned int orig_count)
>  {
>  
>    /* Check whether we're doing something other than just a copy loop.
> @@ -5431,17 +5425,19 @@ rs6000_update_target_cost_per_stmt (rs6000_cost_data *data,
>        || kind == vec_construct
>        || kind == scalar_to_vec
>        || (where == vect_body && kind == vector_stmt))
> -    data->vect_nonmem = true;
> +    m_vect_nonmem = true;
>  
>    /* Gather some information when we are costing the vectorized instruction
>       for the statements located in a loop body.  */
> -  if (!data->costing_for_scalar && data->loop_info && where == vect_body)
> +  if (!m_costing_for_scalar
> +      && is_a<loop_vec_info> (m_vinfo)
> +      && where == vect_body)
>      {
> -      data->nstmts += orig_count;
> +      m_nstmts += orig_count;
>  
>        if (kind == scalar_load || kind == vector_load
>  	  || kind == unaligned_load || kind == vector_gather_load)
> -	data->nloads += orig_count;
> +	m_nloads += orig_count;
>  
>        /* Power processors do not currently have instructions for strided
>  	 and elementwise loads, and instead we must generate multiple
> @@ -5469,20 +5465,16 @@ rs6000_update_target_cost_per_stmt (rs6000_cost_data *data,
>  	  const unsigned int MAX_PENALIZED_COST_FOR_CTOR = 12;
>  	  if (extra_cost > MAX_PENALIZED_COST_FOR_CTOR)
>  	    extra_cost = MAX_PENALIZED_COST_FOR_CTOR;
> -	  data->extra_ctor_cost += extra_cost;
> +	  m_extra_ctor_cost += extra_cost;
>  	}
>      }
>  }
>  
> -/* Implement targetm.vectorize.add_stmt_cost.  */
> -
> -static unsigned
> -rs6000_add_stmt_cost (class vec_info *vinfo, void *data, int count,
> -		      enum vect_cost_for_stmt kind,
> -		      struct _stmt_vec_info *stmt_info, tree vectype,
> -		      int misalign, enum vect_cost_model_location where)
> +unsigned
> +rs6000_cost_data::add_stmt_cost (int count, vect_cost_for_stmt kind,
> +				 stmt_vec_info stmt_info, tree vectype,
> +				 int misalign, vect_cost_model_location where)
>  {
> -  rs6000_cost_data *cost_data = (rs6000_cost_data*) data;
>    unsigned retval = 0;
>  
>    if (flag_vect_cost_model)
> @@ -5494,19 +5486,11 @@ rs6000_add_stmt_cost (class vec_info *vinfo, void *data, int count,
>  	 vectorized are weighted more heavily.  The value here is
>  	 arbitrary and could potentially be improved with analysis.  */
>        unsigned int orig_count = count;
> -      if (where == vect_body && stmt_info
> -	  && stmt_in_inner_loop_p (vinfo, stmt_info))
> -	{
> -	  loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo);
> -	  gcc_assert (loop_vinfo);
> -	  count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); /* FIXME.  */
> -	}
> -
> -      retval = (unsigned) (count * stmt_cost);
> -      cost_data->cost[where] += retval;
> +      retval = adjust_cost_for_freq (stmt_info, where, count * stmt_cost);
> +      m_costs[where] += retval;
>  
> -      rs6000_update_target_cost_per_stmt (cost_data, kind, stmt_info, where,
> -					  stmt_cost, orig_count);
> +      update_target_cost_per_stmt (kind, stmt_info, where,
> +				   stmt_cost, orig_count);
>      }
>  
>    return retval;
> @@ -5518,13 +5502,9 @@ rs6000_add_stmt_cost (class vec_info *vinfo, void *data, int count,
>     vector with length by counting number of required lengths under condition
>     LOOP_VINFO_FULLY_WITH_LENGTH_P.  */
>  
> -static void
> -rs6000_adjust_vect_cost_per_loop (rs6000_cost_data *data)
> +void
> +rs6000_cost_data::adjust_vect_cost_per_loop (loop_vec_info loop_vinfo)
>  {
> -  struct loop *loop = data->loop_info;
> -  gcc_assert (loop);
> -  loop_vec_info loop_vinfo = loop_vec_info_for_loop (loop);
> -
>    if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
>      {
>        rgroup_controls *rgc;
> @@ -5535,49 +5515,29 @@ rs6000_adjust_vect_cost_per_loop (rs6000_cost_data *data)
>  	  /* Each length needs one shift to fill into bits 0-7.  */
>  	  shift_cnt += num_vectors_m1 + 1;
>  
> -      rs6000_add_stmt_cost (loop_vinfo, (void *) data, shift_cnt, scalar_stmt,
> -			    NULL, NULL_TREE, 0, vect_body);
> +      add_stmt_cost (shift_cnt, scalar_stmt, NULL, NULL_TREE, 0, vect_body);
>      }
>  }
>  
> -/* Implement targetm.vectorize.finish_cost.  */
> -
> -static void
> -rs6000_finish_cost (void *data, unsigned *prologue_cost,
> -		    unsigned *body_cost, unsigned *epilogue_cost)
> +void
> +rs6000_cost_data::finish_cost ()
>  {
> -  rs6000_cost_data *cost_data = (rs6000_cost_data*) data;
> -
> -  if (cost_data->loop_info)
> +  if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (m_vinfo))
>      {
> -      rs6000_adjust_vect_cost_per_loop (cost_data);
> -      rs6000_density_test (cost_data);
> -    }
> +      adjust_vect_cost_per_loop (loop_vinfo);
> +      density_test (loop_vinfo);
>  
> -  /* Don't vectorize minimum-vectorization-factor, simple copy loops
> -     that require versioning for any reason.  The vectorization is at
> -     best a wash inside the loop, and the versioning checks make
> -     profitability highly unlikely and potentially quite harmful.  */
> -  if (cost_data->loop_info)
> -    {
> -      loop_vec_info vec_info = loop_vec_info_for_loop (cost_data->loop_info);
> -      if (!cost_data->vect_nonmem
> -	  && LOOP_VINFO_VECT_FACTOR (vec_info) == 2
> -	  && LOOP_REQUIRES_VERSIONING (vec_info))
> -	cost_data->cost[vect_body] += 10000;
> +      /* Don't vectorize minimum-vectorization-factor, simple copy loops
> +	 that require versioning for any reason.  The vectorization is at
> +	 best a wash inside the loop, and the versioning checks make
> +	 profitability highly unlikely and potentially quite harmful.  */
> +      if (!m_vect_nonmem
> +	  && LOOP_VINFO_VECT_FACTOR (loop_vinfo) == 2
> +	  && LOOP_REQUIRES_VERSIONING (loop_vinfo))
> +	m_costs[vect_body] += 10000;
>      }
>  
> -  *prologue_cost = cost_data->cost[vect_prologue];
> -  *body_cost     = cost_data->cost[vect_body];
> -  *epilogue_cost = cost_data->cost[vect_epilogue];
> -}
> -
> -/* Implement targetm.vectorize.destroy_cost_data.  */
> -
> -static void
> -rs6000_destroy_cost_data (void *data)
> -{
> -  free (data);
> +  vector_costs::finish_cost ();
>  }
>  
>  /* Implement targetm.loop_unroll_adjust.  */
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

     prev parent reply	other threads:[~2021-10-21 12:29 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-14 13:04 Richard Sandiford
2021-10-21 12:29 ` Richard Biener [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=r53843o4-83o-1q20-2556-spps5o363811@fhfr.qr \
    --to=rguenther@suse.de \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongtao.liu@intel.com \
    --cc=hubicka@ucw.cz \
    --cc=kirill.yukhin@gmail.com \
    --cc=richard.sandiford@arm.com \
    --cc=segher@kernel.crashing.org \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).