From: Richard Sandiford <richard.sandiford@arm.com>
To: "Andre Vieira \(lists\)" <andre.simoesdiasvieira@arm.com>
Cc: "gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
Subject: Re: [aarch64] Update reg-costs to differentiate between memmove costs
Date: Wed, 16 Mar 2022 17:10:41 +0000 [thread overview]
Message-ID: <mptpmml7rhq.fsf@arm.com> (raw)
In-Reply-To: <e9f205c9-96e7-0825-11d5-d3f312d1984e@arm.com> (Andre Vieira's message of "Wed, 16 Mar 2022 14:56:57 +0000")
"Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes:
> This patch introduces a struct to differentiate between different
> memmove costs to enable a better modeling of memory operations. These
> have been modelled for
> -mcpu/-mtune=neoverse-v1/neoverse-n1/neoverse-n2/neoverse-512tvb, for
> all other tunings all entries are equal to the old single memmove cost
> to ensure the behaviour remains the same.
Thanks for doing this. Having the same cost for loads and stores
has been a long-standing wart.
> 2022-03-16 Tamar Christina <tamar.christina@arm.com>
> Andre Vieira <andre.simoesdiasvieira@arm.com>
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-protos.h (struct cpu_memmov_cost): New
> struct.
> (struct tune_params): Change type of memmov_cost to use
> cpu_memmov_cost.
> * config/aarch64/aarch64.cc (aarch64_memory_move_cost): Update
> all tunings
> to use new cpu_memmov_cost struct.
>
> diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
> index f2fde35c6eb4989af8736db8fad004171c160282..5190eb8b96ea9af809a28470905b8b85ee720b09 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -508,6 +508,18 @@ struct cpu_prefetch_tune
> const int default_opt_level;
> };
>
> +/* Model the costs for loads/stores for reload so that it can do more
I'd say s/reload/the register allocators/ here, since the costs affect
decisions made by IRA too.
> + accurate spill heuristics. */
> +struct cpu_memmov_cost
> +{
> + int load_int;
> + int store_int;
> + int load_fp;
> + int store_fp;
> + int load_pred;
> + int store_pred;
> +};
> +
> struct tune_params
> {
> const struct cpu_cost_table *insn_extra_cost;
> […]
> @@ -14501,12 +14633,41 @@ aarch64_register_move_cost (machine_mode mode,
> return regmove_cost->FP2FP;
> }
>
> +/* Implements TARGET_MEMORY_MOVE_COST. */
> static int
> -aarch64_memory_move_cost (machine_mode mode ATTRIBUTE_UNUSED,
> - reg_class_t rclass ATTRIBUTE_UNUSED,
> - bool in ATTRIBUTE_UNUSED)
> +aarch64_memory_move_cost (machine_mode mode, reg_class_t rclass_i, bool in)
> {
> - return aarch64_tune_params.memmov_cost;
> + enum reg_class rclass = (enum reg_class) rclass_i;
> + switch (rclass)
> + {
> + case PR_LO_REGS:
> + case PR_HI_REGS:
> + case PR_REGS:
> + return in ? aarch64_tune_params.memmov_cost.load_pred
> + : aarch64_tune_params.memmov_cost.store_pred;
> + case POINTER_AND_FP_REGS:
> + case ALL_REGS:
> + {
> + if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
> + return in ? aarch64_tune_params.memmov_cost.load_pred
> + : aarch64_tune_params.memmov_cost.store_pred;
> +
> + if (VECTOR_MODE_P (mode) || FLOAT_MODE_P (mode))
> + return in ? aarch64_tune_params.memmov_cost.load_fp
> + : aarch64_tune_params.memmov_cost.store_fp;
> +
> + return in ? aarch64_tune_params.memmov_cost.load_int
> + : aarch64_tune_params.memmov_cost.store_int;
> + }
> + case FP_LO8_REGS:
> + case FP_LO_REGS:
> + case FP_REGS:
> + return in ? aarch64_tune_params.memmov_cost.load_fp
> + : aarch64_tune_params.memmov_cost.store_fp;
> + default:
> + return in ? aarch64_tune_params.memmov_cost.load_int
> + : aarch64_tune_params.memmov_cost.store_int;
> + }
> }
It would be good to avoid listing individual subclasses if possible,
since it's easy for the list to get out of date if more subclasses
are added.
An alternative would be:
if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
? reg_classes_intersect_p (rclass, PR_REGS)
: reg_class_subset_p (rclass, PR_REGS))
return (in
? aarch64_tune_params.memmov_cost.load_pred
: aarch64_tune_params.memmov_cost.store_pred);
if (VECTOR_MODE_P (mode) || FLOAT_MODE_P (mode)
? reg_classes_intersect_p (rclass, FP_REGS)
: reg_class_subset_p (rclass, FP_REGS))
return (in
? aarch64_tune_params.memmov_cost.load_fp
: aarch64_tune_params.memmov_cost.store_fp);
return (in
? aarch64_tune_params.memmov_cost.load_int
: aarch64_tune_params.memmov_cost.store_int);
OK with that change, if it works.
Thanks,
Richard
prev parent reply other threads:[~2022-03-16 17:10 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-16 14:56 Andre Vieira (lists)
2022-03-16 17:10 ` Richard Sandiford [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mptpmml7rhq.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=Kyrylo.Tkachov@arm.com \
--cc=andre.simoesdiasvieira@arm.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).