public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: Hongyu Wang <hongyu.wang@intel.com>
Cc: gcc-patches@gcc.gnu.org, hongtao.liu@intel.com
Subject: Re: [PATCH V2] i386: Inline function with default arch/tune to caller
Date: Tue, 4 Jul 2023 08:18:52 +0200	[thread overview]
Message-ID: <CAFULd4Y7EY6r=WnpCbQko3enqFFP=qtbKakhdZakw51uUxpK+A@mail.gmail.com> (raw)
In-Reply-To: <20230704031244.1074834-1-hongyu.wang@intel.com>

On Tue, Jul 4, 2023 at 5:12 AM Hongyu Wang <hongyu.wang@intel.com> wrote:
>
> Hi,
>
> For function with different target attributes, current logic rejects to
> inline the callee when any arch or tune is mismatched. Relax the
> condition to allow callee with default arch/tune to be inlined.
>
> Boostrapped/regtested on x86-64-linux-gnu{-m32,}.
>
> Ok for trunk?
>
> gcc/ChangeLog:
>
>         * config/i386/i386.cc (ix86_can_inline_p): If callee has
>         default arch=x86-64 and tune=generic, do not block the
>         inlining to its caller.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/i386/inline_target_clones.c: New test.

OK.

In a follow-up patch, can you please document inlining rules involving
-march and -mtune to "x86 Function Attributes" section? Currently, the
inlining rules at the end of "target function attribute" section does
not even mention -march and -mtune. Maybe a subsubsection "Inlining
rules" should be added (like AArch64 has) to mention that only default
arch and tune are inlined by default (but inline can be forced with
always_inline for different mtune flags).

Looking at the above, perhaps inlining of different arches can also be
forced with always_inline? This would allow developers some control of
inlining, and would not be surprising.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.cc                       | 22 +++++++++++------
>  .../gcc.target/i386/inline_target_clones.c    | 24 +++++++++++++++++++
>  2 files changed, 39 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/inline_target_clones.c
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 8989985700a..4741c9b5364 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -605,13 +605,6 @@ ix86_can_inline_p (tree caller, tree callee)
>                != (callee_opts->x_target_flags & ~always_inline_safe_mask))
>      ret = false;
>
> -  /* See if arch, tune, etc. are the same.  */
> -  else if (caller_opts->arch != callee_opts->arch)
> -    ret = false;
> -
> -  else if (!always_inline && caller_opts->tune != callee_opts->tune)
> -    ret = false;
> -
>    else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath
>            /* If the calle doesn't use FP expressions differences in
>               ix86_fpmath can be ignored.  We are called from FEs
> @@ -622,6 +615,21 @@ ix86_can_inline_p (tree caller, tree callee)
>                || ipa_fn_summaries->get (callee_node)->fp_expressions))
>      ret = false;
>
> +  /* At this point we cannot identify whether arch or tune setting
> +     comes from target attribute or not. So the most conservative way
> +     is to allow the callee that uses default arch and tune string to
> +     be inlined.  */
> +  else if (!strcmp (callee_opts->x_ix86_arch_string, "x86-64")
> +          && !strcmp (callee_opts->x_ix86_tune_string, "generic"))
> +    ret = true;
> +
> +  /* See if arch, tune, etc. are the same.  */
> +  else if (caller_opts->arch != callee_opts->arch)
> +    ret = false;
> +
> +  else if (!always_inline && caller_opts->tune != callee_opts->tune)
> +    ret = false;
> +
>    else if (!always_inline
>            && caller_opts->branch_cost != callee_opts->branch_cost)
>      ret = false;
> diff --git a/gcc/testsuite/gcc.target/i386/inline_target_clones.c b/gcc/testsuite/gcc.target/i386/inline_target_clones.c
> new file mode 100644
> index 00000000000..53db1600ce5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/inline_target_clones.c
> @@ -0,0 +1,24 @@
> +/* { dg-do compile } */
> +/* { dg-require-ifunc "" } */
> +/* { dg-options "-O3 -march=x86-64" } */
> +/* { dg-final { scan-assembler-not "call\[ \t\]+callee" } } */
> +
> +float callee (float a, float b, float c, float d,
> +             float e, float f, float g, float h)
> +{
> +  return a * b + c * d + e * f + g + h + a * c + b * c
> +    + a * d + b * e + a * f + c * h +
> +    b * (a - 0.4f) * (c + h) * (b + e * d) - a / f * h;
> +}
> +
> +__attribute__((target_clones("default","arch=icelake-server")))
> +void caller (int n, float *a,
> +            float c1, float c2, float c3,
> +            float c4, float c5, float c6,
> +            float c7)
> +{
> +  for (int i = 0; i < n; i++)
> +    {
> +      a[i] = callee (a[i], c1, c2, c3, c4, c5, c6, c7);
> +    }
> +}
> --
> 2.31.1
>

  reply	other threads:[~2023-07-04  6:19 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-04  3:12 Hongyu Wang
2023-07-04  6:18 ` Uros Bizjak [this message]
2023-07-04  8:25   ` Hongyu Wang
2023-07-04  8:57     ` Uros Bizjak
2023-07-06  0:37       ` Hongyu Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFULd4Y7EY6r=WnpCbQko3enqFFP=qtbKakhdZakw51uUxpK+A@mail.gmail.com' \
    --to=ubizjak@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongtao.liu@intel.com \
    --cc=hongyu.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).