From: Uros Bizjak <ubizjak@gmail.com>
To: Hongyu Wang <wwwhhhyyy333@gmail.com>
Cc: Hongyu Wang <hongyu.wang@intel.com>,
gcc-patches@gcc.gnu.org, hongtao.liu@intel.com
Subject: Re: [PATCH V2] i386: Inline function with default arch/tune to caller
Date: Tue, 4 Jul 2023 10:57:39 +0200 [thread overview]
Message-ID: <CAFULd4YQE9SMNcnRAV1aqwX5Eyy7UZKddVUcxh=3KnBkU1_JBQ@mail.gmail.com> (raw)
In-Reply-To: <CA+OydWkN68ROW6160OfU_WOogTWwpqdrZewSdRicR7TYOR55DA@mail.gmail.com>
On Tue, Jul 4, 2023 at 10:32 AM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote:
>
> > In a follow-up patch, can you please document inlining rules involving
> > -march and -mtune to "x86 Function Attributes" section? Currently, the
> > inlining rules at the end of "target function attribute" section does
> > not even mention -march and -mtune. Maybe a subsubsection "Inlining
> > rules" should be added (like AArch64 has) to mention that only default
> > arch and tune are inlined by default (but inline can be forced with
> > always_inline for different mtune flags).
>
> The document has below at the end of 'target (OPTIONS)' section
>
> On the x86, the inliner does not inline a function that has
> different target options than the caller, unless the callee
> has a subset of the target options of the caller. For example
> a function declared with 'target("sse3")' can inline a
> function with 'target("sse2")', since '-msse3' implies
> '-msse2'.
>
> Do we need to move this part to a new section and combine with -march and
> -mtune rule description to the new subsubsection?
>
> > Looking at the above, perhaps inlining of different arches can also be
> > forced with always_inline? This would allow developers some control of
> > inlining, and would not be surprising.
>
> If so, I'd like to add the always_inline change on arch to current
> patch and leave the
> document change alone in the next patch.
Yes, this is OK.
Thanks,
Uros.
>
> Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年7月4日周二 14:19写道:
> >
> > On Tue, Jul 4, 2023 at 5:12 AM Hongyu Wang <hongyu.wang@intel.com> wrote:
> > >
> > > Hi,
> > >
> > > For function with different target attributes, current logic rejects to
> > > inline the callee when any arch or tune is mismatched. Relax the
> > > condition to allow callee with default arch/tune to be inlined.
> > >
> > > Boostrapped/regtested on x86-64-linux-gnu{-m32,}.
> > >
> > > Ok for trunk?
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/i386/i386.cc (ix86_can_inline_p): If callee has
> > > default arch=x86-64 and tune=generic, do not block the
> > > inlining to its caller.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/i386/inline_target_clones.c: New test.
> >
> > OK.
> >
> > In a follow-up patch, can you please document inlining rules involving
> > -march and -mtune to "x86 Function Attributes" section? Currently, the
> > inlining rules at the end of "target function attribute" section does
> > not even mention -march and -mtune. Maybe a subsubsection "Inlining
> > rules" should be added (like AArch64 has) to mention that only default
> > arch and tune are inlined by default (but inline can be forced with
> > always_inline for different mtune flags).
> >
> > Looking at the above, perhaps inlining of different arches can also be
> > forced with always_inline? This would allow developers some control of
> > inlining, and would not be surprising.
> >
> > Thanks,
> > Uros.
> >
> > > ---
> > > gcc/config/i386/i386.cc | 22 +++++++++++------
> > > .../gcc.target/i386/inline_target_clones.c | 24 +++++++++++++++++++
> > > 2 files changed, 39 insertions(+), 7 deletions(-)
> > > create mode 100644 gcc/testsuite/gcc.target/i386/inline_target_clones.c
> > >
> > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > index 8989985700a..4741c9b5364 100644
> > > --- a/gcc/config/i386/i386.cc
> > > +++ b/gcc/config/i386/i386.cc
> > > @@ -605,13 +605,6 @@ ix86_can_inline_p (tree caller, tree callee)
> > > != (callee_opts->x_target_flags & ~always_inline_safe_mask))
> > > ret = false;
> > >
> > > - /* See if arch, tune, etc. are the same. */
> > > - else if (caller_opts->arch != callee_opts->arch)
> > > - ret = false;
> > > -
> > > - else if (!always_inline && caller_opts->tune != callee_opts->tune)
> > > - ret = false;
> > > -
> > > else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath
> > > /* If the calle doesn't use FP expressions differences in
> > > ix86_fpmath can be ignored. We are called from FEs
> > > @@ -622,6 +615,21 @@ ix86_can_inline_p (tree caller, tree callee)
> > > || ipa_fn_summaries->get (callee_node)->fp_expressions))
> > > ret = false;
> > >
> > > + /* At this point we cannot identify whether arch or tune setting
> > > + comes from target attribute or not. So the most conservative way
> > > + is to allow the callee that uses default arch and tune string to
> > > + be inlined. */
> > > + else if (!strcmp (callee_opts->x_ix86_arch_string, "x86-64")
> > > + && !strcmp (callee_opts->x_ix86_tune_string, "generic"))
> > > + ret = true;
> > > +
> > > + /* See if arch, tune, etc. are the same. */
> > > + else if (caller_opts->arch != callee_opts->arch)
> > > + ret = false;
> > > +
> > > + else if (!always_inline && caller_opts->tune != callee_opts->tune)
> > > + ret = false;
> > > +
> > > else if (!always_inline
> > > && caller_opts->branch_cost != callee_opts->branch_cost)
> > > ret = false;
> > > diff --git a/gcc/testsuite/gcc.target/i386/inline_target_clones.c b/gcc/testsuite/gcc.target/i386/inline_target_clones.c
> > > new file mode 100644
> > > index 00000000000..53db1600ce5
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/inline_target_clones.c
> > > @@ -0,0 +1,24 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-require-ifunc "" } */
> > > +/* { dg-options "-O3 -march=x86-64" } */
> > > +/* { dg-final { scan-assembler-not "call\[ \t\]+callee" } } */
> > > +
> > > +float callee (float a, float b, float c, float d,
> > > + float e, float f, float g, float h)
> > > +{
> > > + return a * b + c * d + e * f + g + h + a * c + b * c
> > > + + a * d + b * e + a * f + c * h +
> > > + b * (a - 0.4f) * (c + h) * (b + e * d) - a / f * h;
> > > +}
> > > +
> > > +__attribute__((target_clones("default","arch=icelake-server")))
> > > +void caller (int n, float *a,
> > > + float c1, float c2, float c3,
> > > + float c4, float c5, float c6,
> > > + float c7)
> > > +{
> > > + for (int i = 0; i < n; i++)
> > > + {
> > > + a[i] = callee (a[i], c1, c2, c3, c4, c5, c6, c7);
> > > + }
> > > +}
> > > --
> > > 2.31.1
> > >
next prev parent reply other threads:[~2023-07-04 8:57 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-04 3:12 Hongyu Wang
2023-07-04 6:18 ` Uros Bizjak
2023-07-04 8:25 ` Hongyu Wang
2023-07-04 8:57 ` Uros Bizjak [this message]
2023-07-06 0:37 ` Hongyu Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFULd4YQE9SMNcnRAV1aqwX5Eyy7UZKddVUcxh=3KnBkU1_JBQ@mail.gmail.com' \
--to=ubizjak@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hongtao.liu@intel.com \
--cc=hongyu.wang@intel.com \
--cc=wwwhhhyyy333@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).