From: Hongyu Wang <wwwhhhyyy333@gmail.com>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: Hongyu Wang <hongyu.wang@intel.com>,
gcc-patches@gcc.gnu.org, hongtao.liu@intel.com
Subject: Re: [PATCH] i386: Relax inline requirement for functions with different target attrs
Date: Wed, 28 Jun 2023 09:49:38 +0800 [thread overview]
Message-ID: <CA+OydWnpBmOk7Rdk=NDE2_jKj6gksiPL1SpKWfF9a6NhYw_2Aw@mail.gmail.com> (raw)
In-Reply-To: <CAFULd4YqNezFUGVW4xkXU=qYgKdZX_U11e-yPrZps23qqwG91g@mail.gmail.com>
> I don't think this is desirable. If we inline something with different
> ISAs, we get some strange mix of ISAs when the function is inlined.
> OTOH - we already inline with mismatched tune flags if the function is
> marked with always_inline.
Previously ix86_can_inline_p has
if (((caller_opts->x_ix86_isa_flags & callee_opts->x_ix86_isa_flags)
!= callee_opts->x_ix86_isa_flags)
|| ((caller_opts->x_ix86_isa_flags2 & callee_opts->x_ix86_isa_flags2)
!= callee_opts->x_ix86_isa_flags2))
ret = false;
It make sure caller ISA is a super set of callee, and the inlined one
should follow caller's ISA specification.
IMHO I cannot give a real example that after inline the caller's
performance get harmed, I added PVW since there might
be some callee want to limit its vector size and caller may have
larger preferred vector size. At least with current change
we get more optimization opportunity for different target_clones.
But I agree the tuning setting may be a factor that affect the
performance. One possible choice is that if the
tune for callee is unspecified or default, just inline it to the
caller with specified arch and tune.
Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> 于2023年6月27日周二 17:16写道:
>
> On Mon, Jun 26, 2023 at 4:36 AM Hongyu Wang <hongyu.wang@intel.com> wrote:
> >
> > Hi,
> >
> > For function with different target attributes, current logic rejects to
> > inline the callee when any arch or tune is mismatched. Relax the
> > condition to honor just prefer_vecotr_width_type and other flags that
> > may cause safety issue so caller can get more optimization opportunity.
>
> I don't think this is desirable. If we inline something with different
> ISAs, we get some strange mix of ISAs when the function is inlined.
> OTOH - we already inline with mismatched tune flags if the function is
> marked with always_inline.
>
> Uros.
>
> > Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}
> >
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > * config/i386/i386.cc (ix86_can_inline_p): Do not check arch or
> > tune directly, just check prefer_vector_width_type and make sure
> > not to inline if they mismatch.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/inline-target-attr.c: New test.
> > ---
> > gcc/config/i386/i386.cc | 11 +++++----
> > .../gcc.target/i386/inline-target-attr.c | 24 +++++++++++++++++++
> > 2 files changed, 30 insertions(+), 5 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.target/i386/inline-target-attr.c
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index 0761965344b..1d86384ac06 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -605,11 +605,12 @@ ix86_can_inline_p (tree caller, tree callee)
> > != (callee_opts->x_target_flags & ~always_inline_safe_mask))
> > ret = false;
> >
> > - /* See if arch, tune, etc. are the same. */
> > - else if (caller_opts->arch != callee_opts->arch)
> > - ret = false;
> > -
> > - else if (!always_inline && caller_opts->tune != callee_opts->tune)
> > + /* Do not inline when specified perfer-vector-width mismatched between
> > + callee and caller. */
> > + else if ((callee_opts->x_prefer_vector_width_type != PVW_NONE
> > + && caller_opts->x_prefer_vector_width_type != PVW_NONE)
> > + && callee_opts->x_prefer_vector_width_type
> > + != caller_opts->x_prefer_vector_width_type)
> > ret = false;
> >
> > else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath
> > diff --git a/gcc/testsuite/gcc.target/i386/inline-target-attr.c b/gcc/testsuite/gcc.target/i386/inline-target-attr.c
> > new file mode 100644
> > index 00000000000..995502165f0
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/inline-target-attr.c
> > @@ -0,0 +1,24 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2" } */
> > +/* { dg-final { scan-assembler-not "call\[ \t\]callee" } } */
> > +
> > +__attribute__((target("arch=skylake")))
> > +int callee (int n)
> > +{
> > + int sum = 0;
> > + for (int i = 0; i < n; i++)
> > + {
> > + if (i % 2 == 0)
> > + sum +=i;
> > + else
> > + sum += (i - 1);
> > + }
> > + return sum + n;
> > +}
> > +
> > +__attribute__((target("arch=icelake-server")))
> > +int caller (int n)
> > +{
> > + return callee (n) + n;
> > +}
> > +
> > --
> > 2.31.1
> >
next prev parent reply other threads:[~2023-06-28 1:56 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-26 2:34 Hongyu Wang
2023-06-27 9:16 ` Uros Bizjak
2023-06-28 1:49 ` Hongyu Wang [this message]
2023-06-28 6:42 ` Uros Bizjak
2023-06-28 8:13 ` Hongyu Wang
2023-06-28 8:39 ` Uros Bizjak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CA+OydWnpBmOk7Rdk=NDE2_jKj6gksiPL1SpKWfF9a6NhYw_2Aw@mail.gmail.com' \
--to=wwwhhhyyy333@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hongtao.liu@intel.com \
--cc=hongyu.wang@intel.com \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).