public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Hongtao Liu <crazylht@gmail.com>
To: Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org>,
	Jakub Jelinek <jakub@redhat.com>,
	 Uros Bizjak <ubizjak@gmail.com>,
	Hongtao Liu <crazylht@gmail.com>,
	"H. J. Lu" <hjl.tools@gmail.com>,
	 Richard Sandiford <richard.sandiford@arm.com>
Subject: Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]
Date: Tue, 18 May 2021 21:12:03 +0800	[thread overview]
Message-ID: <CAMZc-bxjHPWUTTpZih7d_y1wTQLW_6Qcva=W7rq3Awk07Dz-jw@mail.gmail.com> (raw)
In-Reply-To: <mpt7djxy98p.fsf@arm.com>

On Mon, May 17, 2021 at 5:56 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Fri, May 14, 2021 at 10:27 AM Hongtao Liu <crazylht@gmail.com> wrote:
> >>
> >> On Thu, May 13, 2021 at 7:52 PM Richard Sandiford
> >> <richard.sandiford@arm.com> wrote:
> >> >
> >> > Jakub Jelinek <jakub@redhat.com> writes:
> >> > > On Thu, May 13, 2021 at 12:32:26PM +0100, Richard Sandiford wrote:
> >> > >> Jakub Jelinek <jakub@redhat.com> writes:
> >> > >> > On Thu, May 13, 2021 at 11:43:19AM +0200, Uros Bizjak wrote:
> >> > >> >> > >   Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
> >> > >> >> > >   Ok for trunk?
> >> > >> >> >
> >> > >> >> > Some time ago a support for CLOBBER_HIGH RTX was added (and later
> >> > >> >> > removed for some reason). Perhaps we could resurrect the patch for the
> >> > >> >> > purpose of ferrying 128bit modes via vzeroupper RTX?
> >> > >> >>
> >> > >> >> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html
> >> > >> >
> >> > >> > https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01468.html
> >> > >> > is where it got removed, CCing Richard.
> >> > >>
> >> > >> Yeah.  Initially clobber_high seemed like the best appraoch for
> >> > >> handling the tlsdesc thing, but in practice it was too difficult
> >> > >> to shoe-horn the concept in after the fact, when so much rtl
> >> > >> infrastructure wasn't prepared to deal with it.  The old support
> >> > >> didn't handle all cases and passes correctly, and handled others
> >> > >> suboptimally.
> >> > >>
> >> > >> I think it would be worth using the same approach as
> >> > >> https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01466.html for
> >> > >> vzeroupper: represent the instructions as call_insns in which the
> >> > >> call has a special vzeroupper ABI.  I think that's likely to lead
> >> > >> to better code than clobber_high would (or at least, it did for tlsdesc).
> >>
> >> From an implementation perspective, I guess you're meaning we should
> >> implement TARGET_INSN_CALLEE_ABI and TARGET_FNTYPE_ABI in the i386
> >> backend.
> >>
> > When I implemented the vzeroupper pattern as call_insn and defined
> > TARGET_INSN_CALLEE_ABI for it, I got several failures. they're related
> > to 2 parts
> >
> > 1. requires_stack_frame_p return true for vzeroupper which should be false.
> > 2. in subst_stack_regs, vzeroupper shouldn't kill arguments
> >
> > I've tried a rough patch like below, it works for those failures,
> > unfortunately, I don't have an arm machine to test, so I want to ask
> > would the below change break something in the arm backend?
>
> ABI id 0 just means the default ABI.  Real calls can use other ABIs
> besides the default.  That said…
>
> > modified   gcc/reg-stack.c
> > @@ -174,6 +174,7 @@
> >  #include "reload.h"
> >  #include "tree-pass.h"
> >  #include "rtl-iter.h"
> > +#include "function-abi.h"
> >
> >  #ifdef STACK_REGS
> >
> > @@ -2385,7 +2386,7 @@ subst_stack_regs (rtx_insn *insn, stack_ptr regstack)
> >    bool control_flow_insn_deleted = false;
> >    int i;
> >
> > -  if (CALL_P (insn))
> > +  if (CALL_P (insn) && insn_callee_abi (insn).id () == 0)
> >      {
> >        int top = regstack->top;
>
> …reg-stack.c is effectively x86-specific code, so checking id 0 here
> wouldn't affect anything else.  It doesn't feel very future-proof
> though, since x86 could use ABIs other than 0 for real calls in future.
>
> AIUI the property that matters here isn't the ABI, but that the target
> of the call doesn't reference stack registers.  That can be true for
> real calls too, with -fipa-ra.
>
> > modified   gcc/shrink-wrap.c
> > @@ -58,7 +58,12 @@ requires_stack_frame_p (rtx_insn *insn,
> > HARD_REG_SET prologue_used,
> >    unsigned regno;
> >
> >    if (CALL_P (insn))
> > -    return !SIBLING_CALL_P (insn);
> > +    {
> > +      if (insn_callee_abi (insn).id() != 0)
> > + return false;
> > +      else
> > + return !SIBLING_CALL_P (insn);
> > +    }
>
> TBH I'm not sure why off-hand this function needs to treat non-sibling
> calls specially, rather than rely on normal DF information.  Calls have
> a use of the stack pointer, so we should return true for that reason:
>
>         /* The stack ptr is used (honorarily) by a CALL insn.  */
>         df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i],
>                        NULL, bb, insn_info, DF_REF_REG_USE,
>                        DF_REF_CALL_STACK_USAGE | flags);
>
> I guess this is something we should suppress for fake calls though.
>
> It looks like the rtx “used” flag is unused for INSNs, so we could
> use that as a CALL_INSN flag that indicates a fake call.  We could just
> need to make:
>
>       /* For all other RTXes clear the used flag on the copy.  */
>       RTX_FLAG (copy, used) = 0;
>
> conditional on !INSN_P.
>
I got another error in

@@ -83,6 +83,9 @@ control_flow_insn_p (const rtx_insn *insn)
       return true;

     case CALL_INSN:
+      /* CALL_INSN use "used" flag to indicate it's a fake call.  */
+      if (RTX_FLAG (insn, used))
+ break;

and performance issue in

modified   gcc/final.c
@@ -4498,7 +4498,8 @@ leaf_function_p (void)
   for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
     {
       if (CALL_P (insn)
-   && ! SIBLING_CALL_P (insn))
+   && ! SIBLING_CALL_P (insn)
+   && !RTX_FLAG (insn, used))
  return 0;
       if (NONJUMP_INSN_P (insn)

Also i grep CALL_P or CALL_INSN in GCC source codes, there are many
places which hold the assumption CALL_P/CALL_INSN is a real call.
Considering that vzeroupper is used a lot on the i386 backend, I'm a
bit worried that this implementation solution will be a bottomless
pit.

> Thanks,
> Richard



-- 
BR,
Hongtao

  reply	other threads:[~2021-05-18 13:07 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-13  9:23 Hongtao Liu
2021-05-13  9:40 ` Uros Bizjak
2021-05-13  9:43   ` Uros Bizjak
2021-05-13  9:54     ` Jakub Jelinek
2021-05-13 11:32       ` Richard Sandiford
2021-05-13 11:37         ` Jakub Jelinek
2021-05-13 11:52           ` Richard Sandiford
2021-05-14  2:27             ` Hongtao Liu
2021-05-17  8:44               ` Hongtao Liu
2021-05-17  9:56                 ` Richard Sandiford
2021-05-18 13:12                   ` Hongtao Liu [this message]
2021-05-18 15:18                     ` Richard Sandiford
2021-05-25  6:04                       ` Hongtao Liu
2021-05-25  6:30                         ` Hongtao Liu
2021-05-27  5:07                           ` Hongtao Liu
2021-05-27  7:05                             ` Uros Bizjak
2021-06-01  2:24                               ` Hongtao Liu
2021-06-03  6:54                               ` [PATCH 1/2] CALL_INSN may not be a real function call liuhongt
2021-06-03  6:54                                 ` [PATCH 2/2] Fix _mm256_zeroupper by representing the instructions as call_insns in which the call has a special vzeroupper ABI liuhongt
2021-06-04  2:56                                   ` Hongtao Liu
2021-06-04  6:26                                   ` Uros Bizjak
2021-06-04  6:34                                     ` Hongtao Liu
2021-06-07 19:04                                       ` [PATCH] x86: Don't compile pr82735-[345].c for x32 H.J. Lu
2021-06-04  2:55                                 ` [PATCH 1/2] CALL_INSN may not be a real function call Hongtao Liu
2021-06-04  7:50                                 ` Jakub Jelinek
2021-07-05 23:30                                 ` Segher Boessenkool
2021-07-06  0:03                                   ` Jeff Law
2021-07-06  1:49                                     ` Hongtao Liu
2021-07-07 14:55                                     ` Segher Boessenkool
2021-07-07 17:56                                       ` Jeff Law
2021-07-06  1:37                                   ` Hongtao Liu
2021-07-07  2:44                                     ` Hongtao Liu
2021-07-07  8:15                                       ` Richard Biener
2021-07-07 14:52                                         ` Segher Boessenkool
2021-07-07 15:23                                           ` Hongtao Liu
2021-07-07 23:42                                             ` Segher Boessenkool
2021-07-08  4:14                                               ` Hongtao Liu
2021-07-07 15:32                                           ` Hongtao Liu
2021-07-07 23:54                                             ` Segher Boessenkool
2021-07-09  7:20                                               ` Hongtao Liu
2021-07-07 15:52                                         ` Hongtao Liu
2021-05-27  7:20                             ` [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735] Jakub Jelinek
2021-05-27 10:50                               ` Richard Sandiford
2021-06-01  2:22                                 ` Hongtao Liu
2021-06-01  2:25                                   ` Hongtao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMZc-bxjHPWUTTpZih7d_y1wTQLW_6Qcva=W7rq3Awk07Dz-jw@mail.gmail.com' \
    --to=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=jakub@redhat.com \
    --cc=richard.sandiford@arm.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).