public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Hongtao Liu <crazylht@gmail.com>
To: Jakub Jelinek <jakub@redhat.com>, Uros Bizjak <ubizjak@gmail.com>,
	 Hongtao Liu <crazylht@gmail.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>,
	 "H. J. Lu" <hjl.tools@gmail.com>,
	Richard Sandiford <richard.sandiford@arm.com>
Subject: Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]
Date: Mon, 17 May 2021 16:44:30 +0800	[thread overview]
Message-ID: <CAMZc-bxb_15p-tLjY-UfgP9EyFmxhXW5mMJFbu02V+QXbimoFQ@mail.gmail.com> (raw)
In-Reply-To: <CAMZc-bweYecznWAmDk5kZrHOgr2zxQqAf=9jVSizF+AFQRqL2A@mail.gmail.com>

On Fri, May 14, 2021 at 10:27 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Thu, May 13, 2021 at 7:52 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> >
> > Jakub Jelinek <jakub@redhat.com> writes:
> > > On Thu, May 13, 2021 at 12:32:26PM +0100, Richard Sandiford wrote:
> > >> Jakub Jelinek <jakub@redhat.com> writes:
> > >> > On Thu, May 13, 2021 at 11:43:19AM +0200, Uros Bizjak wrote:
> > >> >> > >   Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
> > >> >> > >   Ok for trunk?
> > >> >> >
> > >> >> > Some time ago a support for CLOBBER_HIGH RTX was added (and later
> > >> >> > removed for some reason). Perhaps we could resurrect the patch for the
> > >> >> > purpose of ferrying 128bit modes via vzeroupper RTX?
> > >> >>
> > >> >> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html
> > >> >
> > >> > https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01468.html
> > >> > is where it got removed, CCing Richard.
> > >>
> > >> Yeah.  Initially clobber_high seemed like the best appraoch for
> > >> handling the tlsdesc thing, but in practice it was too difficult
> > >> to shoe-horn the concept in after the fact, when so much rtl
> > >> infrastructure wasn't prepared to deal with it.  The old support
> > >> didn't handle all cases and passes correctly, and handled others
> > >> suboptimally.
> > >>
> > >> I think it would be worth using the same approach as
> > >> https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01466.html for
> > >> vzeroupper: represent the instructions as call_insns in which the
> > >> call has a special vzeroupper ABI.  I think that's likely to lead
> > >> to better code than clobber_high would (or at least, it did for tlsdesc).
>
> From an implementation perspective, I guess you're meaning we should
> implement TARGET_INSN_CALLEE_ABI and TARGET_FNTYPE_ABI in the i386
> backend.
>
When I implemented the vzeroupper pattern as call_insn and defined
TARGET_INSN_CALLEE_ABI for it, I got several failures. they're related
to 2 parts

1. requires_stack_frame_p return true for vzeroupper which should be false.
2. in subst_stack_regs, vzeroupper shouldn't kill arguments

I've tried a rough patch like below, it works for those failures,
unfortunately, I don't have an arm machine to test, so I want to ask
would the below change break something in the arm backend?

modified   gcc/reg-stack.c
@@ -174,6 +174,7 @@
 #include "reload.h"
 #include "tree-pass.h"
 #include "rtl-iter.h"
+#include "function-abi.h"

 #ifdef STACK_REGS

@@ -2385,7 +2386,7 @@ subst_stack_regs (rtx_insn *insn, stack_ptr regstack)
   bool control_flow_insn_deleted = false;
   int i;

-  if (CALL_P (insn))
+  if (CALL_P (insn) && insn_callee_abi (insn).id () == 0)
     {
       int top = regstack->top;

modified   gcc/shrink-wrap.c
@@ -58,7 +58,12 @@ requires_stack_frame_p (rtx_insn *insn,
HARD_REG_SET prologue_used,
   unsigned regno;

   if (CALL_P (insn))
-    return !SIBLING_CALL_P (insn);
+    {
+      if (insn_callee_abi (insn).id() != 0)
+ return false;
+      else
+ return !SIBLING_CALL_P (insn);
+    }

   /* We need a frame to get the unique CFA expected by the unwinder.  */
   if (cfun->can_throw_non_call_exceptions && can_throw_internal (insn))
> > >
> > > Perhaps a magic call_insn that is split post-reload into a normal insn
> > > with the sets then?
> >
> > I'd be tempted to treat it is a call_insn throughout.  The unspec_volatile
> > means that we can't move the instruction, so converting a call_insn to an
> > insn isn't likely to help from that point of view.  The sets are also
> > likely to be handled suboptimally compared to the more accurate register
> > information attached to the call: all code that handles calls has to be
> > prepared to deal with partial clobbers, whereas most code dealing with
> > sets will assume that the set does useful work, and that the rhs of the
> > set is live.
> >
> > Thanks,
> > Richard
> >
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao

  reply	other threads:[~2021-05-17  8:40 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-13  9:23 Hongtao Liu
2021-05-13  9:40 ` Uros Bizjak
2021-05-13  9:43   ` Uros Bizjak
2021-05-13  9:54     ` Jakub Jelinek
2021-05-13 11:32       ` Richard Sandiford
2021-05-13 11:37         ` Jakub Jelinek
2021-05-13 11:52           ` Richard Sandiford
2021-05-14  2:27             ` Hongtao Liu
2021-05-17  8:44               ` Hongtao Liu [this message]
2021-05-17  9:56                 ` Richard Sandiford
2021-05-18 13:12                   ` Hongtao Liu
2021-05-18 15:18                     ` Richard Sandiford
2021-05-25  6:04                       ` Hongtao Liu
2021-05-25  6:30                         ` Hongtao Liu
2021-05-27  5:07                           ` Hongtao Liu
2021-05-27  7:05                             ` Uros Bizjak
2021-06-01  2:24                               ` Hongtao Liu
2021-06-03  6:54                               ` [PATCH 1/2] CALL_INSN may not be a real function call liuhongt
2021-06-03  6:54                                 ` [PATCH 2/2] Fix _mm256_zeroupper by representing the instructions as call_insns in which the call has a special vzeroupper ABI liuhongt
2021-06-04  2:56                                   ` Hongtao Liu
2021-06-04  6:26                                   ` Uros Bizjak
2021-06-04  6:34                                     ` Hongtao Liu
2021-06-07 19:04                                       ` [PATCH] x86: Don't compile pr82735-[345].c for x32 H.J. Lu
2021-06-04  2:55                                 ` [PATCH 1/2] CALL_INSN may not be a real function call Hongtao Liu
2021-06-04  7:50                                 ` Jakub Jelinek
2021-07-05 23:30                                 ` Segher Boessenkool
2021-07-06  0:03                                   ` Jeff Law
2021-07-06  1:49                                     ` Hongtao Liu
2021-07-07 14:55                                     ` Segher Boessenkool
2021-07-07 17:56                                       ` Jeff Law
2021-07-06  1:37                                   ` Hongtao Liu
2021-07-07  2:44                                     ` Hongtao Liu
2021-07-07  8:15                                       ` Richard Biener
2021-07-07 14:52                                         ` Segher Boessenkool
2021-07-07 15:23                                           ` Hongtao Liu
2021-07-07 23:42                                             ` Segher Boessenkool
2021-07-08  4:14                                               ` Hongtao Liu
2021-07-07 15:32                                           ` Hongtao Liu
2021-07-07 23:54                                             ` Segher Boessenkool
2021-07-09  7:20                                               ` Hongtao Liu
2021-07-07 15:52                                         ` Hongtao Liu
2021-05-27  7:20                             ` [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735] Jakub Jelinek
2021-05-27 10:50                               ` Richard Sandiford
2021-06-01  2:22                                 ` Hongtao Liu
2021-06-01  2:25                                   ` Hongtao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMZc-bxb_15p-tLjY-UfgP9EyFmxhXW5mMJFbu02V+QXbimoFQ@mail.gmail.com \
    --to=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=jakub@redhat.com \
    --cc=richard.sandiford@arm.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).