Re: [PATCH] [i386] Recompute the frame layout less often

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Daniel Santos <daniel.santos@pobox.com>
To: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>, Uros Bizjak <ubizjak@gmail.com>
Subject: Re: [PATCH] [i386] Recompute the frame layout less often
Date: Wed, 17 May 2017 02:12:00 -0000	[thread overview]
Message-ID: <ac395e89-e4ef-7450-5a9d-21e387d2d011@pobox.com> (raw)
In-Reply-To: <AM4PR0701MB21626CD7FEA8A316515C30A6E4E60@AM4PR0701MB2162.eurprd07.prod.outlook.com>

On 05/16/2017 02:52 PM, Bernd Edlinger wrote:
> I think I solved the problem with -fsplit-stack, I am not sure
> if ix86_static_chain_on_stack might change after reload due to
> final.c possibly calling targetm.calls.static_chain, but if that
> is the case, that is an already pre-existing problem.
>
> The goal of this patch is to make all decisions regarding the
> frame layout before the reload pass, and to make sure that
> the frame layout does not change unexpectedly it asserts
> that the data that goes into the decision does not change
> after reload_completed.
>
> With the attached patch -fsplit-stack and the attribute ms_hook_prologue
> is handed directly at the ix86_expand_call, because that data is
> already known before expansion.
>
> The calls_eh_return and ix86_static_chain_on_stack may become
> known at a later time, but after reload it should not change any more.
> To be sure, I added an assertion at ix86_static_chain, which the
> regression test did not trigger, neither with -m64 nor with -m32.
>
> I have bootstrapped the patch several times, and a few times I
> encounterd a segfault in the garbage collection, but it did not
> happen every time.  Currently I think that is unrelated to this patch.
>
>
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu with -m64/-m32.
> Is it OK for trunk?
>
>
> Thanks
> Bernd.

With as many formatting errors as I seem to have had, I would like to 
fix those then you patch on top of that if you wouldn't mind terribly.  
While gcc uses subversion, git-blame is still very helpful (then again, 
since Uros committed it for me, I guess that's already off).


> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c    (revision 248031)
> +++ gcc/config/i386/i386.c    (working copy)
> @@ -2425,7 +2425,9 @@ static int const x86_64_int_return_registers[4] =
>
>  /* Additional registers that are clobbered by SYSV calls.  */
>
> -unsigned const x86_64_ms_sysv_extra_clobbered_registers[12] =
> +#define NUM_X86_64_MS_CLOBBERED_REGS 12
> +static int const x86_64_ms_sysv_extra_clobbered_registers
> +         [NUM_X86_64_MS_CLOBBERED_REGS] =

Is there a reason you're changing this unsigned to signed int? While 
AX_REG and such are just preprocessor macros, everywhere else it seems 
that register numbers are dealt with as unsigned ints.

> @@ -2484,13 +2486,13 @@ class xlogue_layout {
>       needs to store registers based upon data in the 
> machine_function.  */
>    HOST_WIDE_INT get_stack_space_used () const
>    {
> -    const struct machine_function &m = *cfun->machine;
> -    unsigned last_reg = m.call_ms2sysv_extra_regs + MIN_REGS - 1;
> +    const struct machine_function *m = cfun->machine;
> +    unsigned last_reg = m->call_ms2sysv_extra_regs + MIN_REGS - 1;

What is the reason for this change?

>
> -    gcc_assert (m.call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
> +    gcc_assert (m->call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
>      return m_regs[last_reg].offset
> -        + (m.call_ms2sysv_pad_out ? 8 : 0)
> -        + STUB_INDEX_OFFSET;
> +       + (m->call_ms2sysv_pad_out ? 8 : 0)
> +       + STUB_INDEX_OFFSET;
>    }
>
>    /* Returns the offset for the base pointer used by the stub. */
> @@ -2532,7 +2534,7 @@ class xlogue_layout {
>    /* Lazy-inited cache of symbol names for stubs.  */
>    char 
> m_stub_names[XLOGUE_STUB_COUNT][VARIANT_COUNT][STUB_NAME_MAX_LEN];
>
> -  static const struct xlogue_layout GTY(()) 
> s_instances[XLOGUE_SET_COUNT];
> +  static const struct GTY(()) xlogue_layout 
> s_instances[XLOGUE_SET_COUNT];

Hmm, during development I originally had C-style xlogue_layout as a 
struct and later decided to make it a class and apparently forgot to 
remove the "struct" here.  None the less, it's bazaar that the GTY() 
would go in between the "struct" and the "xlogue_layout."  As I said 
before, I don't fully understand how this GTY works.  Can we just remove 
the "struct" keyword?

Also, if the way I had it was wrong, (and resulted in garbage collection 
not working right) then perhaps it was the cause of a problem I had with 
caching symbol rtx objects.  I could not get this to work because my 
cached objects would somehow become stale and I've since removed that 
code (from xlogue_layout::get_stub_rtx).  (i.e., does GTY effect 
lifespan of globals, TU statics and static C++ data members?)

>  /* Constructor for xlogue_layout.  */
> @@ -2639,11 +2643,11 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT 
> stack_
>    : m_hfp (hfp) , m_nregs (hfp ? 17 : 18),
>      m_stack_align_off_in (stack_align_off_in)
>  {
> +  HOST_WIDE_INT offset = stack_align_off_in;
> +  unsigned i, j;
> +
>    memset (m_regs, 0, sizeof (m_regs));
>    memset (m_stub_names, 0, sizeof (m_stub_names));
> -
> -  HOST_WIDE_INT offset = stack_align_off_in;
> -  unsigned i, j;
>    for (i = j = 0; i < MAX_REGS; ++i)
>      {
>        unsigned regno = REG_ORDER[i];
> @@ -2662,11 +2666,12 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT 
> stack_
>        m_regs[j].regno    = regno;
>        m_regs[j++].offset = offset - STUB_INDEX_OFFSET;
>      }
> -    gcc_assert (j == m_nregs);
> +  gcc_assert (j == m_nregs);
>  }

Aside from my formatting errors,  this would actually be incorrect per 
the GNU coding conventions 
(https://gcc.gnu.org/codingconventions.html#Variable), which is probably 
based off of Effective C++ Item 26.  Obviously, we're still dealing with 
part of this as classical C++ and the rest as if it were C, but I'm 
trying to follow C++ norms in C++ classes and member functions.

> @@ -2676,7 +2681,7 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT stack_
>      {
>        int res = snprintf (name, STUB_NAME_MAX_LEN, "__%s_%u",
>                STUB_BASE_NAMES[stub], MIN_REGS + n_extra_regs);
> -      gcc_checking_assert (res <= (int)STUB_NAME_MAX_LEN);
> +      gcc_checking_assert (res < (int)STUB_NAME_MAX_LEN);

Good catch! Thank you.

> @@ -12634,10 +12573,6 @@ ix86_hard_regno_scratch_ok (unsigned int regno)
>            && df_regs_ever_live_p (regno)));
>  }
>
> -/* Registers who's save & restore will be managed by stubs called from
> -   pro/epilogue.  */
> -static HARD_REG_SET GTY(()) stub_managed_regs;
> -
>  /* Return true if register class CL should be an additional allocno
>     class.  */
>
> @@ -12718,10 +12653,17 @@ ix86_save_reg (unsigned int regno, bool 
> maybe_eh_r
>      }
>      }
>
> -  if (ignore_outlined && cfun->machine->call_ms2sysv
> -      && in_hard_reg_set_p (stub_managed_regs, DImode, regno))
> -    return false;
> +  if (ignore_outlined && cfun->machine->call_ms2sysv)
> +    {
> +      /* Registers who's save & restore will be managed by stubs 
> called from
> +     pro/epilogue.  */
> +      HARD_REG_SET stub_managed_regs;
> +      xlogue_layout::compute_stub_managed_regs (stub_managed_regs);
>
> +      if (in_hard_reg_set_p (stub_managed_regs, DImode, regno))
> +    return false;
> +    }
> +
>    if (crtl->drap_reg
>        && regno == REGNO (crtl->drap_reg)
>        && !cfun->machine->no_drap_save_restore)

This makes no sense.  The entire purpose of stub_managed_regs is to 
cache the result of xlogue_layout::compute_stub_managed_regs() and this 
would unnecessarily repeat that calculation for each time 
ix86_save_reg() is called.  Since 
xlogue_layout::compute_stub_managed_regs() calls ix86_save_reg many 
times, this makes it even worse.Which registers are being saved 
out-of-line and inline MUST be known at the time the stack layout is 
determined.  So stub_managed_regsshould either be left a TU static or 
just moved to struct machine_function.

As an aside, I've noticed that xlogue_layout::compute_stub_managed_regs 
is calling ix86_save_reg (which isn't trivial) more often than it really 
has to, so I've refactored it.

>
> -/* Disables out-of-lined msabi to sysv pro/epilogues and emits a 
> warning if
> -   warn_once is null, or *warn_once is zero.  */
> -static void disable_call_ms2sysv_xlogues (const char *feature)
> +/* Emits a warning for unsupported msabi to sysv pro/epilogues.  */
> +static void warn_once_call_ms2sysv_xlogues (const char *feature)
>  {
> -  cfun->machine->call_ms2sysv = false;
> -  warning (OPT_mcall_ms2sysv_xlogues, "not currently compatible with 
> %s.",
> -       feature);
> +  static bool warned_once = false;
> +  if (!warned_once)
> +    {
> +      warning (0, "-mcall-ms2sysv-xlogues is not compatible with %s",
> +           feature);
> +      warned_once = true;
> +    }
>  }
>

We probably don't want to suppress all warnings across cases.  I've got 
a new version of this that takes a mask for the "warned":

static void disable_call_ms2sysv_xlogues (const char *feature, int warned_mask)
{
   static int warned = 0;
   cfun->machine->call_ms2sysv = false;

   if (!(warned & warned_mask))
     {
       warning (OPT_mcall_ms2sysv_xlogues, "not currently compatible with %s.",
	       feature);
       warned |= warned_mask;
     }
}


>  /* When using -fsplit-stack, the allocation routines set a field in
> @@ -12836,8 +12780,9 @@ ix86_builtin_setjmp_frame_value (void)
>  /* Fill structure ix86_frame about frame of currently computed 
> function.  */
>
>  static void
> -ix86_compute_frame_layout (struct ix86_frame *frame)
> +ix86_compute_frame_layout (void)
>  {
> +  struct ix86_frame *frame = &cfun->machine->frame;
>    struct machine_function *m = cfun->machine;
>    unsigned HOST_WIDE_INT stack_alignment_needed;
>    HOST_WIDE_INT offset;
> @@ -12845,46 +12790,41 @@ static void
>    HOST_WIDE_INT size = get_frame_size ();
>    HOST_WIDE_INT to_allocate;
>
> -  CLEAR_HARD_REG_SET (stub_managed_regs);
> -
>    /* m->call_ms2sysv is initially enabled in ix86_expand_call for all 
> 64-bit
>     * ms_abi functions that call a sysv function.  We now need to 
> prune away
>     * cases where it should be disabled.  */
>    if (TARGET_64BIT && m->call_ms2sysv)
> -  {
> -    gcc_assert (TARGET_64BIT_MS_ABI);
> -    gcc_assert (TARGET_CALL_MS2SYSV_XLOGUES);
> -    gcc_assert (!TARGET_SEH);
> +    {
> +      gcc_assert (TARGET_64BIT_MS_ABI);
> +      gcc_assert (TARGET_CALL_MS2SYSV_XLOGUES);
> +      gcc_assert (!TARGET_SEH);
> +      gcc_assert (TARGET_SSE);
> +      gcc_assert (!ix86_using_red_zone ());
>
> -    if (!TARGET_SSE)
> -      m->call_ms2sysv = false;
> +      if (crtl->calls_eh_return)
> +    {
> +      gcc_assert (!reload_completed);
> +      m->call_ms2sysv = false;
> +      warn_once_call_ms2sysv_xlogues ("__builtin_eh_return");
> +    }
>
> -    /* Don't break hot-patched functions.  */
> -    else if (ix86_function_ms_hook_prologue (current_function_decl))
> -      m->call_ms2sysv = false;
> +      else if (ix86_static_chain_on_stack)
> +    {
> +      gcc_assert (!reload_completed);
> +      m->call_ms2sysv = false;
> +      warn_once_call_ms2sysv_xlogues ("static call chains");
> +    }
>
> -    /* TODO: Cases not yet examined.  */
> -    else if (crtl->calls_eh_return)
> -      disable_call_ms2sysv_xlogues ("__builtin_eh_return");
> +      /* Finally, compute which registers the stub will manage.  */
> +      else
> +        {
> +      HARD_REG_SET stub_managed_regs;
> +      unsigned count = xlogue_layout
> +               ::compute_stub_managed_regs (stub_managed_regs);
> +      m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS;
> +    }
> +    }
>
> -    else if (ix86_static_chain_on_stack)
> -      disable_call_ms2sysv_xlogues ("static call chains");
> -
> -    else if (ix86_using_red_zone ())
> -      disable_call_ms2sysv_xlogues ("red zones");
> -
> -    else if (flag_split_stack)
> -      disable_call_ms2sysv_xlogues ("split stack");
> -
> -    /* Finally, compute which registers the stub will manage. */
> -    else
> -      {
> -    unsigned count = xlogue_layout
> -             ::compute_stub_managed_regs (stub_managed_regs);
> -    m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS;
> -      }
> -  }
> -

> @@ -29320,7 +29265,24 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx 
> call
>
>        /* Set here, but it may get cleared later.  */
>        if (TARGET_CALL_MS2SYSV_XLOGUES)
> -    cfun->machine->call_ms2sysv = true;
> +    {
> +      if (!TARGET_SSE)
> +        ;
> +
> +      /* Don't break hot-patched functions.  */
> +      else if (ix86_function_ms_hook_prologue (current_function_decl))
> +        ;
> +
> +      /* TODO: Cases not yet examined.  */
> +      else if (flag_split_stack)
> +        warn_once_call_ms2sysv_xlogues ("-fsplit-stack");
> +
> +      else
> +        {
> +          gcc_assert (!reload_completed);
> +          cfun->machine->call_ms2sysv = true;
> +        }
> +    }
>      }
>

Other than the local compute_stub_managed_regs (which is a problem), 
this looks like a very good improvement.  It clarifies conditions that 
will not change over the course of compiling a function (split stack, 
ms_hook_prologue, etc.) and those that can.

After some thought, I've decided that it's not better to use a sorry() 
to filter out -fsplit-stack combined with -march-ms2sysv-xlogues because 
it would break support for a TU that uses both via function attributes.  
Example:

void __attribute__((ms_abi))
a (void)
{
   call_a_sysv_fn ();
   /* stuff */
}

void __attribute__((optimize("-fsplit-stack")))
b(void)
{
   /* stuff */
}


So you're fixes for this are better.

Thanks,
Daniel

next prev parent reply	other threads:[~2017-05-17  1:57 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-14  9:11 Bernd Edlinger
2017-05-14  9:58 ` Daniel Santos
2017-05-14 10:27   ` Uros Bizjak
2017-05-14 18:11     ` Bernd Edlinger
2017-05-14 20:37       ` Daniel Santos
2017-05-15  2:23       ` Daniel Santos
2017-05-15 20:41         ` Bernd Edlinger
2017-05-15 23:52           ` Daniel Santos
2017-05-16  5:42             ` Daniel Santos
2017-05-16  8:52               ` Bernd Edlinger
2017-05-16 15:11                 ` Daniel Santos
2017-05-16 17:38               ` Ian Lance Taylor via gcc-patches
2017-05-16 20:10                 ` Bernd Edlinger
2017-05-16 21:05                   ` Bernd Edlinger
2017-05-17  2:12                   ` Daniel Santos [this message]
2017-05-17 17:43                     ` Bernd Edlinger
2017-05-18  7:05                       ` Daniel Santos
2017-05-18 13:48                     ` Bernd Edlinger
2017-05-19  3:13                       ` Daniel Santos
2017-05-22 18:32                         ` Bernd Edlinger
2017-05-23 14:34                           ` Bernd Edlinger
2017-05-24  6:49                             ` Daniel Santos
2017-06-01 16:06                             ` [PING][PATCH] " Bernd Edlinger
2017-06-01 18:18                             ` [PATCH] " Uros Bizjak
2017-05-23 23:01                           ` Daniel Santos
2017-05-16 21:35                 ` Daniel Santos
2017-05-17 18:41         ` Bernd Edlinger
2017-05-18  7:14           ` Daniel Santos
2017-05-19  5:12             ` Daniel Santos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac395e89-e4ef-7450-5a9d-21e387d2d011@pobox.com \
    --to=daniel.santos@pobox.com \
    --cc=bernd.edlinger@hotmail.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).