public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Bernd Edlinger <bernd.edlinger@hotmail.de>
To: Daniel Santos <daniel.santos@pobox.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>, Uros Bizjak <ubizjak@gmail.com>
Subject: Re: [PATCH] [i386] Recompute the frame layout less often
Date: Wed, 17 May 2017 17:43:00 -0000	[thread overview]
Message-ID: <AM4PR0701MB2162FAB6687910F8B4AA971CE4E70@AM4PR0701MB2162.eurprd07.prod.outlook.com> (raw)
In-Reply-To: <ac395e89-e4ef-7450-5a9d-21e387d2d011@pobox.com>

On 05/17/17 04:01, Daniel Santos wrote:
> On 05/16/2017 02:52 PM, Bernd Edlinger wrote:
>> I think I solved the problem with -fsplit-stack, I am not sure
>> if ix86_static_chain_on_stack might change after reload due to
>> final.c possibly calling targetm.calls.static_chain, but if that
>> is the case, that is an already pre-existing problem.
>>
>> The goal of this patch is to make all decisions regarding the
>> frame layout before the reload pass, and to make sure that
>> the frame layout does not change unexpectedly it asserts
>> that the data that goes into the decision does not change
>> after reload_completed.
>>
>> With the attached patch -fsplit-stack and the attribute ms_hook_prologue
>> is handed directly at the ix86_expand_call, because that data is
>> already known before expansion.
>>
>> The calls_eh_return and ix86_static_chain_on_stack may become
>> known at a later time, but after reload it should not change any more.
>> To be sure, I added an assertion at ix86_static_chain, which the
>> regression test did not trigger, neither with -m64 nor with -m32.
>>
>> I have bootstrapped the patch several times, and a few times I
>> encounterd a segfault in the garbage collection, but it did not
>> happen every time.  Currently I think that is unrelated to this patch.
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu with -m64/-m32.
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.
>
> With as many formatting errors as I seem to have had, I would like to
> fix those then you patch on top of that if you wouldn't mind terribly.
> While gcc uses subversion, git-blame is still very helpful (then again,
> since Uros committed it for me, I guess that's already off).
>

Apologies if I ruined your patch...

>
>> Index: gcc/config/i386/i386.c
>> ===================================================================
>> --- gcc/config/i386/i386.c    (revision 248031)
>> +++ gcc/config/i386/i386.c    (working copy)
>> @@ -2425,7 +2425,9 @@ static int const x86_64_int_return_registers[4] =
>>
>>  /* Additional registers that are clobbered by SYSV calls.  */
>>
>> -unsigned const x86_64_ms_sysv_extra_clobbered_registers[12] =
>> +#define NUM_X86_64_MS_CLOBBERED_REGS 12
>> +static int const x86_64_ms_sysv_extra_clobbered_registers
>> +         [NUM_X86_64_MS_CLOBBERED_REGS] =
>
> Is there a reason you're changing this unsigned to signed int? While
> AX_REG and such are just preprocessor macros, everywhere else it seems
> that register numbers are dealt with as unsigned ints.
>

I actually there seems to be confusion about "int" vs. "unsigned int"
for regno, the advantage of int, is that it can contain -1 as a
exceptional value.  Furthermore there are 3 similar arrays just
above that also use int:

static int const x86_64_int_parameter_registers[6] =
{
   DI_REG, SI_REG, DX_REG, CX_REG, R8_REG, R9_REG
};

static int const x86_64_ms_abi_int_parameter_registers[4] =
{
   CX_REG, DX_REG, R8_REG, R9_REG
};

static int const x86_64_int_return_registers[4] =
{
   AX_REG, DX_REG, DI_REG, SI_REG
};

/* Additional registers that are clobbered by SYSV calls.  */

#define NUM_X86_64_MS_CLOBBERED_REGS 12
static int const x86_64_ms_sysv_extra_clobbered_registers
                  [NUM_X86_64_MS_CLOBBERED_REGS] =
{
   SI_REG, DI_REG,
   XMM6_REG, XMM7_REG,
   XMM8_REG, XMM9_REG, XMM10_REG, XMM11_REG,
   XMM12_REG, XMM13_REG, XMM14_REG, XMM15_REG
};

So IMHO it looked odd to have one array use a different type in the
first place.


>> @@ -2484,13 +2486,13 @@ class xlogue_layout {
>>       needs to store registers based upon data in the
>> machine_function.  */
>>    HOST_WIDE_INT get_stack_space_used () const
>>    {
>> -    const struct machine_function &m = *cfun->machine;
>> -    unsigned last_reg = m.call_ms2sysv_extra_regs + MIN_REGS - 1;
>> +    const struct machine_function *m = cfun->machine;
>> +    unsigned last_reg = m->call_ms2sysv_extra_regs + MIN_REGS - 1;
>
> What is the reason for this change?
>

Because a mixture of C and C++ (C wants "struct" machine_function)
looks ugly, and everywhere else in this module, "m" is a pointer and no
reference.

>>
>> -    gcc_assert (m.call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
>> +    gcc_assert (m->call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
>>      return m_regs[last_reg].offset
>> -        + (m.call_ms2sysv_pad_out ? 8 : 0)
>> -        + STUB_INDEX_OFFSET;
>> +       + (m->call_ms2sysv_pad_out ? 8 : 0)
>> +       + STUB_INDEX_OFFSET;
>>    }
>>
>>    /* Returns the offset for the base pointer used by the stub. */
>> @@ -2532,7 +2534,7 @@ class xlogue_layout {
>>    /* Lazy-inited cache of symbol names for stubs.  */
>>    char
>> m_stub_names[XLOGUE_STUB_COUNT][VARIANT_COUNT][STUB_NAME_MAX_LEN];
>>
>> -  static const struct xlogue_layout GTY(())
>> s_instances[XLOGUE_SET_COUNT];
>> +  static const struct GTY(()) xlogue_layout
>> s_instances[XLOGUE_SET_COUNT];
>
> Hmm, during development I originally had C-style xlogue_layout as a
> struct and later decided to make it a class and apparently forgot to
> remove the "struct" here.  None the less, it's bazaar that the GTY()
> would go in between the "struct" and the "xlogue_layout."  As I said
> before, I don't fully understand how this GTY works.  Can we just remove
> the "struct" keyword?
>
> Also, if the way I had it was wrong, (and resulted in garbage collection
> not working right) then perhaps it was the cause of a problem I had with
> caching symbol rtx objects.  I could not get this to work because my
> cached objects would somehow become stale and I've since removed that
> code (from xlogue_layout::get_stub_rtx).  (i.e., does GTY effect
> lifespan of globals, TU statics and static C++ data members?)
>

Yes, I have not noticed the "struct", and agree to remove it.

I just saw every other place where GTY is used it is directly after
"struct" or "static", so my impulse was just to follow that examples.

But neither version actually makes the class GC-able.  Apparently
this class construct is too complicated for the gengtype machinery.
So I am inclined to remove the GTY keyword completely as it gives
you only false security in GC's ability to garbage collect anything
in this class.

>>  /* Constructor for xlogue_layout.  */
>> @@ -2639,11 +2643,11 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT
>> stack_
>>    : m_hfp (hfp) , m_nregs (hfp ? 17 : 18),
>>      m_stack_align_off_in (stack_align_off_in)
>>  {
>> +  HOST_WIDE_INT offset = stack_align_off_in;
>> +  unsigned i, j;
>> +
>>    memset (m_regs, 0, sizeof (m_regs));
>>    memset (m_stub_names, 0, sizeof (m_stub_names));
>> -
>> -  HOST_WIDE_INT offset = stack_align_off_in;
>> -  unsigned i, j;
>>    for (i = j = 0; i < MAX_REGS; ++i)
>>      {
>>        unsigned regno = REG_ORDER[i];
>> @@ -2662,11 +2666,12 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT
>> stack_
>>        m_regs[j].regno    = regno;
>>        m_regs[j++].offset = offset - STUB_INDEX_OFFSET;
>>      }
>> -    gcc_assert (j == m_nregs);
>> +  gcc_assert (j == m_nregs);
>>  }
>
> Aside from my formatting errors,  this would actually be incorrect per
> the GNU coding conventions
> (https://gcc.gnu.org/codingconventions.html#Variable), which is probably
> based off of Effective C++ Item 26.  Obviously, we're still dealing with
> part of this as classical C++ and the rest as if it were C, but I'm
> trying to follow C++ norms in C++ classes and member functions.
>

Well, yes, but I doubt this creates a "cognitive burden on the
programmer", given this is just one line above.

>> @@ -2676,7 +2681,7 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT stack_
>>      {
>>        int res = snprintf (name, STUB_NAME_MAX_LEN, "__%s_%u",
>>                STUB_BASE_NAMES[stub], MIN_REGS + n_extra_regs);
>> -      gcc_checking_assert (res <= (int)STUB_NAME_MAX_LEN);
>> +      gcc_checking_assert (res < (int)STUB_NAME_MAX_LEN);
>
> Good catch! Thank you.
>
>> @@ -12634,10 +12573,6 @@ ix86_hard_regno_scratch_ok (unsigned int regno)
>>            && df_regs_ever_live_p (regno)));
>>  }
>>
>> -/* Registers who's save & restore will be managed by stubs called from
>> -   pro/epilogue.  */
>> -static HARD_REG_SET GTY(()) stub_managed_regs;
>> -
>>  /* Return true if register class CL should be an additional allocno
>>     class.  */
>>
>> @@ -12718,10 +12653,17 @@ ix86_save_reg (unsigned int regno, bool
>> maybe_eh_r
>>      }
>>      }
>>
>> -  if (ignore_outlined && cfun->machine->call_ms2sysv
>> -      && in_hard_reg_set_p (stub_managed_regs, DImode, regno))
>> -    return false;
>> +  if (ignore_outlined && cfun->machine->call_ms2sysv)
>> +    {
>> +      /* Registers who's save & restore will be managed by stubs
>> called from
>> +     pro/epilogue.  */
>> +      HARD_REG_SET stub_managed_regs;
>> +      xlogue_layout::compute_stub_managed_regs (stub_managed_regs);
>>
>> +      if (in_hard_reg_set_p (stub_managed_regs, DImode, regno))
>> +    return false;
>> +    }
>> +
>>    if (crtl->drap_reg
>>        && regno == REGNO (crtl->drap_reg)
>>        && !cfun->machine->no_drap_save_restore)
>
> This makes no sense.  The entire purpose of stub_managed_regs is to
> cache the result of xlogue_layout::compute_stub_managed_regs() and this
> would unnecessarily repeat that calculation for each time
> ix86_save_reg() is called.  Since
> xlogue_layout::compute_stub_managed_regs() calls ix86_save_reg many
> times, this makes it even worse.Which registers are being saved
> out-of-line and inline MUST be known at the time the stack layout is
> determined.  So stub_managed_regsshould either be left a TU static or
> just moved to struct machine_function.
>
> As an aside, I've noticed that xlogue_layout::compute_stub_managed_regs
> is calling ix86_save_reg (which isn't trivial) more often than it really
> has to, so I've refactored it.
>

((Actually ix86_save_reg is not an expensive function.
It relies entirely on cached information.))

As I told you, it is not good to rely on the pass-manager to run all
passes for one function in a row and moreover cfun is actually a
stack of functions, see push_cfun ().  So a static value will not
reflect the same value as cfun->machine does.

I would like to move that data to i386.h but I think
we should not add new dependencies to i386.h because it
is used everywhere.  The problem is HARD_REG_SET not being
used before in any target header file, and I don't want to be
the first one.

So yes this is not a perfect solution yet, but I still doubt that
a static value is a better solution.

>>
>> -/* Disables out-of-lined msabi to sysv pro/epilogues and emits a
>> warning if
>> -   warn_once is null, or *warn_once is zero.  */
>> -static void disable_call_ms2sysv_xlogues (const char *feature)
>> +/* Emits a warning for unsupported msabi to sysv pro/epilogues.  */
>> +static void warn_once_call_ms2sysv_xlogues (const char *feature)
>>  {
>> -  cfun->machine->call_ms2sysv = false;
>> -  warning (OPT_mcall_ms2sysv_xlogues, "not currently compatible with
>> %s.",
>> -       feature);
>> +  static bool warned_once = false;
>> +  if (!warned_once)
>> +    {
>> +      warning (0, "-mcall-ms2sysv-xlogues is not compatible with %s",
>> +           feature);
>> +      warned_once = true;
>> +    }
>>  }
>>
>
> We probably don't want to suppress all warnings across cases.  I've got
> a new version of this that takes a mask for the "warned":
>
> static void disable_call_ms2sysv_xlogues (const char *feature, int
> warned_mask)
> {
>   static int warned = 0;
>   cfun->machine->call_ms2sysv = false;
>
>   if (!(warned & warned_mask))
>     {
>       warning (OPT_mcall_ms2sysv_xlogues, "not currently compatible with
> %s.",
>            feature);
>       warned |= warned_mask;
>     }
> }
>
>

That would be a candidate for a follow-up patch.

Note, that the first parameter ought to be a warning option name, if it
is just refering to a code generation option that is IMHO mis-leading
to the user.

Thanks,
Bernd.

>>  /* When using -fsplit-stack, the allocation routines set a field in
>> @@ -12836,8 +12780,9 @@ ix86_builtin_setjmp_frame_value (void)
>>  /* Fill structure ix86_frame about frame of currently computed
>> function.  */
>>
>>  static void
>> -ix86_compute_frame_layout (struct ix86_frame *frame)
>> +ix86_compute_frame_layout (void)
>>  {
>> +  struct ix86_frame *frame = &cfun->machine->frame;
>>    struct machine_function *m = cfun->machine;
>>    unsigned HOST_WIDE_INT stack_alignment_needed;
>>    HOST_WIDE_INT offset;
>> @@ -12845,46 +12790,41 @@ static void
>>    HOST_WIDE_INT size = get_frame_size ();
>>    HOST_WIDE_INT to_allocate;
>>
>> -  CLEAR_HARD_REG_SET (stub_managed_regs);
>> -
>>    /* m->call_ms2sysv is initially enabled in ix86_expand_call for all
>> 64-bit
>>     * ms_abi functions that call a sysv function.  We now need to
>> prune away
>>     * cases where it should be disabled.  */
>>    if (TARGET_64BIT && m->call_ms2sysv)
>> -  {
>> -    gcc_assert (TARGET_64BIT_MS_ABI);
>> -    gcc_assert (TARGET_CALL_MS2SYSV_XLOGUES);
>> -    gcc_assert (!TARGET_SEH);
>> +    {
>> +      gcc_assert (TARGET_64BIT_MS_ABI);
>> +      gcc_assert (TARGET_CALL_MS2SYSV_XLOGUES);
>> +      gcc_assert (!TARGET_SEH);
>> +      gcc_assert (TARGET_SSE);
>> +      gcc_assert (!ix86_using_red_zone ());
>>
>> -    if (!TARGET_SSE)
>> -      m->call_ms2sysv = false;
>> +      if (crtl->calls_eh_return)
>> +    {
>> +      gcc_assert (!reload_completed);
>> +      m->call_ms2sysv = false;
>> +      warn_once_call_ms2sysv_xlogues ("__builtin_eh_return");
>> +    }
>>
>> -    /* Don't break hot-patched functions.  */
>> -    else if (ix86_function_ms_hook_prologue (current_function_decl))
>> -      m->call_ms2sysv = false;
>> +      else if (ix86_static_chain_on_stack)
>> +    {
>> +      gcc_assert (!reload_completed);
>> +      m->call_ms2sysv = false;
>> +      warn_once_call_ms2sysv_xlogues ("static call chains");
>> +    }
>>
>> -    /* TODO: Cases not yet examined.  */
>> -    else if (crtl->calls_eh_return)
>> -      disable_call_ms2sysv_xlogues ("__builtin_eh_return");
>> +      /* Finally, compute which registers the stub will manage.  */
>> +      else
>> +        {
>> +      HARD_REG_SET stub_managed_regs;
>> +      unsigned count = xlogue_layout
>> +               ::compute_stub_managed_regs (stub_managed_regs);
>> +      m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS;
>> +    }
>> +    }
>>
>> -    else if (ix86_static_chain_on_stack)
>> -      disable_call_ms2sysv_xlogues ("static call chains");
>> -
>> -    else if (ix86_using_red_zone ())
>> -      disable_call_ms2sysv_xlogues ("red zones");
>> -
>> -    else if (flag_split_stack)
>> -      disable_call_ms2sysv_xlogues ("split stack");
>> -
>> -    /* Finally, compute which registers the stub will manage. */
>> -    else
>> -      {
>> -    unsigned count = xlogue_layout
>> -             ::compute_stub_managed_regs (stub_managed_regs);
>> -    m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS;
>> -      }
>> -  }
>> -
>
>> @@ -29320,7 +29265,24 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx
>> call
>>
>>        /* Set here, but it may get cleared later.  */
>>        if (TARGET_CALL_MS2SYSV_XLOGUES)
>> -    cfun->machine->call_ms2sysv = true;
>> +    {
>> +      if (!TARGET_SSE)
>> +        ;
>> +
>> +      /* Don't break hot-patched functions.  */
>> +      else if (ix86_function_ms_hook_prologue (current_function_decl))
>> +        ;
>> +
>> +      /* TODO: Cases not yet examined.  */
>> +      else if (flag_split_stack)
>> +        warn_once_call_ms2sysv_xlogues ("-fsplit-stack");
>> +
>> +      else
>> +        {
>> +          gcc_assert (!reload_completed);
>> +          cfun->machine->call_ms2sysv = true;
>> +        }
>> +    }
>>      }
>>
>
> Other than the local compute_stub_managed_regs (which is a problem),
> this looks like a very good improvement.  It clarifies conditions that
> will not change over the course of compiling a function (split stack,
> ms_hook_prologue, etc.) and those that can.
>
> After some thought, I've decided that it's not better to use a sorry()
> to filter out -fsplit-stack combined with -march-ms2sysv-xlogues because
> it would break support for a TU that uses both via function attributes.
> Example:
>
> void __attribute__((ms_abi))
> a (void)
> {
>   call_a_sysv_fn ();
>   /* stuff */
> }
>
> void __attribute__((optimize("-fsplit-stack")))
> b(void)
> {
>   /* stuff */
> }
>
>
> So you're fixes for this are better.
>
> Thanks,
> Daniel
>
>
>

  reply	other threads:[~2017-05-17 17:41 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-14  9:11 Bernd Edlinger
2017-05-14  9:58 ` Daniel Santos
2017-05-14 10:27   ` Uros Bizjak
2017-05-14 18:11     ` Bernd Edlinger
2017-05-14 20:37       ` Daniel Santos
2017-05-15  2:23       ` Daniel Santos
2017-05-15 20:41         ` Bernd Edlinger
2017-05-15 23:52           ` Daniel Santos
2017-05-16  5:42             ` Daniel Santos
2017-05-16  8:52               ` Bernd Edlinger
2017-05-16 15:11                 ` Daniel Santos
2017-05-16 17:38               ` Ian Lance Taylor via gcc-patches
2017-05-16 20:10                 ` Bernd Edlinger
2017-05-16 21:05                   ` Bernd Edlinger
2017-05-17  2:12                   ` Daniel Santos
2017-05-17 17:43                     ` Bernd Edlinger [this message]
2017-05-18  7:05                       ` Daniel Santos
2017-05-18 13:48                     ` Bernd Edlinger
2017-05-19  3:13                       ` Daniel Santos
2017-05-22 18:32                         ` Bernd Edlinger
2017-05-23 14:34                           ` Bernd Edlinger
2017-05-24  6:49                             ` Daniel Santos
2017-06-01 16:06                             ` [PING][PATCH] " Bernd Edlinger
2017-06-01 18:18                             ` [PATCH] " Uros Bizjak
2017-05-23 23:01                           ` Daniel Santos
2017-05-16 21:35                 ` Daniel Santos
2017-05-17 18:41         ` Bernd Edlinger
2017-05-18  7:14           ` Daniel Santos
2017-05-19  5:12             ` Daniel Santos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM4PR0701MB2162FAB6687910F8B4AA971CE4E70@AM4PR0701MB2162.eurprd07.prod.outlook.com \
    --to=bernd.edlinger@hotmail.de \
    --cc=daniel.santos@pobox.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).