From: Daniel Santos <daniel.santos@pobox.com>
To: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>, Uros Bizjak <ubizjak@gmail.com>
Subject: Re: [PATCH] [i386] Recompute the frame layout less often
Date: Wed, 17 May 2017 02:12:00 -0000 [thread overview]
Message-ID: <ac395e89-e4ef-7450-5a9d-21e387d2d011@pobox.com> (raw)
In-Reply-To: <AM4PR0701MB21626CD7FEA8A316515C30A6E4E60@AM4PR0701MB2162.eurprd07.prod.outlook.com>
On 05/16/2017 02:52 PM, Bernd Edlinger wrote:
> I think I solved the problem with -fsplit-stack, I am not sure
> if ix86_static_chain_on_stack might change after reload due to
> final.c possibly calling targetm.calls.static_chain, but if that
> is the case, that is an already pre-existing problem.
>
> The goal of this patch is to make all decisions regarding the
> frame layout before the reload pass, and to make sure that
> the frame layout does not change unexpectedly it asserts
> that the data that goes into the decision does not change
> after reload_completed.
>
> With the attached patch -fsplit-stack and the attribute ms_hook_prologue
> is handed directly at the ix86_expand_call, because that data is
> already known before expansion.
>
> The calls_eh_return and ix86_static_chain_on_stack may become
> known at a later time, but after reload it should not change any more.
> To be sure, I added an assertion at ix86_static_chain, which the
> regression test did not trigger, neither with -m64 nor with -m32.
>
> I have bootstrapped the patch several times, and a few times I
> encounterd a segfault in the garbage collection, but it did not
> happen every time. Currently I think that is unrelated to this patch.
>
>
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu with -m64/-m32.
> Is it OK for trunk?
>
>
> Thanks
> Bernd.
With as many formatting errors as I seem to have had, I would like to
fix those then you patch on top of that if you wouldn't mind terribly.
While gcc uses subversion, git-blame is still very helpful (then again,
since Uros committed it for me, I guess that's already off).
> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c (revision 248031)
> +++ gcc/config/i386/i386.c (working copy)
> @@ -2425,7 +2425,9 @@ static int const x86_64_int_return_registers[4] =
>
> /* Additional registers that are clobbered by SYSV calls. */
>
> -unsigned const x86_64_ms_sysv_extra_clobbered_registers[12] =
> +#define NUM_X86_64_MS_CLOBBERED_REGS 12
> +static int const x86_64_ms_sysv_extra_clobbered_registers
> + [NUM_X86_64_MS_CLOBBERED_REGS] =
Is there a reason you're changing this unsigned to signed int? While
AX_REG and such are just preprocessor macros, everywhere else it seems
that register numbers are dealt with as unsigned ints.
> @@ -2484,13 +2486,13 @@ class xlogue_layout {
> needs to store registers based upon data in the
> machine_function. */
> HOST_WIDE_INT get_stack_space_used () const
> {
> - const struct machine_function &m = *cfun->machine;
> - unsigned last_reg = m.call_ms2sysv_extra_regs + MIN_REGS - 1;
> + const struct machine_function *m = cfun->machine;
> + unsigned last_reg = m->call_ms2sysv_extra_regs + MIN_REGS - 1;
What is the reason for this change?
>
> - gcc_assert (m.call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
> + gcc_assert (m->call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
> return m_regs[last_reg].offset
> - + (m.call_ms2sysv_pad_out ? 8 : 0)
> - + STUB_INDEX_OFFSET;
> + + (m->call_ms2sysv_pad_out ? 8 : 0)
> + + STUB_INDEX_OFFSET;
> }
>
> /* Returns the offset for the base pointer used by the stub. */
> @@ -2532,7 +2534,7 @@ class xlogue_layout {
> /* Lazy-inited cache of symbol names for stubs. */
> char
> m_stub_names[XLOGUE_STUB_COUNT][VARIANT_COUNT][STUB_NAME_MAX_LEN];
>
> - static const struct xlogue_layout GTY(())
> s_instances[XLOGUE_SET_COUNT];
> + static const struct GTY(()) xlogue_layout
> s_instances[XLOGUE_SET_COUNT];
Hmm, during development I originally had C-style xlogue_layout as a
struct and later decided to make it a class and apparently forgot to
remove the "struct" here. None the less, it's bazaar that the GTY()
would go in between the "struct" and the "xlogue_layout." As I said
before, I don't fully understand how this GTY works. Can we just remove
the "struct" keyword?
Also, if the way I had it was wrong, (and resulted in garbage collection
not working right) then perhaps it was the cause of a problem I had with
caching symbol rtx objects. I could not get this to work because my
cached objects would somehow become stale and I've since removed that
code (from xlogue_layout::get_stub_rtx). (i.e., does GTY effect
lifespan of globals, TU statics and static C++ data members?)
> /* Constructor for xlogue_layout. */
> @@ -2639,11 +2643,11 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT
> stack_
> : m_hfp (hfp) , m_nregs (hfp ? 17 : 18),
> m_stack_align_off_in (stack_align_off_in)
> {
> + HOST_WIDE_INT offset = stack_align_off_in;
> + unsigned i, j;
> +
> memset (m_regs, 0, sizeof (m_regs));
> memset (m_stub_names, 0, sizeof (m_stub_names));
> -
> - HOST_WIDE_INT offset = stack_align_off_in;
> - unsigned i, j;
> for (i = j = 0; i < MAX_REGS; ++i)
> {
> unsigned regno = REG_ORDER[i];
> @@ -2662,11 +2666,12 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT
> stack_
> m_regs[j].regno = regno;
> m_regs[j++].offset = offset - STUB_INDEX_OFFSET;
> }
> - gcc_assert (j == m_nregs);
> + gcc_assert (j == m_nregs);
> }
Aside from my formatting errors, this would actually be incorrect per
the GNU coding conventions
(https://gcc.gnu.org/codingconventions.html#Variable), which is probably
based off of Effective C++ Item 26. Obviously, we're still dealing with
part of this as classical C++ and the rest as if it were C, but I'm
trying to follow C++ norms in C++ classes and member functions.
> @@ -2676,7 +2681,7 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT stack_
> {
> int res = snprintf (name, STUB_NAME_MAX_LEN, "__%s_%u",
> STUB_BASE_NAMES[stub], MIN_REGS + n_extra_regs);
> - gcc_checking_assert (res <= (int)STUB_NAME_MAX_LEN);
> + gcc_checking_assert (res < (int)STUB_NAME_MAX_LEN);
Good catch! Thank you.
> @@ -12634,10 +12573,6 @@ ix86_hard_regno_scratch_ok (unsigned int regno)
> && df_regs_ever_live_p (regno)));
> }
>
> -/* Registers who's save & restore will be managed by stubs called from
> - pro/epilogue. */
> -static HARD_REG_SET GTY(()) stub_managed_regs;
> -
> /* Return true if register class CL should be an additional allocno
> class. */
>
> @@ -12718,10 +12653,17 @@ ix86_save_reg (unsigned int regno, bool
> maybe_eh_r
> }
> }
>
> - if (ignore_outlined && cfun->machine->call_ms2sysv
> - && in_hard_reg_set_p (stub_managed_regs, DImode, regno))
> - return false;
> + if (ignore_outlined && cfun->machine->call_ms2sysv)
> + {
> + /* Registers who's save & restore will be managed by stubs
> called from
> + pro/epilogue. */
> + HARD_REG_SET stub_managed_regs;
> + xlogue_layout::compute_stub_managed_regs (stub_managed_regs);
>
> + if (in_hard_reg_set_p (stub_managed_regs, DImode, regno))
> + return false;
> + }
> +
> if (crtl->drap_reg
> && regno == REGNO (crtl->drap_reg)
> && !cfun->machine->no_drap_save_restore)
This makes no sense. The entire purpose of stub_managed_regs is to
cache the result of xlogue_layout::compute_stub_managed_regs() and this
would unnecessarily repeat that calculation for each time
ix86_save_reg() is called. Since
xlogue_layout::compute_stub_managed_regs() calls ix86_save_reg many
times, this makes it even worse.Which registers are being saved
out-of-line and inline MUST be known at the time the stack layout is
determined. So stub_managed_regsshould either be left a TU static or
just moved to struct machine_function.
As an aside, I've noticed that xlogue_layout::compute_stub_managed_regs
is calling ix86_save_reg (which isn't trivial) more often than it really
has to, so I've refactored it.
>
> -/* Disables out-of-lined msabi to sysv pro/epilogues and emits a
> warning if
> - warn_once is null, or *warn_once is zero. */
> -static void disable_call_ms2sysv_xlogues (const char *feature)
> +/* Emits a warning for unsupported msabi to sysv pro/epilogues. */
> +static void warn_once_call_ms2sysv_xlogues (const char *feature)
> {
> - cfun->machine->call_ms2sysv = false;
> - warning (OPT_mcall_ms2sysv_xlogues, "not currently compatible with
> %s.",
> - feature);
> + static bool warned_once = false;
> + if (!warned_once)
> + {
> + warning (0, "-mcall-ms2sysv-xlogues is not compatible with %s",
> + feature);
> + warned_once = true;
> + }
> }
>
We probably don't want to suppress all warnings across cases. I've got
a new version of this that takes a mask for the "warned":
static void disable_call_ms2sysv_xlogues (const char *feature, int warned_mask)
{
static int warned = 0;
cfun->machine->call_ms2sysv = false;
if (!(warned & warned_mask))
{
warning (OPT_mcall_ms2sysv_xlogues, "not currently compatible with %s.",
feature);
warned |= warned_mask;
}
}
> /* When using -fsplit-stack, the allocation routines set a field in
> @@ -12836,8 +12780,9 @@ ix86_builtin_setjmp_frame_value (void)
> /* Fill structure ix86_frame about frame of currently computed
> function. */
>
> static void
> -ix86_compute_frame_layout (struct ix86_frame *frame)
> +ix86_compute_frame_layout (void)
> {
> + struct ix86_frame *frame = &cfun->machine->frame;
> struct machine_function *m = cfun->machine;
> unsigned HOST_WIDE_INT stack_alignment_needed;
> HOST_WIDE_INT offset;
> @@ -12845,46 +12790,41 @@ static void
> HOST_WIDE_INT size = get_frame_size ();
> HOST_WIDE_INT to_allocate;
>
> - CLEAR_HARD_REG_SET (stub_managed_regs);
> -
> /* m->call_ms2sysv is initially enabled in ix86_expand_call for all
> 64-bit
> * ms_abi functions that call a sysv function. We now need to
> prune away
> * cases where it should be disabled. */
> if (TARGET_64BIT && m->call_ms2sysv)
> - {
> - gcc_assert (TARGET_64BIT_MS_ABI);
> - gcc_assert (TARGET_CALL_MS2SYSV_XLOGUES);
> - gcc_assert (!TARGET_SEH);
> + {
> + gcc_assert (TARGET_64BIT_MS_ABI);
> + gcc_assert (TARGET_CALL_MS2SYSV_XLOGUES);
> + gcc_assert (!TARGET_SEH);
> + gcc_assert (TARGET_SSE);
> + gcc_assert (!ix86_using_red_zone ());
>
> - if (!TARGET_SSE)
> - m->call_ms2sysv = false;
> + if (crtl->calls_eh_return)
> + {
> + gcc_assert (!reload_completed);
> + m->call_ms2sysv = false;
> + warn_once_call_ms2sysv_xlogues ("__builtin_eh_return");
> + }
>
> - /* Don't break hot-patched functions. */
> - else if (ix86_function_ms_hook_prologue (current_function_decl))
> - m->call_ms2sysv = false;
> + else if (ix86_static_chain_on_stack)
> + {
> + gcc_assert (!reload_completed);
> + m->call_ms2sysv = false;
> + warn_once_call_ms2sysv_xlogues ("static call chains");
> + }
>
> - /* TODO: Cases not yet examined. */
> - else if (crtl->calls_eh_return)
> - disable_call_ms2sysv_xlogues ("__builtin_eh_return");
> + /* Finally, compute which registers the stub will manage. */
> + else
> + {
> + HARD_REG_SET stub_managed_regs;
> + unsigned count = xlogue_layout
> + ::compute_stub_managed_regs (stub_managed_regs);
> + m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS;
> + }
> + }
>
> - else if (ix86_static_chain_on_stack)
> - disable_call_ms2sysv_xlogues ("static call chains");
> -
> - else if (ix86_using_red_zone ())
> - disable_call_ms2sysv_xlogues ("red zones");
> -
> - else if (flag_split_stack)
> - disable_call_ms2sysv_xlogues ("split stack");
> -
> - /* Finally, compute which registers the stub will manage. */
> - else
> - {
> - unsigned count = xlogue_layout
> - ::compute_stub_managed_regs (stub_managed_regs);
> - m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS;
> - }
> - }
> -
> @@ -29320,7 +29265,24 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx
> call
>
> /* Set here, but it may get cleared later. */
> if (TARGET_CALL_MS2SYSV_XLOGUES)
> - cfun->machine->call_ms2sysv = true;
> + {
> + if (!TARGET_SSE)
> + ;
> +
> + /* Don't break hot-patched functions. */
> + else if (ix86_function_ms_hook_prologue (current_function_decl))
> + ;
> +
> + /* TODO: Cases not yet examined. */
> + else if (flag_split_stack)
> + warn_once_call_ms2sysv_xlogues ("-fsplit-stack");
> +
> + else
> + {
> + gcc_assert (!reload_completed);
> + cfun->machine->call_ms2sysv = true;
> + }
> + }
> }
>
Other than the local compute_stub_managed_regs (which is a problem),
this looks like a very good improvement. It clarifies conditions that
will not change over the course of compiling a function (split stack,
ms_hook_prologue, etc.) and those that can.
After some thought, I've decided that it's not better to use a sorry()
to filter out -fsplit-stack combined with -march-ms2sysv-xlogues because
it would break support for a TU that uses both via function attributes.
Example:
void __attribute__((ms_abi))
a (void)
{
call_a_sysv_fn ();
/* stuff */
}
void __attribute__((optimize("-fsplit-stack")))
b(void)
{
/* stuff */
}
So you're fixes for this are better.
Thanks,
Daniel
next prev parent reply other threads:[~2017-05-17 1:57 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-14 9:11 Bernd Edlinger
2017-05-14 9:58 ` Daniel Santos
2017-05-14 10:27 ` Uros Bizjak
2017-05-14 18:11 ` Bernd Edlinger
2017-05-14 20:37 ` Daniel Santos
2017-05-15 2:23 ` Daniel Santos
2017-05-15 20:41 ` Bernd Edlinger
2017-05-15 23:52 ` Daniel Santos
2017-05-16 5:42 ` Daniel Santos
2017-05-16 8:52 ` Bernd Edlinger
2017-05-16 15:11 ` Daniel Santos
2017-05-16 17:38 ` Ian Lance Taylor via gcc-patches
2017-05-16 20:10 ` Bernd Edlinger
2017-05-16 21:05 ` Bernd Edlinger
2017-05-17 2:12 ` Daniel Santos [this message]
2017-05-17 17:43 ` Bernd Edlinger
2017-05-18 7:05 ` Daniel Santos
2017-05-18 13:48 ` Bernd Edlinger
2017-05-19 3:13 ` Daniel Santos
2017-05-22 18:32 ` Bernd Edlinger
2017-05-23 14:34 ` Bernd Edlinger
2017-05-24 6:49 ` Daniel Santos
2017-06-01 16:06 ` [PING][PATCH] " Bernd Edlinger
2017-06-01 18:18 ` [PATCH] " Uros Bizjak
2017-05-23 23:01 ` Daniel Santos
2017-05-16 21:35 ` Daniel Santos
2017-05-17 18:41 ` Bernd Edlinger
2017-05-18 7:14 ` Daniel Santos
2017-05-19 5:12 ` Daniel Santos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ac395e89-e4ef-7450-5a9d-21e387d2d011@pobox.com \
--to=daniel.santos@pobox.com \
--cc=bernd.edlinger@hotmail.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).