public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Andrew Pinski <pinskia@gmail.com>
To: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
	Marcus Shawcroft <marcus.shawcroft@arm.com>,
		Richard Earnshaw <Richard.Earnshaw@arm.com>,
	James Greenhalgh <james.greenhalgh@arm.com>,
		Segher Boessenkool <segher@kernel.crashing.org>
Subject: Re: [PATCH][AArch64] Separate shrink wrapping hooks implementation
Date: Thu, 10 Nov 2016 22:42:00 -0000	[thread overview]
Message-ID: <CA+=Sn1mZgmeU0VVfTi7HqWOWrZtoN=_43BJqgR9iTfs0Eq34zw@mail.gmail.com> (raw)
In-Reply-To: <5824836B.5030302@foss.arm.com>

On Thu, Nov 10, 2016 at 6:25 AM, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
> Hi all,
>
> This patch implements the new separate shrink-wrapping hooks for aarch64.
> In separate shrink wrapping (as I understand it) we consider each register
> save/restore as
> a 'component' that can be performed independently of the other save/restores
> in the prologue/epilogue
> and can be moved outside the prologue/epilogue and instead performed only in
> the basic blocks where it's
> actually needed. This allows us to avoid saving and restoring registers on
> execution paths where a register
> might not be needed.
>
> In the most general form a 'component' can be any operation that the
> prologue/epilogue performs, for example
> stack adjustment. But in this patch we only consider callee-saved register
> save/restores as components.
> The code is in many ways similar to the powerpc implementation of the hooks.
>
> The hooks implemented are:
> * TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS: Returns a bitmap containing a
> bit for each register that should
> be considered a 'component' i.e. its save/restore should be separated from
> the prologue and epilogue and placed
> at the basic block where it's needed.
>
> * TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB: Determine for a given basic block
> which 'component' registers it needs.
> This is determined through dataflow. If a component register is in the
> IN,GEN or KILL sets for the basic block
> it's considered as needed and marked as such in the bitmap.
>
> * TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS and
> TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS: Given a bitmap
> of component registers emits the save or restore code for them.
>
> * TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS: Given a bitmap of component
> registers record in the backend that
> the register is shrink-wrapped using this approach and that the normal
> prologue and epilogue expansion code
> should not emit code for them. This is done similarly to powerpc by defining
> a bool array in machine_function
> where we record whether each register is separately shrink-wrapped.  The
> prologue and epilogue expansion code
> (through aarch64_save_callee_saves and aarch64_restore_callee_saves) is
> updated to not emit save/restores for
> these registers if they appear in that array.
>
> Our prologue and epilogue code has a lot of intricate logic to perform stack
> adjustments using the writeback
> forms of the load/store instructions. Separately shrink-wrapping those
> registers marked for writeback
> (cfun->machine->frame.wb_candidate1 and cfun->machine->frame.wb_candidate2)
> broke that codegen and I had to
> emit an explicit stack adjustment instruction that created ugly
> prologue/epilogue sequences. So this patch
> is conservative and doesn't allow shrink-wrapping of the registers marked
> for writeback. Maybe in the future
> we can relax it (for example allow wrapping of one of the two writeback
> registers if the writeback amount
> can be encoded in a single-register writeback store/load) but given the
> development stage of GCC I thought
> I'd play it safe.
>
> I ran SPEC2006 on a Cortex-A72. Overall scores were neutral but there were
> some interesting swings.
> 458.sjeng     +1.45%
> 471.omnetpp   +2.19%
> 445.gobmk     -2.01%
>
> On SPECFP:
> 453.povray    +7.00%


Wow, this looks really good.  Thank you for implementing this.  If I
get some time I am going to try it out on other processors than A72
but I doubt I have time any time soon.

Thanks,
Andrew

>
> I'll be re-running the benchmarks with Segher's recent patch [1] to see if
> they fix the regression
> and if it does I think this can go in.
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00889.html
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Thanks,
> Kyrill
>
> 2016-11-10  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/aarch64/aarch64.h (machine_function): Add
>     reg_is_wrapped_separately field.
>     * config/aarch64/aarch64.c (emit_set_insn): Change return type to
>     rtx_insn *.
>     (aarch64_save_callee_saves): Don't save registers that are wrapped
>     separately.
>     (aarch64_restore_callee_saves): Don't restore registers that are
>     wrapped separately.
>     (offset_9bit_signed_unscaled_p, offset_12bit_unsigned_scaled_p,
>     aarch64_offset_7bit_signed_scaled_p): Move earlier in the file.
>     (aarch64_get_separate_components): New function.
>     (aarch64_components_for_bb): Likewise.
>     (aarch64_disqualify_components): Likewise.
>     (aarch64_emit_prologue_components): Likewise.
>     (aarch64_emit_epilogue_components): Likewise.
>     (aarch64_set_handled_components): Likewise.
>     (TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS,
>     TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB,
>     TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS,
>     TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS,
>     TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS,
>     TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS): Define.

  parent reply	other threads:[~2016-11-10 22:42 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-10 14:26 Kyrill Tkachov
2016-11-10 16:33 ` Segher Boessenkool
2016-11-10 16:45   ` Kyrill Tkachov
2016-11-10 22:42 ` Andrew Pinski [this message]
2016-11-10 23:39   ` Segher Boessenkool
2016-11-11 10:18     ` Kyrill Tkachov
2016-11-11 15:31       ` Kyrill Tkachov
2016-11-14 14:25         ` Kyrill Tkachov
2016-11-17 14:22           ` Kyrill Tkachov
2016-11-17 14:44             ` Segher Boessenkool
2016-11-17 14:55               ` Kyrill Tkachov
2016-11-17 15:02                 ` Segher Boessenkool
2016-11-17 15:06                   ` Kyrill Tkachov
2016-11-17 16:50                 ` Kyrill Tkachov
2016-11-17 17:46                   ` Segher Boessenkool
2016-11-18  9:29                     ` Kyrill Tkachov
2016-11-18 12:50                       ` Segher Boessenkool
2016-11-22 16:47                         ` Kyrill Tkachov
2016-11-29 10:58           ` James Greenhalgh
2016-11-29 11:18             ` Kyrill Tkachov
2016-11-29 11:32               ` Kyrill Tkachov
2016-11-29 11:37                 ` James Greenhalgh
2016-11-29 20:29             ` Segher Boessenkool
2016-11-30 14:08               ` Kyrill Tkachov
2016-12-02 17:09                 ` James Greenhalgh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+=Sn1mZgmeU0VVfTi7HqWOWrZtoN=_43BJqgR9iTfs0Eq34zw@mail.gmail.com' \
    --to=pinskia@gmail.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=james.greenhalgh@arm.com \
    --cc=kyrylo.tkachov@foss.arm.com \
    --cc=marcus.shawcroft@arm.com \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).