public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
To: Bernd Edlinger <bernd.edlinger@hotmail.de>,
	Jeff Law <law@redhat.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Richard Biener <rguenther@suse.de>,
	Jakub Jelinek <jakub@redhat.com>,
	Richard Sandiford <rdsandiford@googlemail.com>,
	Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>,
	Nick Clifton <nickc@redhat.com>
Subject: Re: [PATCH, ARM] Add a new target hook to compute the frame layout
Date: Fri, 05 Aug 2016 14:06:00 -0000	[thread overview]
Message-ID: <4b971f30-b5e2-7960-aab4-a873f3ed433e@arm.com> (raw)
In-Reply-To: <AM4PR0701MB2162397071611D67B20D8599E4180@AM4PR0701MB2162.eurprd07.prod.outlook.com>

On 05/08/16 13:49, Bernd Edlinger wrote:
> On 08/05/16 11:29, Richard Earnshaw (lists) wrote:
>> On 04/08/16 22:16, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> this patch introduces a new target hook that allows the target's
>>> INITIAL_ELIMINATION_OFFSET function to use cached values instead of
>>> re-computing the frame layout every time.
>>>
>>> I have updated the documentation a bit and hope it is clearer this time.
>>>
>>> It still needs a review by ARM port maintainers.
>>>
>>> If the ARM port maintainers find this patch useful, that would be fine.
>>>
>>
>> I need to look into this more, but my first thought was that the
>> documentation is confusing, or there is a real problem in this patch.
>>
> 
> Thanks for your quick response.
> 
> The documentation is actually the most difficult part for me.
> 
>> As I understand it the frame has to be re-laid out each time the
>> contents of the frame changes (an extra register becomes live or another
>> spill slot is needed).  So saying that it is laid out once can't be
>> right if (as I read it initially) you mean 'once per function' since I
>> think it needs to be 'once for each time the frame contents changes'.
>>
>> Of course, things might be a bit different with LRA compared with
>> reload, but I strongly suspect this is never going to be an 'exactly
>> once per function' operation.
>>
> 
> Right.  It will be done 2 or 3 times for each function.
> LRA and reload behave identical in that respect.
> 
> But each time reload changes something in the input data the
> INITIAL_EMIMINATION_OFFSET is called several times, and the results
> have to be consistent in each iteration.
> 
> The frame layout function has no way to know if the frame layout
> is expected to change or not.
> 
> Many targets use reload_completed as an indication when the frame layout
> may not change at all, but that is only an approximation.
> 
>> Can you clarify your meaning in the documentation please?
>>
> 
> I meant 'once' in the sense of 'once for each time the frame contents 
> change'.
> 
> Thus I'd change that sentence to:
> 
> "This target hook allows the target to compute the frame layout once for
> each time the frame contents change and make use of the cached frame
> layout in @code{INITIAL_ELIMINATION_OFFSET} instead of re-computing it
> on every invocation.  This is particularly useful for targets that have
> an expensive frame layout function.  Implementing this callback is
> optional."
> 

Thanks, that's pretty much what I expected would be the case.

Could I suggest:

This target hook is called once each time the frame layout needs to be
recalculated.  The calculations can be cached by the target and can then
be used by @code{INITIAL_ELIMINATION_OFFSET} instead of re-computing the
layout on every invocation of that hook.  This is particularly useful
for targets that have an expensive frame layout function.  Implementing
this callback is optional.

R.

> 
> Thanks
> Bernd.
> 
> 
>> R.
>>
>>>
>>> Thanks
>>> Bernd.
>>>
>>> On 06/21/16 23:29, Jeff Law wrote:
>>>> On 06/16/2016 08:47 AM, Bernd Edlinger wrote:
>>>>> Hi!
>>>>>
>>>>>
>>>>> By the design of the target hook INITIAL_ELIMINATION_OFFSET
>>>>> it is necessary to call this function several times with
>>>>> different register combinations.
>>>>> Most targets use a cached data structure that describes the
>>>>> exact frame layout of the current function.
>>>>>
>>>>> It is safe to skip the computation when reload_completed = true,
>>>>> and most targets do that already.
>>>>>
>>>>> However while reload is doing its work, it is not clear when to
>>>>> do the computation and when not.  This results in unnecessary
>>>>> work.  Computing the frame layout can be a simple function or an
>>>>> arbitrarily complex one, that walks all instructions of the current
>>>>> function for instance, which is more or less the common case.
>>>>>
>>>>>
>>>>> This patch adds a new optional target hook that can be used
>>>>> by the target to factor the INITIAL_ELIMINATION_OFFSET-hook
>>>>> into a O(n) computation part, and a O(1) result function.
>>>>>
>>>>> The patch implements a compute_frame_layout target hook just
>>>>> for ARM in the moment, to show the principle.
>>>>> Other targets may also implement that hook, if it seems appropriate.
>>>>>
>>>>>
>>>>> Boot-strapped and reg-tested on arm-linux-gnueabihf.
>>>>> OK for trunk?
>>>>>
>>>>>
>>>>> Thanks
>>>>> Bernd.
>>>>>
>>>>>
>>>>> changelog-frame-layout.txt
>>>>>
>>>>>
>>>>> 2016-06-16  Bernd Edlinger  <bernd.edlinger@hotmail.de>
>>>>>
>>>>>      * target.def (compute_frame_layout): New optional target hook.
>>>>>      * doc/tm.texi.in (TARGET_COMPUTE_FRAME_LAYOUT): Add hook.
>>>>>      * doc/tm.texi (TARGET_COMPUTE_FRAME_LAYOUT): Add documentation.
>>>>>      * lra-eliminations.c (update_reg_eliminate): Call
>>>>> compute_frame_layout
>>>>>      target hook.
>>>>>      * reload1.c (verify_initial_elim_offsets): Likewise.
>>>>>      * config/arm/arm.c (TARGET_COMPUTE_FRAME_LAYOUT): Define.
>>>>>      (use_simple_return_p): Call arm_compute_frame_layout if needed.
>>>>>      (arm_get_frame_offsets): Split up into this ...
>>>>>      (arm_compute_frame_layout): ... and this function.
>>>> The ARM maintainers would need to chime in on the ARM specific changes
>>>> though.
>>>>
>>>>
>>>>
>>>>> Index: gcc/target.def
>>>>> ===================================================================
>>>>> --- gcc/target.def    (Revision 233176)
>>>>> +++ gcc/target.def    (Arbeitskopie)
>>>>> @@ -5245,8 +5245,19 @@ five otherwise.  This is best for most machines.",
>>>>>    unsigned int, (void),
>>>>>    default_case_values_threshold)
>>>>>
>>>>> -/* Retutn true if a function must have and use a frame pointer.  */
>>>> s/Retutn/Return
>>>>
>>>>> +/* Optional callback to advise the target to compute the frame
>>>>> layout.  */
>>>>>   DEFHOOK
>>>>> +(compute_frame_layout,
>>>>> + "This target hook is called immediately before reload wants to call\n\
>>>>> +@code{INITIAL_ELIMINATION_OFFSET} and allows the target to cache the
>>>>> frame\n\
>>>>> +layout instead of re-computing it on every invocation.  This is
>>>>> particularly\n\
>>>>> +useful for targets that have an O(n) frame layout function.
>>>>> Implementing\n\
>>>>> +this callback is optional.",
>>>>> + void, (void),
>>>>> + hook_void_void)
>>>> So the docs say "immediately before", but that's not actually reality in
>>>> lra-eliminations.  I think you can just say "This target hook is called
>>>> before reload or lra-eliminations calls
>>>> @code{INITIAL_ELIMINATION_OFFSET} and allows ..."
>>>>
>>>>
>>>> How does this macro interact with INITIAL_FRAME_POINTER_OFFSET?
>>>>
>>>> I'm OK with this conceptually.  I think you need a minor doc update and
>>>> OK from the ARM maintainers before it can be installed though.
>>>>
>>>> jeff
>>>>
>>>> changelog-frame-layout.txt
>>>>
>>>>
>>>> 2016-06-16  Bernd Edlinger  <bernd.edlinger@hotmail.de>
>>>>
>>>> 	* target.def (compute_frame_layout): New optional target hook.
>>>> 	* doc/tm.texi.in (TARGET_COMPUTE_FRAME_LAYOUT): Add hook.
>>>> 	* doc/tm.texi (TARGET_COMPUTE_FRAME_LAYOUT): Add documentation.
>>>> 	* lra-eliminations.c (update_reg_eliminate): Call compute_frame_layout
>>>> 	target hook.
>>>> 	* reload1.c (verify_initial_elim_offsets): Likewise.
>>>> 	* config/arm/arm.c (TARGET_COMPUTE_FRAME_LAYOUT): Define.
>>>> 	(use_simple_return_p): Call arm_compute_frame_layout if needed.
>>>> 	(arm_get_frame_offsets): Split up into this ...
>>>> 	(arm_compute_frame_layout): ... and this function.
>>>>
>>>> patch-frame-layout.diff
>>>>
>>>>
>>>> Index: gcc/config/arm/arm.c
>>>> ===================================================================
>>>> --- gcc/config/arm/arm.c	(revision 239144)
>>>> +++ gcc/config/arm/arm.c	(working copy)
>>>> @@ -81,6 +81,7 @@ static bool arm_const_not_ok_for_debug_p (rtx);
>>>>   static bool arm_needs_doubleword_align (machine_mode, const_tree);
>>>>   static int arm_compute_static_chain_stack_bytes (void);
>>>>   static arm_stack_offsets *arm_get_frame_offsets (void);
>>>> +static void arm_compute_frame_layout (void);
>>>>   static void arm_add_gc_roots (void);
>>>>   static int arm_gen_constant (enum rtx_code, machine_mode, rtx,
>>>>   			     unsigned HOST_WIDE_INT, rtx, rtx, int, int);
>>>> @@ -663,6 +664,9 @@ static const struct attribute_spec arm_attribute_t
>>>>   #undef TARGET_SCALAR_MODE_SUPPORTED_P
>>>>   #define TARGET_SCALAR_MODE_SUPPORTED_P arm_scalar_mode_supported_p
>>>>
>>>> +#undef TARGET_COMPUTE_FRAME_LAYOUT
>>>> +#define TARGET_COMPUTE_FRAME_LAYOUT arm_compute_frame_layout
>>>> +
>>>>   #undef TARGET_FRAME_POINTER_REQUIRED
>>>>   #define TARGET_FRAME_POINTER_REQUIRED arm_frame_pointer_required
>>>>
>>>> @@ -3880,6 +3884,10 @@ use_simple_return_p (void)
>>>>   {
>>>>     arm_stack_offsets *offsets;
>>>>
>>>> +  /* Note this function can be called before or after reload.  */
>>>> +  if (!reload_completed)
>>>> +    arm_compute_frame_layout ();
>>>> +
>>>>     offsets = arm_get_frame_offsets ();
>>>>     return offsets->outgoing_args != 0;
>>>>   }
>>>> @@ -19370,7 +19378,7 @@ arm_compute_static_chain_stack_bytes (void)
>>>>
>>>>   /* Compute a bit mask of which registers need to be
>>>>      saved on the stack for the current function.
>>>> -   This is used by arm_get_frame_offsets, which may add extra registers.  */
>>>> +   This is used by arm_compute_frame_layout, which may add extra registers.  */
>>>>
>>>>   static unsigned long
>>>>   arm_compute_save_reg_mask (void)
>>>> @@ -20926,12 +20934,25 @@ any_sibcall_could_use_r3 (void)
>>>>     alignment.  */
>>>>
>>>>
>>>> +/* Return cached stack offsets.  */
>>>> +
>>>> +static arm_stack_offsets *
>>>> +arm_get_frame_offsets (void)
>>>> +{
>>>> +  struct arm_stack_offsets *offsets;
>>>> +
>>>> +  offsets = &cfun->machine->stack_offsets;
>>>> +
>>>> +  return offsets;
>>>> +}
>>>> +
>>>> +
>>>>   /* Calculate stack offsets.  These are used to calculate register elimination
>>>>      offsets and in prologue/epilogue code.  Also calculates which registers
>>>>      should be saved.  */
>>>>
>>>> -static arm_stack_offsets *
>>>> -arm_get_frame_offsets (void)
>>>> +static void
>>>> +arm_compute_frame_layout (void)
>>>>   {
>>>>     struct arm_stack_offsets *offsets;
>>>>     unsigned long func_type;
>>>> @@ -20943,19 +20964,6 @@ any_sibcall_could_use_r3 (void)
>>>>
>>>>     offsets = &cfun->machine->stack_offsets;
>>>>
>>>> -  /* We need to know if we are a leaf function.  Unfortunately, it
>>>> -     is possible to be called after start_sequence has been called,
>>>> -     which causes get_insns to return the insns for the sequence,
>>>> -     not the function, which will cause leaf_function_p to return
>>>> -     the incorrect result.
>>>> -
>>>> -     to know about leaf functions once reload has completed, and the
>>>> -     frame size cannot be changed after that time, so we can safely
>>>> -     use the cached value.  */
>>>> -
>>>> -  if (reload_completed)
>>>> -    return offsets;
>>>> -
>>>>     /* Initially this is the size of the local variables.  It will translated
>>>>        into an offset once we have determined the size of preceding data.  */
>>>>     frame_size = ROUND_UP_WORD (get_frame_size ());
>>>> @@ -21022,7 +21030,7 @@ any_sibcall_could_use_r3 (void)
>>>>       {
>>>>         offsets->outgoing_args = offsets->soft_frame;
>>>>         offsets->locals_base = offsets->soft_frame;
>>>> -      return offsets;
>>>> +      return;
>>>>       }
>>>>
>>>>     /* Ensure SFP has the correct alignment.  */
>>>> @@ -21098,8 +21106,6 @@ any_sibcall_could_use_r3 (void)
>>>>   	offsets->outgoing_args += 4;
>>>>         gcc_assert (!(offsets->outgoing_args & 7));
>>>>       }
>>>> -
>>>> -  return offsets;
>>>>   }
>>>>
>>>>
>>>> Index: gcc/doc/tm.texi
>>>> ===================================================================
>>>> --- gcc/doc/tm.texi	(revision 239144)
>>>> +++ gcc/doc/tm.texi	(working copy)
>>>> @@ -3693,6 +3693,14 @@ registers.  This macro must be defined if @code{EL
>>>>   defined.
>>>>   @end defmac
>>>>
>>>> +@deftypefn {Target Hook} void TARGET_COMPUTE_FRAME_LAYOUT (void)
>>>> +This target hook allows the target to compute the frame layout once and
>>>> +make use of the cached frame layout in @code{INITIAL_ELIMINATION_OFFSET}
>>>> +instead of re-computing it on every invocation.  This is particularly
>>>> +useful for targets that have an expensive frame layout function.
>>>> +Implementing this callback is optional.
>>>> +@end deftypefn
>>>> +
>>>>   @node Stack Arguments
>>>>   @subsection Passing Function Arguments on the Stack
>>>>   @cindex arguments on stack
>>>> Index: gcc/doc/tm.texi.in
>>>> ===================================================================
>>>> --- gcc/doc/tm.texi.in	(revision 239144)
>>>> +++ gcc/doc/tm.texi.in	(working copy)
>>>> @@ -3227,6 +3227,8 @@ registers.  This macro must be defined if @code{EL
>>>>   defined.
>>>>   @end defmac
>>>>
>>>> +@hook TARGET_COMPUTE_FRAME_LAYOUT
>>>> +
>>>>   @node Stack Arguments
>>>>   @subsection Passing Function Arguments on the Stack
>>>>   @cindex arguments on stack
>>>> Index: gcc/lra-eliminations.c
>>>> ===================================================================
>>>> --- gcc/lra-eliminations.c	(revision 239144)
>>>> +++ gcc/lra-eliminations.c	(working copy)
>>>> @@ -1203,6 +1203,10 @@ update_reg_eliminate (bitmap insns_with_changed_of
>>>>     struct lra_elim_table *ep, *ep1;
>>>>     HARD_REG_SET temp_hard_reg_set;
>>>>
>>>> +#ifdef ELIMINABLE_REGS
>>>> +  targetm.compute_frame_layout ();
>>>> +#endif
>>>> +
>>>>     /* Clear self elimination offsets.  */
>>>>     for (ep = reg_eliminate; ep < &reg_eliminate[NUM_ELIMINABLE_REGS]; ep++)
>>>>       self_elim_offsets[ep->from] = 0;
>>>> Index: gcc/reload1.c
>>>> ===================================================================
>>>> --- gcc/reload1.c	(revision 239144)
>>>> +++ gcc/reload1.c	(working copy)
>>>> @@ -3831,6 +3831,7 @@ verify_initial_elim_offsets (void)
>>>>     {
>>>>      struct elim_table *ep;
>>>>
>>>> +   targetm.compute_frame_layout ();
>>>>      for (ep = reg_eliminate; ep < &reg_eliminate[NUM_ELIMINABLE_REGS]; ep++)
>>>>        {
>>>>          INITIAL_ELIMINATION_OFFSET (ep->from, ep->to, t);
>>>> @@ -3855,6 +3856,7 @@ set_initial_elim_offsets (void)
>>>>     struct elim_table *ep = reg_eliminate;
>>>>
>>>>   #ifdef ELIMINABLE_REGS
>>>> +  targetm.compute_frame_layout ();
>>>>     for (; ep < &reg_eliminate[NUM_ELIMINABLE_REGS]; ep++)
>>>>       {
>>>>         INITIAL_ELIMINATION_OFFSET (ep->from, ep->to, ep->initial_offset);
>>>> Index: gcc/target.def
>>>> ===================================================================
>>>> --- gcc/target.def	(revision 239144)
>>>> +++ gcc/target.def	(working copy)
>>>> @@ -5269,8 +5269,19 @@ five otherwise.  This is best for most machines.",
>>>>    unsigned int, (void),
>>>>    default_case_values_threshold)
>>>>
>>>> -/* Retutn true if a function must have and use a frame pointer.  */
>>>> +/* Optional callback to advise the target to compute the frame layout.  */
>>>>   DEFHOOK
>>>> +(compute_frame_layout,
>>>> + "This target hook allows the target to compute the frame layout once and\n\
>>>> +make use of the cached frame layout in @code{INITIAL_ELIMINATION_OFFSET}\n\
>>>> +instead of re-computing it on every invocation.  This is particularly\n\
>>>> +useful for targets that have an expensive frame layout function.\n\
>>>> +Implementing this callback is optional.",
>>>> + void, (void),
>>>> + hook_void_void)
>>>> +
>>>> +/* Return true if a function must have and use a frame pointer.  */
>>>> +DEFHOOK
>>>>   (frame_pointer_required,
>>>>    "This target hook should return @code{true} if a function must have and use\n\
>>>>   a frame pointer.  This target hook is called in the reload pass.  If its return\n\
>>
> 

  reply	other threads:[~2016-08-05 14:06 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-16 14:47 [PATCH] " Bernd Edlinger
2016-06-21 21:30 ` Jeff Law
2016-06-22  5:12   ` Bernd Edlinger
2016-06-22  7:21     ` AW: " Bernd Edlinger
2016-06-22 18:34       ` Jeff Law
2016-06-22 18:49     ` Jeff Law
2016-06-22 20:25       ` Bernd Edlinger
2016-08-04 21:16   ` [PATCH, ARM] " Bernd Edlinger
2016-08-05  9:29     ` Richard Earnshaw (lists)
2016-08-05 12:49       ` Bernd Edlinger
2016-08-05 14:06         ` Richard Earnshaw (lists) [this message]
2016-08-05 15:23           ` Bernd Edlinger
2016-09-05 16:45           ` Bernd Edlinger
2017-05-05 15:02             ` Richard Earnshaw (lists)
2017-05-06 18:16               ` Bernd Edlinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b971f30-b5e2-7960-aab4-a873f3ed433e@arm.com \
    --to=richard.earnshaw@arm.com \
    --cc=bernd.edlinger@hotmail.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=law@redhat.com \
    --cc=nickc@redhat.com \
    --cc=ramana.radhakrishnan@arm.com \
    --cc=rdsandiford@googlemail.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).