public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Torbjorn SVENSSON <torbjorn.svensson@foss.st.com>
To: Luis Machado <luis.machado@arm.com>, <gdb-patches@sourceware.org>
Cc: <vanekt@volny.cz>
Subject: Re: [PATCH 3/3] gdb/arm: PR 29738 Cache value for stack pointers for dwarf2 frames
Date: Sat, 12 Nov 2022 20:54:47 +0100	[thread overview]
Message-ID: <5f819016-ec7e-7d4c-0ddc-86ac12f34cc7@foss.st.com> (raw)
In-Reply-To: <79392f32-a64e-7bcf-e4a3-b2795917580b@arm.com>

Hi Luis,

Thanks for taking another look at this.
My comments below will be slightly overlaping what Tomas already 
replied, but I think it makes more sense to reply to this email than to 
Tomas reply as my comments is on your mail.

On 2022-11-11 10:15, Luis Machado wrote:
> Hi Torbjörn,
> 
> On 11/4/22 14:44, Torbjörn SVENSSON wrote:
>> Without this patch, the number of calls to arm_dwarf2_prev_register
>> would grow in a too rapid way when the number of frames increase.
>>
>> Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
>> ---
>>   gdb/arm-tdep.c | 141 +++++++++++++++++++++++--------------------------
>>   1 file changed, 66 insertions(+), 75 deletions(-)
>>
>> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
>> index c011b2aa973..a6fb660bcbc 100644
>> --- a/gdb/arm-tdep.c
>> +++ b/gdb/arm-tdep.c
>> @@ -3992,78 +3992,6 @@ arm_dwarf2_prev_register (frame_info_ptr 
>> this_frame, void **this_cache,
>>         cpsr = reconstruct_t_bit (gdbarch, lr, cpsr);
>>         return frame_unwind_got_constant (this_frame, regnum, cpsr);
>>       }
>> -  else if (arm_is_alternative_sp_register (tdep, regnum))
>> -    {
>> -      /* Handle the alternative SP registers on Cortex-M.  */
>> -      bool override_with_sp_value = false;
>> -      CORE_ADDR val;
>> -
>> -      if (tdep->have_sec_ext)
>> -    {
>> -      CORE_ADDR sp
>> -        = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM);
>> -      CORE_ADDR msp_s
>> -        = get_frame_register_unsigned (this_frame,
>> -                       tdep->m_profile_msp_s_regnum);
>> -      CORE_ADDR msp_ns
>> -        = get_frame_register_unsigned (this_frame,
>> -                       tdep->m_profile_msp_ns_regnum);
>> -      CORE_ADDR psp_s
>> -        = get_frame_register_unsigned (this_frame,
>> -                       tdep->m_profile_psp_s_regnum);
>> -      CORE_ADDR psp_ns
>> -        = get_frame_register_unsigned (this_frame,
>> -                       tdep->m_profile_psp_ns_regnum);
>> -
>> -      bool is_msp = (regnum == tdep->m_profile_msp_regnum)
>> -        && (msp_s == sp || msp_ns == sp);
>> -      bool is_msp_s = (regnum == tdep->m_profile_msp_s_regnum)
>> -        && (msp_s == sp);
>> -      bool is_msp_ns = (regnum == tdep->m_profile_msp_ns_regnum)
>> -        && (msp_ns == sp);
>> -      bool is_psp = (regnum == tdep->m_profile_psp_regnum)
>> -        && (psp_s == sp || psp_ns == sp);
>> -      bool is_psp_s = (regnum == tdep->m_profile_psp_s_regnum)
>> -        && (psp_s == sp);
>> -      bool is_psp_ns = (regnum == tdep->m_profile_psp_ns_regnum)
>> -        && (psp_ns == sp);
>> -
>> -      override_with_sp_value = is_msp || is_msp_s || is_msp_ns
>> -        || is_psp || is_psp_s || is_psp_ns;
>> -
>> -    }
>> -      else if (tdep->is_m)
>> -    {
>> -      CORE_ADDR sp
>> -        = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM);
>> -      CORE_ADDR msp
>> -        = get_frame_register_unsigned (this_frame,
>> -                       tdep->m_profile_msp_regnum);
>> -      CORE_ADDR psp
>> -        = get_frame_register_unsigned (this_frame,
>> -                       tdep->m_profile_psp_regnum);
>> -
>> -      bool is_msp = (regnum == tdep->m_profile_msp_regnum) && (sp == 
>> msp);
>> -      bool is_psp = (regnum == tdep->m_profile_psp_regnum) && (sp == 
>> psp);
>> -
>> -      override_with_sp_value = is_msp || is_psp;
>> -    }
>> -
>> -      if (override_with_sp_value)
>> -    {
>> -      /* Use value of SP from previous frame.  */
>> -      frame_info_ptr prev_frame = get_prev_frame (this_frame);
>> -      if (prev_frame)
>> -        val = get_frame_register_unsigned (prev_frame, ARM_SP_REGNUM);
>> -      else
>> -        val = get_frame_base (this_frame);
>> -    }
>> -      else
>> -    /* Use value for the register from previous frame.  */
>> -    val = get_frame_register_unsigned (this_frame, regnum);
>> -
>> -      return frame_unwind_got_constant (this_frame, regnum, val);
>> -    }
>>     internal_error (_("Unexpected register %d"), regnum);
>>   }
>> @@ -5202,9 +5130,72 @@ arm_dwarf2_frame_init_reg (struct gdbarch 
>> *gdbarch, int regnum,
>>       reg->how = DWARF2_FRAME_REG_CFA;
>>     else if (arm_is_alternative_sp_register (tdep, regnum))
>>       {
>> -      /* Handle the alternative SP registers on Cortex-M.  */
>> -      reg->how = DWARF2_FRAME_REG_FN;
>> -      reg->loc.fn = arm_dwarf2_prev_register;
>> +      /* Identify what stack pointers that are synced with sp.  */
>> +      bool override_with_sp_value = false;
>> +
>> +      if (tdep->have_sec_ext)
>> +    {
>> +      CORE_ADDR sp
>> +        = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM);
>> +
>> +      CORE_ADDR msp_s
>> +        = get_frame_register_unsigned (this_frame,
>> +                       tdep->m_profile_msp_s_regnum);
>> +      CORE_ADDR msp_ns
>> +        = get_frame_register_unsigned (this_frame,
>> +                       tdep->m_profile_msp_ns_regnum);
>> +      CORE_ADDR psp_s
>> +        = get_frame_register_unsigned (this_frame,
>> +                       tdep->m_profile_psp_s_regnum);
>> +      CORE_ADDR psp_ns
>> +        = get_frame_register_unsigned (this_frame,
>> +                       tdep->m_profile_psp_ns_regnum);
>> +
>> +      bool is_msp = (regnum == tdep->m_profile_msp_regnum)
>> +        && (msp_s == sp || msp_ns == sp);
>> +      bool is_msp_s = (regnum == tdep->m_profile_msp_s_regnum)
>> +        && (msp_s == sp);
>> +      bool is_msp_ns = (regnum == tdep->m_profile_msp_ns_regnum)
>> +        && (msp_ns == sp);
>> +      bool is_psp = (regnum == tdep->m_profile_psp_regnum)
>> +        && (psp_s == sp || psp_ns == sp);
>> +      bool is_psp_s = (regnum == tdep->m_profile_psp_s_regnum)
>> +        && (psp_s == sp);
>> +      bool is_psp_ns = (regnum == tdep->m_profile_psp_ns_regnum)
>> +        && (psp_ns == sp);
>> +
>> +      override_with_sp_value = is_msp || is_msp_s || is_msp_ns
>> +        || is_psp || is_psp_s || is_psp_ns;
>> +
>> +    }
>> +      else if (tdep->is_m)
>> +    {
>> +      CORE_ADDR sp
>> +        = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM);
>> +
>> +      CORE_ADDR msp
>> +        = get_frame_register_unsigned (this_frame,
>> +                       tdep->m_profile_msp_regnum);
>> +      CORE_ADDR psp
>> +        = get_frame_register_unsigned (this_frame,
>> +                       tdep->m_profile_psp_regnum);
>> +
>> +      bool is_msp = (regnum == tdep->m_profile_msp_regnum) && (sp == 
>> msp);
>> +      bool is_psp = (regnum == tdep->m_profile_psp_regnum) && (sp == 
>> psp);
>> +
>> +      override_with_sp_value = is_msp || is_psp;
>> +    }
>> +
>> +      if (override_with_sp_value)
>> +    {
>> +      /* Use the CFA value for this stack pointer register.  */
>> +      reg->how = DWARF2_FRAME_REG_CFA;
>> +    }
>> +      else
>> +    {
>> +      /* This frame does not have any update for this stack pointer.  */
>> +      reg->how = DWARF2_FRAME_REG_SAME_VALUE;
>> +    }
>>       }
>>   }
> 
> Although it does make sense to cache the values to address this issue, I 
> don't think this approach is appropriate.
> 
> The init_reg () hook is supposed to initialize the rules. Even if 
> unwinding works from this particular function, I don't think it is safe 
> to do so. >
> If we attempt to point at a register whose rule hasn't been initialize 
> yet, we may run into issues. I'm not sure if that is technically 
> possible, I haven't exercised this in practice.

As I said on IRC; where are currently only fetching values for the inner 
frame and not the current frame, so I think the risk that some rule has 
not yet been initialized is slim to none. :)
It can still be a bad way to go though...

> I think we need to take a step back and re-assess how we're determining 
> the active sp register. From what I can tell, it is just a value check. 
> Since we need to check at least 6 values, it gets fairly expensive the 
> more frames you have in the debugging session.

So, to lets give everyone some background here to have some context.
For Arm Cortex-M33 with security extension (TrustZone), there are 4 
stack pointer registers. msp_s, mps_ns, psp_s and psp_ns. In addition to 
these, there is the 'sp' register that is always in sync with one of the 
previous mentioned 4 registers. Internally in GDB, there is only one 
stack pointer and its register is sp. When sp is updated, for some 
reason, in GDB during the unwind, then the matching of the 4 other 
registers should have the same modification as they are in sync in the core.
To further complicate things, there are 2 other registers, msp and psp, 
that may (or may not) exist on the target. If they exist, then they 
should be kept in sync with their _ns or _s registers depending on 
securty context. When the core is in secure context, msp, psp and sp are 
always alias for one of msp_s or psp_s. When the core is in non-secure 
context, msp, psp and sp are always alias for one of msp_ns or psp_ns.
So, all in all; On Cortex-M33 with TrustZone, there are 7 stack pointers 
where 3 of them are "aliases" for one of the basic 4 registers.
For all Cortex-M without TrustZone (as far as I know anyway), the stack 
pointers are reduced to sp, msp and psp. In this case, sp is to be 
considered as an "alias" for either msp or psp.

Regardless of the Cortex-M type or existence of TrustZone, the core 
always works on what is called 'sp', so it's a matter of updating what 
this 'sp' is alias for when doing interrupts or similar.

This description might be simplified or not 100% acurate from a core 
perspective, but it's my view of how the hardware works. If anyone has a 
better picture to draw, please help me get a more clear view of how to 
look at all these registers from a core perspective. :)

> 
> One idea that comes to mind is to introduce a pseudo-register that 
> stores the active sp register number (say active_sp). It may not need a 
> name if it is only going to be used for unwinding purposes.

Yes, it would reduce the number of registers to fetch, but it would 
still be more than what I think is acceptable.

For a non-TrustZone core, I suppose it would be:
* sp: just the ARM_SP_REGNUM register.
* msp: active_sp and then either the register that active_sp points to 
or the msp register number.
* psp: active_sp and then either the register that active_sp points to 
or the psp register number.

For a TrustZone core, it would be worse:
* sp: just the ARM_SP_REGNUM register.
* msp_s: active_sp and then the register that active_sp points to or the 
msp_s register number.
* msp_ns: active_sp and then the register that active_sp points to or 
the msp_ns register number.
* psp_s: active_sp and then the register that active_sp points to or the 
psp_s register number.
* psp_ns: active_sp and then the register that active_sp points to or 
the psp_ns register number.
* msp: active_sp and then the register that active_sp points to or the 
msp register number.
* psp: active_sp and then the register that active_sp points to or the 
psp register number.


So, in best case, it's 1 register lookup, but in worse case it's 2 
register lookups, and this is per fetch of a stack pointer.
To fetch the values once for a frame, it would be 5 register fetches for 
non-TrustZone and 13 register fetches for TrustZone. This is still lower 
than the original implementation, but is it acceptable with that large 
number of fetches of register content per frame?

> 
> In the sentinel frame, we can easily determine this value by doing quick 
> checks with all the alternate sp registers. The other frames would query 
> this pseudo-register to know what value they must use for the alternate 
> sp registers.
> 
> I think this will reduce the number of alternate sp value fetches 
> considerably, and thus will address this performance issue.

Yes, it would lower the number, but it would still be rather high as 
fetching a single register from the inner frame would result in 1 or 2 
fetches from the more inner frame and so on, so it's stil growing fast.

> 
> For the alternate sp registers, we can still use the 
> arm_dwarf2_prev_register function, but now we'll fetch the active_sp 
> pseudo_register to help us decide what value to return.
> 
> On other unwinders of different frame types (prologue, exception etc), 
> we can adjust the value of active_sp accordingly if the active stack 
> pointer changed.

Sure, it would mean that the identification of what sp is alias for is 
already available by the inner frame, but I don't see that much 
difference compare to the code in this patch, except that this porposed 
algorithm would be fetching a lot more registers from the inner frame 
each time a stack pointer is to be returned.
Performance wise, I'd say that your solution is somehwere in between 
what is currently merged in master and this patch.

> 
> Does that make sense?

I think there are only 2 viable solutions for dwarf2 frames:
* eager cache (that's what I did in this patch)
* lazy cache (this is what Tomas suggested in his RFC series)
Both have their pros and cons.

I don't know if I'm still stuck in the loop of what I've provided or if 
there is a better alternative that I can't see.

Kind regards,
Torbjörn

  parent reply	other threads:[~2022-11-12 19:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04 14:44 [PATCH 0/3] gdb/arm: Fixes for Cortex-M stack unwinding Torbjörn SVENSSON
2022-11-04 14:44 ` [PATCH 1/3] gdb/arm: Update active msp/psp when switching stack Torbjörn SVENSSON
2022-11-04 14:44 ` [PATCH 2/3] gdb/arm: Ensure that stack pointers are in sync Torbjörn SVENSSON
2022-11-04 17:31   ` Tomas Vanek
2022-11-07 17:27     ` Torbjorn SVENSSON
2022-11-04 14:44 ` [PATCH 3/3] gdb/arm: PR 29738 Cache value for stack pointers for dwarf2 frames Torbjörn SVENSSON
2022-11-11  9:15   ` Luis Machado
2022-11-11 10:27     ` Tomas Vanek
2022-11-12 19:54     ` Torbjorn SVENSSON [this message]
2022-11-14 10:46       ` Luis Machado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f819016-ec7e-7d4c-0ddc-86ac12f34cc7@foss.st.com \
    --to=torbjorn.svensson@foss.st.com \
    --cc=gdb-patches@sourceware.org \
    --cc=luis.machado@arm.com \
    --cc=vanekt@volny.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).