From: Torbjorn SVENSSON <torbjorn.svensson@foss.st.com>
To: Luis Machado <luis.machado@arm.com>, <gdb-patches@sourceware.org>
Cc: <vanekt@volny.cz>
Subject: Re: [PATCH 3/3] gdb/arm: PR 29738 Cache value for stack pointers for dwarf2 frames
Date: Sat, 12 Nov 2022 20:54:47 +0100 [thread overview]
Message-ID: <5f819016-ec7e-7d4c-0ddc-86ac12f34cc7@foss.st.com> (raw)
In-Reply-To: <79392f32-a64e-7bcf-e4a3-b2795917580b@arm.com>
Hi Luis,
Thanks for taking another look at this.
My comments below will be slightly overlaping what Tomas already
replied, but I think it makes more sense to reply to this email than to
Tomas reply as my comments is on your mail.
On 2022-11-11 10:15, Luis Machado wrote:
> Hi Torbjörn,
>
> On 11/4/22 14:44, Torbjörn SVENSSON wrote:
>> Without this patch, the number of calls to arm_dwarf2_prev_register
>> would grow in a too rapid way when the number of frames increase.
>>
>> Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
>> ---
>> gdb/arm-tdep.c | 141 +++++++++++++++++++++++--------------------------
>> 1 file changed, 66 insertions(+), 75 deletions(-)
>>
>> diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
>> index c011b2aa973..a6fb660bcbc 100644
>> --- a/gdb/arm-tdep.c
>> +++ b/gdb/arm-tdep.c
>> @@ -3992,78 +3992,6 @@ arm_dwarf2_prev_register (frame_info_ptr
>> this_frame, void **this_cache,
>> cpsr = reconstruct_t_bit (gdbarch, lr, cpsr);
>> return frame_unwind_got_constant (this_frame, regnum, cpsr);
>> }
>> - else if (arm_is_alternative_sp_register (tdep, regnum))
>> - {
>> - /* Handle the alternative SP registers on Cortex-M. */
>> - bool override_with_sp_value = false;
>> - CORE_ADDR val;
>> -
>> - if (tdep->have_sec_ext)
>> - {
>> - CORE_ADDR sp
>> - = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM);
>> - CORE_ADDR msp_s
>> - = get_frame_register_unsigned (this_frame,
>> - tdep->m_profile_msp_s_regnum);
>> - CORE_ADDR msp_ns
>> - = get_frame_register_unsigned (this_frame,
>> - tdep->m_profile_msp_ns_regnum);
>> - CORE_ADDR psp_s
>> - = get_frame_register_unsigned (this_frame,
>> - tdep->m_profile_psp_s_regnum);
>> - CORE_ADDR psp_ns
>> - = get_frame_register_unsigned (this_frame,
>> - tdep->m_profile_psp_ns_regnum);
>> -
>> - bool is_msp = (regnum == tdep->m_profile_msp_regnum)
>> - && (msp_s == sp || msp_ns == sp);
>> - bool is_msp_s = (regnum == tdep->m_profile_msp_s_regnum)
>> - && (msp_s == sp);
>> - bool is_msp_ns = (regnum == tdep->m_profile_msp_ns_regnum)
>> - && (msp_ns == sp);
>> - bool is_psp = (regnum == tdep->m_profile_psp_regnum)
>> - && (psp_s == sp || psp_ns == sp);
>> - bool is_psp_s = (regnum == tdep->m_profile_psp_s_regnum)
>> - && (psp_s == sp);
>> - bool is_psp_ns = (regnum == tdep->m_profile_psp_ns_regnum)
>> - && (psp_ns == sp);
>> -
>> - override_with_sp_value = is_msp || is_msp_s || is_msp_ns
>> - || is_psp || is_psp_s || is_psp_ns;
>> -
>> - }
>> - else if (tdep->is_m)
>> - {
>> - CORE_ADDR sp
>> - = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM);
>> - CORE_ADDR msp
>> - = get_frame_register_unsigned (this_frame,
>> - tdep->m_profile_msp_regnum);
>> - CORE_ADDR psp
>> - = get_frame_register_unsigned (this_frame,
>> - tdep->m_profile_psp_regnum);
>> -
>> - bool is_msp = (regnum == tdep->m_profile_msp_regnum) && (sp ==
>> msp);
>> - bool is_psp = (regnum == tdep->m_profile_psp_regnum) && (sp ==
>> psp);
>> -
>> - override_with_sp_value = is_msp || is_psp;
>> - }
>> -
>> - if (override_with_sp_value)
>> - {
>> - /* Use value of SP from previous frame. */
>> - frame_info_ptr prev_frame = get_prev_frame (this_frame);
>> - if (prev_frame)
>> - val = get_frame_register_unsigned (prev_frame, ARM_SP_REGNUM);
>> - else
>> - val = get_frame_base (this_frame);
>> - }
>> - else
>> - /* Use value for the register from previous frame. */
>> - val = get_frame_register_unsigned (this_frame, regnum);
>> -
>> - return frame_unwind_got_constant (this_frame, regnum, val);
>> - }
>> internal_error (_("Unexpected register %d"), regnum);
>> }
>> @@ -5202,9 +5130,72 @@ arm_dwarf2_frame_init_reg (struct gdbarch
>> *gdbarch, int regnum,
>> reg->how = DWARF2_FRAME_REG_CFA;
>> else if (arm_is_alternative_sp_register (tdep, regnum))
>> {
>> - /* Handle the alternative SP registers on Cortex-M. */
>> - reg->how = DWARF2_FRAME_REG_FN;
>> - reg->loc.fn = arm_dwarf2_prev_register;
>> + /* Identify what stack pointers that are synced with sp. */
>> + bool override_with_sp_value = false;
>> +
>> + if (tdep->have_sec_ext)
>> + {
>> + CORE_ADDR sp
>> + = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM);
>> +
>> + CORE_ADDR msp_s
>> + = get_frame_register_unsigned (this_frame,
>> + tdep->m_profile_msp_s_regnum);
>> + CORE_ADDR msp_ns
>> + = get_frame_register_unsigned (this_frame,
>> + tdep->m_profile_msp_ns_regnum);
>> + CORE_ADDR psp_s
>> + = get_frame_register_unsigned (this_frame,
>> + tdep->m_profile_psp_s_regnum);
>> + CORE_ADDR psp_ns
>> + = get_frame_register_unsigned (this_frame,
>> + tdep->m_profile_psp_ns_regnum);
>> +
>> + bool is_msp = (regnum == tdep->m_profile_msp_regnum)
>> + && (msp_s == sp || msp_ns == sp);
>> + bool is_msp_s = (regnum == tdep->m_profile_msp_s_regnum)
>> + && (msp_s == sp);
>> + bool is_msp_ns = (regnum == tdep->m_profile_msp_ns_regnum)
>> + && (msp_ns == sp);
>> + bool is_psp = (regnum == tdep->m_profile_psp_regnum)
>> + && (psp_s == sp || psp_ns == sp);
>> + bool is_psp_s = (regnum == tdep->m_profile_psp_s_regnum)
>> + && (psp_s == sp);
>> + bool is_psp_ns = (regnum == tdep->m_profile_psp_ns_regnum)
>> + && (psp_ns == sp);
>> +
>> + override_with_sp_value = is_msp || is_msp_s || is_msp_ns
>> + || is_psp || is_psp_s || is_psp_ns;
>> +
>> + }
>> + else if (tdep->is_m)
>> + {
>> + CORE_ADDR sp
>> + = get_frame_register_unsigned (this_frame, ARM_SP_REGNUM);
>> +
>> + CORE_ADDR msp
>> + = get_frame_register_unsigned (this_frame,
>> + tdep->m_profile_msp_regnum);
>> + CORE_ADDR psp
>> + = get_frame_register_unsigned (this_frame,
>> + tdep->m_profile_psp_regnum);
>> +
>> + bool is_msp = (regnum == tdep->m_profile_msp_regnum) && (sp ==
>> msp);
>> + bool is_psp = (regnum == tdep->m_profile_psp_regnum) && (sp ==
>> psp);
>> +
>> + override_with_sp_value = is_msp || is_psp;
>> + }
>> +
>> + if (override_with_sp_value)
>> + {
>> + /* Use the CFA value for this stack pointer register. */
>> + reg->how = DWARF2_FRAME_REG_CFA;
>> + }
>> + else
>> + {
>> + /* This frame does not have any update for this stack pointer. */
>> + reg->how = DWARF2_FRAME_REG_SAME_VALUE;
>> + }
>> }
>> }
>
> Although it does make sense to cache the values to address this issue, I
> don't think this approach is appropriate.
>
> The init_reg () hook is supposed to initialize the rules. Even if
> unwinding works from this particular function, I don't think it is safe
> to do so. >
> If we attempt to point at a register whose rule hasn't been initialize
> yet, we may run into issues. I'm not sure if that is technically
> possible, I haven't exercised this in practice.
As I said on IRC; where are currently only fetching values for the inner
frame and not the current frame, so I think the risk that some rule has
not yet been initialized is slim to none. :)
It can still be a bad way to go though...
> I think we need to take a step back and re-assess how we're determining
> the active sp register. From what I can tell, it is just a value check.
> Since we need to check at least 6 values, it gets fairly expensive the
> more frames you have in the debugging session.
So, to lets give everyone some background here to have some context.
For Arm Cortex-M33 with security extension (TrustZone), there are 4
stack pointer registers. msp_s, mps_ns, psp_s and psp_ns. In addition to
these, there is the 'sp' register that is always in sync with one of the
previous mentioned 4 registers. Internally in GDB, there is only one
stack pointer and its register is sp. When sp is updated, for some
reason, in GDB during the unwind, then the matching of the 4 other
registers should have the same modification as they are in sync in the core.
To further complicate things, there are 2 other registers, msp and psp,
that may (or may not) exist on the target. If they exist, then they
should be kept in sync with their _ns or _s registers depending on
securty context. When the core is in secure context, msp, psp and sp are
always alias for one of msp_s or psp_s. When the core is in non-secure
context, msp, psp and sp are always alias for one of msp_ns or psp_ns.
So, all in all; On Cortex-M33 with TrustZone, there are 7 stack pointers
where 3 of them are "aliases" for one of the basic 4 registers.
For all Cortex-M without TrustZone (as far as I know anyway), the stack
pointers are reduced to sp, msp and psp. In this case, sp is to be
considered as an "alias" for either msp or psp.
Regardless of the Cortex-M type or existence of TrustZone, the core
always works on what is called 'sp', so it's a matter of updating what
this 'sp' is alias for when doing interrupts or similar.
This description might be simplified or not 100% acurate from a core
perspective, but it's my view of how the hardware works. If anyone has a
better picture to draw, please help me get a more clear view of how to
look at all these registers from a core perspective. :)
>
> One idea that comes to mind is to introduce a pseudo-register that
> stores the active sp register number (say active_sp). It may not need a
> name if it is only going to be used for unwinding purposes.
Yes, it would reduce the number of registers to fetch, but it would
still be more than what I think is acceptable.
For a non-TrustZone core, I suppose it would be:
* sp: just the ARM_SP_REGNUM register.
* msp: active_sp and then either the register that active_sp points to
or the msp register number.
* psp: active_sp and then either the register that active_sp points to
or the psp register number.
For a TrustZone core, it would be worse:
* sp: just the ARM_SP_REGNUM register.
* msp_s: active_sp and then the register that active_sp points to or the
msp_s register number.
* msp_ns: active_sp and then the register that active_sp points to or
the msp_ns register number.
* psp_s: active_sp and then the register that active_sp points to or the
psp_s register number.
* psp_ns: active_sp and then the register that active_sp points to or
the psp_ns register number.
* msp: active_sp and then the register that active_sp points to or the
msp register number.
* psp: active_sp and then the register that active_sp points to or the
psp register number.
So, in best case, it's 1 register lookup, but in worse case it's 2
register lookups, and this is per fetch of a stack pointer.
To fetch the values once for a frame, it would be 5 register fetches for
non-TrustZone and 13 register fetches for TrustZone. This is still lower
than the original implementation, but is it acceptable with that large
number of fetches of register content per frame?
>
> In the sentinel frame, we can easily determine this value by doing quick
> checks with all the alternate sp registers. The other frames would query
> this pseudo-register to know what value they must use for the alternate
> sp registers.
>
> I think this will reduce the number of alternate sp value fetches
> considerably, and thus will address this performance issue.
Yes, it would lower the number, but it would still be rather high as
fetching a single register from the inner frame would result in 1 or 2
fetches from the more inner frame and so on, so it's stil growing fast.
>
> For the alternate sp registers, we can still use the
> arm_dwarf2_prev_register function, but now we'll fetch the active_sp
> pseudo_register to help us decide what value to return.
>
> On other unwinders of different frame types (prologue, exception etc),
> we can adjust the value of active_sp accordingly if the active stack
> pointer changed.
Sure, it would mean that the identification of what sp is alias for is
already available by the inner frame, but I don't see that much
difference compare to the code in this patch, except that this porposed
algorithm would be fetching a lot more registers from the inner frame
each time a stack pointer is to be returned.
Performance wise, I'd say that your solution is somehwere in between
what is currently merged in master and this patch.
>
> Does that make sense?
I think there are only 2 viable solutions for dwarf2 frames:
* eager cache (that's what I did in this patch)
* lazy cache (this is what Tomas suggested in his RFC series)
Both have their pros and cons.
I don't know if I'm still stuck in the loop of what I've provided or if
there is a better alternative that I can't see.
Kind regards,
Torbjörn
next prev parent reply other threads:[~2022-11-12 19:54 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-04 14:44 [PATCH 0/3] gdb/arm: Fixes for Cortex-M stack unwinding Torbjörn SVENSSON
2022-11-04 14:44 ` [PATCH 1/3] gdb/arm: Update active msp/psp when switching stack Torbjörn SVENSSON
2022-11-04 14:44 ` [PATCH 2/3] gdb/arm: Ensure that stack pointers are in sync Torbjörn SVENSSON
2022-11-04 17:31 ` Tomas Vanek
2022-11-07 17:27 ` Torbjorn SVENSSON
2022-11-04 14:44 ` [PATCH 3/3] gdb/arm: PR 29738 Cache value for stack pointers for dwarf2 frames Torbjörn SVENSSON
2022-11-11 9:15 ` Luis Machado
2022-11-11 10:27 ` Tomas Vanek
2022-11-12 19:54 ` Torbjorn SVENSSON [this message]
2022-11-14 10:46 ` Luis Machado
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5f819016-ec7e-7d4c-0ddc-86ac12f34cc7@foss.st.com \
--to=torbjorn.svensson@foss.st.com \
--cc=gdb-patches@sourceware.org \
--cc=luis.machado@arm.com \
--cc=vanekt@volny.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).