From: Andrew Burgess <aburgess@redhat.com>
To: Bernd Edlinger <bernd.edlinger@hotmail.de>, gdb-patches@sourceware.org
Subject: Re: [PATCHv5] gdb: fix handling of DW_AT_entry_pc of inlined subroutines
Date: Tue, 05 Nov 2024 11:25:44 +0000 [thread overview]
Message-ID: <87h68lsz0n.fsf@redhat.com> (raw)
In-Reply-To: <DU2PR08MB10263176E1B7E4BD051B7B0FEE4562@DU2PR08MB10263.eurprd08.prod.outlook.com>
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
> On 10/31/24 16:00, Andrew Burgess wrote:
>> In v5:
>>
>> - If DW_AT_entry_pc is outside of a block's start()..end() range,
>> then the DW_AT_entry_pc is now ignored.
>>
>> - The entry-pc is stored as an offset from block::start() within the
>> block class, this means we no longer need to worry about special
>> address values, i.e. and entry-pc of zero is fine. This also
>> means that we no longer need to relocate the entry-pc in
>> objfile_relocate1. Credit for this idea to Bernd.
>>
>> In v4:
>>
>> - I still have one failed being reported from Linaro CI,
>> gdb.stabs/gdb11479.exp. I've been unable to reproduce this
>> failure. However, on re-reading this patch I did notice that, if
>> the m_entry_pc field was not set then we would still end up giving
>> it a value during relocation, this would result in m_entry_pc
>> holding an address which had been relocated twice.
>>
>> Fixed this by moving the block's address relocation logic into the
>> block class, and only relocating the m_entry_pc variable if it
>> actually has been assigned a value.
>>
>> In v3:
>>
>> - Entry PC values are now clamped to the block's start/end range.
>> This was causing failures in the self-tests when GDB was compiled
>> with GCC using default optimisation, as GCC emits an (invalid?)
>> entry-pc which is outside the blocks start/end range.
>>
>> In v2:
>>
>> - Don't use std::optional<CORE_ADDR> for block::m_entry_pc, instead
>> just use CORE_ADDR. This does mean that targets where 0 is a
>> valid code address might run into problems, but those targets
>> likely have plenty of other problems, and I'd rather land this fix
>> than get hung up on this edge case.
>>
>> In the future it might be possible to remove block::m_start and
>> block::m_end as Klaus suggested, but this doesn't look like a
>> simple task. If we did drop those fields then it might be
>> possible to space a word for some flag bits which would allow us
>> to remove the 0 address special case.
>>
>> - Fixes to the DWARF assembler testcase to handle compiling with
>> '-pie', the Linaro CI exposed these issues. The changes are
>> pretty minor, mostly using label names rather than addresses when
>> building the DWARF.
>>
>> ---
>>
>> The entry PC for a DIE, e.g. an inline function, might not be the base
>> address of the DIE. Currently though, in block::entry_pc(), GDB
>> always returns the base address (low-pc or the first address of the
>> first range) as the entry PC.
>>
>> This commit extends the block class to carry the entry PC as a
>> separate member variable. Then the DWARF reader is extended to read
>> and set the entry PC for the block. Now in block::entry_pc(), if the
>> entry PC has been set, then this is the value returned.
>>
>> I have not removed the old code in block::entry_pc(). Not every DIE
>> will set the entry-pc, but GDB still needs to have an entry-pc for
>> every block, so the existing logic supplies the entry-pc for any block
>> where the entry-pc was not set.
>>
>> The DWARF-5 spec for reading the entry PC is a super-set of the spec
>> as found in DWARF-4. For example, if there is no DW_AT_entry_pc then
>> DWARF-4 says to use DW_AT_low_pc while DWARF-5 says to use the base
>> address, which is DW_AT_low_pc or the first address in the first range
>> specified by DW_AT_ranges if there is no DW_AT_low_pc.
>>
>> I have taken the approach of just implementing the DWARF-5 spec for
>> everyone. There doesn't seem to be any benefit to deliberately
>> ignoring a ranges based entry PC value for DWARF-4. If some naughty
>> compiler has emitted that, then lets use it.
>>
>> Similarly, DWARF-4 says that DW_AT_entry_pc is an address. DWARF-5
>> allows an address or a constant, where the constant is an offset from
>> the base address. I allow both approaches for all DWARF versions.
>> There doesn't seem to be any downsides to this approach.
>>
>> I ran into an issue when testing this patch where GCC would have the
>> DW_AT_entry_pc point to an empty range. When GDB parses the ranges
>> any empty ranges are ignored. As a consequence, the entry-pc appears
>> to be outside the address range of a block.
>>
>> The empty range problem is certainly something that we can, and should
>> address, but that is not the focus of this patch, so for now I'm
>> ignoring that problem. What I have done is added a check: if the
>> DW_AT_entry_pc is outside the range of a block then the entry-pc is
>> ignored, GDB will then fall-back to its default algorithm for
>> computing the entry-pc.
>>
>> If/when in the future we address the empty range problem, these
>> DW_AT_entry_pc attributes will suddenly become valid and GDB will
>> start using them. Until then, GDB continues to operate as it always
>> had.
>>
>> An early version of this patch stored the entry-pc within the block
>> like this:
>>
>> std::optional<CORE_ADDR> m_entry_pc;
>>
>> However, a concern was raised that this, on a 64-bit host, effectively
>> increases the size of block by 16-bytes (8-bytes for the CORE_ADDR,
>> and 8-bytes for the std::optional's bool plus padding).
>>
>> If we remove the std::optional part and just use a CORE_ADDR then we
>> need to have a "special" address to indicate if m_entry_pc is in use
>> or not. I don't really like using special addresses; different
>> targets can access different address ranges, even zero is a valid
>> address on some targets.
>>
>> However, Bernd Edlinger suggested storing the entry-pc as an offset,
>> and I think that will resolve my concerns. So, we store the entry-pc
>> as an offset from the block::start() value. If this offset is zero
>> then the entry-pc is block::start(), or was unset and we should just
>> use the default algorithm. If the entry-pc was set, and is not equal
>> to block::start() then the offset will be non-zero, and we can
>> recalculate the entry-pc when needed.
>>
>> With this done, on a 64-bit host, block only needs to increase by
>> 8-bytes.
>>
>> The inline-entry.exp test was originally contributed by Bernd here:
>>
>> https://inbox.sourceware.org/gdb-patches/AS1PR01MB94659E4D9B3F4A6006CC605FE4922@AS1PR01MB9465.eurprd01.prod.exchangelabs.com
>>
>> though I have made some edits, making more use of lib/gdb.exp
>> functions, making the gdb_test output patterns a little tighter, and
>> updating the test to run with Clang. I also moved the test to
>> gdb.opt/ as that seemed like a better home for it.
>>
>> Co-Authored-By: Bernd Edlinger <bernd.edlinger@hotmail.de>
>> ---
>> gdb/block.h | 47 ++-
>> gdb/dwarf2/read.c | 102 ++++-
>> gdb/testsuite/gdb.dwarf2/dw2-entry-pc.c | 51 +++
>> gdb/testsuite/gdb.dwarf2/dw2-entry-pc.exp | 487 ++++++++++++++++++++++
>> gdb/testsuite/gdb.opt/inline-entry.c | 41 ++
>> gdb/testsuite/gdb.opt/inline-entry.exp | 51 +++
>> 6 files changed, 766 insertions(+), 13 deletions(-)
>> create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-entry-pc.c
>> create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-entry-pc.exp
>> create mode 100644 gdb/testsuite/gdb.opt/inline-entry.c
>> create mode 100644 gdb/testsuite/gdb.opt/inline-entry.exp
>>
>> diff --git a/gdb/block.h b/gdb/block.h
>> index c3babad52f3..e284d67f550 100644
>> --- a/gdb/block.h
>> +++ b/gdb/block.h
>> @@ -179,27 +179,37 @@ struct block : public allocate_on_obstack<block>
>>
>> /* Return the "entry PC" of this block.
>>
>> - The entry PC is the lowest (start) address for the block when all addresses
>> - within the block are contiguous. If non-contiguous, then use the start
>> - address for the first range in the block.
>> -
>> - At the moment, this almost matches what DWARF specifies as the entry
>> - pc. (The missing bit is support for DW_AT_entry_pc which should be
>> - preferred over range data and the low_pc.)
>> + If the entry PC has been set to a specific value then this is
>> + returned. Otherwise, the entry PC is the lowest (start) address for
>> + the block when all addresses within the block are contiguous. If
>> + non-contiguous, then use the start address for the first range in the
>> + block.
>>
>> - Once support for DW_AT_entry_pc is added, I expect that an entry_pc
>> - field will be added to one of these data structures. Once that's done,
>> - the entry_pc field can be set from the dwarf reader (and other readers
>> - too). ENTRY_PC can then be redefined to be less DWARF-centric. */
>> + This almost matches what DWARF specifies as the entry pc, except that
>> + the final case, using the first address of the first range, is a GDB
>> + extension. However, the DWARF reader sets the specific entry PC
>> + wherever possible, so this non-standard fallback case is only used as
>> + a last resort. */
>>
>> CORE_ADDR entry_pc () const
>> {
>> - if (this->is_contiguous ())
>> + if (m_entry_pc_offset != 0)
>> + return this->start () + m_entry_pc_offset;
>> + else if (this->is_contiguous ())
>> return this->start ();
>> else
>> return this->ranges ()[0].start ();
>> }
>
> I wonder if it can happen that this->start is different
> than ranges[0].start, because if that is possible,
> then it would not work to set_entry_pc(this->start)
> because the offset will be 0 and the else path will
> pick the ranges[0],start.
>
> Actually I thought of something like this:
> if (this->is_contiguous ())
> pc = this->start ();
> else
> pc = this->ranges ()[0].start ();
> pc += m_entry_pc_offset;
> return pc;
>
> so without any additional branches,
> is there a reason why that cannot work?
You're correct. I've adopted this approach in v6 which I just posted.
It is possible, and valid, for a block's sub-ranges to not be in
ascending address order. In this case the start() can be lower than the
start of the first range.
>
>>
>> + /* Set this block's entry-pc to ADDR, which must lie between start() and
>> + end(). */
>> +
>> + void set_entry_pc (CORE_ADDR addr)
>> + {
>> + gdb_assert (addr >= this->start () && addr < this->end ());
>
> note that my patch series part 3/3 has an example where
> the inline funcion has an empty instruction range, I wonder
> if this assertion will break, maybe addr <= this->end ?
I ran the tests from the full optimised debug series. Obviously there's
plenty of failures at this point, but there are no assertions triggered.
Using 'addr <= this->end' would, I think not be correct, as the block's
end() address is not inclusive. That is, if 'addr == this->end ()' then
the entry-pc would be the first address outside the block's range, which
is clearly not correct.
> So, when this function is called, are the ranges already set,
> or will that happen later?
The ranges are already set by the time a block's entry-pc is set.
Thanks,
Andrew
next prev parent reply other threads:[~2024-11-05 11:25 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-16 15:47 [PATCH] " Andrew Burgess
2024-10-17 20:03 ` Tom Tromey
2024-10-18 10:06 ` Andrew Burgess
2024-10-18 13:57 ` Andrew Burgess
2024-10-18 10:26 ` Gerlicher, Klaus
2024-10-18 13:55 ` Andrew Burgess
2024-10-18 13:53 ` [PATCHv2] " Andrew Burgess
2024-10-28 13:45 ` [PATCHv3] " Andrew Burgess
2024-10-29 14:49 ` [PATCHv4] " Andrew Burgess
2024-10-29 15:28 ` Bernd Edlinger
2024-10-31 10:57 ` Andrew Burgess
2024-10-31 14:01 ` Bernd Edlinger
2024-10-31 14:56 ` Andrew Burgess
2024-10-29 16:34 ` Bernd Edlinger
2024-10-31 10:59 ` Andrew Burgess
2024-10-31 15:00 ` [PATCHv5] " Andrew Burgess
2024-11-01 18:13 ` Tom Tromey
2024-11-01 20:27 ` Bernd Edlinger
2024-11-05 11:25 ` Andrew Burgess [this message]
2024-11-05 15:26 ` Bernd Edlinger
2024-11-05 16:52 ` Andrew Burgess
2024-11-05 19:57 ` Bernd Edlinger
2024-11-05 11:21 ` [PATCHv6] " Andrew Burgess
2024-11-13 13:49 ` Andrew Burgess
2024-11-13 16:59 ` Andrew Burgess
2024-11-14 9:20 ` Tom de Vries
2024-11-14 19:33 ` Andrew Burgess
2024-11-15 8:50 ` Tom de Vries
2024-11-15 10:53 ` Bernd Edlinger
2024-11-15 14:00 ` Andrew Burgess
2024-11-15 14:30 ` Tom de Vries
2024-11-15 16:46 ` Andrew Burgess
2024-11-15 19:24 ` Andrew Burgess
2024-11-17 23:52 ` Bernd Edlinger
2024-11-19 9:29 ` Andrew Burgess
2024-11-15 17:00 ` Bernd Edlinger
2024-11-15 13:45 ` Andrew Burgess
2024-10-29 15:29 ` [PATCHv3] " Sam James
2024-10-31 9:48 ` Andrew Burgess
2024-10-18 14:24 ` [PATCH] " Bernd Edlinger
2024-10-28 13:26 ` Andrew Burgess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h68lsz0n.fsf@redhat.com \
--to=aburgess@redhat.com \
--cc=bernd.edlinger@hotmail.de \
--cc=gdb-patches@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).