From: Simon Marchi <simon.marchi@polymtl.ca>
To: Mark Wielaard <mark@klomp.org>
Cc: "gdb@sourceware.org" <gdb@sourceware.org>,
gcc@gcc.gnu.org,
"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: Split DWARF and rnglists, gcc vs clang
Date: Fri, 13 Nov 2020 09:45:24 -0500 [thread overview]
Message-ID: <3a747adc-af41-7492-de36-b357e5429fff@polymtl.ca> (raw)
In-Reply-To: <20201113001143.GA2654@wildebeest.org>
On 2020-11-12 7:11 p.m., Mark Wielaard wrote:
> Hi Simon,
>
> On Thu, Nov 05, 2020 at 11:11:43PM -0500, Simon Marchi wrote:
>> I'm currently squashing some bugs related to .debug_rnglists in GDB, and
>> I happened to notice that clang and gcc do different things when
>> generating rnglists with split DWARF. I'd like to know if the two
>> behaviors are acceptable, and therefore if we need to make GDB accept
>> both. Or maybe one of them is not doing things correctly and would need
>> to be fixed.
>>
>> clang generates a .debug_rnglists.dwo section in the .dwo file. Any
>> DW_FORM_rnglistx attribute in the DWO refers to that section. That
>> section is not shared with any other object, so DW_AT_rnglists_base is
>> never involved for these attributes. Note that there might still be a
>> DW_AT_rnglists_base on the DW_TAG_skeleton_unit, in the linked file,
>> used if the skeleton itself has an attribute of form DW_FORM_rnglistx.
>> This rnglist would be found in a .debug_rnglists section in the linked
>> file, shared with the other units of the linked file.
>>
>> gcc generates a single .debug_rnglists in the linked file and no
>> .debug_rnglists.dwo in the DWO files. So when an attribute has form
>> DW_FORM_rnglistx in a DWO file, I presume we need to do the lookup in
>> the .debug_rnglists section in the linked file, using the
>> DW_AT_rnglists_base attribute found in the corresponding skeleton unit.
>> This looks vaguely similar to how it was done pre-DWARF 5, with
>> DW_AT_GNU_ranges base.
>>
>> So, is gcc wrong here? I don't see anything in the DWARF 5 spec
>> prohibiting to do it like gcc does, but clang's way of doing it sounds
>> more in-line with the intent of what's described in the DWARF 5 spec.
>> So I wonder if it's maybe an oversight or a misunderstanding between the
>> two compilers.
>
> I think I would have asked the question the other way around :) The
> spec explicitly describes rnglists_base (and loclists_base) as a way
> to reference ranges (loclists) through the index table, so that the
> only relocation you need is in the (skeleton) DIE.
I presume you reference this non-normative text in section 2.17.3?
This range list representation, the rnglist class, and the related
DW_AT_rnglists_base attribute are new in DWARF Version 5. Together
they eliminate most or all of the object language relocations
previously needed for range lists.
What I understand from this is that the rnglist class and
DW_AT_rnglists_base attribute help reduce the number of relocations in
the non-split case (it removes the need for relocations from
DW_AT_ranges attribute values in .debug_info to .debug_rnglists). I
don't understand it as saying anything about where to put the rnglist
data in the split-unit case.
> But the rnglists
> (loclists) themselves can still use relocations. A large part of them
> is non-shared addresses, so using indexes (into the .debug_addr
> addr_base) would simply be extra overhead. The relocations will
> disappear once linked, but the index tables won't.
>
> As an alternative, if you like to minimize the amount of debug data in
> the main object file, the spec also describes how to put a whole
> .debug_rnglists.dwo (or .debug_loclists.dwo) in the split dwarf
> file. Then you cannot use all entry encodings and do need to use an
> .debug_addr index to refer to any addresses in that case. So the
> relocations are still there, you just refer to them through an extra
> index indirection.
>
> So I believe both encodings are valid according to the spec. It just
> depends on what you are optimizing for, small main object file size or
> smallest encoding with least number of indirections.
So, if I understand correctly, gcc's way of doing things (putting all
the rnglists in a common .debug_rnglists section) reduces the overall
size of debug info since the rnglists can use the direct addressing
rnglists entries (e.g. DW_RLE_start_end) rather than the indirect ones
(e.g. DW_RLE_startx_endx). But this come at the expense of a lot of
relocations in the rnglists themselves, since they refer to addresses
directly.
I thought that the main point of split-units was to reduce the number of
relocations processed by the linker and data moved around by the linker,
to reduce link time and provide a better edit-build-debug cycle. Is
that the case?
Anyway, regardless of the intent, the spec should ideally be clear about
that so we don't have to guess.
> P.S. I am really interested in these interpretations of DWARF, but I
> don't really follow the gdb implementation details very much. Could we
> maybe move discussions like these from the -patches list to the main
> gdb (or gcc) mailinglist?
Sure, I added gdb@ and gcc@. I also left gdb-patches@ so that it's
possible to follow the discussion there.
Simon
next parent reply other threads:[~2020-11-13 14:45 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <d2bd55b6-67fb-a04c-95d3-bae4a0c65ff5@polymtl.ca>
[not found] ` <20201113001143.GA2654@wildebeest.org>
2020-11-13 14:45 ` Simon Marchi [this message]
2020-11-13 15:18 ` Mark Wielaard
2020-11-13 15:41 ` Simon Marchi
2020-11-13 18:34 ` Mark Wielaard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3a747adc-af41-7492-de36-b357e5429fff@polymtl.ca \
--to=simon.marchi@polymtl.ca \
--cc=gcc@gcc.gnu.org \
--cc=gdb-patches@sourceware.org \
--cc=gdb@sourceware.org \
--cc=mark@klomp.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).