public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
From: Simon Marchi <simon.marchi@polymtl.ca>
To: Mark Wielaard <mark@klomp.org>
Cc: "gdb@sourceware.org" <gdb@sourceware.org>,
	gcc@gcc.gnu.org,
	"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: Split DWARF and rnglists, gcc vs clang
Date: Fri, 13 Nov 2020 09:45:24 -0500	[thread overview]
Message-ID: <3a747adc-af41-7492-de36-b357e5429fff@polymtl.ca> (raw)
In-Reply-To: <20201113001143.GA2654@wildebeest.org>

On 2020-11-12 7:11 p.m., Mark Wielaard wrote:
> Hi Simon,
>
> On Thu, Nov 05, 2020 at 11:11:43PM -0500, Simon Marchi wrote:
>> I'm currently squashing some bugs related to .debug_rnglists in GDB, and
>> I happened to notice that clang and gcc do different things when
>> generating rnglists with split DWARF.  I'd like to know if the two
>> behaviors are acceptable, and therefore if we need to make GDB accept
>> both.  Or maybe one of them is not doing things correctly and would need
>> to be fixed.
>>
>> clang generates a .debug_rnglists.dwo section in the .dwo file.  Any
>> DW_FORM_rnglistx attribute in the DWO refers to that section.  That
>> section is not shared with any other object, so DW_AT_rnglists_base is
>> never involved for these attributes.  Note that there might still be a
>> DW_AT_rnglists_base on the DW_TAG_skeleton_unit, in the linked file,
>> used if the skeleton itself has an attribute of form DW_FORM_rnglistx.
>> This rnglist would be found in a .debug_rnglists section in the linked
>> file, shared with the other units of the linked file.
>>
>> gcc generates a single .debug_rnglists in the linked file and no
>> .debug_rnglists.dwo in the DWO files.  So when an attribute has form
>> DW_FORM_rnglistx in a DWO file, I presume we need to do the lookup in
>> the .debug_rnglists section in the linked file, using the
>> DW_AT_rnglists_base attribute found in the corresponding skeleton unit.
>> This looks vaguely similar to how it was done pre-DWARF 5, with
>> DW_AT_GNU_ranges base.
>>
>> So, is gcc wrong here?  I don't see anything in the DWARF 5 spec
>> prohibiting to do it like gcc does, but clang's way of doing it sounds
>> more in-line with the intent of what's described in the DWARF 5 spec.
>> So I wonder if it's maybe an oversight or a misunderstanding between the
>> two compilers.
>
> I think I would have asked the question the other way around :) The
> spec explicitly describes rnglists_base (and loclists_base) as a way
> to reference ranges (loclists) through the index table, so that the
> only relocation you need is in the (skeleton) DIE.

I presume you reference this non-normative text in section 2.17.3?

    This range list representation, the rnglist class, and the related
    DW_AT_rnglists_base attribute are new in DWARF Version 5. Together
    they eliminate most or all of the object language relocations
    previously needed for range lists.

What I understand from this is that the rnglist class and
DW_AT_rnglists_base attribute help reduce the number of relocations in
the non-split case (it removes the need for relocations from
DW_AT_ranges attribute values in .debug_info to .debug_rnglists).  I
don't understand it as saying anything about where to put the rnglist
data in the split-unit case.

> But the rnglists
> (loclists) themselves can still use relocations. A large part of them
> is non-shared addresses, so using indexes (into the .debug_addr
> addr_base) would simply be extra overhead. The relocations will
> disappear once linked, but the index tables won't.
>
> As an alternative, if you like to minimize the amount of debug data in
> the main object file, the spec also describes how to put a whole
> .debug_rnglists.dwo (or .debug_loclists.dwo) in the split dwarf
> file. Then you cannot use all entry encodings and do need to use an
> .debug_addr index to refer to any addresses in that case. So the
> relocations are still there, you just refer to them through an extra
> index indirection.
>
> So I believe both encodings are valid according to the spec. It just
> depends on what you are optimizing for, small main object file size or
> smallest encoding with least number of indirections.

So, if I understand correctly, gcc's way of doing things (putting all
the rnglists in a common .debug_rnglists section) reduces the overall
size of debug info since the rnglists can use the direct addressing
rnglists entries (e.g. DW_RLE_start_end) rather than the indirect ones
(e.g. DW_RLE_startx_endx).  But this come at the expense of a lot of
relocations in the rnglists themselves, since they refer to addresses
directly.

I thought that the main point of split-units was to reduce the number of
relocations processed by the linker and data moved around by the linker,
to reduce link time and provide a better edit-build-debug cycle.  Is
that the case?

Anyway, regardless of the intent, the spec should ideally be clear about
that so we don't have to guess.

> P.S. I am really interested in these interpretations of DWARF, but I
> don't really follow the gdb implementation details very much. Could we
> maybe move discussions like these from the -patches list to the main
> gdb (or gcc) mailinglist?

Sure, I added gdb@ and gcc@.  I also left gdb-patches@ so that it's
possible to follow the discussion there.

Simon

       reply	other threads:[~2020-11-13 14:45 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <d2bd55b6-67fb-a04c-95d3-bae4a0c65ff5@polymtl.ca>
     [not found] ` <20201113001143.GA2654@wildebeest.org>
2020-11-13 14:45   ` Simon Marchi [this message]
2020-11-13 15:18     ` Mark Wielaard
2020-11-13 15:41       ` Simon Marchi
2020-11-13 18:34         ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3a747adc-af41-7492-de36-b357e5429fff@polymtl.ca \
    --to=simon.marchi@polymtl.ca \
    --cc=gcc@gcc.gnu.org \
    --cc=gdb-patches@sourceware.org \
    --cc=gdb@sourceware.org \
    --cc=mark@klomp.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).