public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
* build-ids, .debug_sup and other IDs (Was: [PATCH] Handle DWARF 5 separate debug sections)
       [not found] <20210221231810.1062175-1-tom@tromey.com>
@ 2021-02-24 15:07 ` Mark Wielaard
  2021-02-24 17:00   ` Nick Clifton
  2021-02-24 20:11   ` build-ids, .debug_sup and other IDs Tom Tromey
  0 siblings, 2 replies; 13+ messages in thread
From: Mark Wielaard @ 2021-02-24 15:07 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches, elfutils-devel, dwz

Hi,

I am adding the elfutils and dwz mailinglists to CC because I think
you stumbled upon a silent assumption debuginfod makes which would
be good to make explicit (or maybe change?)

Context is that dwz 0.15 (still not released yet) can now produce
standardized DWARF Supplementary Files using a .debug_sup section
when using --dwarf-5 -m multifile. See this bug report:
https://sourceware.org/bugzilla/show_bug.cgi?id=27440

The full patch to support that in gdb is here:
https://sourceware.org/pipermail/gdb-patches/2021-February/176508.html

But what I like to discuss is this part of Tom's email:

On Sun, Feb 21, 2021 at 04:18:10PM -0700, Tom Tromey wrote:
> DWARF 5 standardized the .gnu_debugaltlink section that dwz emits in
> multi-file mode.  This is handled via some new forms, and a new
> .debug_sup section.
> 
> This patch adds support for this to gdb.  It is largely
> straightforward, I think, though one oddity is that I chose not to
> have this code search the system build-id directories for the
> supplementary file.  My feeling was that, while it makes sense for a
> distro to unify the build-id concept with the hash stored in the
> .debug_sup section, there's no intrinsic need to do so.

Tom is correct. Technically a supplemental (alt) file id isn't a
build-id. But it does smell like one. It is a large globally unique
identifier that can be expressed as hex string. And the GNU extension
defining alt files for DWARF < 5 (and still with DWARF5 currently
by default because no consumer yet supports .debug_sup) will put
it in a .note.gnu.build-id NOTE section as GNU_BUILD_ID.

This has the nice side-effect that a distro will most likely make
it available as /usr/lib/debug/.build-id/xx/yyyy.debug file. And
that "build-id" is how you request it from debuginfod (you request
/buildid/BUILDID/debuginfo).

Now technically that is cheating and confusing two concepts,
the build-id and supplemental file id. But I was personally assuming
we would extend it to also to other things like dwo IDs (which are
again almost identical globally unique identifiers for files).
There one question would be if you would get the .dwo file or the
.dwp file in which is was embedded (I would vote for the second).

One consequence of conflating all these IDs is that it isn't immediately
clear what a debuginfod request for /buildid/BUILDID/executable should
return (probably nothing). Or if /buildid/BUILDID/source/SOURCE/FILE
requests also (should) work for other IDs than build-ids.

Any opinions on whether we should split these concepts out and introduce
separate request mechanisms per ID-kind, or simply assume a globally
unique id is globally unique and we just clarify what it means to use
a build-id, sup_checksum or dwo_id together with a request for an
executable, debuginfo or source/file?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs (Was: [PATCH] Handle DWARF 5 separate debug sections)
  2021-02-24 15:07 ` build-ids, .debug_sup and other IDs (Was: [PATCH] Handle DWARF 5 separate debug sections) Mark Wielaard
@ 2021-02-24 17:00   ` Nick Clifton
  2021-02-24 17:21     ` Mark Wielaard
  2021-02-24 20:11   ` build-ids, .debug_sup and other IDs Tom Tromey
  1 sibling, 1 reply; 13+ messages in thread
From: Nick Clifton @ 2021-02-24 17:00 UTC (permalink / raw)
  To: Mark Wielaard, Tom Tromey; +Cc: dwz, elfutils-devel, gdb-patches

Hi Mark,

> Context is that dwz 0.15 (still not released yet) can now produce
> standardized DWARF Supplementary Files using a .debug_sup section
> when using --dwarf-5 -m multifile. See this bug report:
> https://sourceware.org/bugzilla/show_bug.cgi?id=27440

Is there somewhere that I can lay my hands on a file containing a
.debug_sup section and its corresponding supplimentary file ?  I
would like to test the binutils to make sure that they can support
them too.

Cheers
   Nick


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs (Was: [PATCH] Handle DWARF 5 separate debug sections)
  2021-02-24 17:00   ` Nick Clifton
@ 2021-02-24 17:21     ` Mark Wielaard
  2021-02-25 17:52       ` Nick Clifton
  0 siblings, 1 reply; 13+ messages in thread
From: Mark Wielaard @ 2021-02-24 17:21 UTC (permalink / raw)
  To: Nick Clifton; +Cc: Tom Tromey, dwz, elfutils-devel, gdb-patches

Hi Nick,

On Wed, Feb 24, 2021 at 05:00:14PM +0000, Nick Clifton wrote:
> > Context is that dwz 0.15 (still not released yet) can now produce
> > standardized DWARF Supplementary Files using a .debug_sup section
> > when using --dwarf-5 -m multifile. See this bug report:
> > https://sourceware.org/bugzilla/show_bug.cgi?id=27440
> 
> Is there somewhere that I can lay my hands on a file containing a
> .debug_sup section and its corresponding supplimentary file ?  I
> would like to test the binutils to make sure that they can support
> them too.

Currently you need dwz git trunk (hopefully we will do a real release
by Monday?). The following will get you a .sup file for dwz itself:

$ git clone git://sourceware.org/git/dwz.git
$ cd dwz
$ ./configure
$ make
$ cp dwz one
$ cp dwz two
$ dwz --dwarf-5 -m sup one two

If you already process .gnu_debugaltlink then processing the
.debug_sup shouldn't be too hard. Just don't expect there to be a
.note.gnu.build-id. Also note that instead of DW_FORM_GNU_alt_ref and
DW_FORM_GNU_alt_strp the one and two files will contain
DW_FORM_ref_sup and DW_FORM_ref_strp.

The formal description of the .debug_sup section can be found in
section 7.3.6 DWARF Supplementary Object Files
http://dwarfstd.org/doc/DWARF5.pdf

Cheers,

Mark

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs
  2021-02-24 15:07 ` build-ids, .debug_sup and other IDs (Was: [PATCH] Handle DWARF 5 separate debug sections) Mark Wielaard
  2021-02-24 17:00   ` Nick Clifton
@ 2021-02-24 20:11   ` Tom Tromey
  2021-02-25 16:42     ` Frank Ch. Eigler
  1 sibling, 1 reply; 13+ messages in thread
From: Tom Tromey @ 2021-02-24 20:11 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Tom Tromey, dwz, elfutils-devel, gdb-patches

>>>>> "Mark" == Mark Wielaard <mark@klomp.org> writes:

>> This patch adds support for this to gdb.  It is largely
>> straightforward, I think, though one oddity is that I chose not to
>> have this code search the system build-id directories for the
>> supplementary file.  My feeling was that, while it makes sense for a
>> distro to unify the build-id concept with the hash stored in the
>> .debug_sup section, there's no intrinsic need to do so.

Mark> Any opinions on whether we should split these concepts out and introduce
Mark> separate request mechanisms per ID-kind, or simply assume a globally
Mark> unique id is globally unique and we just clarify what it means to use
Mark> a build-id, sup_checksum or dwo_id together with a request for an
Mark> executable, debuginfo or source/file?

FWIW I looked a little at unifying these.  For example,
bfdopncls.c:bfd_get_alt_debug_link_info could look at both the build-id
and .debug_sup.

But, this seemed a bit weird.  What if both appear and they are
different?  Then a single API isn't so great -- you want to check the ID
corresponding to whatever was in the original file.

Probably I should have stuck some of the new code into BFD though.
It's not too late to do that at least.

I suppose a distro can ensure that the IDs are always equal.  Maybe
debuginfod could also just require this.

Tom

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs
  2021-02-24 20:11   ` build-ids, .debug_sup and other IDs Tom Tromey
@ 2021-02-25 16:42     ` Frank Ch. Eigler
  2021-02-25 16:48       ` Jakub Jelinek
  2021-03-02 22:04       ` Tom Tromey
  0 siblings, 2 replies; 13+ messages in thread
From: Frank Ch. Eigler @ 2021-02-25 16:42 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Mark Wielaard, dwz, elfutils-devel, gdb-patches

Hi -

> FWIW I looked a little at unifying these.  For example,
> bfdopncls.c:bfd_get_alt_debug_link_info could look at both the build-id
> and .debug_sup.
> 
> But, this seemed a bit weird.  What if both appear and they are
> different?  Then a single API isn't so great -- you want to check the ID
> corresponding to whatever was in the original file.

If both appear and are different, can we characterize the elf file as
malformed?  Does our current tooling produce such files?  If it's an
abnormality (requires special elf manipulation or whatever), we could
avoid bending backward for this case, and tune the tools to the
Normal.

> [...]
> I suppose a distro can ensure that the IDs are always equal.

Via debugedit for example?

> Maybe debuginfod could also just require this.

Or debuginfod could export the content under -both- IDs, if there were
two valid candidates, and just go with the flow.  Let the clients
choose which ID they prefer to look up by.


- FChE


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs
  2021-02-25 16:42     ` Frank Ch. Eigler
@ 2021-02-25 16:48       ` Jakub Jelinek
  2021-02-25 17:04         ` Frank Ch. Eigler
  2021-03-02 22:04       ` Tom Tromey
  1 sibling, 1 reply; 13+ messages in thread
From: Jakub Jelinek @ 2021-02-25 16:48 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Tom Tromey, Mark Wielaard, dwz, elfutils-devel, gdb-patches

On Thu, Feb 25, 2021 at 11:42:45AM -0500, Frank Ch. Eigler via Dwz wrote:
> > FWIW I looked a little at unifying these.  For example,
> > bfdopncls.c:bfd_get_alt_debug_link_info could look at both the build-id
> > and .debug_sup.
> > 
> > But, this seemed a bit weird.  What if both appear and they are
> > different?  Then a single API isn't so great -- you want to check the ID
> > corresponding to whatever was in the original file.
> 
> If both appear and are different, can we characterize the elf file as
> malformed?

Unsure, the DWARF spec only talks about .debug_sup, the NOTE is a GNU
extension.

> Does our current tooling produce such files?  If it's an

dwz without --dwarf-5 produces .gnu_debugaltlink in the referrers and
.note.gnu.build-id in the supplemental object file.
For dwz --dwarf-5, if it produced a .note.gnu.build-id, it would produce
the same one, but I thought that if I produced that, then consumers could
keep using that instead of .debug_sup which is the only thing defined
in the standard, so in the end dwz --dwarf-5 only produces .debug_sup
on both the referrers side and on the side of supplemental object file
as DWARF specifies.

	Jakub


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs
  2021-02-25 16:48       ` Jakub Jelinek
@ 2021-02-25 17:04         ` Frank Ch. Eigler
  2021-03-02 22:05           ` Tom Tromey
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Ch. Eigler @ 2021-02-25 17:04 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Tom Tromey, Mark Wielaard, dwz, elfutils-devel, gdb-patches

Hi -

> For dwz --dwarf-5, if it produced a .note.gnu.build-id, it would produce
> the same one, but I thought that if I produced that, then consumers could
> keep using that instead of .debug_sup which is the only thing defined
> in the standard, so in the end dwz --dwarf-5 only produces .debug_sup
> on both the referrers side and on the side of supplemental object file
> as DWARF specifies.

Right, but build-ids are still in normal binaries -- just not the
dwz-commonized files created by "dwz --dwarf-5"?  So our toolchains
still process build-ids routinely for all the other uses.  By omitting
the build-id on the dwz-generated files, we're forcing a flag day on
all our consumer tools.  (Does dwz'd dwarf5 even work on gdb
etc. now?)  ISTM tool backward compatibility is more important, so
I would suggest dwz generate -both- identifiers.

- FChE


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs (Was: [PATCH] Handle DWARF 5 separate debug sections)
  2021-02-24 17:21     ` Mark Wielaard
@ 2021-02-25 17:52       ` Nick Clifton
  2021-06-14  5:52         ` Matt Schulte
  0 siblings, 1 reply; 13+ messages in thread
From: Nick Clifton @ 2021-02-25 17:52 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Tom Tromey, dwz, elfutils-devel, gdb-patches

Hi Mark,

> $ git clone git://sourceware.org/git/dwz.git
> $ cd dwz
> $ ./configure
> $ make
> $ cp dwz one
> $ cp dwz two
> $ dwz --dwarf-5 -m sup one two

Thanks.  Using those files as a guide I have added some initial support for displaying and following .debug_sup sections to readelf and objdump.

Cheers
   Nick


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs
  2021-02-25 16:42     ` Frank Ch. Eigler
  2021-02-25 16:48       ` Jakub Jelinek
@ 2021-03-02 22:04       ` Tom Tromey
  1 sibling, 0 replies; 13+ messages in thread
From: Tom Tromey @ 2021-03-02 22:04 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Tom Tromey, Mark Wielaard, dwz, elfutils-devel, gdb-patches

>> But, this seemed a bit weird.  What if both appear and they are
>> different?  Then a single API isn't so great -- you want to check the ID
>> corresponding to whatever was in the original file.

Frank> If both appear and are different, can we characterize the elf file as
Frank> malformed?

Not really, nothing specifies that these must be the same.

Frank> Or debuginfod could export the content under -both- IDs, if there were
Frank> two valid candidates, and just go with the flow.  Let the clients
Frank> choose which ID they prefer to look up by.

There's a namespace problem here.  You could, in theory, have executable
A with build id AAAA, and also executable B with debug_sup id also AAAA.
This could be fixed with some kind of query parameter.  It would be easy
on the gdb side to supply this information.

Tom

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs
  2021-02-25 17:04         ` Frank Ch. Eigler
@ 2021-03-02 22:05           ` Tom Tromey
  0 siblings, 0 replies; 13+ messages in thread
From: Tom Tromey @ 2021-03-02 22:05 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Jakub Jelinek, Tom Tromey, Mark Wielaard, dwz, elfutils-devel,
	gdb-patches

Frank> (Does dwz'd dwarf5 even work on gdb
Frank> etc. now?)

It doesn't, this thread started because I sent a patch to change gdb to
read .debug_sup.  This hasn't landed yet.

Tom

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs (Was: [PATCH] Handle DWARF 5 separate debug sections)
  2021-02-25 17:52       ` Nick Clifton
@ 2021-06-14  5:52         ` Matt Schulte
  2021-06-14 12:49           ` Frank Ch. Eigler
  0 siblings, 1 reply; 13+ messages in thread
From: Matt Schulte @ 2021-06-14  5:52 UTC (permalink / raw)
  To: Nick Clifton; +Cc: Mark Wielaard, dwz, Tom Tromey, elfutils-devel, gdb-patches

Hi Mark,

My apologies for bringing this up so late. I was just re-reading this thread
while looking at how to find a .dwp for a given binary.

> But I was personally assuming we would extend it to also to other things like
> dwo IDs (which are again almost identical globally unique identifiers for
> files)

I'm concerned about using dwo IDs to index debuginfod. They are only 64-bits and
there will be many more dwo IDs than build ids or supplemental file ids since
there is 1 per compile unit.  Assuming dwo IDs are randomly distributed, once we
have ~600,000,000 dwo IDs we have a 1% chance of a collision. ~600,000,000 =
sqrt(2 * 2^64 * 0.01) (I think I did that math right but forgive me if not).

Maybe that's an ok number? (I tried to estimate the number of compile units in
one distro's release, but do not have a good way of doing that quickly)

What about using `/buildid/BUILDID/dwp` instead? This is not a perfect solution,
since (currently) no one puts the build-id into the *.dwp file, but it does get
around this collision problem.

-Matt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs (Was: [PATCH] Handle DWARF 5 separate debug sections)
  2021-06-14  5:52         ` Matt Schulte
@ 2021-06-14 12:49           ` Frank Ch. Eigler
  2021-06-14 15:18             ` Matt Schulte
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Ch. Eigler @ 2021-06-14 12:49 UTC (permalink / raw)
  To: Matt Schulte
  Cc: Nick Clifton, elfutils-devel, Mark Wielaard, dwz, Tom Tromey,
	gdb-patches

Hi -

> I'm concerned about using dwo IDs to index debuginfod. They are only
> 64-bits and there will be many more dwo IDs than build ids or
> supplemental file ids [...]

AIUI, -gsplit-dwarf is more suitable for development/scratch builds
than for distro binaries.  If distros agree, then I would not expect
.dwo files to show up in distro-wide debuginfod services, but rather
within developers' own build trees.  Then debuginfod indexing
collisions would only be a risk within a particular local set of trees
(if serviced by a local debuginfod), rather than distro wide or wider.

> What about using `/buildid/BUILDID/dwp` instead? This is not a
> perfect solution, since (currently) no one puts the build-id into
> the *.dwp file, but it does get around this collision problem.

The hypothetical problem is collision between dwo/dwp files, not
between dwo/dwp and normal buildid dwarf files, right?  In that case,
are you talking about two levels of indexing (buildid of final linked
binary + dwo_id)?  That would resemble the indexing work required from
debuginfod to match up binaries with their source files plus binaries
with dwz supplemental files).

- FChE


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: build-ids, .debug_sup and other IDs (Was: [PATCH] Handle DWARF 5 separate debug sections)
  2021-06-14 12:49           ` Frank Ch. Eigler
@ 2021-06-14 15:18             ` Matt Schulte
  0 siblings, 0 replies; 13+ messages in thread
From: Matt Schulte @ 2021-06-14 15:18 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Nick Clifton, elfutils-devel, Mark Wielaard, dwz, Tom Tromey,
	gdb-patches

Thanks for the thoughts!

> AIUI, -gsplit-dwarf is more suitable for development/scratch builds
> than for distro binaries.  If distros agree, then I would not expect
> .dwo files to show up in distro-wide debuginfod services, but rather
> within developers' own build trees.

That's a good point. My concerns are only valid if distros decide to
start building packages using -gsplit-dwarf and dwp to package up
the .dwo files into one .dwp file.

I also agree that split dwarfs (split dwarves?) are more suitable for
local builds than for distro builds. The one advantage I can think
of that split dwarfs offer distro binaries is a faster build for
larger packages (since dwp does not do all the relocations the
linker would normally do). But I don't know enough about building
packages to say what will happen in the future.

> The hypothetical problem is collision between dwo/dwp files, not
> between dwo/dwp and normal buildid dwarf files, right?

That's correct.

> In that case, are you talking about two levels of indexing (buildid
> of final linked binary + dwo_id)?

I was suggesting one level of indexing. The buildid of the final linked
binary would be used to reference the dwp file directly. This solution
would not work for individual dwo files. For individual dwo files we
could still use the dwo_id as they should only be for local builds.

-Matt

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-06-14 15:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210221231810.1062175-1-tom@tromey.com>
2021-02-24 15:07 ` build-ids, .debug_sup and other IDs (Was: [PATCH] Handle DWARF 5 separate debug sections) Mark Wielaard
2021-02-24 17:00   ` Nick Clifton
2021-02-24 17:21     ` Mark Wielaard
2021-02-25 17:52       ` Nick Clifton
2021-06-14  5:52         ` Matt Schulte
2021-06-14 12:49           ` Frank Ch. Eigler
2021-06-14 15:18             ` Matt Schulte
2021-02-24 20:11   ` build-ids, .debug_sup and other IDs Tom Tromey
2021-02-25 16:42     ` Frank Ch. Eigler
2021-02-25 16:48       ` Jakub Jelinek
2021-02-25 17:04         ` Frank Ch. Eigler
2021-03-02 22:05           ` Tom Tromey
2021-03-02 22:04       ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).