public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: "Frank Ch. Eigler" <fche@redhat.com>,
	Romain GEISSLER <romain.geissler@amadeus.com>
Cc: "elfutils-devel@sourceware.org" <elfutils-devel@sourceware.org>,
	 Francois RIGAULT <francois.rigault@amadeus.com>
Subject: Re: Performance issue with systemd-coredump and container process linking 2000 shared libraries.
Date: Thu, 22 Jun 2023 12:59:12 +0200	[thread overview]
Message-ID: <9a242e2d1fda9a3eb9972278e8c624f95431ab7f.camel@klomp.org> (raw)
In-Reply-To: <20230622024044.GI5772@redhat.com>

Hi Frank,

On Wed, 2023-06-21 at 22:40 -0400, Frank Ch. Eigler wrote:
> For an application that processes these elf/dwarf files sequentially,
> queries for each synthetic solib are going to result in 2000 https-404
> transactions, sans debuginfod caching.  If you're lucky (reusing a
> dwfl object), elfutils may be able to reuse a long-lived https
> connection to a server, otherwise a new https connection might have to
> be spun up for each.  But even with reuse, we're talking about 2000
> pingponging messages.  That will take a handful of minutes of elapsed
> time just by itself.
> 
> If the calling code made these queries in parallel batches, it would
> be much faster overall.

I have been thinking about this too. But don't know of a good solution
that doesn't negate the (iterative) lazy model that Dwfl uses.

libdwfl tries to do the least amount of work possible so that you don't
"pay" for looking up extra information for Dwfl_Modules (libraries) you
don't care about. So for the use case of getting a backtrace of a
particular thread in that core file (Dwfl) you only fetch/load the
register notes and stack memory from the core file. Then, only if you
translate those stack addresses to symbols, only for those modules
associated with those stack addresses it will try to fetch/load the
symbol tables, which might involve resolving build-ids to Elf or
separate Dwarf debug files. This is normally an iterative process and
for something like generating a backtrace it often involves just a
handful of Dwfl_Modules (libraries), not all 2000.

In this case this falls down a bit since the application creates a Dwfl
from a core file and then requests information (the elf file) from all
Dwfl_Modules, so it can get at the package note description for each of
them. As Romain noted it would be really nice if elfutils/libdwfl had a
way to get at the package note description (just like it has for
getting the build-id, which is another loaded ELF note). So such a
function would know it doesn't need to get the full ELF file.

Maybe another solution might be an "get me everything for this Dwfl,
all symbol tables, all elf files, all separate Dwarf debug files, etc."
function so an application can "pay upfront" for not having to fetch
each item lazily? Such a function could then do a "parallel/batched
fetch" through debuginfod.

Cheers,

Mark

      reply	other threads:[~2023-06-22 10:59 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-14 13:39 Romain Geissler
2023-06-19 15:08 ` Mark Wielaard
2023-06-19 19:56   ` Romain GEISSLER
2023-06-19 20:54     ` Luca Boccassi
2023-06-20 11:59       ` Mark Wielaard
2023-06-20 12:12         ` Luca Boccassi
2023-06-20 13:15     ` Mark Wielaard
2023-06-20 21:37   ` Mark Wielaard
2023-06-20 22:05     ` Romain GEISSLER
2023-06-21 16:24       ` Mark Wielaard
2023-06-21 18:14         ` Romain GEISSLER
2023-06-21 19:39           ` Mark Wielaard
2023-06-22  8:10             ` Romain GEISSLER
2023-06-22 12:03               ` Mark Wielaard
2023-06-27 14:09         ` Mark Wielaard
2023-06-30 16:09           ` Romain GEISSLER
2023-06-30 22:35             ` Mark Wielaard
2023-06-22  2:40       ` Frank Ch. Eigler
2023-06-22 10:59         ` Mark Wielaard [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a242e2d1fda9a3eb9972278e8c624f95431ab7f.camel@klomp.org \
    --to=mark@klomp.org \
    --cc=elfutils-devel@sourceware.org \
    --cc=fche@redhat.com \
    --cc=francois.rigault@amadeus.com \
    --cc=romain.geissler@amadeus.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).