public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: "Frank Ch. Eigler" <fche@redhat.com>
To: Milian Wolff <mail@milianw.de>
Cc: elfutils-devel@sourceware.org
Subject: Re: caching failed lookups of debuginfo?
Date: Fri, 8 Apr 2022 16:05:27 -0400	[thread overview]
Message-ID: <20220408200527.GC23295@redhat.com> (raw)
In-Reply-To: <4448277.fIUe8AKecr@milian-workstation>

Hi -

> another debuginfod related question, but unrelated to the other thread I 
> started earlier today. In a work branch I have ported my heaptrack profiler 
> over to elfutils. I have then run the analyzer that uses elfutils (and thus 
> debuginfod internally via dwfl) on a recorded data file to have it download 
> all the debug info files it can find.

Nice.


> These negative lookups are not cached. Meaning rerunning the same process 
> using dwfl and debuginfod on the same data would always incur a significant 
> slowdown, as we would again and again try to look for something that's not 
> there. The lookups take roughly ~200ms for me to realize the data is not on 
> the server.

That's not correct, as of elfutils 0.184 (commit 5f72c51a7e5c0),
with some more recent tweaks in (commit 7d64173fb11c6).

- FChE



> What's worse, I'm seeing multiple lookups for the same buildid *within the 
> same process*. I.e.:
> 
> ```
> export DEBUGINFOD_VERBOSE=1
> ./heaptrack_interpret ... |& egrep "^url 0 https" | sort | uniq -c | sort
> ...
>       6 url 0 https://debuginfod.archlinux.org/buildid/
> 7f4b16b4b407cbae2d7118d6f99610e29a18a56a/debuginfo
>       8 url 0 https://debuginfod.archlinux.org/buildid/
> c09c6f50f6bcec73c64a0b4be77eadb8f7202410/debuginfo
>      14 url 0 https://debuginfod.archlinux.org/buildid/
> 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo
> ```
> 
> Here, we are paying roughly `14 * 0.2s = 2.8s` just for a single library.
> 
> Can we find a way to improve this situation somehow generically? I would 
> personally even be OK with caching the 404 error case locally for some time 
> (say, one hour or one day or ...). Then at least we would at most pay this 
> cost once per library, and not multiple times. And rerunning the analysis a 
> second time would become much faster again.
> 
> Was there a deliberate decision against caching negative server side lookups?
> 
> Thanks
> -- 
> Milian Wolff
> mail@milianw.de
> http://milianw.de



  reply	other threads:[~2022-04-08 20:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-08 19:58 Milian Wolff
2022-04-08 20:05 ` Frank Ch. Eigler [this message]
2022-04-08 20:45   ` Milian Wolff
2022-04-08 20:59     ` Mark Wielaard
2022-04-08 21:08       ` Milian Wolff
2022-04-08 21:34         ` Aaron Merey
2022-04-08 21:56           ` Milian Wolff
2022-04-08 22:21             ` Mark Wielaard
2022-04-08 22:23             ` Milian Wolff
2022-04-08 22:40               ` Mark Wielaard
2022-04-08 22:54                 ` Aaron Merey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220408200527.GC23295@redhat.com \
    --to=fche@redhat.com \
    --cc=elfutils-devel@sourceware.org \
    --cc=mail@milianw.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).