public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* caching failed lookups of debuginfo?
@ 2022-04-08 19:58 Milian Wolff
  2022-04-08 20:05 ` Frank Ch. Eigler
  0 siblings, 1 reply; 11+ messages in thread
From: Milian Wolff @ 2022-04-08 19:58 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 3425 bytes --]

Hey all,

another debuginfod related question, but unrelated to the other thread I 
started earlier today. In a work branch I have ported my heaptrack profiler 
over to elfutils. I have then run the analyzer that uses elfutils (and thus 
debuginfod internally via dwfl) on a recorded data file to have it download 
all the debug info files it can find.

Now, on consecutive runs, nothing should be downloaded anymore. But I notice 
that there is still a huge slowdown incurred when enabled `DEBUGINFOD_URLS`:

With DEBUGINFOD_URLS set:
```
perf stat ./heaptrack_interpret ...
          5,765.36 msec task-clock                #    0.707 CPUs utilized          
...
       8.154873275 seconds time elapsed

       5.477158000 seconds user
       0.271662000 seconds sys
```

Without DEBUGINFOD_URLS set:
```
perf stat ./heaptrack_interpret ...

          2,460.09 msec task-clock                #    0.999 CPUs utilized          
...
       2.461661784 seconds time elapsed

       2.334429000 seconds user
       0.119698000 seconds sys
```

Note the significantly increased wall-time in the first case. The second case 
(without debuginfod) has basically zero off-CPU time and significantly reduced 
on-CPU time too.

Looking at the off-CPU time flame graph, and enabling verbose debuginfod 
output lets me come to the conclusion, that the issue above is due to some 
libraries simply not having any debug information uploaded. I.e. I see dozens 
of reports such as:

```
debuginfod_find_debuginfo 2e245c2bf12f95fd8ab79b3a4be99524677cbd70
server urls "https://debuginfod.archlinux.org/"
checking build-id
checking cache dir /home/milian/.debuginfod_client_cache
using timeout 90
init server 0 https://debuginfod.archlinux.org/buildid
url 0 https://debuginfod.archlinux.org/buildid/
2e245c2bf12f95fd8ab79b3a4be99524677cbd70/debuginfo
query 1 urls in parallel
Downloading |
server response HTTP response code said error
url 0 The requested URL returned error: 404
not found No such file or directory (err=-2)
```

These negative lookups are not cached. Meaning rerunning the same process 
using dwfl and debuginfod on the same data would always incur a significant 
slowdown, as we would again and again try to look for something that's not 
there. The lookups take roughly ~200ms for me to realize the data is not on 
the server.

What's worse, I'm seeing multiple lookups for the same buildid *within the 
same process*. I.e.:

```
export DEBUGINFOD_VERBOSE=1
./heaptrack_interpret ... |& egrep "^url 0 https" | sort | uniq -c | sort
...
      6 url 0 https://debuginfod.archlinux.org/buildid/
7f4b16b4b407cbae2d7118d6f99610e29a18a56a/debuginfo
      8 url 0 https://debuginfod.archlinux.org/buildid/
c09c6f50f6bcec73c64a0b4be77eadb8f7202410/debuginfo
     14 url 0 https://debuginfod.archlinux.org/buildid/
85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo
```

Here, we are paying roughly `14 * 0.2s = 2.8s` just for a single library.

Can we find a way to improve this situation somehow generically? I would 
personally even be OK with caching the 404 error case locally for some time 
(say, one hour or one day or ...). Then at least we would at most pay this 
cost once per library, and not multiple times. And rerunning the analysis a 
second time would become much faster again.

Was there a deliberate decision against caching negative server side lookups?

Thanks
-- 
Milian Wolff
mail@milianw.de
http://milianw.de

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-04-08 22:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-08 19:58 caching failed lookups of debuginfo? Milian Wolff
2022-04-08 20:05 ` Frank Ch. Eigler
2022-04-08 20:45   ` Milian Wolff
2022-04-08 20:59     ` Mark Wielaard
2022-04-08 21:08       ` Milian Wolff
2022-04-08 21:34         ` Aaron Merey
2022-04-08 21:56           ` Milian Wolff
2022-04-08 22:21             ` Mark Wielaard
2022-04-08 22:23             ` Milian Wolff
2022-04-08 22:40               ` Mark Wielaard
2022-04-08 22:54                 ` Aaron Merey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).