* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
@ 2023-03-13 16:40 ` fche at redhat dot com
2023-03-14 1:47 ` vi at endrift dot com
` (12 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: fche at redhat dot com @ 2023-03-13 16:40 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
Frank Ch. Eigler <fche at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |fche at redhat dot com
--- Comment #1 from Frank Ch. Eigler <fche at redhat dot com> ---
Even a 404 error may be transient, as a server may just not have gotten around
to indexing new content yet. Other transient errors may persist awhile. I
don't know of any unambiguous winning policy here.
As to the question that, if such a policy were formulated, how could the
results be represented in the filesystem: xattrs, yeah maybe. But even
simpler would be to have the code set the mtime or ctime of the 0-length file
to a cause-related artificial timestamp that will inform the "cache_miss_s"
expiry calculations.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
2023-03-13 16:40 ` [Bug debuginfod/30221] " fche at redhat dot com
@ 2023-03-14 1:47 ` vi at endrift dot com
2023-03-17 1:00 ` vi at endrift dot com
` (11 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vi at endrift dot com @ 2023-03-14 1:47 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
--- Comment #2 from Vicki Pfau <vi at endrift dot com> ---
404 and the like *may* be transient, but the fact of the matter is that *most*
of the time it won't be And it's a cache, not a definitive answer saying this
will never exist. Having a 404 cache for 10x the amount of time as a Ctrl-C
would be a benefit to users 99% of the time, if not more. You don't need to
overgeneralize to a surefire 100% of the time for something that's already
"soft" like a cache. I'm already dealing with gdb taking well over 30 seconds
to start running a program with a bunch of shared object dependencies that
aren't in debuginfod...only to have to do that again in 10 minutes because
there's no way for the cache to say "this probably won't appear in the short
term." Setting cache_miss_s higher works, but is a workaround.
Using an artificial timestamp to fake out the cache_miss_s expiry is a hack.
There's no other way of describing it. You're trying to wedge down additional
information to a dumber system instead of making the system smarter if you go
for that approach. Your filesystem representation works for the small, simple
case you have here, but it won't scale if you try and extend the system with
any metadata at all. You have one inode per negative cache file instead of one
entry in, e.g. a SQLite database, which you can add additional columns to.
xattrs are still a bit of a kludge but at least aren't trying to spoof
information to fool a system unaware of complexity existing.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
2023-03-13 16:40 ` [Bug debuginfod/30221] " fche at redhat dot com
2023-03-14 1:47 ` vi at endrift dot com
@ 2023-03-17 1:00 ` vi at endrift dot com
2023-03-17 1:08 ` fche at redhat dot com
` (10 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vi at endrift dot com @ 2023-03-17 1:00 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
--- Comment #3 from Vicki Pfau <vi at endrift dot com> ---
I have a proof of concept patch that I can attach here or submit to the mailing
list if you think the xattrs approach is a good way to go. Alternatively, a
metadata directory could be added under each buildid for per-file info, which
would work in the absence of functional xattrs, but be slightly more complex.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (2 preceding siblings ...)
2023-03-17 1:00 ` vi at endrift dot com
@ 2023-03-17 1:08 ` fche at redhat dot com
2023-03-17 1:16 ` vi at endrift dot com
` (9 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: fche at redhat dot com @ 2023-03-17 1:08 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
--- Comment #4 from Frank Ch. Eigler <fche at redhat dot com> ---
(In reply to Vicki Pfau from comment #3)
> I have a proof of concept patch that I can attach here or submit to the
> mailing list if you think the xattrs approach is a good way to go.
> Alternatively, a metadata directory could be added under each buildid for
> per-file info, which would work in the absence of functional xattrs, but be
> slightly more complex.
Have you considered the idea of encoding the retention deadline in the boring
inode mtime or ctime?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (3 preceding siblings ...)
2023-03-17 1:08 ` fche at redhat dot com
@ 2023-03-17 1:16 ` vi at endrift dot com
2023-03-17 1:20 ` fche at redhat dot com
` (8 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vi at endrift dot com @ 2023-03-17 1:16 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
--- Comment #5 from Vicki Pfau <vi at endrift dot com> ---
I have a proof of concept patch that I can attach here or submit to the mailing
list if you think the xattrs approach is a good way to go. Alternatively, a
metadata directory could be added under each buildid for per-file info, which
would work in the absence of functional xattrs, but be slightly more
complex.(In reply to Frank Ch. Eigler from comment #4)
> (In reply to Vicki Pfau from comment #3)
> > I have a proof of concept patch that I can attach here or submit to the
> > mailing list if you think the xattrs approach is a good way to go.
> > Alternatively, a metadata directory could be added under each buildid for
> > per-file info, which would work in the absence of functional xattrs, but be
> > slightly more complex.
>
> Have you considered the idea of encoding the retention deadline in the
> boring inode mtime or ctime?
I did, and in comment 2 I already explained why I think it's a bad idea.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (4 preceding siblings ...)
2023-03-17 1:16 ` vi at endrift dot com
@ 2023-03-17 1:20 ` fche at redhat dot com
2023-03-17 1:28 ` vi at endrift dot com
` (7 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: fche at redhat dot com @ 2023-03-17 1:20 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
--- Comment #6 from Frank Ch. Eigler <fche at redhat dot com> ---
(In reply to Vicki Pfau from comment #2)
> 404 and the like *may* be transient, but the fact of the matter is that
> *most* of the time it won't be And it's a cache, not a definitive answer
> saying this will never exist. Having a 404 cache for 10x the amount of time
> as a Ctrl-C
I don't understand - a ctrl-C should not result in a cached artifact at all.
If that's happening, we should fix that.
> I'm already dealing with gdb taking well
> over 30 seconds to start running a program with a bunch of shared object
> dependencies that aren't in debuginfod...
Uncached misses from debuginfod tend to take on the order of milliseconds,
much less than seconds. Do you have a trace of what's happening?
(DEBUGINFOD_VERBOSE=1 or something like that?)
> [...] because there's no way for the cache to say "this probably won't
> appear in the short term." Setting cache_miss_s higher works, but is a
> workaround.
That workaround is precisely the parameter for the quantity you seek.
> Your filesystem representation works
> for the small, simple case you have here, but it won't scale if you try and
> extend the system with any metadata at all.
That's fine. If we can revisit when rationale exists for more metadata.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (5 preceding siblings ...)
2023-03-17 1:20 ` fche at redhat dot com
@ 2023-03-17 1:28 ` vi at endrift dot com
2023-03-17 1:30 ` vi at endrift dot com
` (6 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vi at endrift dot com @ 2023-03-17 1:28 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
--- Comment #7 from Vicki Pfau <vi at endrift dot com> ---
I have a proof of concept patch that I can attach here or submit to the mailing
list if you think the xattrs approach is a good way to go. Alternatively, a
metadata directory could be added under each buildid for per-file info, which
would work in the absence of functional xattrs, but be slightly more
complex.(In reply to Frank Ch. Eigler from comment #4)
> (In reply to Vicki Pfau from comment #3)
> > I have a proof of concept patch that I can attach here or submit to the
> > mailing list if you think the xattrs approach is a good way to go.
> > Alternatively, a metadata directory could be added under each buildid for
> > per-file info, which would work in the absence of functional xattrs, but be
> > slightly more complex.
>
> Have you considered the idea of encoding the retention deadline in the
> boring inode mtime or ctime?
I did, and in comment 2 I already explained why I think it's a bad idea.(In
reply to Frank Ch. Eigler from comment #6)
> (In reply to Vicki Pfau from comment #2)
> > 404 and the like *may* be transient, but the fact of the matter is that
> > *most* of the time it won't be And it's a cache, not a definitive answer
> > saying this will never exist. Having a 404 cache for 10x the amount of time
> > as a Ctrl-C
>
> I don't understand - a ctrl-C should not result in a cached artifact at all.
> If that's happening, we should fix that.
Okay, that is kinda weird then. I'm seeing it in gdb--perhaps it's a gdb issue
then.
> > I'm already dealing with gdb taking well
> > over 30 seconds to start running a program with a bunch of shared object
> > dependencies that aren't in debuginfod...
>
> Uncached misses from debuginfod tend to take on the order of milliseconds,
> much less than seconds. Do you have a trace of what's happening?
> (DEBUGINFOD_VERBOSE=1 or something like that?)
The issue appears to be the debuginfod server taking a not-insignificant amount
of time per request (500ms - 2s I'd estimate) to report the absence of an
associated artifact. Perhaps this is just an issue with how the server is
configured. I'm using the elfutils server, but I've seen the same issue on
Arch's server (the distro I'm using). It's worth noting too that some users
will undoubtedly have higher latency. A way to asynchronously initiate requests
so you can have multiple going at once would be great to try and alleviate this
somewhat, but it doesn't look like there's a way to do this yet.
> > [...] because there's no way for the cache to say "this probably won't
> > appear in the short term." Setting cache_miss_s higher works, but is a
> > workaround.
>
> That workaround is precisely the parameter for the quantity you seek.
Assuming the Ctrl-C issue I mentioned above is resolved, you could well be
right. It's definitely the biggest source of the "transient" issues I
mentioned, though things like timeouts might still qualify.
> > Your filesystem representation works
> > for the small, simple case you have here, but it won't scale if you try and
> > extend the system with any metadata at all.
>
> That's fine. If we can revisit when rationale exists for more metadata.
Sounds good.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (6 preceding siblings ...)
2023-03-17 1:28 ` vi at endrift dot com
@ 2023-03-17 1:30 ` vi at endrift dot com
2023-03-17 16:16 ` amerey at redhat dot com
` (5 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vi at endrift dot com @ 2023-03-17 1:30 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
--- Comment #8 from Vicki Pfau <vi at endrift dot com> ---
Apologies for the double-post of the first part of that comment. I reloaded the
page and apparently hitting the reply button didn't clear the comment at the
top and I didn't notice until I replied.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (7 preceding siblings ...)
2023-03-17 1:30 ` vi at endrift dot com
@ 2023-03-17 16:16 ` amerey at redhat dot com
2023-03-17 16:53 ` amerey at redhat dot com
` (4 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: amerey at redhat dot com @ 2023-03-17 16:16 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
Aaron Merey <amerey at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |amerey at redhat dot com
--- Comment #9 from Aaron Merey <amerey at redhat dot com> ---
(In reply to Vicki Pfau from comment #7)
> I did, and in comment 2 I already explained why I think it's a bad idea.(In
> reply to Frank Ch. Eigler from comment #6)
> > (In reply to Vicki Pfau from comment #2)
> > > 404 and the like *may* be transient, but the fact of the matter is that
> > > *most* of the time it won't be And it's a cache, not a definitive answer
> > > saying this will never exist. Having a 404 cache for 10x the amount of time
> > > as a Ctrl-C
> >
> > I don't understand - a ctrl-C should not result in a cached artifact at all.
> > If that's happening, we should fix that.
>
> Okay, that is kinda weird then. I'm seeing it in gdb--perhaps it's a gdb
> issue then.
The issue was in libdebuginfod itself. I merged a fix for this:
https://sourceware.org/pipermail/elfutils-devel/2023q1/006050.html
> > > I'm already dealing with gdb taking well
> > > over 30 seconds to start running a program with a bunch of shared object
> > > dependencies that aren't in debuginfod...
> >
> > Uncached misses from debuginfod tend to take on the order of milliseconds,
> > much less than seconds. Do you have a trace of what's happening?
> > (DEBUGINFOD_VERBOSE=1 or something like that?)
>
> The issue appears to be the debuginfod server taking a not-insignificant
> amount of time per request (500ms - 2s I'd estimate) to report the absence
> of an associated artifact. Perhaps this is just an issue with how the server
> is configured. I'm using the elfutils server, but I've seen the same issue
> on Arch's server (the distro I'm using). It's worth noting too that some
> users will undoubtedly have higher latency. A way to asynchronously initiate
> requests so you can have multiple going at once would be great to try and
> alleviate this somewhat, but it doesn't look like there's a way to do this
> yet.
There has been some discussion about gdb downloading from debuginfod in
background worker threads. I would like to get this feature added eventually.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (8 preceding siblings ...)
2023-03-17 16:16 ` amerey at redhat dot com
@ 2023-03-17 16:53 ` amerey at redhat dot com
2023-03-17 23:39 ` vi at endrift dot com
` (3 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: amerey at redhat dot com @ 2023-03-17 16:53 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
--- Comment #10 from Aaron Merey <amerey at redhat dot com> ---
(In reply to Vicki Pfau from comment #7)
> The issue appears to be the debuginfod server taking a not-insignificant
> amount of time per request (500ms - 2s I'd estimate) to report the absence
> of an associated artifact.
Long-lived TCP connections to debuginfod servers were added to GDB 11.1. Before
that we'd set up and tear down a connection for each query which added
unnecessary latency. So if you are using an older version of GDB this could
explain some of the delay.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (9 preceding siblings ...)
2023-03-17 16:53 ` amerey at redhat dot com
@ 2023-03-17 23:39 ` vi at endrift dot com
2023-03-24 11:22 ` fche at redhat dot com
` (2 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: vi at endrift dot com @ 2023-03-17 23:39 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
--- Comment #11 from Vicki Pfau <vi at endrift dot com> ---
I am using 11.1, but I think part of the problem is that Arch adopted
debuginfod relatively recently and hasn't backfilled packages. I updated my
packages yesterday and it took forever to start gdb today, but I think it was
actually downloading most of those packages so it shouldn't leave negative
cache this time. I don't know how debuginfod federation works either, but I'd
absolutely believe that Arch's server is just slow for one reason or another,
and somehow that's causing issues even though I'm querying the sourceware one
directly.
When I updated gdb a few days ago I did notice that the way information was
presented to the user changed, and it did seem faster, but I'm unsure if that
was just placebo effect due to the fact that it was telling me more information
now.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (10 preceding siblings ...)
2023-03-17 23:39 ` vi at endrift dot com
@ 2023-03-24 11:22 ` fche at redhat dot com
2023-04-08 13:29 ` mark at klomp dot org
2023-04-21 2:04 ` fche at redhat dot com
13 siblings, 0 replies; 15+ messages in thread
From: fche at redhat dot com @ 2023-03-24 11:22 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
--- Comment #12 from Frank Ch. Eigler <fche at redhat dot com> ---
There was a wild performance regression in sqlite 3.41 that archlinux's
debuginfod server got hit with. This was identified and corrected yesterday.
(It had nothing to do with caching.)
https://sqlite.org/forum/forumpost/a284a63124
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (11 preceding siblings ...)
2023-03-24 11:22 ` fche at redhat dot com
@ 2023-04-08 13:29 ` mark at klomp dot org
2023-04-21 2:04 ` fche at redhat dot com
13 siblings, 0 replies; 15+ messages in thread
From: mark at klomp dot org @ 2023-04-08 13:29 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
Mark Wielaard <mark at klomp dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mark at klomp dot org
--- Comment #13 from Mark Wielaard <mark at klomp dot org> ---
Has this issue been fixed with the fixe from comment #9
https://sourceware.org/cgit/elfutils/commit/?id=5527216460c6131527c27b06dada015b67525966
And/Or was it caused by the sqlite performance regression mentioned in comment
#9
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug debuginfod/30221] Negative cache should differentiate failure types
2023-03-11 1:32 [Bug debuginfod/30221] New: Negative cache should differentiate failure types vi at endrift dot com
` (12 preceding siblings ...)
2023-04-08 13:29 ` mark at klomp dot org
@ 2023-04-21 2:04 ` fche at redhat dot com
13 siblings, 0 replies; 15+ messages in thread
From: fche at redhat dot com @ 2023-04-21 2:04 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=30221
Frank Ch. Eigler <fche at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |WORKSFORME
--- Comment #14 from Frank Ch. Eigler <fche at redhat dot com> ---
We believe the current code behaves better with respect to aborted downloads.
Thank you for your report.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread