public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* [Bug debuginfod/27531] New: Support retry of failed downloads
@ 2021-03-06  7:46 sergiodj at sergiodj dot net
  2021-03-06 18:08 ` [Bug debuginfod/27531] " fche at redhat dot com
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: sergiodj at sergiodj dot net @ 2021-03-06  7:46 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

            Bug ID: 27531
           Summary: Support retry of failed downloads
           Product: elfutils
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: debuginfod
          Assignee: unassigned at sourceware dot org
          Reporter: sergiodj at sergiodj dot net
                CC: elfutils-devel at sourceware dot org
  Target Milestone: ---

While running Debian's debuginfod service, I have received a bunch of
complaints from people who are living far from where the server is located
(Europe) and are getting a few timeouts when GDB attempts to download the
debuginfo files.

It would be nice if we could configure debuginfod-client to retry N times
before giving up.  I thought about opening this bug against GDB, but I think it
makes more sense to implement this in the debuginfod client itself so that
everyone can benefit from it.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
@ 2021-03-06 18:08 ` fche at redhat dot com
  2021-03-07  2:34 ` fche at redhat dot com
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: fche at redhat dot com @ 2021-03-06 18:08 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fche at redhat dot com
             Status|NEW                         |WAITING

--- Comment #1 from Frank Ch. Eigler <fche at redhat dot com> ---
Can you collect more information about the nature of the timeouts and
what diagnostics if any libcurl/debuginfod-client returned?  The default
timeout imposed by the debuginfod-client code is 90s to start returning
file content, as governed by the $DEBUGINFOD_TIMEOUT environment variable.
Would these folks like us to retry *beyond* that timeout?

Or is the timeout at some earlier stage?  We'd need the diagnostics or
packet trace or something like that.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
  2021-03-06 18:08 ` [Bug debuginfod/27531] " fche at redhat dot com
@ 2021-03-07  2:34 ` fche at redhat dot com
  2021-03-07 20:38 ` sergiodj at sergiodj dot net
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: fche at redhat dot com @ 2021-03-07  2:34 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

--- Comment #2 from Frank Ch. Eigler <fche at redhat dot com> ---
BTW  env DEBUGINFOD_VERBOSE=1   can assist collection of client side
diagnostics

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
  2021-03-06 18:08 ` [Bug debuginfod/27531] " fche at redhat dot com
  2021-03-07  2:34 ` fche at redhat dot com
@ 2021-03-07 20:38 ` sergiodj at sergiodj dot net
  2021-03-07 22:04 ` fche at redhat dot com
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: sergiodj at sergiodj dot net @ 2021-03-07 20:38 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

--- Comment #3 from Sergio Durigan Junior <sergiodj at sergiodj dot net> ---
(In reply to Frank Ch. Eigler from comment #1)
> Can you collect more information about the nature of the timeouts and
> what diagnostics if any libcurl/debuginfod-client returned?  The default
> timeout imposed by the debuginfod-client code is 90s to start returning
> file content, as governed by the $DEBUGINFOD_TIMEOUT environment variable.
> Would these folks like us to retry *beyond* that timeout?

First of all: yes, I can try to collect more info from them.  I've just emailed
one of the guys who brought this to my attention, and asked him if he can
provide more info and leave a comment here.

With that out of the way, I think it's worth mentioning that the retry idea did
not come from the reports I've received.  It is something that I figured would
be nice to have in order to mitigate the issues.

We are talking about two distinct (albeit related) things here: having a big
timeout does not necessarily mean that the download will succeeded, therefore
having the possibility of retrying makes a lot of sense.

> Or is the timeout at some earlier stage?  We'd need the diagnostics or
> packet trace or something like that.

I understand the intention to address the timeout problem, but there are things
out of debuginfod's control here.  One of the emails I received was from a
person who lives in China and therefore has to cope with their sub-optimal
international links.  As far as I understood from what they explained, a
connection between China and Europe is prone to suffer from the instability
that was reported.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
                   ` (2 preceding siblings ...)
  2021-03-07 20:38 ` sergiodj at sergiodj dot net
@ 2021-03-07 22:04 ` fche at redhat dot com
  2021-03-14 17:14 ` zsj950618 at gmail dot com
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: fche at redhat dot com @ 2021-03-07 22:04 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

--- Comment #4 from Frank Ch. Eigler <fche at redhat dot com> ---

> having a big timeout does not necessarily mean that the download will
> succeeded, therefore having the possibility of retrying makes a lot of sense.

OK, retrying for an outright aborted connection within the overall timeout
limit
is something we could do.  Retrying AFTER a timeout, probably not obvious; we'd
need yet another timeout to limit the post-timeout timeout, yo dawg.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
                   ` (3 preceding siblings ...)
  2021-03-07 22:04 ` fche at redhat dot com
@ 2021-03-14 17:14 ` zsj950618 at gmail dot com
  2021-03-14 17:16 ` zsj950618 at gmail dot com
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: zsj950618 at gmail dot com @ 2021-03-14 17:14 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

Shengjing Zhu <zsj950618 at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |zsj950618 at gmail dot com

--- Comment #5 from Shengjing Zhu <zsj950618 at gmail dot com> ---
Created attachment 13308
  --> https://sourceware.org/bugzilla/attachment.cgi?id=13308&action=edit
debuginfod verbose output

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
                   ` (4 preceding siblings ...)
  2021-03-14 17:14 ` zsj950618 at gmail dot com
@ 2021-03-14 17:16 ` zsj950618 at gmail dot com
  2021-03-14 17:29 ` fche at redhat dot com
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: zsj950618 at gmail dot com @ 2021-03-14 17:16 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

--- Comment #6 from Shengjing Zhu <zsj950618 at gmail dot com> ---
I'm the one that has complained "Timer expired" error when using
https://debuginfod.debian.net. Sergio points me this issue one week ago.

The problem is that I have a bad connection to the server. Log is attached with
DEBUGINFOD_VERBOSE=1.

url 0
https://debuginfod.debian.net/buildid/75703faf54dc0d3014ce34c9ce110eb6fe217818/debuginfo
query 1 urls in parallel
Downloading separate debug info for
/lib/x86_64-linux-gnu/libapt-private.so.0.0...
committed to url 0
server response Timeout was reached
url 0 Operation too slow. Less than 102400 bytes/sec transferred the last 90
seconds
not found Timer expired (err=-62)
Download failed: Timer expired.  Continuing without debug info for
/lib/x86_64-linux-gnu/libapt-private.so.0.0.


$ curl -v
https://debuginfod.debian.net/buildid/75703faf54dc0d3014ce34c9ce110eb6fe217818/debuginfo
-o /dev/null 
<skip>
{ [5 bytes data]
100 2700k  100 2700k    0     0  20137      0  0:02:17  0:02:17 --:--:-- 24550
* Connection #0 to host debuginfod.debian.net left intact

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
                   ` (5 preceding siblings ...)
  2021-03-14 17:16 ` zsj950618 at gmail dot com
@ 2021-03-14 17:29 ` fche at redhat dot com
  2021-03-14 18:02 ` zsj950618 at gmail dot com
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: fche at redhat dot com @ 2021-03-14 17:29 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

--- Comment #7 from Frank Ch. Eigler <fche at redhat dot com> ---
OK, it looks like four separate instances of "Timer expired (err=-62)"
errors, indicating that the system did wait the 90 seconds (each time).
Doing a retry on our own initiative after this would be possible but
would cause more elapsed time.

Have you considered setting $DEBUGINFOD_TIMEOUT to a larger value, 
like 180 or 300?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
                   ` (6 preceding siblings ...)
  2021-03-14 17:29 ` fche at redhat dot com
@ 2021-03-14 18:02 ` zsj950618 at gmail dot com
  2021-03-14 18:03 ` zsj950618 at gmail dot com
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: zsj950618 at gmail dot com @ 2021-03-14 18:02 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

--- Comment #8 from Shengjing Zhu <zsj950618 at gmail dot com> ---
Created attachment 13309
  --> https://sourceware.org/bugzilla/attachment.cgi?id=13309&action=edit
large timeout output

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
                   ` (7 preceding siblings ...)
  2021-03-14 18:02 ` zsj950618 at gmail dot com
@ 2021-03-14 18:03 ` zsj950618 at gmail dot com
  2021-03-14 22:58 ` fche at redhat dot com
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: zsj950618 at gmail dot com @ 2021-03-14 18:03 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

--- Comment #9 from Shengjing Zhu <zsj950618 at gmail dot com> ---
With large timeout it works. I paste the log with timestamp in attachment.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
                   ` (8 preceding siblings ...)
  2021-03-14 18:03 ` zsj950618 at gmail dot com
@ 2021-03-14 22:58 ` fche at redhat dot com
  2021-03-14 23:07 ` sergiodj at sergiodj dot net
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: fche at redhat dot com @ 2021-03-14 22:58 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

--- Comment #10 from Frank Ch. Eigler <fche at redhat dot com> ---
OK, for your use case, a retry possibly would not help, but an enlarged timeout
does.  Interesting.  Should we keep this RFE open or close it?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
                   ` (9 preceding siblings ...)
  2021-03-14 22:58 ` fche at redhat dot com
@ 2021-03-14 23:07 ` sergiodj at sergiodj dot net
  2021-03-18 16:54 ` fche at redhat dot com
  2021-07-09 13:58 ` mark at klomp dot org
  12 siblings, 0 replies; 14+ messages in thread
From: sergiodj at sergiodj dot net @ 2021-03-14 23:07 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

--- Comment #11 from Sergio Durigan Junior <sergiodj at sergiodj dot net> ---
Up to you.  I think a retry mechanism is an interesting thing to have, even if
it's not used much nor enabled by default.  But I understand there are higher
priorities now.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
                   ` (10 preceding siblings ...)
  2021-03-14 23:07 ` sergiodj at sergiodj dot net
@ 2021-03-18 16:54 ` fche at redhat dot com
  2021-07-09 13:58 ` mark at klomp dot org
  12 siblings, 0 replies; 14+ messages in thread
From: fche at redhat dot com @ 2021-03-18 16:54 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at sourceware dot org   |alizhang at redhat dot com
             Status|WAITING                     |NEW

--- Comment #12 from Frank Ch. Eigler <fche at redhat dot com> ---
Would suggest implementing this via a new $DEBUGINFOD_RETRY=number environment
variable, defaulting to 0.  If greater or equal to 1, after an *intermittent
error* type curl result (not a final one like 200 success or a 404 not found),
retry the $DEBUGINFOD_URLS loop up to $number of times.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug debuginfod/27531] Support retry of failed downloads
  2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
                   ` (11 preceding siblings ...)
  2021-03-18 16:54 ` fche at redhat dot com
@ 2021-07-09 13:58 ` mark at klomp dot org
  12 siblings, 0 replies; 14+ messages in thread
From: mark at klomp dot org @ 2021-07-09 13:58 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=27531

Mark Wielaard <mark at klomp dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
                 CC|                            |mark at klomp dot org
             Status|NEW                         |RESOLVED

--- Comment #13 from Mark Wielaard <mark at klomp dot org> ---
commit 60117fb6b2006e1ef282fee48eae7646622d1667
Author: Alice Zhang <alizhang@redhat.com>
Date:   Tue Jul 6 16:12:43 2021 -0400

    PR27531: retry within default retry_limit will be supported.

    In debuginfod-client.c (debuginfod_query_server),insert a
    goto statement for jumping back to the beginning of curl
    handles set up if query fails and a non ENOENT error is returned.

    Also introduced DEBUGINFOD_RETRY_LIMIT_ENV_VAR and default
    DEBUGINFOD_RETRY_LIMIT(which is 2).

    Correponding test has been added to tests/run-debuginfod-find.sh

    Signed-off-by: Alice Zhang <alizhang@redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-07-09 13:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-06  7:46 [Bug debuginfod/27531] New: Support retry of failed downloads sergiodj at sergiodj dot net
2021-03-06 18:08 ` [Bug debuginfod/27531] " fche at redhat dot com
2021-03-07  2:34 ` fche at redhat dot com
2021-03-07 20:38 ` sergiodj at sergiodj dot net
2021-03-07 22:04 ` fche at redhat dot com
2021-03-14 17:14 ` zsj950618 at gmail dot com
2021-03-14 17:16 ` zsj950618 at gmail dot com
2021-03-14 17:29 ` fche at redhat dot com
2021-03-14 18:02 ` zsj950618 at gmail dot com
2021-03-14 18:03 ` zsj950618 at gmail dot com
2021-03-14 22:58 ` fche at redhat dot com
2021-03-14 23:07 ` sergiodj at sergiodj dot net
2021-03-18 16:54 ` fche at redhat dot com
2021-07-09 13:58 ` mark at klomp dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).