* [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing
@ 2022-10-17 14:15 fche at redhat dot com
2023-06-16 13:45 ` [Bug debuginfod/29696] " rgoldber at redhat dot com
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: fche at redhat dot com @ 2022-10-17 14:15 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=29696
Bug ID: 29696
Summary: intermittent libmicrohttpd assertion failures related
to socket fd closing
Product: elfutils
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: debuginfod
Assignee: unassigned at sourceware dot org
Reporter: fche at redhat dot com
CC: elfutils-devel at sourceware dot org
Target Milestone: ---
In a range of libmicrohttpd versions, up to and including
libmicrohttpd-0.9.75-3.fc36.x86_64, debuginfod occasionally crashes with
messages like:
https://builder.sourceware.org/testrun/920819ee86861130393e12933821c5b544afeee4?filename=tests%2Frun-debuginfod-federation-metrics.sh.log#line1669
Fatal error in GNU libmicrohttpd daemon.c:3831: Failed to remove FD from epoll
set.
Even without MHD_USE_EPOLL, a nearly identical message can come from a
different code path.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug debuginfod/29696] intermittent libmicrohttpd assertion failures related to socket fd closing
2022-10-17 14:15 [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing fche at redhat dot com
@ 2023-06-16 13:45 ` rgoldber at redhat dot com
2023-06-16 13:53 ` mark at klomp dot org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: rgoldber at redhat dot com @ 2023-06-16 13:45 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=29696
Ryan Goldberg <rgoldber at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at sourceware dot org |rgoldber at redhat dot com
CC| |rgoldber at redhat dot com
--- Comment #1 from Ryan Goldberg <rgoldber at redhat dot com> ---
Created attachment 14933
--> https://sourceware.org/bugzilla/attachment.cgi?id=14933&action=edit
Patch for 29696
The debuginfod cache config was using fdopen and then calling both fclose on
the file stream & close on the original file descriptor. Since the fd is not
dup'ed, this led to a race condition where if that fd was reused (by
microhttpd) we'd end up prematurely closing their socket leading to the above
issue.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug debuginfod/29696] intermittent libmicrohttpd assertion failures related to socket fd closing
2022-10-17 14:15 [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing fche at redhat dot com
2023-06-16 13:45 ` [Bug debuginfod/29696] " rgoldber at redhat dot com
@ 2023-06-16 13:53 ` mark at klomp dot org
2023-06-16 14:42 ` rgoldber at redhat dot com
2023-06-16 15:23 ` fche at redhat dot com
3 siblings, 0 replies; 5+ messages in thread
From: mark at klomp dot org @ 2023-06-16 13:53 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=29696
Mark Wielaard <mark at klomp dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mark at klomp dot org
--- Comment #2 from Mark Wielaard <mark at klomp dot org> ---
Very nice find. How did you catch this btw?
Are there any tools that help find such a "double closes"?
If not maybe we can teach valgrind --track-fds=yes about it, which already can
track fd leaks, so it shouldn't be too hard to make it also detect double/bad
closes.
The patch seems obviously correct to me. Nice to now log close () failures,
which should help catch similar issues early.
Small nitpick. The "}else{" is a bit of a style break with the rest of the
code, which would say:
...
}
else
{
...
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug debuginfod/29696] intermittent libmicrohttpd assertion failures related to socket fd closing
2022-10-17 14:15 [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing fche at redhat dot com
2023-06-16 13:45 ` [Bug debuginfod/29696] " rgoldber at redhat dot com
2023-06-16 13:53 ` mark at klomp dot org
@ 2023-06-16 14:42 ` rgoldber at redhat dot com
2023-06-16 15:23 ` fche at redhat dot com
3 siblings, 0 replies; 5+ messages in thread
From: rgoldber at redhat dot com @ 2023-06-16 14:42 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=29696
--- Comment #3 from Ryan Goldberg <rgoldber at redhat dot com> ---
I noticed that the issue was happening in run-debuginfod-federation-metrics.sh
so to reproduce I was playing with sending lots of requests to a federation of
servers. Only had the issue occur on the downstream so it was a client issue.
This made it pretty quick to replicate, so I could go through
debuginfod_query_server and see how far down I can put an early exit before
seeing the race condition. That narrowed it down to debuginfod_config_cache.
From there noticed the double close and it was smooth sailing.
I'm not sure about tooling around the double close but is it possible to know
that something is a double close if the fd may just be reused? Since in this
case for instance the close won't fail, we're just closing someone else's open,
good to go fd. fwiw I looked in elfutils at least and we didn't use fdopen with
a double close again.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug debuginfod/29696] intermittent libmicrohttpd assertion failures related to socket fd closing
2022-10-17 14:15 [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing fche at redhat dot com
` (2 preceding siblings ...)
2023-06-16 14:42 ` rgoldber at redhat dot com
@ 2023-06-16 15:23 ` fche at redhat dot com
3 siblings, 0 replies; 5+ messages in thread
From: fche at redhat dot com @ 2023-06-16 15:23 UTC (permalink / raw)
To: elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=29696
Frank Ch. Eigler <fche at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #4 from Frank Ch. Eigler <fche at redhat dot com> ---
commit 938a52c22ee915ff2cea813edd5da66bc8184885
Author: Ryan Goldberg <rgoldber@redhat.com>
Date: Fri Jun 16 10:20:04 2023 -0400
debuginfod: PR29696: Removed secondary fd close in cache config causing a
race condition
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-06-16 15:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-17 14:15 [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing fche at redhat dot com
2023-06-16 13:45 ` [Bug debuginfod/29696] " rgoldber at redhat dot com
2023-06-16 13:53 ` mark at klomp dot org
2023-06-16 14:42 ` rgoldber at redhat dot com
2023-06-16 15:23 ` fche at redhat dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).