public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing
@ 2022-10-17 14:15 fche at redhat dot com
  2023-06-16 13:45 ` [Bug debuginfod/29696] " rgoldber at redhat dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: fche at redhat dot com @ 2022-10-17 14:15 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29696

            Bug ID: 29696
           Summary: intermittent libmicrohttpd assertion failures related
                    to socket fd closing
           Product: elfutils
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: debuginfod
          Assignee: unassigned at sourceware dot org
          Reporter: fche at redhat dot com
                CC: elfutils-devel at sourceware dot org
  Target Milestone: ---

In a range of libmicrohttpd versions, up to and including
libmicrohttpd-0.9.75-3.fc36.x86_64, debuginfod occasionally crashes with
messages like:

https://builder.sourceware.org/testrun/920819ee86861130393e12933821c5b544afeee4?filename=tests%2Frun-debuginfod-federation-metrics.sh.log#line1669

Fatal error in GNU libmicrohttpd daemon.c:3831: Failed to remove FD from epoll
set.

Even without MHD_USE_EPOLL, a nearly identical message can come from a
different code path.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug debuginfod/29696] intermittent libmicrohttpd assertion failures related to socket fd closing
  2022-10-17 14:15 [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing fche at redhat dot com
@ 2023-06-16 13:45 ` rgoldber at redhat dot com
  2023-06-16 13:53 ` mark at klomp dot org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: rgoldber at redhat dot com @ 2023-06-16 13:45 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29696

Ryan Goldberg <rgoldber at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at sourceware dot org   |rgoldber at redhat dot com
                 CC|                            |rgoldber at redhat dot com

--- Comment #1 from Ryan Goldberg <rgoldber at redhat dot com> ---
Created attachment 14933
  --> https://sourceware.org/bugzilla/attachment.cgi?id=14933&action=edit
Patch for 29696

The debuginfod cache config was using fdopen and then calling both fclose on
the file stream & close on the original file descriptor. Since the fd is not
dup'ed, this led to a race condition where if that fd was reused (by
microhttpd) we'd end up prematurely closing their socket leading to the above
issue.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug debuginfod/29696] intermittent libmicrohttpd assertion failures related to socket fd closing
  2022-10-17 14:15 [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing fche at redhat dot com
  2023-06-16 13:45 ` [Bug debuginfod/29696] " rgoldber at redhat dot com
@ 2023-06-16 13:53 ` mark at klomp dot org
  2023-06-16 14:42 ` rgoldber at redhat dot com
  2023-06-16 15:23 ` fche at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: mark at klomp dot org @ 2023-06-16 13:53 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29696

Mark Wielaard <mark at klomp dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mark at klomp dot org

--- Comment #2 from Mark Wielaard <mark at klomp dot org> ---
Very nice find. How did you catch this btw?
Are there any tools that help find such a "double closes"?
If not maybe we can teach valgrind --track-fds=yes about it, which already can
track fd leaks, so it shouldn't be too hard to make it also detect double/bad
closes.

The patch seems obviously correct to me. Nice to now log close () failures,
which should help catch similar issues early.

Small nitpick. The "}else{" is a bit of a style break with the rest of the
code, which would say:
...
  }
else
  {
...

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug debuginfod/29696] intermittent libmicrohttpd assertion failures related to socket fd closing
  2022-10-17 14:15 [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing fche at redhat dot com
  2023-06-16 13:45 ` [Bug debuginfod/29696] " rgoldber at redhat dot com
  2023-06-16 13:53 ` mark at klomp dot org
@ 2023-06-16 14:42 ` rgoldber at redhat dot com
  2023-06-16 15:23 ` fche at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: rgoldber at redhat dot com @ 2023-06-16 14:42 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29696

--- Comment #3 from Ryan Goldberg <rgoldber at redhat dot com> ---
I noticed that the issue was happening in run-debuginfod-federation-metrics.sh
so to reproduce I was playing with sending lots of requests to a federation of
servers. Only had the issue occur on the downstream so it was a client issue.
This made it pretty quick to replicate, so I could go through
debuginfod_query_server and see how far down I can put an early exit before
seeing the race condition. That narrowed it down to debuginfod_config_cache.
From there noticed the double close and it was smooth sailing.

I'm not sure about tooling around the double close but is it possible to know
that something is a double close if the fd may just be reused? Since in this
case for instance the close won't fail, we're just closing someone else's open,
good to go fd. fwiw I looked in elfutils at least and we didn't use fdopen with
a double close again.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug debuginfod/29696] intermittent libmicrohttpd assertion failures related to socket fd closing
  2022-10-17 14:15 [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing fche at redhat dot com
                   ` (2 preceding siblings ...)
  2023-06-16 14:42 ` rgoldber at redhat dot com
@ 2023-06-16 15:23 ` fche at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: fche at redhat dot com @ 2023-06-16 15:23 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29696

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #4 from Frank Ch. Eigler <fche at redhat dot com> ---
commit 938a52c22ee915ff2cea813edd5da66bc8184885
Author: Ryan Goldberg <rgoldber@redhat.com>
Date:   Fri Jun 16 10:20:04 2023 -0400

    debuginfod: PR29696: Removed secondary fd close in cache config causing a
race condition

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-16 15:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-17 14:15 [Bug debuginfod/29696] New: intermittent libmicrohttpd assertion failures related to socket fd closing fche at redhat dot com
2023-06-16 13:45 ` [Bug debuginfod/29696] " rgoldber at redhat dot com
2023-06-16 13:53 ` mark at klomp dot org
2023-06-16 14:42 ` rgoldber at redhat dot com
2023-06-16 15:23 ` fche at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).