public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* [Bug debuginfod/29976] New: webapi connection pool eats all file handles
@ 2023-01-09 18:14 ross at burtonini dot com
  2023-01-09 19:02 ` [Bug debuginfod/29976] " fche at redhat dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: ross at burtonini dot com @ 2023-01-09 18:14 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29976

            Bug ID: 29976
           Summary: webapi connection pool eats all file handles
           Product: elfutils
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: debuginfod
          Assignee: unassigned at sourceware dot org
          Reporter: ross at burtonini dot com
                CC: elfutils-devel at sourceware dot org
  Target Milestone: ---

If I start debuginfod without any concurrency limits:

[Mon Jan  9 17:40:14 2023] (2356243/2356243): libmicrohttpd error: Failed to
create worker inter-thread communication channel: Too many open files

My machine has 256 cores, and stracing debuginfod shows that it fails to open
more files after creating 510 epoll fds (twice):

epoll_create1(EPOLL_CLOEXEC)            = 1021
epoll_ctl(1021, EPOLL_CTL_ADD, 3, {events=EPOLLIN, data={u32=4027013664,
u64=187651148175904}}) = 0
epoll_ctl(1021, EPOLL_CTL_ADD, 1020, {events=EPOLLIN, data={u32=2965961632,
u64=281473647704992}}) = 0
mmap(NULL, 8454144, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) =
0xfff6b97b0000
mprotect(0xfff6b97c0000, 8388608, PROT_READ|PROT_WRITE) = 0
rt_sigprocmask(SIG_BLOCK, ~[], [], 8)   = 0
clone(child_stack=0xfff6b9fbea00,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tid=[2361982], tls=0xfff6b9fbf880, child_tidptr=0xfff6b9fbf210) =
2361982
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK)   = 1022
epoll_create1(EPOLL_CLOEXEC)            = 1023
epoll_ctl(1023, EPOLL_CTL_ADD, 3, {events=EPOLLIN, data={u32=4027014456,
u64=187651148176696}}) = 0
epoll_ctl(1023, EPOLL_CTL_ADD, 1022, {events=EPOLLIN, data={u32=2965961632,
u64=281473647704992}}) = 0
mmap(NULL, 8454144, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) =
0xfff6b8fa0000
mprotect(0xfff6b8fb0000, 8388608, PROT_READ|PROT_WRITE) = 0
rt_sigprocmask(SIG_BLOCK, ~[], [], 8)   = 0
clone(child_stack=0xfff6b97aea00,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tid=[2361983], tls=0xfff6b97af880, child_tidptr=0xfff6b97af210) =
2361983
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK)   = -1 EMFILE (Too many open files)

ulimit -n is 1024, do I really need more just to start debuginfod if I have 256
cores?  As the web connections is 2xthreads and it appears to be using two fds
per connection, maybe I do.

Should the connection pool have a hard limit when using the default? I doubt
512 incoming connections would be usual, and if that is needed then the user
can specify -C.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debuginfod/29976] webapi connection pool eats all file handles
  2023-01-09 18:14 [Bug debuginfod/29976] New: webapi connection pool eats all file handles ross at burtonini dot com
@ 2023-01-09 19:02 ` fche at redhat dot com
  2023-01-09 19:56 ` ross at burtonini dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: fche at redhat dot com @ 2023-01-09 19:02 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29976

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fche at redhat dot com

--- Comment #1 from Frank Ch. Eigler <fche at redhat dot com> ---
What sets "ulimit -n -> 1000" in your case?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debuginfod/29976] webapi connection pool eats all file handles
  2023-01-09 18:14 [Bug debuginfod/29976] New: webapi connection pool eats all file handles ross at burtonini dot com
  2023-01-09 19:02 ` [Bug debuginfod/29976] " fche at redhat dot com
@ 2023-01-09 19:56 ` ross at burtonini dot com
  2023-01-09 19:59 ` ross at burtonini dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: ross at burtonini dot com @ 2023-01-09 19:56 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29976

--- Comment #2 from Ross Burton <ross at burtonini dot com> ---
Honestly, no idea.  Appears to be the default on ubuntu.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debuginfod/29976] webapi connection pool eats all file handles
  2023-01-09 18:14 [Bug debuginfod/29976] New: webapi connection pool eats all file handles ross at burtonini dot com
  2023-01-09 19:02 ` [Bug debuginfod/29976] " fche at redhat dot com
  2023-01-09 19:56 ` ross at burtonini dot com
@ 2023-01-09 19:59 ` ross at burtonini dot com
  2023-01-09 20:05 ` fche at redhat dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: ross at burtonini dot com @ 2023-01-09 19:59 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29976

--- Comment #3 from Ross Burton <ross at burtonini dot com> ---
Yes, kernel defaults: 1024 soft, 4096 hard.

I *can* change it to 4096 but there's still the point that:

1) debugging the failure case isn't trivial
2) cores*2 threads in the connection pool probably doesn't scale linearly

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debuginfod/29976] webapi connection pool eats all file handles
  2023-01-09 18:14 [Bug debuginfod/29976] New: webapi connection pool eats all file handles ross at burtonini dot com
                   ` (2 preceding siblings ...)
  2023-01-09 19:59 ` ross at burtonini dot com
@ 2023-01-09 20:05 ` fche at redhat dot com
  2023-01-09 20:20 ` ross at burtonini dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: fche at redhat dot com @ 2023-01-09 20:05 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29976

--- Comment #4 from Frank Ch. Eigler <fche at redhat dot com> ---
I assume "debuginfod -C $num -d $num" still works for you, in this battle of
distro/site defaults.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debuginfod/29976] webapi connection pool eats all file handles
  2023-01-09 18:14 [Bug debuginfod/29976] New: webapi connection pool eats all file handles ross at burtonini dot com
                   ` (3 preceding siblings ...)
  2023-01-09 20:05 ` fche at redhat dot com
@ 2023-01-09 20:20 ` ross at burtonini dot com
  2023-01-10 23:04 ` fche at redhat dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: ross at burtonini dot com @ 2023-01-09 20:20 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29976

--- Comment #5 from Ross Burton <ross at burtonini dot com> ---
Yes.

My use case is a test that uses debuginfod, so it works everywhere and as it
only has to service a few requests I'm just passing -C2 -c2.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debuginfod/29976] webapi connection pool eats all file handles
  2023-01-09 18:14 [Bug debuginfod/29976] New: webapi connection pool eats all file handles ross at burtonini dot com
                   ` (4 preceding siblings ...)
  2023-01-09 20:20 ` ross at burtonini dot com
@ 2023-01-10 23:04 ` fche at redhat dot com
  2023-01-11 11:44 ` ross at burtonini dot com
  2023-01-11 15:34 ` fche at redhat dot com
  7 siblings, 0 replies; 9+ messages in thread
From: fche at redhat dot com @ 2023-01-10 23:04 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29976

--- Comment #6 from Frank Ch. Eigler <fche at redhat dot com> ---
please check out commit 7399e3bd7eb72d045 on elfutils.git for a test patch

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debuginfod/29976] webapi connection pool eats all file handles
  2023-01-09 18:14 [Bug debuginfod/29976] New: webapi connection pool eats all file handles ross at burtonini dot com
                   ` (5 preceding siblings ...)
  2023-01-10 23:04 ` fche at redhat dot com
@ 2023-01-11 11:44 ` ross at burtonini dot com
  2023-01-11 15:34 ` fche at redhat dot com
  7 siblings, 0 replies; 9+ messages in thread
From: ross at burtonini dot com @ 2023-01-11 11:44 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29976

--- Comment #7 from Ross Burton <ross at burtonini dot com> ---
Looks good to me!

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug debuginfod/29976] webapi connection pool eats all file handles
  2023-01-09 18:14 [Bug debuginfod/29976] New: webapi connection pool eats all file handles ross at burtonini dot com
                   ` (6 preceding siblings ...)
  2023-01-11 11:44 ` ross at burtonini dot com
@ 2023-01-11 15:34 ` fche at redhat dot com
  7 siblings, 0 replies; 9+ messages in thread
From: fche at redhat dot com @ 2023-01-11 15:34 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=29976

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--- Comment #8 from Frank Ch. Eigler <fche at redhat dot com> ---

Pushed to master as dcb40f9caa7ca30

Author: Frank Ch. Eigler <fche@redhat.com>
Date:   Tue Jan 10 17:59:35 2023 -0500

    debuginfod PR29975 & PR29976: decrease default concurrency

    ... based on rlimit (rlimig -n NUM)
    ... based on cpu-affinity (taskset -c A,B,C,D ...)

    Signed-off-by: Frank Ch. Eigler <fche@redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-01-11 15:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-09 18:14 [Bug debuginfod/29976] New: webapi connection pool eats all file handles ross at burtonini dot com
2023-01-09 19:02 ` [Bug debuginfod/29976] " fche at redhat dot com
2023-01-09 19:56 ` ross at burtonini dot com
2023-01-09 19:59 ` ross at burtonini dot com
2023-01-09 20:05 ` fche at redhat dot com
2023-01-09 20:20 ` ross at burtonini dot com
2023-01-10 23:04 ` fche at redhat dot com
2023-01-11 11:44 ` ross at burtonini dot com
2023-01-11 15:34 ` fche at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).