public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: "Frank Ch. Eigler" <fche@redhat.com>
To: Mark Wielaard <mark@klomp.org>
Cc: elfutils-devel@sourceware.org
Subject: Re: patch rfc: PR28661: debuginfod thread-pool
Date: Thu, 9 Dec 2021 19:10:52 -0500	[thread overview]
Message-ID: <20211210001052.GA25363@redhat.com> (raw)
In-Reply-To: <addfc6a7ff0de038b49b4b1f0a61698ee5bf3ceb.camel@klomp.org>

Hi, Mark -

> So I would recommend to simply add a testcase that just uses no-option, 
> -C and -C 256 (or take any arbitrary number, 1 might be an interesting
> corner case) and see that if you have 4 (8, 16, ...?) debuginfod-find
> (or curl metric) processes doing some parallel queries works as
> expected.

OK, here's the current version, with a test, and with one more small
improvement to the error message printer (to add timestamp & batching).


Author: Frank Ch. Eigler <fche@redhat.com>
Date:   Wed Dec 8 22:41:47 2021 -0500

    PR28661: debuginfo connection thread pool support
    
    Add an option -C, which activates libmicrohttpd's thread-pool mode for
    handling incoming http connections.  Add libmicrohttpd error-logging
    callback function so as to receive indication of its internal errors,
    and relay counts to our metrics.  Some of these internal errors tipped
    us off to a microhttpd bug that thread pooling works around.  Document
    in debuginfod.8 page.  Hand-tested against "ulimit -u NNN" shells, and
    with a less strenuous new test case.
    
    Signed-off-by: Frank Ch. Eigler <fche@redhat.com>

diff --git a/NEWS b/NEWS
index 490932ae4ef9..6be58866f100 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,8 @@
+Version 0.187 after 0.186
+
+debuginfod: Support -C option for connection thread pooling.
+
+
 Version 0.186
 
 debuginfod-client: Default $DEBUGINFOD_URLS is computed from drop-in files
diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index df373201d5c6..2642ef5eeaf1 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -3,6 +3,14 @@
 	* debuginfod.cxx (database_stats_report): Don't format clog
 	using 'right' and 'setw(20)'.
 
+2021-12-08  Frank Ch. Eigler  <fche@redhat.com>
+
+	* debuginfod.cxx (connection_pool): New global.
+	(parse_opt): Parse & check -C option to set it.
+	(error_cb): New callback for libmicrohttpd error counting.
+	(main): Activate MHD_OPTION_THREAD_POOL_SIZE if appropriate.
+	Activate error_cb.
+
 2021-12-04  Mark Wielaard  <mark@klomp.org>
 
 	* debuginfod.cxx (main): Call debuginfod_pool_groom before exit.
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 887e4f5abd44..bb8e8e102896 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -353,7 +353,9 @@ static const struct argp_option options[] =
    { "rescan-time", 't', "SECONDS", 0, "Number of seconds to wait between rescans, 0=disable.", 0 },
    { "groom-time", 'g', "SECONDS", 0, "Number of seconds to wait between database grooming, 0=disable.", 0 },
    { "maxigroom", 'G', NULL, 0, "Run a complete database groom/shrink pass at startup.", 0 },
-   { "concurrency", 'c', "NUM", 0, "Limit scanning thread concurrency to NUM.", 0 },
+   { "concurrency", 'c', "NUM", 0, "Limit scanning thread concurrency to NUM, default=#CPUs.", 0 },
+   { "connection-pool", 'C', "NUM", OPTION_ARG_OPTIONAL,
+     "Use webapi connection pool with NUM threads, default=unlim.", 0 },
    { "include", 'I', "REGEX", 0, "Include files matching REGEX, default=all.", 0 },
    { "exclude", 'X', "REGEX", 0, "Exclude files matching REGEX, default=none.", 0 },
    { "port", 'p', "NUM", 0, "HTTP port to listen on, default 8002.", 0 },
@@ -412,6 +414,7 @@ static unsigned rescan_s = 300;
 static unsigned groom_s = 86400;
 static bool maxigroom = false;
 static unsigned concurrency = std::thread::hardware_concurrency() ?: 1;
+static int connection_pool = 0;
 static set<string> source_paths;
 static bool scan_files = false;
 static map<string,string> scan_archives;
@@ -558,6 +561,16 @@ parse_opt (int key, char *arg,
       concurrency = (unsigned) atoi(arg);
       if (concurrency < 1) concurrency = 1;
       break;
+    case 'C':
+      if (arg)
+        {
+          connection_pool = atoi(arg);
+          if (connection_pool < 2)
+            argp_failure(state, 1, EINVAL, "-C NUM minimum 2");
+        }
+      else // arg not given
+        connection_pool = std::thread::hardware_concurrency() * 2 ?: 2;
+      break;
     case 'I':
       // NB: no problem with unconditional free here - an earlier failed regcomp would exit program
       if (passive_p)
@@ -1368,6 +1381,7 @@ class libarchive_fdcache
       if (verbose > 3)
         obatched(clog) << "fdcache interned a=" << a << " b=" << b
                        << " fd=" << fd << " mb=" << mb << " front=" << front_p << endl;
+      
       set_metrics();
     }
 
@@ -3708,7 +3722,15 @@ sigusr2_handler (int /* sig */)
 }
 
 
-
+static void // error logging callback from libmicrohttpd internals
+error_cb (void *arg, const char *fmt, va_list ap)
+{
+  (void) arg;
+  inc_metric("error_count","libmicrohttpd",fmt);
+  char errmsg[512];
+  (void) vsnprintf (errmsg, sizeof(errmsg), fmt, ap); // ok if slightly truncated
+  obatched(cerr) << "libmicrohttpd error: " << errmsg; // MHD_DLOG calls already include \n
+}
 
 
 // A user-defined sqlite function, to score the sharedness of the
@@ -3853,7 +3875,7 @@ main (int argc, char *argv[])
 
   // Start httpd server threads.  Separate pool for IPv4 and IPv6, in
   // case the host only has one protocol stack.
-  MHD_Daemon *d4 = MHD_start_daemon (MHD_USE_THREAD_PER_CONNECTION
+  MHD_Daemon *d4 = MHD_start_daemon ((connection_pool ? 0 : MHD_USE_THREAD_PER_CONNECTION)
 #if MHD_VERSION >= 0x00095300
                                      | MHD_USE_INTERNAL_POLLING_THREAD
 #else
@@ -3863,8 +3885,11 @@ main (int argc, char *argv[])
                                      http_port,
                                      NULL, NULL, /* default accept policy */
                                      handler_cb, NULL, /* handler callback */
+                                     MHD_OPTION_EXTERNAL_LOGGER, error_cb, NULL,
+                                     (connection_pool ? MHD_OPTION_THREAD_POOL_SIZE : MHD_OPTION_END),
+                                     (connection_pool ? (int)connection_pool : MHD_OPTION_END),
                                      MHD_OPTION_END);
-  MHD_Daemon *d6 = MHD_start_daemon (MHD_USE_THREAD_PER_CONNECTION
+  MHD_Daemon *d6 = MHD_start_daemon ((connection_pool ? 0 : MHD_USE_THREAD_PER_CONNECTION)
 #if MHD_VERSION >= 0x00095300
                                      | MHD_USE_INTERNAL_POLLING_THREAD
 #else
@@ -3875,6 +3900,9 @@ main (int argc, char *argv[])
                                      http_port,
                                      NULL, NULL, /* default accept policy */
                                      handler_cb, NULL, /* handler callback */
+                                     MHD_OPTION_EXTERNAL_LOGGER, error_cb, NULL,
+                                     (connection_pool ? MHD_OPTION_THREAD_POOL_SIZE : MHD_OPTION_END),
+                                     (connection_pool ? (int)connection_pool : MHD_OPTION_END),
                                      MHD_OPTION_END);
 
   if (d4 == NULL && d6 == NULL) // neither ipv4 nor ipv6? boo
@@ -3928,6 +3956,8 @@ main (int argc, char *argv[])
 
   if (! passive_p)
     obatched(clog) << "search concurrency " << concurrency << endl;
+  obatched(clog) << "webapi connection pool " << connection_pool
+                 << (connection_pool ? "" : " (unlimited)") << endl;
   if (! passive_p)
     obatched(clog) << "rescan time " << rescan_s << endl;
   obatched(clog) << "fdcache fds " << fdcache_fds << endl;
diff --git a/doc/ChangeLog b/doc/ChangeLog
index 7a73fa107bf3..f25bcd2ee2b3 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,8 @@
+2021-12-08  Frank Ch. Eigler  <fche@redhat.com>
+
+	PR28661
+	* debuginfod.8 (-C): Document new flag.
+
 2021-11-05  Frank Ch. Eigler  <fche@redhat.com>
 
 	PR28430
diff --git a/doc/debuginfod.8 b/doc/debuginfod.8
index 1e56f6568322..ee8e4078e5b5 100644
--- a/doc/debuginfod.8
+++ b/doc/debuginfod.8
@@ -205,6 +205,23 @@ This important for controlling CPU-intensive operations like parsing
 an ELF file and especially decompressing archives.  The default is the
 number of processors on the system; the minimum is 1.
 
+.TP
+.B "\-C" "\-C=NUM" "\-\-connection\-pool" "\-\-connection\-pool=NUM"
+Set the size of the pool of threads serving webapi queries.  The
+following table summarizes the interpretaton of this option and its
+optional NUM parameter.
+.TS
+l l.
+no option	clone new thread for every request, no fixed pool
+\-C	use a fixed thread pool sized automatically
+\-C=NUM	use a fixed thread pool sized NUM, minimum 2
+.TE
+
+The first mode is useful for friendly bursty traffic.  The second mode
+is a simple and safe configuration based on the number of processors.
+The third mode is suitable for tuned load-limiting configurations
+facing unruly traffic.
+
 .TP
 .B "\-L"
 Traverse symbolic links encountered during traversal of the PATHs,
diff --git a/tests/ChangeLog b/tests/ChangeLog
index 76cb974a43c1..82061c6e9a58 100644
--- a/tests/ChangeLog
+++ b/tests/ChangeLog
@@ -1,3 +1,9 @@
+2021-12-09  Frank Ch. Eigler  <fche@redhat.com>
+
+	* debuginfod-subr.sh (xfail): New proc.
+	* run-debuginfod-webapi-concurrency.sh: New test for -C.
+	* Makefile.am: List it.
+
 2021-12-04  Mark Wielaard  <mark@klomp.org>
 
 	* Makefile.am (EXTRA_NLIST_CFLAGS): New variable depends on
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 8acaeaf2a3a6..b2da2c83e15a 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -242,7 +242,8 @@ TESTS += run-debuginfod-dlopen.sh \
          run-debuginfod-percent-escape.sh \
          run-debuginfod-x-forwarded-for.sh \
          run-debuginfod-response-headers.sh \
-         run-debuginfod-extraction-passive.sh
+         run-debuginfod-extraction-passive.sh \
+	 run-debuginfod-webapi-concurrency.sh
 endif
 endif
 
@@ -539,6 +540,7 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \
              run-debuginfod-percent-escape.sh \
 	     run-debuginfod-response-headers.sh \
              run-debuginfod-extraction-passive.sh \
+             run-debuginfod-webapi-concurrency.sh \
 	     debuginfod-rpms/fedora30/hello2-1.0-2.src.rpm \
 	     debuginfod-rpms/fedora30/hello2-1.0-2.x86_64.rpm \
 	     debuginfod-rpms/fedora30/hello2-debuginfo-1.0-2.x86_64.rpm \
diff --git a/tests/debuginfod-subr.sh b/tests/debuginfod-subr.sh
index 59033f35980e..0b59b5b87e55 100755
--- a/tests/debuginfod-subr.sh
+++ b/tests/debuginfod-subr.sh
@@ -158,3 +158,13 @@ PORT1=0
 PORT2=0
 PID1=0
 PID2=0
+
+
+# run $1 as a sh -c command, invert result code
+xfail() {
+    if sh -c "$1"; then
+        false
+    else
+        true
+    fi
+}
diff --git a/tests/run-debuginfod-webapi-concurrency.sh b/tests/run-debuginfod-webapi-concurrency.sh
new file mode 100755
index 000000000000..513f834f40d5
--- /dev/null
+++ b/tests/run-debuginfod-webapi-concurrency.sh
@@ -0,0 +1,60 @@
+#!/usr/bin/env bash
+#
+# Copyright (C) 2021 Red Hat, Inc.
+# This file is part of elfutils.
+#
+# This file is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# elfutils is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. $srcdir/debuginfod-subr.sh
+
+# for test case debugging, uncomment:
+set -x
+
+mkdir Z
+# This variable is essential and ensures no time-race for claiming ports occurs
+# set base to a unique multiple of 100 not used in any other 'run-debuginfod-*' test
+base=12000
+get_ports
+
+export DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache
+
+cp -rvp ${abs_srcdir}/debuginfod-tars Z
+tempfiles Z
+
+
+for Cnum in "" "-C" "-C10" "-C100"
+do
+    env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../debuginfod/debuginfod $VERBOSE "$Cnum" -d :memory: -Z .tar.xz -Z .tar.bz2=bzcat -p $PORT1 -t0 -g0 -v --fdcache-fds=0 --fdcache-prefetch-fds=0 Z >> vlog$PORT1 2>&1 &
+    PID1=$!
+    tempfiles vlog$PORT1
+    errfiles vlog$PORT1
+    
+    wait_ready $PORT1 'ready' 1
+    # Wait for server to finish indexing
+    wait_ready $PORT1 'thread_work_total{role="traverse"}' 1
+    wait_ready $PORT1 'thread_work_pending{role="scan"}' 0
+    wait_ready $PORT1 'thread_busy{role="scan"}' 0
+
+    # Do a bunch of lookups in parallel
+    for jobs in `seq 400`; do
+        curl -s http://localhost:$PORT1/buildid/cee13b2ea505a7f37bd20d271c6bc7e5f8d2dfcb/debuginfo > /dev/null &
+    done
+    (sleep 5; curl -s http://localhost:$PORT1/metrics | egrep 'error|transfers'; kill $PID1) &
+    wait
+    PID1=0
+done
+
+xfail "grep Server.reached.connection vlog$PORT1" # PR18661
+
+exit 0


  parent reply	other threads:[~2021-12-10  0:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-09  3:55 Frank Ch. Eigler
2021-12-09 16:47 ` Mark Wielaard
2021-12-09 16:54   ` Frank Ch. Eigler
2021-12-10  0:10   ` Frank Ch. Eigler [this message]
2021-12-10 17:44     ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211210001052.GA25363@redhat.com \
    --to=fche@redhat.com \
    --cc=elfutils-devel@sourceware.org \
    --cc=mark@klomp.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).