From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by sourceware.org (Postfix) with ESMTP id A4C853861038 for ; Sat, 31 Oct 2020 14:00:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A4C853861038 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-466-y0L87Y65Pv-piE0u8o9wVA-1; Sat, 31 Oct 2020 10:00:14 -0400 X-MC-Unique: y0L87Y65Pv-piE0u8o9wVA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E10FE1006702 for ; Sat, 31 Oct 2020 14:00:13 +0000 (UTC) Received: from redhat.com (ovpn-112-64.phx2.redhat.com [10.3.112.64]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B01AD5B4D1 for ; Sat, 31 Oct 2020 14:00:13 +0000 (UTC) Received: from fche by redhat.com with local (Exim 4.94) (envelope-from ) id 1kYrQS-0005nM-GU for elfutils-devel@sourceware.org; Sat, 31 Oct 2020 10:00:12 -0400 Date: Sat, 31 Oct 2020 10:00:12 -0400 From: "Frank Ch. Eigler" To: elfutils-devel@sourceware.org Subject: obv [PATCH] debuginfod acceleration, metrics Message-ID: <20201031140012.GA22258@redhat.com> MIME-Version: 1.0 User-Agent: Mutt/1.12.0 (2019-05-25) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_BADIPHTTP, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Oct 2020 14:00:21 -0000 Hi - Committing as obvious enough. :-) Author: Frank Ch. Eigler Date: Sat Oct 31 09:48:56 2020 -0400 debuginfod: Accelerate traversal, shutdowns, improve metrics Added new metrics for scanning that allow estimation of its reading bandwidth. Accelerated responsivity to SIGINT shutdown during archive-scanning phase, which previously insisted on completely processing the current archive. Noted in systemd service file that in the worst case, it might still take a long time. Accelerated traversals by moving regex -I/-X handling to apply to file names only (as always documented), so directory traversal metrics are accurate regardless of their name. Signed-off-by: Frank Ch. Eigler diff --git a/config/ChangeLog b/config/ChangeLog index d280511d95ea..e5c41bee234e 100644 --- a/config/ChangeLog +++ b/config/ChangeLog @@ -1,3 +1,7 @@ +2020-10-31 Frank Ch. Eigler + + * debuginfod.service: Bump up TimeoutStopSec. + 2020-10-30 Frank Ch. Eigler * elfutils.spec.in: Fix debuginfod config/state file attributes diff --git a/config/debuginfod.service b/config/debuginfod.service index 8fca343fb70e..b64d8cb92324 100644 --- a/config/debuginfod.service +++ b/config/debuginfod.service @@ -9,7 +9,8 @@ User=debuginfod Group=debuginfod #CacheDirectory=debuginfod ExecStart=/usr/bin/debuginfod -d /var/cache/debuginfod/debuginfod.sqlite -p $DEBUGINFOD_PORT $DEBUGINFOD_VERBOSE $DEBUGINFOD_PRAGMAS $DEBUGINFOD_PATHS -TimeoutStopSec=10 +# Stopping can take a long time if scanning of large archives is in progress +TimeoutStopSec=60 PrivateTmp=yes [Install] diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog index ec791c68d460..34363e7e0139 100644 --- a/debuginfod/ChangeLog +++ b/debuginfod/ChangeLog @@ -1,8 +1,16 @@ +2020-10-31 Frank Ch. Eigler + + * debuginfod.cxx (scan_source_file, scan_archive_file): Add new scanned_bytes_total, + scanned_files_total metrics. + (archive_classify): Exit early if interrupted. + (scan_source_paths): Perform realpath/regex checks only on FTS_F files. + Tweak metrics. + 2020-10-30 Frank Ch. Eigler PR26775 cont'd. - (thread_main_scanner): Ensure control doesn't leave - infinite loop until program exit, even if SIGUSR2. + * debuginfod.cxx (thread_main_scanner): Ensure control doesn't + leave infinite loop until program exit, even if SIGUSR2. (scan_source_paths): Have traverser clean scanq on SIGUSR2. Emit additional traversed_total metrics. (groom): Emit additional groomed_total metrics. diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx index 1020a6405978..3085f388f5c1 100644 --- a/debuginfod/debuginfod.cxx +++ b/debuginfod/debuginfod.cxx @@ -1405,7 +1405,8 @@ handle_buildid_r_match (bool internal_req_p, throw libc_exception (errno, "cannot create temporary file"); // NB: don't unlink (tmppath), as fdcache will take charge of it. - rc = archive_read_data_into_fd (a, fd); + // NB: this can take many uninterruptible seconds for a huge file + rc = archive_read_data_into_fd (a, fd); if (rc != ARCHIVE_OK) // e.g. ENOSPC! { close (fd); @@ -2228,7 +2229,9 @@ scan_source_file (const string& rps, const stat_t& st, elf_classify (fd, executable_p, debuginfo_p, buildid, sourcefiles); else throw libc_exception(errno, string("open ") + rps); - inc_metric ("scanned_total","source","file"); + add_metric ("scanned_bytes_total","source","file", + st.st_size); + inc_metric ("scanned_files_total","source","file"); } // NB: we catch exceptions here too, so that we can // cache the corrupt-elf case (!executable_p && @@ -2411,9 +2414,12 @@ archive_classify (const string& rps, string& archive_extension, if (verbose > 3) obatched(clog) << "libarchive scanning " << rps << endl; - while(1) // parse cpio archive entries + while(1) // parse archive entries { - try + if (interrupted) + break; + + try { struct archive_entry *e; rc = archive_read_next_header (a, &e); @@ -2602,7 +2608,9 @@ scan_archive_file (const string& rps, const stat_t& st, st.st_mtime, my_fts_executable, my_fts_debuginfo, my_fts_sref, my_fts_sdef, my_fts_sref_complete_p); - inc_metric ("scanned_total","source",archive_extension + " archive"); + add_metric ("scanned_bytes_total","source",archive_extension + " archive", + st.st_size); + inc_metric ("scanned_files_total","source",archive_extension + " archive"); add_metric("found_debuginfo_total","source",archive_extension + " archive", my_fts_debuginfo); add_metric("found_executable_total","source",archive_extension + " archive", @@ -2808,35 +2816,39 @@ scan_source_paths() if (verbose > 2) obatched(clog) << "fts traversing " << f->fts_path << endl; - /* Found a file. Convert it to an absolute path, so - the buildid database does not have relative path - names that are unresolvable from a subsequent run - in a different cwd. */ - char *rp = realpath(f->fts_path, NULL); - if (rp == NULL) - continue; // ignore dangling symlink or such - string rps = string(rp); - free (rp); - - bool ri = !regexec (&file_include_regex, rps.c_str(), 0, 0, 0); - bool rx = !regexec (&file_exclude_regex, rps.c_str(), 0, 0, 0); - if (!ri || rx) - { - if (verbose > 3) - obatched(clog) << "fts skipped by regex " << (!ri ? "I" : "") << (rx ? "X" : "") << endl; - fts_regex ++; - if (!ri) - inc_metric("traversed_total","type","regex-skipped-I"); - if (rx) - inc_metric("traversed_total","type","regex-skipped-X"); - continue; - } - switch (f->fts_info) { case FTS_F: - scanq.push_back (make_pair(rps, *f->fts_statp)); - inc_metric("traversed_total","type","file"); + { + /* Found a file. Convert it to an absolute path, so + the buildid database does not have relative path + names that are unresolvable from a subsequent run + in a different cwd. */ + char *rp = realpath(f->fts_path, NULL); + if (rp == NULL) + continue; // ignore dangling symlink or such + string rps = string(rp); + free (rp); + + bool ri = !regexec (&file_include_regex, rps.c_str(), 0, 0, 0); + bool rx = !regexec (&file_exclude_regex, rps.c_str(), 0, 0, 0); + if (!ri || rx) + { + if (verbose > 3) + obatched(clog) << "fts skipped by regex " + << (!ri ? "I" : "") << (rx ? "X" : "") << endl; + fts_regex ++; + if (!ri) + inc_metric("traversed_total","type","file-skipped-I"); + if (rx) + inc_metric("traversed_total","type","file-skipped-X"); + } + else + { + scanq.push_back (make_pair(rps, *f->fts_statp)); + inc_metric("traversed_total","type","file"); + } + } break; case FTS_ERR: @@ -2849,6 +2861,10 @@ scan_source_paths() inc_metric("traversed_total","type","error"); break; + case FTS_SL: // ignore, but count because debuginfod -L would traverse these + inc_metric("traversed_total","type","symlink"); + break; + case FTS_D: // ignore inc_metric("traversed_total","type","directory"); break; diff --git a/tests/ChangeLog b/tests/ChangeLog index 13abf89ac5dd..b46d73592c4c 100644 --- a/tests/ChangeLog +++ b/tests/ChangeLog @@ -1,3 +1,7 @@ +2020-10-31 Frank Ch. Eigler + + * run-debuginfod-find.sh: Modify for tweaked/new metrics. + 2020-10-30 Frank Ch. Eigler PR26775 diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh index 0d1fcf1fb829..48dbc7d4edf4 100755 --- a/tests/run-debuginfod-find.sh +++ b/tests/run-debuginfod-find.sh @@ -246,11 +246,11 @@ cp -rvp ${abs_srcdir}/debuginfod-tars Z kill -USR1 $PID1 # All rpms need to be in the index, except the dummy permission-000 one rpms=$(find R -name \*rpm | grep -v nothing | wc -l) -wait_ready $PORT1 'scanned_total{source=".rpm archive"}' $rpms +wait_ready $PORT1 'scanned_files_total{source=".rpm archive"}' $rpms txz=$(find Z -name \*tar.xz | wc -l) -wait_ready $PORT1 'scanned_total{source=".tar.xz archive"}' $txz +wait_ready $PORT1 'scanned_files_total{source=".tar.xz archive"}' $txz tb2=$(find Z -name \*tar.bz2 | wc -l) -wait_ready $PORT1 'scanned_total{source=".tar.bz2 archive"}' $tb2 +wait_ready $PORT1 'scanned_files_total{source=".tar.bz2 archive"}' $tb2 kill -USR1 $PID1 # two hits of SIGUSR1 may be needed to resolve .debug->dwz->srefs # Expect all source files found in the rpms (they are all called hello.c :) @@ -419,9 +419,9 @@ if type bsdtar 2>/dev/null; then kill -USR1 $PID2 # All debs need to be in the index debs=$(find D -name \*.deb | wc -l) - wait_ready $PORT2 'scanned_total{source=".deb archive"}' `expr $debs` + wait_ready $PORT2 'scanned_files_total{source=".deb archive"}' `expr $debs` ddebs=$(find D -name \*.ddeb | wc -l) - wait_ready $PORT2 'scanned_total{source=".ddeb archive"}' `expr $ddebs` + wait_ready $PORT2 'scanned_files_total{source=".ddeb archive"}' `expr $ddebs` # ubuntu archive_test f17a29b5a25bd4960531d82aa6b07c8abe84fa66 "" "" @@ -478,6 +478,8 @@ curl -s http://127.0.0.1:$PORT1/metrics | grep 'http_responses_transfer_bytes_co curl -s http://127.0.0.1:$PORT1/metrics | grep 'http_responses_transfer_bytes_sum' curl -s http://127.0.0.1:$PORT1/metrics | grep 'fdcache_' curl -s http://127.0.0.1:$PORT1/metrics | grep 'error_count' +curl -s http://127.0.0.1:$PORT1/metrics | grep 'traversed_total' +curl -s http://127.0.0.1:$PORT1/metrics | grep 'scanned_bytes_total' # And generate a few errors into the second debuginfod's logs, for analysis just below curl -s http://127.0.0.1:$PORT2/badapi > /dev/null || true