From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (wildebeest.demon.nl [212.238.236.112]) by sourceware.org (Postfix) with ESMTPS id C8FBB3858034 for ; Sun, 12 Sep 2021 19:08:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C8FBB3858034 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org Received: from reform (deer0x0b.wildebeest.org [172.31.17.141]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id E2B4230006CF; Sun, 12 Sep 2021 21:08:33 +0200 (CEST) Received: by reform (Postfix, from userid 1000) id 7BB642E806E8; Sun, 12 Sep 2021 21:08:33 +0200 (CEST) Date: Sun, 12 Sep 2021 21:08:33 +0200 From: Mark Wielaard To: Noah Sanci Cc: elfutils-devel@sourceware.org Subject: Re: [Bug debuginfod/27277] Describe retrieved files when verbose Message-ID: References: <20210805165402.GD4195@redhat.com> <35f2073dfeed8f008d42a78bf60b7efcf13164eb.camel@klomp.org> <20210806185459.GE4195@redhat.com> <9ac621fee207ef233873c40843b3d34ced9019cc.camel@klomp.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_BADIPHTTP, KAM_DMARC_STATUS, KAM_SHORT, NUMERIC_HTTP_ADDR, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Sep 2021 19:08:38 -0000 Hi Noah, On Fri, Sep 10, 2021 at 02:22:00PM -0400, Noah Sanci via Elfutils-devel wrote: > From 979f19eb4fd7a35ace4ddafed103922559b93120 Mon Sep 17 00:00:00 2001 > From: Noah Sanci > Date: Wed, 28 Jul 2021 14:46:05 -0400 > Subject: [PATCH 2/2] debuginfod: PR27277 - Describe retrieved files when > verbose > > Allow users, with enough verbosity, to print the HTTP response headers > upon retrieving a file. These files may include several custome http > response headers such as X-DEBUGINFOD-FILE, X-DEBUGINFOD-SIZE, and > X-DEBUGINFOD-ARCHIVE. These headers are added from the daemon, in > debuginfod.cxx. > run-debuginfod-fd-prefetch-caches.sh was updated so that it doesn't > trip and fail as previously greping for a value that should yield zero > caused an error. I think this part should be in this patch. > Previously, target_handle was used to gather CURLE_OK reponses. Under > some conditions, target_handle was NULL when we wanted it to point to > the handle. This could cause some failuers. instead msg->easy_handle > is used, which ensures the correct handle is used. Thanks for including this explanation. What were the "some conditions"? > +2021-08-02 Noah Sanci > + > + PR27277 > + * debuginfod-client.c (struct debuginfod_client): New field > + winning_headers. > + (struct handle_data): New field response_data. > + (header_callback): Store received headers in response_data. > + (debuginfod_query_server): Activate CURLOPT_HEADERFUNCTION. > + Save winning response_data. > + (debuginfod_end): free client winning headers. > + * debuginfod.cxx (handle_buildid_f_match): remove X-DEBUGINFOD-FILE > + path. Add X-DEBUGINFOD-FILE and X-DEBUGINFOD-SIZE headers. > + (handle_buildid_r_match): remove X-DEBUGINFOD-FILE path. Add > + X-DEBUGINFOD-FILE, X-DEBUGINFOD-SIZE > + headers, and X-ARCHIVE headers. > + > 2021-07-26 Noah Sanci > > PR27982 > diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c > index d41723ce..9c16e7a3 100644 > --- a/debuginfod/debuginfod-client.c > +++ b/debuginfod/debuginfod-client.c > @@ -127,6 +127,7 @@ struct debuginfod_client > timeout or other info gotten from environment variables, the > handle data, etc. So those don't have to be reparsed and > recreated on each request. */ > + char * winning_headers; > }; > > /* The cache_clean_interval_s file within the debuginfod cache specifies > @@ -183,6 +184,8 @@ struct handle_data > to the cache. Used to ensure that a file is not downloaded from > multiple servers unnecessarily. */ > CURL **target_handle; > + /* Response http headers for this client handle, sent from the server */ > + char *response_data; > }; I think it will be convenient to also add a size_t response_data_size field. See below. > +static size_t > +header_callback (char * buffer, size_t size, size_t numitems, void * userdata) > +{ > + if (size != 1) > + return 0; > + /* Temporary buffer for realloc */ > + char *temp = NULL; > + size_t userlen = 0; > + if (*(char**)userdata == NULL) > + { > + temp = malloc(numitems+1); > + if (temp == NULL) > + return 0; > + memset(temp, '\0', numitems+1); > + } > + else > + { > + userlen = strlen(*(char**)userdata); > + temp = realloc(*(char**)userdata, userlen + numitems + 1); > + if (temp == NULL) > + return 0; > + } > + strncat(temp, buffer, numitems); > + *(char**)userdata = temp; > + return numitems; > +} I found https://curl.se/libcurl/c/CURLOPT_HEADERFUNCTION.html Maybe you could add that as comment for future readers. In the above userdata points to data[i].response_data. Which means the length/size needs to recalculated each time. Also the data is added with strncat which needs to walk the whole buffer again. If the stuct handle_data had also a size field then most of the above recalculations of the size are unnecessary and since we then know the (old) end of response_data we can simply memcpy the new data to the end (of course we need to make sure to add a zero terminator, but that can be done with one byte wrote instead of doing a memset of the whole buffer). > /* Query each of the server URLs found in $DEBUGINFOD_URLS for the file > with the specified build-id, type (debuginfo, executable or source) > and filename. filename may be NULL. If found, return a file > @@ -936,10 +966,13 @@ debuginfod_query_server (debuginfod_client *c, > curl_easy_setopt (data[i].handle, CURLOPT_LOW_SPEED_LIMIT, > 100 * 1024L); > } > + data[i].response_data = NULL; > curl_easy_setopt(data[i].handle, CURLOPT_FILETIME, (long) 1); > curl_easy_setopt(data[i].handle, CURLOPT_FOLLOWLOCATION, (long) 1); > curl_easy_setopt(data[i].handle, CURLOPT_FAILONERROR, (long) 1); > curl_easy_setopt(data[i].handle, CURLOPT_NOSIGNAL, (long) 1); > + curl_easy_setopt(data[i].handle, CURLOPT_HEADERFUNCTION, header_callback); > + curl_easy_setopt(data[i].handle, CURLOPT_HEADERDATA, (void *) &(data[i].response_data)); So if we wanted to do the suggestion of using a size field then it should be initialized to zero here and the last call should be &data[i]. > #if LIBCURL_VERSION_NUM >= 0x072a00 /* 7.42.0 */ > curl_easy_setopt(data[i].handle, CURLOPT_PATH_AS_IS, (long) 1); > #else > @@ -961,6 +994,7 @@ debuginfod_query_server (debuginfod_client *c, > int committed_to = -1; > bool verbose_reported = false; > struct timespec start_time, cur_time; > + c->winning_headers = NULL; > if ( maxtime > 0 && clock_gettime(CLOCK_MONOTONIC_RAW, &start_time) == -1) > { > rc = errno; > @@ -995,8 +1029,18 @@ debuginfod_query_server (debuginfod_client *c, > if (data[i].handle != target_handle) > curl_multi_remove_handle(curlm, data[i].handle); > else > - committed_to = i; > - } > + { > + committed_to = i; > + if (c->winning_headers == NULL) The indenting here is off because of the mixing of spaces and tabs. > +2021-08-04 Noah Sanci > + > + PR27277 > + * debuginfod-find.1: Increasing verbosity describes the downloaded > + file. > + * debuginfod.8: Describe X-DEBUGINFOD-FILE, X-DEBUGINFOD-SIZE, and > + X-DEBUGINFOD-ARCHIVE. Thanks for updating the docs. Looks good. > diff --git a/tests/ChangeLog b/tests/ChangeLog > index 1abe5456..23aeec4a 100644 > --- a/tests/ChangeLog > +++ b/tests/ChangeLog > @@ -106,11 +106,12 @@ > run-debuginfod-query-retry.sh, > run-debuginfod-regex.sh, > run-debuginfod-sizetime.sh, > - run-debuginfod-tmp-home.sh, and > - run-debuginfod-writable.sh > - run-debuginfod-x-forwarded-for.sh > - * tests/Makefile.am: Added the above new files to the test suite > - * tests/test-subr.sh: Added some general functions for above tests > + run-debuginfod-tmp-home.sh, > + run-debuginfod-writable.sh, and > + run-debuginfod-x-forwarded-for.sh. > + All of the above functions now use a 'base' variable to avoid races > + * Makefile.am: Added the above new files to the test suite > + * test-subr.sh: Added some general functions for above tests These changes seem unrelated to this patch. > +2021-08-02 Noah Sanci > + > + PR27277 > + * run-debuginfod-response-headers.sh: Add a test to ensure that file descriptions > + are accurate for files outside or within archives. > + * Makefile.am: Add run-debuginfod-response-headers.sh to TESTS. It also needs to be added to EXTRA_DIST. > diff --git a/tests/run-debuginfod-fd-prefetch-caches.sh b/tests/run-debuginfod-fd-prefetch-caches.sh > index 61fee9e9..bee88c1e 100755 > --- a/tests/run-debuginfod-fd-prefetch-caches.sh > +++ b/tests/run-debuginfod-fd-prefetch-caches.sh > @@ -51,9 +51,9 @@ grep 'prefetch fds ' vlog$PORT1 #$PREFETCH_FDS > grep 'prefetch mbs ' vlog$PORT1 #$PREFETCH_MBS > # search the vlog to find what metric counts should be and check the correct metrics > # were incrimented > -wait_ready $PORT1 'fdcache_op_count{op="enqueue"}' $( grep -c 'interned.*front=1' vlog$PORT1 ) > -wait_ready $PORT1 'fdcache_op_count{op="evict"}' $( grep -c 'evicted a=.*' vlog$PORT1 ) > -wait_ready $PORT1 'fdcache_op_count{op="prefetch_enqueue"}' $( grep -c 'interned.*front=0' vlog$PORT1 ) > +wait_ready $PORT1 'fdcache_op_count{op="enqueue"}' $( grep -c 'interned.*front=1' vlog$PORT1 || true) > +wait_ready $PORT1 'fdcache_op_count{op="evict"}' $( grep -c 'evicted a=.*' vlog$PORT1 || true ) > +wait_ready $PORT1 'fdcache_op_count{op="prefetch_enqueue"}' $( grep -c 'interned.*front=0' vlog$PORT1 || true ) > wait_ready $PORT1 'fdcache_op_count{op="prefetch_evict"}' $( grep -c 'evicted from prefetch a=.*front=0' vlog$PORT1 || true ) > > kill $PID1 This is an unrelated change. > diff --git a/tests/run-debuginfod-response-headers.sh b/tests/run-debuginfod-response-headers.sh > new file mode 100755 > index 00000000..a458ca1b > --- /dev/null > +++ b/tests/run-debuginfod-response-headers.sh > @@ -0,0 +1,118 @@ > +#!/usr/bin/env bash > +# > +# Copyright (C) 2019-2021 Red Hat, Inc. > +# This file is part of elfutils. > +# > +# This file is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3 of the License, or > +# (at your option) any later version. > +# > +# elfutils is distributed in the hope that it will be useful, but > +# WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see . > + > +. $srcdir/test-subr.sh # includes set -e > + > +# for test case debugging, uncomment: > +set -x > + > +DB=${PWD}/.debuginfod_tmp.sqlite > +tempfiles $DB > +export DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache > + > +# This variable is essential and ensures no time-race for claiming ports occurs > +# set base to a unique multiple of 100 not used in any other 'run-debuginfod-*' test > +base=9500 > +get_ports > +mkdir F R > + > +env LD_LIBRARY_PATH=$ldpath DEBUGINFOD_URLS= ${abs_builddir}/../debuginfod/debuginfod $VERBOSE -F -R -d $DB -p $PORT1 -t0 -g0 -v R F > vlog$PORT1 2>&1 & > +PID1=$! > +tempfiles vlog$PORT1 > +errfiles vlog$PORT1 > +# Server must become ready > +wait_ready $PORT1 'ready' 1 > +export DEBUGINFOD_URLS=http://127.0.0.1:$PORT1/ # or without trailing / > +# Check thread comm names > +ps -q $PID1 -e -L -o '%p %c %a' | grep groom > +ps -q $PID1 -e -L -o '%p %c %a' | grep scan > +ps -q $PID1 -e -L -o '%p %c %a' | grep traverse > + > +# We use -t0 and -g0 here to turn off time-based scanning & grooming. > +# For testing purposes, we just sic SIGUSR1 / SIGUSR2 at the process. > + > +######################################################################## > + > +# Compile a simple program, strip its debuginfo and save the build-id. > +# Also move the debuginfo into another directory so that elfutils > +# cannot find it without debuginfod. > +echo "int main() { return 0; }" > ${PWD}/prog.c > +tempfiles prog.c > +# Create a subdirectory to confound source path names > +mkdir foobar > +gcc -Wl,--build-id -g -o prog ${PWD}/foobar///./../prog.c > +testrun ${abs_top_builddir}/src/strip -g -f prog.debug ${PWD}/prog > +BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \ > + -a prog | grep 'Build ID' | cut -d ' ' -f 7` > + > +mv prog F > +mv prog.debug F > +kill -USR1 $PID1 > +# Wait till both files are in the index. > +wait_ready $PORT1 'thread_work_total{role="traverse"}' 1 > +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0 > +wait_ready $PORT1 'thread_busy{role="scan"}' 0 > + > +cp -rvp ${abs_srcdir}/debuginfod-rpms R > +if [ "$zstd" = "false" ]; then # nuke the zstd fedora 31 ones > + rm -vrf R/debuginfod-rpms/fedora31 > +fi > + > +kill -USR1 $PID1 > +# Wait till both files are in the index and scan/index fully finished > +wait_ready $PORT1 'thread_work_total{role="traverse"}' 2 > +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0 > +wait_ready $PORT1 'thread_busy{role="scan"}' 0 > +# All rpms need to be in the index, except the dummy permission-000 one > +rpms=$(find R -name \*rpm | grep -v nothing | wc -l) > +wait_ready $PORT1 'scanned_files_total{source=".rpm archive"}' $rpms > +kill -USR1 $PID1 # two hits of SIGUSR1 may be needed to resolve .debug->dwz->srefs > +# Wait till both files are in the index and scan/index fully finished > +wait_ready $PORT1 'thread_work_total{role="traverse"}' 3 > +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0 > +wait_ready $PORT1 'thread_busy{role="scan"}' 0 Is it really necessary to add all this if this is just a test to check the new headers are sent? > +######################################################################## > +## PR27277 > +# Make a simple request to the debuginfod server and check debuginfod-find's vlog to see if > +# the custom HTTP headers are received. > +rm -rf $DEBUGINFOD_CACHE_PATH > +env DEBUGINFOD_URLS="http://127.0.0.1:"$PORT1 LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find\ > + -vvv executable F/prog > vlog-find$PORT1.1 2>&1 > +tempfiles vlog-find$PORT1.1 > +grep 'Content-Length: ' vlog-find$PORT1.1 > +grep 'Connection: ' vlog-find$PORT1.1 > +grep 'Cache-Control: ' vlog-find$PORT1.1 > +grep 'X-DEBUGINFOD-FILE: ' vlog-find$PORT1.1 > +grep 'X-DEBUGINFOD-SIZE: ' vlog-find$PORT1.1 > + > +# Check to see if an executable file located in an archive prints the file's description and archive > +env DEBUGINFOD_URLS="http://127.0.0.1:"$PORT1 LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find\ > + -vvv executable c36708a78618d597dee15d0dc989f093ca5f9120 > vlog-find$PORT1.2 2>&1 > +tempfiles vlog-find$PORT1.2 > +grep 'Content-Length: ' vlog-find$PORT1.2 > +grep 'Connection: ' vlog-find$PORT1.2 > +grep 'Cache-Control: ' vlog-find$PORT1.2 > +grep 'X-DEBUGINFOD-FILE: ' vlog-find$PORT1.2 > +grep 'X-DEBUGINFOD-SIZE: ' vlog-find$PORT1.2 > +grep 'X-DEBUGINFOD-ARCHIVE: ' vlog-find$PORT1.2 > + > +kill $PID1 > +wait $PID1 > +PID1=0 > +exit 0 > diff --git a/tests/test-subr.sh b/tests/test-subr.sh > index 41e95e31..95dcbb26 100644 > --- a/tests/test-subr.sh > +++ b/tests/test-subr.sh > @@ -43,9 +43,9 @@ remove_files= > exit_cleanup() > { > rm -rf ${PWD}/.client_cache* > - rm -f $remove_files; > + rm -f $remove_files; > ls -a ${PWD} > - cd ..; > + cd ..; > rmdir $test_dir > } > trap exit_cleanup 0 > @@ -228,7 +228,7 @@ cleanup() > { > if [ $PID1 -ne 0 ]; then kill $PID1; wait $PID1; fi > if [ $PID2 -ne 0 ]; then kill $PID2; wait $PID2; fi > - rm -rf ${PWD}/.client_cache F R D L Z ${PWD}/foobar ${PWD}/mocktree > + rm -rf ${PWD}/.client_cache F R D L Z ${PWD}/foobar ${PWD}/mocktree > exit_cleanup > } > trap cleanup 0 1 2 3 5 9 15 > @@ -316,12 +316,12 @@ archive_test() { > > get_ports() { > while true; do > - PORT1=`expr '(' $RANDOM % 1000 ')' + 9000` > + PORT1=`expr '(' $RANDOM % 100 ')' + $base` > ss -atn | fgrep ":$PORT1" || break > done > # Some tests will use two servers, so assign the second var > while true; do > - PORT2=`expr '(' $RANDOM % 1000 ')' + 9000` > + PORT2=`expr '(' $RANDOM % 100 ')' + $base` > ss -atn | fgrep ":$PORT2" && $PORT1 -ne $PORT2 || break > done These changes also seem unrelated. Cheers, Mark