From: Mark Wielaard <mark@klomp.org>
To: Noah Sanci <nsanci@redhat.com>
Cc: elfutils-devel@sourceware.org
Subject: Re: [Bug debuginfod/27277] Describe retrieved files when verbose
Date: Sun, 12 Sep 2021 21:08:33 +0200 [thread overview]
Message-ID: <YT5QMUw8KbKkKi5X@wildebeest.org> (raw)
In-Reply-To: <CAJXA7qgA+ov2qBqqUX5MgeQM7aRX2OMGEDp2N+wssa6teDeuGg@mail.gmail.com>
Hi Noah,
On Fri, Sep 10, 2021 at 02:22:00PM -0400, Noah Sanci via Elfutils-devel wrote:
> From 979f19eb4fd7a35ace4ddafed103922559b93120 Mon Sep 17 00:00:00 2001
> From: Noah Sanci <nsanci@redhat.com>
> Date: Wed, 28 Jul 2021 14:46:05 -0400
> Subject: [PATCH 2/2] debuginfod: PR27277 - Describe retrieved files when
> verbose
>
> Allow users, with enough verbosity, to print the HTTP response headers
> upon retrieving a file. These files may include several custome http
> response headers such as X-DEBUGINFOD-FILE, X-DEBUGINFOD-SIZE, and
> X-DEBUGINFOD-ARCHIVE. These headers are added from the daemon, in
> debuginfod.cxx.
> run-debuginfod-fd-prefetch-caches.sh was updated so that it doesn't
> trip and fail as previously greping for a value that should yield zero
> caused an error.
I think this part should be in this patch.
> Previously, target_handle was used to gather CURLE_OK reponses. Under
> some conditions, target_handle was NULL when we wanted it to point to
> the handle. This could cause some failuers. instead msg->easy_handle
> is used, which ensures the correct handle is used.
Thanks for including this explanation. What were the "some conditions"?
> +2021-08-02 Noah Sanci <nsanci@redhat.com>
> +
> + PR27277
> + * debuginfod-client.c (struct debuginfod_client): New field
> + winning_headers.
> + (struct handle_data): New field response_data.
> + (header_callback): Store received headers in response_data.
> + (debuginfod_query_server): Activate CURLOPT_HEADERFUNCTION.
> + Save winning response_data.
> + (debuginfod_end): free client winning headers.
> + * debuginfod.cxx (handle_buildid_f_match): remove X-DEBUGINFOD-FILE
> + path. Add X-DEBUGINFOD-FILE and X-DEBUGINFOD-SIZE headers.
> + (handle_buildid_r_match): remove X-DEBUGINFOD-FILE path. Add
> + X-DEBUGINFOD-FILE, X-DEBUGINFOD-SIZE
> + headers, and X-ARCHIVE headers.
> +
> 2021-07-26 Noah Sanci <nsanci@redhat.com>
>
> PR27982
> diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
> index d41723ce..9c16e7a3 100644
> --- a/debuginfod/debuginfod-client.c
> +++ b/debuginfod/debuginfod-client.c
> @@ -127,6 +127,7 @@ struct debuginfod_client
> timeout or other info gotten from environment variables, the
> handle data, etc. So those don't have to be reparsed and
> recreated on each request. */
> + char * winning_headers;
> };
>
> /* The cache_clean_interval_s file within the debuginfod cache specifies
> @@ -183,6 +184,8 @@ struct handle_data
> to the cache. Used to ensure that a file is not downloaded from
> multiple servers unnecessarily. */
> CURL **target_handle;
> + /* Response http headers for this client handle, sent from the server */
> + char *response_data;
> };
I think it will be convenient to also add a size_t response_data_size
field. See below.
> +static size_t
> +header_callback (char * buffer, size_t size, size_t numitems, void * userdata)
> +{
> + if (size != 1)
> + return 0;
> + /* Temporary buffer for realloc */
> + char *temp = NULL;
> + size_t userlen = 0;
> + if (*(char**)userdata == NULL)
> + {
> + temp = malloc(numitems+1);
> + if (temp == NULL)
> + return 0;
> + memset(temp, '\0', numitems+1);
> + }
> + else
> + {
> + userlen = strlen(*(char**)userdata);
> + temp = realloc(*(char**)userdata, userlen + numitems + 1);
> + if (temp == NULL)
> + return 0;
> + }
> + strncat(temp, buffer, numitems);
> + *(char**)userdata = temp;
> + return numitems;
> +}
I found https://curl.se/libcurl/c/CURLOPT_HEADERFUNCTION.html
Maybe you could add that as comment for future readers.
In the above userdata points to data[i].response_data. Which means the
length/size needs to recalculated each time. Also the data is added
with strncat which needs to walk the whole buffer again.
If the stuct handle_data had also a size field then most of the above
recalculations of the size are unnecessary and since we then know the
(old) end of response_data we can simply memcpy the new data to the
end (of course we need to make sure to add a zero terminator, but that
can be done with one byte wrote instead of doing a memset of the whole
buffer).
> /* Query each of the server URLs found in $DEBUGINFOD_URLS for the file
> with the specified build-id, type (debuginfo, executable or source)
> and filename. filename may be NULL. If found, return a file
> @@ -936,10 +966,13 @@ debuginfod_query_server (debuginfod_client *c,
> curl_easy_setopt (data[i].handle, CURLOPT_LOW_SPEED_LIMIT,
> 100 * 1024L);
> }
> + data[i].response_data = NULL;
> curl_easy_setopt(data[i].handle, CURLOPT_FILETIME, (long) 1);
> curl_easy_setopt(data[i].handle, CURLOPT_FOLLOWLOCATION, (long) 1);
> curl_easy_setopt(data[i].handle, CURLOPT_FAILONERROR, (long) 1);
> curl_easy_setopt(data[i].handle, CURLOPT_NOSIGNAL, (long) 1);
> + curl_easy_setopt(data[i].handle, CURLOPT_HEADERFUNCTION, header_callback);
> + curl_easy_setopt(data[i].handle, CURLOPT_HEADERDATA, (void *) &(data[i].response_data));
So if we wanted to do the suggestion of using a size field then it
should be initialized to zero here and the last call should be
&data[i].
> #if LIBCURL_VERSION_NUM >= 0x072a00 /* 7.42.0 */
> curl_easy_setopt(data[i].handle, CURLOPT_PATH_AS_IS, (long) 1);
> #else
> @@ -961,6 +994,7 @@ debuginfod_query_server (debuginfod_client *c,
> int committed_to = -1;
> bool verbose_reported = false;
> struct timespec start_time, cur_time;
> + c->winning_headers = NULL;
> if ( maxtime > 0 && clock_gettime(CLOCK_MONOTONIC_RAW, &start_time) == -1)
> {
> rc = errno;
> @@ -995,8 +1029,18 @@ debuginfod_query_server (debuginfod_client *c,
> if (data[i].handle != target_handle)
> curl_multi_remove_handle(curlm, data[i].handle);
> else
> - committed_to = i;
> - }
> + {
> + committed_to = i;
> + if (c->winning_headers == NULL)
The indenting here is off because of the mixing of spaces and tabs.
> +2021-08-04 Noah Sanci <nsanci@redhat.com>
> +
> + PR27277
> + * debuginfod-find.1: Increasing verbosity describes the downloaded
> + file.
> + * debuginfod.8: Describe X-DEBUGINFOD-FILE, X-DEBUGINFOD-SIZE, and
> + X-DEBUGINFOD-ARCHIVE.
Thanks for updating the docs. Looks good.
> diff --git a/tests/ChangeLog b/tests/ChangeLog
> index 1abe5456..23aeec4a 100644
> --- a/tests/ChangeLog
> +++ b/tests/ChangeLog
> @@ -106,11 +106,12 @@
> run-debuginfod-query-retry.sh,
> run-debuginfod-regex.sh,
> run-debuginfod-sizetime.sh,
> - run-debuginfod-tmp-home.sh, and
> - run-debuginfod-writable.sh
> - run-debuginfod-x-forwarded-for.sh
> - * tests/Makefile.am: Added the above new files to the test suite
> - * tests/test-subr.sh: Added some general functions for above tests
> + run-debuginfod-tmp-home.sh,
> + run-debuginfod-writable.sh, and
> + run-debuginfod-x-forwarded-for.sh.
> + All of the above functions now use a 'base' variable to avoid races
> + * Makefile.am: Added the above new files to the test suite
> + * test-subr.sh: Added some general functions for above tests
These changes seem unrelated to this patch.
> +2021-08-02 Noah Sanci <nsanci@redhat.com>
> +
> + PR27277
> + * run-debuginfod-response-headers.sh: Add a test to ensure that file descriptions
> + are accurate for files outside or within archives.
> + * Makefile.am: Add run-debuginfod-response-headers.sh to TESTS.
It also needs to be added to EXTRA_DIST.
> diff --git a/tests/run-debuginfod-fd-prefetch-caches.sh b/tests/run-debuginfod-fd-prefetch-caches.sh
> index 61fee9e9..bee88c1e 100755
> --- a/tests/run-debuginfod-fd-prefetch-caches.sh
> +++ b/tests/run-debuginfod-fd-prefetch-caches.sh
> @@ -51,9 +51,9 @@ grep 'prefetch fds ' vlog$PORT1 #$PREFETCH_FDS
> grep 'prefetch mbs ' vlog$PORT1 #$PREFETCH_MBS
> # search the vlog to find what metric counts should be and check the correct metrics
> # were incrimented
> -wait_ready $PORT1 'fdcache_op_count{op="enqueue"}' $( grep -c 'interned.*front=1' vlog$PORT1 )
> -wait_ready $PORT1 'fdcache_op_count{op="evict"}' $( grep -c 'evicted a=.*' vlog$PORT1 )
> -wait_ready $PORT1 'fdcache_op_count{op="prefetch_enqueue"}' $( grep -c 'interned.*front=0' vlog$PORT1 )
> +wait_ready $PORT1 'fdcache_op_count{op="enqueue"}' $( grep -c 'interned.*front=1' vlog$PORT1 || true)
> +wait_ready $PORT1 'fdcache_op_count{op="evict"}' $( grep -c 'evicted a=.*' vlog$PORT1 || true )
> +wait_ready $PORT1 'fdcache_op_count{op="prefetch_enqueue"}' $( grep -c 'interned.*front=0' vlog$PORT1 || true )
> wait_ready $PORT1 'fdcache_op_count{op="prefetch_evict"}' $( grep -c 'evicted from prefetch a=.*front=0' vlog$PORT1 || true )
>
> kill $PID1
This is an unrelated change.
> diff --git a/tests/run-debuginfod-response-headers.sh b/tests/run-debuginfod-response-headers.sh
> new file mode 100755
> index 00000000..a458ca1b
> --- /dev/null
> +++ b/tests/run-debuginfod-response-headers.sh
> @@ -0,0 +1,118 @@
> +#!/usr/bin/env bash
> +#
> +# Copyright (C) 2019-2021 Red Hat, Inc.
> +# This file is part of elfutils.
> +#
> +# This file is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# elfutils is distributed in the hope that it will be useful, but
> +# WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +. $srcdir/test-subr.sh # includes set -e
> +
> +# for test case debugging, uncomment:
> +set -x
> +
> +DB=${PWD}/.debuginfod_tmp.sqlite
> +tempfiles $DB
> +export DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache
> +
> +# This variable is essential and ensures no time-race for claiming ports occurs
> +# set base to a unique multiple of 100 not used in any other 'run-debuginfod-*' test
> +base=9500
> +get_ports
> +mkdir F R
> +
> +env LD_LIBRARY_PATH=$ldpath DEBUGINFOD_URLS= ${abs_builddir}/../debuginfod/debuginfod $VERBOSE -F -R -d $DB -p $PORT1 -t0 -g0 -v R F > vlog$PORT1 2>&1 &
> +PID1=$!
> +tempfiles vlog$PORT1
> +errfiles vlog$PORT1
> +# Server must become ready
> +wait_ready $PORT1 'ready' 1
> +export DEBUGINFOD_URLS=http://127.0.0.1:$PORT1/ # or without trailing /
> +# Check thread comm names
> +ps -q $PID1 -e -L -o '%p %c %a' | grep groom
> +ps -q $PID1 -e -L -o '%p %c %a' | grep scan
> +ps -q $PID1 -e -L -o '%p %c %a' | grep traverse
> +
> +# We use -t0 and -g0 here to turn off time-based scanning & grooming.
> +# For testing purposes, we just sic SIGUSR1 / SIGUSR2 at the process.
> +
> +########################################################################
> +
> +# Compile a simple program, strip its debuginfo and save the build-id.
> +# Also move the debuginfo into another directory so that elfutils
> +# cannot find it without debuginfod.
> +echo "int main() { return 0; }" > ${PWD}/prog.c
> +tempfiles prog.c
> +# Create a subdirectory to confound source path names
> +mkdir foobar
> +gcc -Wl,--build-id -g -o prog ${PWD}/foobar///./../prog.c
> +testrun ${abs_top_builddir}/src/strip -g -f prog.debug ${PWD}/prog
> +BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \
> + -a prog | grep 'Build ID' | cut -d ' ' -f 7`
> +
> +mv prog F
> +mv prog.debug F
> +kill -USR1 $PID1
> +# Wait till both files are in the index.
> +wait_ready $PORT1 'thread_work_total{role="traverse"}' 1
> +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0
> +wait_ready $PORT1 'thread_busy{role="scan"}' 0
> +
> +cp -rvp ${abs_srcdir}/debuginfod-rpms R
> +if [ "$zstd" = "false" ]; then # nuke the zstd fedora 31 ones
> + rm -vrf R/debuginfod-rpms/fedora31
> +fi
> +
> +kill -USR1 $PID1
> +# Wait till both files are in the index and scan/index fully finished
> +wait_ready $PORT1 'thread_work_total{role="traverse"}' 2
> +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0
> +wait_ready $PORT1 'thread_busy{role="scan"}' 0
> +# All rpms need to be in the index, except the dummy permission-000 one
> +rpms=$(find R -name \*rpm | grep -v nothing | wc -l)
> +wait_ready $PORT1 'scanned_files_total{source=".rpm archive"}' $rpms
> +kill -USR1 $PID1 # two hits of SIGUSR1 may be needed to resolve .debug->dwz->srefs
> +# Wait till both files are in the index and scan/index fully finished
> +wait_ready $PORT1 'thread_work_total{role="traverse"}' 3
> +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0
> +wait_ready $PORT1 'thread_busy{role="scan"}' 0
Is it really necessary to add all this if this is just a test to check
the new headers are sent?
> +########################################################################
> +## PR27277
> +# Make a simple request to the debuginfod server and check debuginfod-find's vlog to see if
> +# the custom HTTP headers are received.
> +rm -rf $DEBUGINFOD_CACHE_PATH
> +env DEBUGINFOD_URLS="http://127.0.0.1:"$PORT1 LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find\
> + -vvv executable F/prog > vlog-find$PORT1.1 2>&1
> +tempfiles vlog-find$PORT1.1
> +grep 'Content-Length: ' vlog-find$PORT1.1
> +grep 'Connection: ' vlog-find$PORT1.1
> +grep 'Cache-Control: ' vlog-find$PORT1.1
> +grep 'X-DEBUGINFOD-FILE: ' vlog-find$PORT1.1
> +grep 'X-DEBUGINFOD-SIZE: ' vlog-find$PORT1.1
> +
> +# Check to see if an executable file located in an archive prints the file's description and archive
> +env DEBUGINFOD_URLS="http://127.0.0.1:"$PORT1 LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find\
> + -vvv executable c36708a78618d597dee15d0dc989f093ca5f9120 > vlog-find$PORT1.2 2>&1
> +tempfiles vlog-find$PORT1.2
> +grep 'Content-Length: ' vlog-find$PORT1.2
> +grep 'Connection: ' vlog-find$PORT1.2
> +grep 'Cache-Control: ' vlog-find$PORT1.2
> +grep 'X-DEBUGINFOD-FILE: ' vlog-find$PORT1.2
> +grep 'X-DEBUGINFOD-SIZE: ' vlog-find$PORT1.2
> +grep 'X-DEBUGINFOD-ARCHIVE: ' vlog-find$PORT1.2
> +
> +kill $PID1
> +wait $PID1
> +PID1=0
> +exit 0
> diff --git a/tests/test-subr.sh b/tests/test-subr.sh
> index 41e95e31..95dcbb26 100644
> --- a/tests/test-subr.sh
> +++ b/tests/test-subr.sh
> @@ -43,9 +43,9 @@ remove_files=
> exit_cleanup()
> {
> rm -rf ${PWD}/.client_cache*
> - rm -f $remove_files;
> + rm -f $remove_files;
> ls -a ${PWD}
> - cd ..;
> + cd ..;
> rmdir $test_dir
> }
> trap exit_cleanup 0
> @@ -228,7 +228,7 @@ cleanup()
> {
> if [ $PID1 -ne 0 ]; then kill $PID1; wait $PID1; fi
> if [ $PID2 -ne 0 ]; then kill $PID2; wait $PID2; fi
> - rm -rf ${PWD}/.client_cache F R D L Z ${PWD}/foobar ${PWD}/mocktree
> + rm -rf ${PWD}/.client_cache F R D L Z ${PWD}/foobar ${PWD}/mocktree
> exit_cleanup
> }
> trap cleanup 0 1 2 3 5 9 15
> @@ -316,12 +316,12 @@ archive_test() {
>
> get_ports() {
> while true; do
> - PORT1=`expr '(' $RANDOM % 1000 ')' + 9000`
> + PORT1=`expr '(' $RANDOM % 100 ')' + $base`
> ss -atn | fgrep ":$PORT1" || break
> done
> # Some tests will use two servers, so assign the second var
> while true; do
> - PORT2=`expr '(' $RANDOM % 1000 ')' + 9000`
> + PORT2=`expr '(' $RANDOM % 100 ')' + $base`
> ss -atn | fgrep ":$PORT2" && $PORT1 -ne $PORT2 || break
> done
These changes also seem unrelated.
Cheers,
Mark
next prev parent reply other threads:[~2021-09-12 19:08 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-04 18:54 Noah Sanci
2021-08-05 15:13 ` Mark Wielaard
2021-08-05 16:54 ` Frank Ch. Eigler
2021-08-06 10:04 ` Mark Wielaard
2021-08-06 18:54 ` Frank Ch. Eigler
2021-08-09 9:25 ` Mark Wielaard
2021-08-23 15:11 ` Noah Sanci
2021-08-24 8:18 ` Mark Wielaard
2021-08-27 18:38 ` Noah Sanci
2021-09-08 20:56 ` Mark Wielaard
2021-09-10 18:22 ` Noah Sanci
2021-09-12 19:08 ` Mark Wielaard [this message]
2021-09-13 20:07 ` Noah Sanci
2021-09-16 10:50 ` Mark Wielaard
2021-09-22 20:33 ` Frank Ch. Eigler
2021-09-29 14:55 ` Mark Wielaard
2021-09-29 21:28 ` Frank Ch. Eigler
2021-10-05 14:28 ` Mark Wielaard
2022-07-14 15:32 ` Noah Sanci
2022-08-04 13:12 ` Mark Wielaard
[not found] ` <CAJXA7qg09YkxK-NRQ31Hem0+54Us=jYC5+1siPSbHangx=SCow@mail.gmail.com>
2022-08-08 14:35 ` Mark Wielaard
2021-08-25 18:08 Noah Sanci
2021-09-08 15:01 ` Mark Wielaard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YT5QMUw8KbKkKi5X@wildebeest.org \
--to=mark@klomp.org \
--cc=elfutils-devel@sourceware.org \
--cc=nsanci@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).