public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: Noah Sanci <nsanci@redhat.com>
Cc: elfutils-devel@sourceware.org
Subject: Re: [Bug debuginfod/27277] Describe retrieved files when verbose
Date: Sun, 12 Sep 2021 21:08:33 +0200	[thread overview]
Message-ID: <YT5QMUw8KbKkKi5X@wildebeest.org> (raw)
In-Reply-To: <CAJXA7qgA+ov2qBqqUX5MgeQM7aRX2OMGEDp2N+wssa6teDeuGg@mail.gmail.com>

Hi Noah,

On Fri, Sep 10, 2021 at 02:22:00PM -0400, Noah Sanci via Elfutils-devel wrote:
> From 979f19eb4fd7a35ace4ddafed103922559b93120 Mon Sep 17 00:00:00 2001
> From: Noah Sanci <nsanci@redhat.com>
> Date: Wed, 28 Jul 2021 14:46:05 -0400
> Subject: [PATCH 2/2] debuginfod: PR27277 - Describe retrieved files when
>  verbose
> 
> Allow users, with enough verbosity, to print the HTTP response headers
> upon retrieving a file. These files may include several custome http
> response headers such as X-DEBUGINFOD-FILE, X-DEBUGINFOD-SIZE, and
> X-DEBUGINFOD-ARCHIVE. These headers are added from the daemon, in
> debuginfod.cxx.

> run-debuginfod-fd-prefetch-caches.sh was updated so that it doesn't
> trip and fail as previously greping for a value that should yield zero
> caused an error.

I think this part should be in this patch.

> Previously, target_handle was used to gather CURLE_OK reponses. Under
> some conditions, target_handle was NULL when we wanted it to point to
> the handle. This could cause some failuers. instead msg->easy_handle
> is used, which ensures the correct handle is used.

Thanks for including this explanation. What were the "some conditions"?

> +2021-08-02  Noah Sanci  <nsanci@redhat.com>
> +
> +	PR27277
> +	* debuginfod-client.c (struct debuginfod_client): New field
> +	winning_headers.
> +	(struct handle_data): New field response_data.
> +	(header_callback): Store received headers in response_data.
> +	(debuginfod_query_server): Activate CURLOPT_HEADERFUNCTION.
> +	Save winning response_data.
> +	(debuginfod_end): free client winning headers.
> +	* debuginfod.cxx (handle_buildid_f_match): remove X-DEBUGINFOD-FILE
> +	path. Add X-DEBUGINFOD-FILE and X-DEBUGINFOD-SIZE headers.
> +	(handle_buildid_r_match): remove X-DEBUGINFOD-FILE path. Add
> +	X-DEBUGINFOD-FILE, X-DEBUGINFOD-SIZE
> +	headers, and X-ARCHIVE headers.
> +
>  2021-07-26  Noah Sanci  <nsanci@redhat.com>
>  
>  	PR27982
> diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
> index d41723ce..9c16e7a3 100644
> --- a/debuginfod/debuginfod-client.c
> +++ b/debuginfod/debuginfod-client.c
> @@ -127,6 +127,7 @@ struct debuginfod_client
>       timeout or other info gotten from environment variables, the
>       handle data, etc. So those don't have to be reparsed and
>       recreated on each request.  */
> +  char * winning_headers;
>  };
>  
>  /* The cache_clean_interval_s file within the debuginfod cache specifies
> @@ -183,6 +184,8 @@ struct handle_data
>       to the cache. Used to ensure that a file is not downloaded from
>       multiple servers unnecessarily.  */
>    CURL **target_handle;
> +  /* Response http headers for this client handle, sent from the server */
> +  char *response_data;
>  };

I think it will be convenient to also add a size_t response_data_size
field.  See below.

> +static size_t
> +header_callback (char * buffer, size_t size, size_t numitems, void * userdata)
> +{
> +  if (size != 1)
> +    return 0;
> +  /* Temporary buffer for realloc */
> +  char *temp = NULL;
> +  size_t userlen = 0;
> +  if (*(char**)userdata == NULL)
> +    {
> +      temp = malloc(numitems+1);
> +      if (temp == NULL)
> +        return 0;
> +      memset(temp, '\0', numitems+1);
> +    }
> +  else
> +    {
> +      userlen = strlen(*(char**)userdata);
> +      temp = realloc(*(char**)userdata, userlen + numitems + 1);
> +      if (temp == NULL)
> +       return 0;
> +    }
> +  strncat(temp, buffer, numitems);
> +  *(char**)userdata = temp;
> +  return numitems;
> +}

I found https://curl.se/libcurl/c/CURLOPT_HEADERFUNCTION.html
Maybe you could add that as comment for future readers.

In the above userdata points to data[i].response_data. Which means the
length/size needs to recalculated each time. Also the data is added
with strncat which needs to walk the whole buffer again.

If the stuct handle_data had also a size field then most of the above
recalculations of the size are unnecessary and since we then know the
(old) end of response_data we can simply memcpy the new data to the
end (of course we need to make sure to add a zero terminator, but that
can be done with one byte wrote instead of doing a memset of the whole
buffer).

>  /* Query each of the server URLs found in $DEBUGINFOD_URLS for the file
>     with the specified build-id, type (debuginfo, executable or source)
>     and filename. filename may be NULL. If found, return a file
> @@ -936,10 +966,13 @@ debuginfod_query_server (debuginfod_client *c,
>  	  curl_easy_setopt (data[i].handle, CURLOPT_LOW_SPEED_LIMIT,
>  			    100 * 1024L);
>  	}
> +      data[i].response_data = NULL;
>        curl_easy_setopt(data[i].handle, CURLOPT_FILETIME, (long) 1);
>        curl_easy_setopt(data[i].handle, CURLOPT_FOLLOWLOCATION, (long) 1);
>        curl_easy_setopt(data[i].handle, CURLOPT_FAILONERROR, (long) 1);
>        curl_easy_setopt(data[i].handle, CURLOPT_NOSIGNAL, (long) 1);
> +      curl_easy_setopt(data[i].handle, CURLOPT_HEADERFUNCTION, header_callback);
> +      curl_easy_setopt(data[i].handle, CURLOPT_HEADERDATA, (void *) &(data[i].response_data));

So if we wanted to do the suggestion of using a size field then it
should be initialized to zero here and the last call should be
&data[i].

>  #if LIBCURL_VERSION_NUM >= 0x072a00 /* 7.42.0 */
>        curl_easy_setopt(data[i].handle, CURLOPT_PATH_AS_IS, (long) 1);
>  #else
> @@ -961,6 +994,7 @@ debuginfod_query_server (debuginfod_client *c,
>    int committed_to = -1;
>    bool verbose_reported = false;
>    struct timespec start_time, cur_time;
> +  c->winning_headers = NULL;
>    if ( maxtime > 0 && clock_gettime(CLOCK_MONOTONIC_RAW, &start_time) == -1)
>      {
>        rc = errno;
> @@ -995,8 +1029,18 @@ debuginfod_query_server (debuginfod_client *c,
>  	    if (data[i].handle != target_handle)
>  	      curl_multi_remove_handle(curlm, data[i].handle);
>  	    else
> -	      committed_to = i;
> -	}
> +              {
> +	        committed_to = i;
> +                if (c->winning_headers == NULL)

The indenting here is off because of the mixing of spaces and tabs.

> +2021-08-04  Noah Sanci  <nsanci@redhat.com>
> +
> +	PR27277
> +	* debuginfod-find.1: Increasing verbosity describes the downloaded
> +	file.
> +	* debuginfod.8: Describe X-DEBUGINFOD-FILE, X-DEBUGINFOD-SIZE, and
> +	X-DEBUGINFOD-ARCHIVE.

Thanks for updating the docs. Looks good.

> diff --git a/tests/ChangeLog b/tests/ChangeLog
> index 1abe5456..23aeec4a 100644
> --- a/tests/ChangeLog
> +++ b/tests/ChangeLog
> @@ -106,11 +106,12 @@
>  	run-debuginfod-query-retry.sh,
>  	run-debuginfod-regex.sh,
>  	run-debuginfod-sizetime.sh,
> -	run-debuginfod-tmp-home.sh, and
> -	run-debuginfod-writable.sh
> -	run-debuginfod-x-forwarded-for.sh
> -	* tests/Makefile.am: Added the above new files to the test suite
> -	* tests/test-subr.sh: Added some general functions for above tests
> +	run-debuginfod-tmp-home.sh,
> +	run-debuginfod-writable.sh, and
> +	run-debuginfod-x-forwarded-for.sh.
> +	All of the above functions now use a 'base' variable to avoid races
> +	* Makefile.am: Added the above new files to the test suite
> +	* test-subr.sh: Added some general functions for above tests

These changes seem unrelated to this patch.

> +2021-08-02  Noah Sanci  <nsanci@redhat.com>
> +
> +	PR27277
> +	* run-debuginfod-response-headers.sh: Add a test to ensure that file descriptions
> +	are accurate for files outside or within archives.
> +	* Makefile.am: Add run-debuginfod-response-headers.sh to TESTS.

It also needs to be added to EXTRA_DIST.

> diff --git a/tests/run-debuginfod-fd-prefetch-caches.sh b/tests/run-debuginfod-fd-prefetch-caches.sh
> index 61fee9e9..bee88c1e 100755
> --- a/tests/run-debuginfod-fd-prefetch-caches.sh
> +++ b/tests/run-debuginfod-fd-prefetch-caches.sh
> @@ -51,9 +51,9 @@ grep 'prefetch fds ' vlog$PORT1 #$PREFETCH_FDS
>  grep 'prefetch mbs ' vlog$PORT1 #$PREFETCH_MBS
>  # search the vlog to find what metric counts should be and check the correct metrics
>  # were incrimented
> -wait_ready $PORT1 'fdcache_op_count{op="enqueue"}' $( grep -c 'interned.*front=1' vlog$PORT1 )
> -wait_ready $PORT1 'fdcache_op_count{op="evict"}' $( grep -c 'evicted a=.*' vlog$PORT1 )
> -wait_ready $PORT1 'fdcache_op_count{op="prefetch_enqueue"}' $( grep -c 'interned.*front=0' vlog$PORT1 )
> +wait_ready $PORT1 'fdcache_op_count{op="enqueue"}' $( grep -c 'interned.*front=1' vlog$PORT1 || true)
> +wait_ready $PORT1 'fdcache_op_count{op="evict"}' $( grep -c 'evicted a=.*' vlog$PORT1 || true )
> +wait_ready $PORT1 'fdcache_op_count{op="prefetch_enqueue"}' $( grep -c 'interned.*front=0' vlog$PORT1 || true )
>  wait_ready $PORT1 'fdcache_op_count{op="prefetch_evict"}' $( grep -c 'evicted from prefetch a=.*front=0' vlog$PORT1 || true )
>  
>  kill $PID1

This is an unrelated change.

> diff --git a/tests/run-debuginfod-response-headers.sh b/tests/run-debuginfod-response-headers.sh
> new file mode 100755
> index 00000000..a458ca1b
> --- /dev/null
> +++ b/tests/run-debuginfod-response-headers.sh
> @@ -0,0 +1,118 @@
> +#!/usr/bin/env bash
> +#
> +# Copyright (C) 2019-2021 Red Hat, Inc.
> +# This file is part of elfutils.
> +#
> +# This file is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# elfutils is distributed in the hope that it will be useful, but
> +# WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +. $srcdir/test-subr.sh  # includes set -e
> +
> +# for test case debugging, uncomment:
> +set -x
> +
> +DB=${PWD}/.debuginfod_tmp.sqlite
> +tempfiles $DB
> +export DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache
> +
> +# This variable is essential and ensures no time-race for claiming ports occurs
> +# set base to a unique multiple of 100 not used in any other 'run-debuginfod-*' test
> +base=9500
> +get_ports
> +mkdir F R
> +
> +env LD_LIBRARY_PATH=$ldpath DEBUGINFOD_URLS= ${abs_builddir}/../debuginfod/debuginfod $VERBOSE -F -R -d $DB -p $PORT1 -t0 -g0 -v R F > vlog$PORT1 2>&1 &
> +PID1=$!
> +tempfiles vlog$PORT1
> +errfiles vlog$PORT1
> +# Server must become ready
> +wait_ready $PORT1 'ready' 1
> +export DEBUGINFOD_URLS=http://127.0.0.1:$PORT1/   # or without trailing /
> +# Check thread comm names
> +ps -q $PID1 -e -L -o '%p %c %a' | grep groom
> +ps -q $PID1 -e -L -o '%p %c %a' | grep scan
> +ps -q $PID1 -e -L -o '%p %c %a' | grep traverse
> +
> +# We use -t0 and -g0 here to turn off time-based scanning & grooming.
> +# For testing purposes, we just sic SIGUSR1 / SIGUSR2 at the process.
> +
> +########################################################################
> +
> +# Compile a simple program, strip its debuginfo and save the build-id.
> +# Also move the debuginfo into another directory so that elfutils
> +# cannot find it without debuginfod.
> +echo "int main() { return 0; }" > ${PWD}/prog.c
> +tempfiles prog.c
> +# Create a subdirectory to confound source path names
> +mkdir foobar
> +gcc -Wl,--build-id -g -o prog ${PWD}/foobar///./../prog.c
> +testrun ${abs_top_builddir}/src/strip -g -f prog.debug ${PWD}/prog
> +BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \
> +          -a prog | grep 'Build ID' | cut -d ' ' -f 7`
> +
> +mv prog F
> +mv prog.debug F
> +kill -USR1 $PID1
> +# Wait till both files are in the index.
> +wait_ready $PORT1 'thread_work_total{role="traverse"}' 1
> +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0
> +wait_ready $PORT1 'thread_busy{role="scan"}' 0
> +
> +cp -rvp ${abs_srcdir}/debuginfod-rpms R
> +if [ "$zstd" = "false" ]; then  # nuke the zstd fedora 31 ones
> +    rm -vrf R/debuginfod-rpms/fedora31
> +fi
> +
> +kill -USR1 $PID1
> +# Wait till both files are in the index and scan/index fully finished
> +wait_ready $PORT1 'thread_work_total{role="traverse"}' 2
> +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0
> +wait_ready $PORT1 'thread_busy{role="scan"}' 0
> +# All rpms need to be in the index, except the dummy permission-000 one
> +rpms=$(find R -name \*rpm | grep -v nothing | wc -l)
> +wait_ready $PORT1 'scanned_files_total{source=".rpm archive"}' $rpms
> +kill -USR1 $PID1  # two hits of SIGUSR1 may be needed to resolve .debug->dwz->srefs
> +# Wait till both files are in the index and scan/index fully finished
> +wait_ready $PORT1 'thread_work_total{role="traverse"}' 3
> +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0
> +wait_ready $PORT1 'thread_busy{role="scan"}' 0

Is it really necessary to add all this if this is just a test to check
the new headers are sent?

> +########################################################################
> +## PR27277
> +# Make a simple request to the debuginfod server and check debuginfod-find's vlog to see if
> +# the custom HTTP headers are received.
> +rm -rf $DEBUGINFOD_CACHE_PATH
> +env DEBUGINFOD_URLS="http://127.0.0.1:"$PORT1 LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find\
> +    -vvv executable F/prog > vlog-find$PORT1.1 2>&1
> +tempfiles vlog-find$PORT1.1
> +grep 'Content-Length: ' vlog-find$PORT1.1
> +grep 'Connection: ' vlog-find$PORT1.1
> +grep 'Cache-Control: ' vlog-find$PORT1.1
> +grep 'X-DEBUGINFOD-FILE: ' vlog-find$PORT1.1
> +grep 'X-DEBUGINFOD-SIZE: ' vlog-find$PORT1.1
> +
> +# Check to see if an executable file located in an archive prints the file's description and archive
> +env DEBUGINFOD_URLS="http://127.0.0.1:"$PORT1 LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find\
> +    -vvv executable c36708a78618d597dee15d0dc989f093ca5f9120 > vlog-find$PORT1.2 2>&1
> +tempfiles vlog-find$PORT1.2
> +grep 'Content-Length: ' vlog-find$PORT1.2
> +grep 'Connection: ' vlog-find$PORT1.2
> +grep 'Cache-Control: ' vlog-find$PORT1.2
> +grep 'X-DEBUGINFOD-FILE: ' vlog-find$PORT1.2
> +grep 'X-DEBUGINFOD-SIZE: ' vlog-find$PORT1.2
> +grep 'X-DEBUGINFOD-ARCHIVE: ' vlog-find$PORT1.2
> +
> +kill $PID1
> +wait $PID1
> +PID1=0
> +exit 0


> diff --git a/tests/test-subr.sh b/tests/test-subr.sh
> index 41e95e31..95dcbb26 100644
> --- a/tests/test-subr.sh
> +++ b/tests/test-subr.sh
> @@ -43,9 +43,9 @@ remove_files=
>  exit_cleanup()
>  {
>    rm -rf ${PWD}/.client_cache*
> -  rm -f $remove_files; 
> +  rm -f $remove_files;
>    ls -a ${PWD}
> -  cd ..; 
> +  cd ..;
>    rmdir $test_dir
>  }
>  trap exit_cleanup 0
> @@ -228,7 +228,7 @@ cleanup()
>  {
>    if [ $PID1 -ne 0 ]; then kill $PID1; wait $PID1; fi
>    if [ $PID2 -ne 0 ]; then kill $PID2; wait $PID2; fi
> -  rm -rf ${PWD}/.client_cache F R D L Z ${PWD}/foobar ${PWD}/mocktree 
> +  rm -rf ${PWD}/.client_cache F R D L Z ${PWD}/foobar ${PWD}/mocktree
>    exit_cleanup
>  }
>  trap cleanup 0 1 2 3 5 9 15
> @@ -316,12 +316,12 @@ archive_test() {
>  
>  get_ports() {
>    while true; do
> -    PORT1=`expr '(' $RANDOM % 1000 ')' + 9000`
> +    PORT1=`expr '(' $RANDOM % 100 ')' + $base`
>      ss -atn | fgrep ":$PORT1" || break
>    done
>  # Some tests will use two servers, so assign the second var
>    while true; do
> -    PORT2=`expr '(' $RANDOM % 1000 ')' + 9000`
> +    PORT2=`expr '(' $RANDOM % 100 ')' + $base`
>      ss -atn | fgrep ":$PORT2" && $PORT1 -ne $PORT2 || break
>    done

These changes also seem unrelated.

Cheers,

Mark


  reply	other threads:[~2021-09-12 19:08 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-04 18:54 Noah Sanci
2021-08-05 15:13 ` Mark Wielaard
2021-08-05 16:54   ` Frank Ch. Eigler
2021-08-06 10:04     ` Mark Wielaard
2021-08-06 18:54       ` Frank Ch. Eigler
2021-08-09  9:25         ` Mark Wielaard
2021-08-23 15:11           ` Noah Sanci
2021-08-24  8:18             ` Mark Wielaard
2021-08-27 18:38               ` Noah Sanci
2021-09-08 20:56                 ` Mark Wielaard
2021-09-10 18:22                   ` Noah Sanci
2021-09-12 19:08                     ` Mark Wielaard [this message]
2021-09-13 20:07                       ` Noah Sanci
2021-09-16 10:50                         ` Mark Wielaard
2021-09-22 20:33           ` Frank Ch. Eigler
2021-09-29 14:55             ` Mark Wielaard
2021-09-29 21:28               ` Frank Ch. Eigler
2021-10-05 14:28                 ` Mark Wielaard
2022-07-14 15:32                   ` Noah Sanci
2022-08-04 13:12                     ` Mark Wielaard
     [not found]                       ` <CAJXA7qg09YkxK-NRQ31Hem0+54Us=jYC5+1siPSbHangx=SCow@mail.gmail.com>
2022-08-08 14:35                         ` Mark Wielaard
2021-08-25 18:08 Noah Sanci
2021-09-08 15:01 ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YT5QMUw8KbKkKi5X@wildebeest.org \
    --to=mark@klomp.org \
    --cc=elfutils-devel@sourceware.org \
    --cc=nsanci@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).