public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: Noah Sanci <nsanci@redhat.com>, elfutils-devel@sourceware.org
Subject: Re: [Bug debuginfod/27277] Describe retrieved files when verbose
Date: Wed, 08 Sep 2021 17:01:38 +0200	[thread overview]
Message-ID: <43ca882b5c80053fd53d9d9a5b5be6eb5fe8f30c.camel@klomp.org> (raw)
In-Reply-To: <CAJXA7qgt4oq+PpQsu4wCrruxvdE4GOp4eBzKWr0eYWj67cc24A@mail.gmail.com>

Hi Noah,

On Wed, 2021-08-25 at 14:08 -0400, Noah Sanci via Elfutils-devel wrote:
> There appear to exist use cases that intend to simply check for the
> existence of content in a debuginfod server, without actually
> downloading it.  In HTTP land, the HEAD operation is the natural
> expression of this.
> Instead of implementing a HEAD/describe option, allow users, with
> enough
> verbosity, to print the HTTP response headers upon retrieving a file.
> E.g output:
> 
> HTTP/1.1 200 OK
> Connection: Keep-Alive
> Content-Length: 2428240
> Cache-Control: public
> Last-Modified: Sat, 15 May 2021 20:49:51 GMT
> Content-Type: application/octet-stream
> Date: Tue, 03 Aug 2021 18:50:36 GMT
> 
> Note: the new test is mostly compatible with the nsanci/test-fix
> commit. When nsanci/test-fix merges with master I can resubmit the
> patch with the test properly rebased and adjusted for maximum
> efficiency.

Please do resubmit the test now that we have separate debuginfod tests.

> The HEAD functionality will (likely) be put into a new PR for
> existing checking.

Thanks. That is https://sourceware.org/bugzilla/show_bug.cgi?id=28284

> From 6351258d707337b69563d4be8effbb30fc42f784 Mon Sep 17 00:00:00
> 2001
> From: Noah Sanci <nsanci@redhat.com>
> Date: Wed, 28 Jul 2021 14:46:05 -0400
> Subject: [PATCH] debuginfod: PR27277 - Describe retrieved files when
> verbose
> 
> There appear to exist use cases that intend to simply check for the
> existence of content in a debuginfod server, without actually
> downloading it.  In HTTP land, the HEAD operation is the natural
> expression of this.
> Instead of implementing a HEAD/describe option,

Maybe drop this first part, it describes what is not implemented in
this patch.

>  allow users, with enough
> verbosity, to print the HTTP response headers upon retrieving a file.
> E.g output:
> 
> HTTP/1.1 200 OK
> Connection: Keep-Alive
> Content-Length: 2428240
> Cache-Control: public
> Last-Modified: Sat, 15 May 2021 20:49:51 GMT
> Content-Type: application/octet-stream
> Date: Tue, 03 Aug 2021 18:50:36 GMT
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=27277
> 
> Signed-off-by: Noah Sanci <nsanci@redhat.com>
> [...]
> diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
> index 530f7dc7..cb9e50c7 100644
> --- a/debuginfod/ChangeLog
> +++ b/debuginfod/ChangeLog
> @@ -4,6 +4,17 @@
>  	* debuginfod.cxx (handler_cb): Fix after_you unique_set key
>  	to the entire incoming URL.
>  
> +2021-08-02  Noah Sanci  <nsanci@redhat.com>
> +
> +	PR27277
> +	* debuginfod-client.c (struct debuginfod_client): New field 
> +	winning_headers.

It seems technically we don't need winning_headers at this point. But
it will be useful when you implement the other PR.

> +	(struct handle_data): New field response_data.
> +	(header_callback): Store received headers in response_data.
> +	(debuginfod_query_server): Activate CURLOPT_HEADERFUNCTION.
> +	Save winning response_data.
> +	(debuginfod_end): free client winning headers.
> +
> [...]
> +static size_t
> +header_callback (char * buffer, size_t size, size_t numitems, void * userdata)
> +{
> +  if (size != 1)
> +    return 0;
> +  /* Temporary buffer for realloc */
> +  char *temp = NULL;
> +  size_t userlen = 0;
> +  if (*(char**)userdata == NULL)
> +    {
> +      temp = malloc(numitems+1);
> +      if (temp == NULL)
> +        return 0;
> +      memset(temp, '\0', numitems+1);
> +    }
> +  else
> +    {
> +      userlen = strlen(*(char**)userdata);
> +      temp = realloc(*(char**)userdata, userlen + numitems + 1);
> +      if (temp == NULL)
> +       return 0;
> +    }
> +  strncat(temp, buffer, numitems);
> +  *(char**)userdata = temp;
> +  return numitems;
> +}

I am slightly confused about this function.
Is the idea that this is given (part of) the headers and allocates
memory for userdata to store the parts it wants/needs?

If so, do we need to filter the headers here for those that interest
us, or is that simply all headers?

Do we need to protect against using too much memory?
Currently we keep reallocing till we run out of memory, maybe set some
predetermined limit? (64K of headers should be enough for anybody...)

Why do you clear the temp memory the first time, but don't clear the
end of the realloc data? Is the clearing of the whole block necessary
given that the temp data is overwritten immediately with the buffer
contents?

>  /* Query each of the server URLs found in $DEBUGINFOD_URLS for the file
>     with the specified build-id, type (debuginfo, executable or source)
>     and filename. filename may be NULL. If found, return a file
> @@ -936,10 +966,13 @@ debuginfod_query_server (debuginfod_client *c,
>  	  curl_easy_setopt (data[i].handle, CURLOPT_LOW_SPEED_LIMIT,
>  			    100 * 1024L);
>  	}
> +      data[i].response_data = NULL;
>        curl_easy_setopt(data[i].handle, CURLOPT_FILETIME, (long) 1);
>        curl_easy_setopt(data[i].handle, CURLOPT_FOLLOWLOCATION, (long) 1);
>        curl_easy_setopt(data[i].handle, CURLOPT_FAILONERROR, (long) 1);
>        curl_easy_setopt(data[i].handle, CURLOPT_NOSIGNAL, (long) 1);
> +      curl_easy_setopt(data[i].handle, CURLOPT_HEADERFUNCTION, header_callback);
> +      curl_easy_setopt(data[i].handle, CURLOPT_HEADERDATA, (void *) &(data[i].response_data));

Aha, ok, so this is how the headers are gotten and finally assigned to
response_data..

> @@ -1161,10 +1205,10 @@ debuginfod_query_server (debuginfod_client *c,
>                  {
>                    char *effective_url = NULL;
>                    long resp_code = 500;
> -                  CURLcode ok1 = curl_easy_getinfo (target_handle,
> +                  CURLcode ok1 = curl_easy_getinfo (msg->easy_handle,
>  						    CURLINFO_EFFECTIVE_URL,
>  						    &effective_url);
> -                  CURLcode ok2 = curl_easy_getinfo (target_handle,
> +                  CURLcode ok2 = curl_easy_getinfo (msg->easy_handle,
>  						    CURLINFO_RESPONSE_CODE,
>  						    &resp_code);
>                    if(ok1 == CURLE_OK && ok2 == CURLE_OK && effective_url)

Why this change?

> +2021-08-04  Noah Sanci  <nsanci@redhat.com>
> +
> +	PR27277
> +	* debuginfod-find.1: Increasing verbosity describes the downloaded
> +	file.


Thanks for updating the docs.

I haven't reviewed the test case yet.

Cheers,

Mark

  reply	other threads:[~2021-09-08 15:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-25 18:08 Noah Sanci
2021-09-08 15:01 ` Mark Wielaard [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-08-04 18:54 Noah Sanci
2021-08-05 15:13 ` Mark Wielaard
2021-08-05 16:54   ` Frank Ch. Eigler
2021-08-06 10:04     ` Mark Wielaard
2021-08-06 18:54       ` Frank Ch. Eigler
2021-08-09  9:25         ` Mark Wielaard
2021-08-23 15:11           ` Noah Sanci
2021-08-24  8:18             ` Mark Wielaard
2021-08-27 18:38               ` Noah Sanci
2021-09-08 20:56                 ` Mark Wielaard
2021-09-10 18:22                   ` Noah Sanci
2021-09-12 19:08                     ` Mark Wielaard
2021-09-13 20:07                       ` Noah Sanci
2021-09-16 10:50                         ` Mark Wielaard
2021-09-22 20:33           ` Frank Ch. Eigler
2021-09-29 14:55             ` Mark Wielaard
2021-09-29 21:28               ` Frank Ch. Eigler
2021-10-05 14:28                 ` Mark Wielaard
2022-07-14 15:32                   ` Noah Sanci
2022-08-04 13:12                     ` Mark Wielaard
     [not found]                       ` <CAJXA7qg09YkxK-NRQ31Hem0+54Us=jYC5+1siPSbHangx=SCow@mail.gmail.com>
2022-08-08 14:35                         ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43ca882b5c80053fd53d9d9a5b5be6eb5fe8f30c.camel@klomp.org \
    --to=mark@klomp.org \
    --cc=elfutils-devel@sourceware.org \
    --cc=nsanci@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).