From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (wildebeest.demon.nl [212.238.236.112]) by sourceware.org (Postfix) with ESMTPS id 0B146385842C for ; Wed, 8 Sep 2021 15:01:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0B146385842C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org Received: from tarox.wildebeest.org (83-87-18-245.cable.dynamic.v4.ziggo.nl [83.87.18.245]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id AC3DC300BC88; Wed, 8 Sep 2021 17:01:40 +0200 (CEST) Received: by tarox.wildebeest.org (Postfix, from userid 1000) id 27B5D4D40ADD; Wed, 8 Sep 2021 17:01:39 +0200 (CEST) Message-ID: <43ca882b5c80053fd53d9d9a5b5be6eb5fe8f30c.camel@klomp.org> Subject: Re: [Bug debuginfod/27277] Describe retrieved files when verbose From: Mark Wielaard To: Noah Sanci , elfutils-devel@sourceware.org Date: Wed, 08 Sep 2021 17:01:38 +0200 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Evolution 3.28.5 (3.28.5-10.el7) Mime-Version: 1.0 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Sep 2021 15:01:44 -0000 Hi Noah, On Wed, 2021-08-25 at 14:08 -0400, Noah Sanci via Elfutils-devel wrote: > There appear to exist use cases that intend to simply check for the > existence of content in a debuginfod server, without actually > downloading it. In HTTP land, the HEAD operation is the natural > expression of this. > Instead of implementing a HEAD/describe option, allow users, with > enough > verbosity, to print the HTTP response headers upon retrieving a file. > E.g output: >=20 > HTTP/1.1 200 OK > Connection: Keep-Alive > Content-Length: 2428240 > Cache-Control: public > Last-Modified: Sat, 15 May 2021 20:49:51 GMT > Content-Type: application/octet-stream > Date: Tue, 03 Aug 2021 18:50:36 GMT >=20 > Note: the new test is mostly compatible with the nsanci/test-fix > commit. When nsanci/test-fix merges with master I can resubmit the > patch with the test properly rebased and adjusted for maximum > efficiency. Please do resubmit the test now that we have separate debuginfod tests. > The HEAD functionality will (likely) be put into a new PR for > existing checking. Thanks. That is https://sourceware.org/bugzilla/show_bug.cgi?id=3D28284 > From 6351258d707337b69563d4be8effbb30fc42f784 Mon Sep 17 00:00:00 > 2001 > From: Noah Sanci > Date: Wed, 28 Jul 2021 14:46:05 -0400 > Subject: [PATCH] debuginfod: PR27277 - Describe retrieved files when > verbose >=20 > There appear to exist use cases that intend to simply check for the > existence of content in a debuginfod server, without actually > downloading it. In HTTP land, the HEAD operation is the natural > expression of this. > Instead of implementing a HEAD/describe option, Maybe drop this first part, it describes what is not implemented in this patch. > allow users, with enough > verbosity, to print the HTTP response headers upon retrieving a file. > E.g output: >=20 > HTTP/1.1 200 OK > Connection: Keep-Alive > Content-Length: 2428240 > Cache-Control: public > Last-Modified: Sat, 15 May 2021 20:49:51 GMT > Content-Type: application/octet-stream > Date: Tue, 03 Aug 2021 18:50:36 GMT >=20 > https://sourceware.org/bugzilla/show_bug.cgi?id=3D27277 >=20 > Signed-off-by: Noah Sanci > [...] > diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog > index 530f7dc7..cb9e50c7 100644 > --- a/debuginfod/ChangeLog > +++ b/debuginfod/ChangeLog > @@ -4,6 +4,17 @@ > * debuginfod.cxx (handler_cb): Fix after_you unique_set key > to the entire incoming URL. > =20 > +2021-08-02 Noah Sanci > + > + PR27277 > + * debuginfod-client.c (struct debuginfod_client): New field=20 > + winning_headers. It seems technically we don't need winning_headers at this point. But it will be useful when you implement the other PR. > + (struct handle_data): New field response_data. > + (header_callback): Store received headers in response_data. > + (debuginfod_query_server): Activate CURLOPT_HEADERFUNCTION. > + Save winning response_data. > + (debuginfod_end): free client winning headers. > + > [...] > +static size_t > +header_callback (char * buffer, size_t size, size_t numitems, void * use= rdata) > +{ > + if (size !=3D 1) > + return 0; > + /* Temporary buffer for realloc */ > + char *temp =3D NULL; > + size_t userlen =3D 0; > + if (*(char**)userdata =3D=3D NULL) > + { > + temp =3D malloc(numitems+1); > + if (temp =3D=3D NULL) > + return 0; > + memset(temp, '\0', numitems+1); > + } > + else > + { > + userlen =3D strlen(*(char**)userdata); > + temp =3D realloc(*(char**)userdata, userlen + numitems + 1); > + if (temp =3D=3D NULL) > + return 0; > + } > + strncat(temp, buffer, numitems); > + *(char**)userdata =3D temp; > + return numitems; > +} I am slightly confused about this function. Is the idea that this is given (part of) the headers and allocates memory for userdata to store the parts it wants/needs? If so, do we need to filter the headers here for those that interest us, or is that simply all headers? Do we need to protect against using too much memory? Currently we keep reallocing till we run out of memory, maybe set some predetermined limit? (64K of headers should be enough for anybody...) Why do you clear the temp memory the first time, but don't clear the end of the realloc data? Is the clearing of the whole block necessary given that the temp data is overwritten immediately with the buffer contents? > /* Query each of the server URLs found in $DEBUGINFOD_URLS for the file > with the specified build-id, type (debuginfo, executable or source) > and filename. filename may be NULL. If found, return a file > @@ -936,10 +966,13 @@ debuginfod_query_server (debuginfod_client *c, > curl_easy_setopt (data[i].handle, CURLOPT_LOW_SPEED_LIMIT, > 100 * 1024L); > } > + data[i].response_data =3D NULL; > curl_easy_setopt(data[i].handle, CURLOPT_FILETIME, (long) 1); > curl_easy_setopt(data[i].handle, CURLOPT_FOLLOWLOCATION, (long) 1)= ; > curl_easy_setopt(data[i].handle, CURLOPT_FAILONERROR, (long) 1); > curl_easy_setopt(data[i].handle, CURLOPT_NOSIGNAL, (long) 1); > + curl_easy_setopt(data[i].handle, CURLOPT_HEADERFUNCTION, header_ca= llback); > + curl_easy_setopt(data[i].handle, CURLOPT_HEADERDATA, (void *) &(da= ta[i].response_data)); Aha, ok, so this is how the headers are gotten and finally assigned to response_data.. > @@ -1161,10 +1205,10 @@ debuginfod_query_server (debuginfod_client *c, > { > char *effective_url =3D NULL; > long resp_code =3D 500; > - CURLcode ok1 =3D curl_easy_getinfo (target_handle, > + CURLcode ok1 =3D curl_easy_getinfo (msg->easy_handle, > CURLINFO_EFFECTIVE_URL, > &effective_url); > - CURLcode ok2 =3D curl_easy_getinfo (target_handle, > + CURLcode ok2 =3D curl_easy_getinfo (msg->easy_handle, > CURLINFO_RESPONSE_CODE, > &resp_code); > if(ok1 =3D=3D CURLE_OK && ok2 =3D=3D CURLE_OK && effec= tive_url) Why this change? > +2021-08-04 Noah Sanci > + > + PR27277 > + * debuginfod-find.1: Increasing verbosity describes the downloaded > + file. Thanks for updating the docs. I haven't reviewed the test case yet. Cheers, Mark