From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (wildebeest.demon.nl [212.238.236.112]) by sourceware.org (Postfix) with ESMTPS id E60673858D34 for ; Sun, 12 Sep 2021 17:24:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E60673858D34 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org Received: from reform (deer0x0b.wildebeest.org [172.31.17.141]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id B669130006CF; Sun, 12 Sep 2021 19:24:46 +0200 (CEST) Received: by reform (Postfix, from userid 1000) id 3F8F82E82172; Sun, 12 Sep 2021 19:24:46 +0200 (CEST) Date: Sun, 12 Sep 2021 19:24:46 +0200 From: Mark Wielaard To: Noah Sanci Cc: elfutils-devel@sourceware.org Subject: Re: [Bug debuginfod/28034] client-side %-escape url characters Message-ID: References: <20210826210213.GM416@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_BADIPHTTP, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Sep 2021 17:24:50 -0000 Hi Noah, On Thu, Sep 09, 2021 at 01:28:21PM -0400, Noah Sanci via Elfutils-devel wrote: > The attached patch %-escapes debuginfod url characters, then unescapes only > '/' characters. Previously characters such as '+' were not escaped and caused > improper escaping further on in handler_cb. > https://sourceware.org/bugzilla/show_bug.cgi?id=28034. > > On Wed, Sep 8, 2021 at 9:38 AM Mark Wielaard wrote: > > /* Initialize each handle. */ > > for (int i = 0; i < num_urls; i++) > > > > So you only need to escape once. You of course then need to make sure > > the escaped_string is freed after the loop. > Added > > > We already check that the first char is a '/'. It seems silly to curl > > escape that one and then unescape it again. So maybe curl_easy_escape > > (data[i].handle, filename + 1, 0) and then change the snprintf pattern > > to "%s/%s/%s/%s"? > > ^ the slash got readded here. > Added > > > The strlen inside the while loop can also be done outside and then > > calculated instead of running strlen on the tail every time. > Added Thanks. I do have one more performance nitpick (sorry). > + char *escaped_string; > + if (filename) > + escaped_string = curl_easy_escape(&target_handle, filename+1, 0); The escaped_string is created outside the loop and reused each time (good). But... > + > /* Initialize each handle. */ > for (int i = 0; i < num_urls; i++) > { > @@ -904,16 +908,23 @@ debuginfod_query_server (debuginfod_client *c, > if (filename) /* must start with / */ > { > /* PR28034 escape characters in completed url to %hh format. */ > - char *escaped_string; > - escaped_string = curl_easy_escape(data[i].handle, filename, 0); > + char *loc = escaped_string; > if (!escaped_string) > { > rc = -ENOMEM; > goto out2; > } This check, and... > + > + size_t escaped_strlen = strlen(escaped_string); > + while ((loc = strstr(loc, "%2F"))) > + { > + loc[0] = '/'; > + // pull the string back after replacement > + memmove(loc+1, loc+3,escaped_strlen - (loc - escaped_string + 2) ); > + escaped_strlen -=2; > + } the manipulation of the escaped_string, could both also be done outside the loop, since they always do the same thing. > snprintf(data[i].url, PATH_MAX, "%s/%s/%s/%s", server_url, > build_id_bytes, type, escaped_string); > - curl_free(escaped_string); > } > else > snprintf(data[i].url, PATH_MAX, "%s/%s/%s", server_url, build_id_bytes, type); > @@ -953,6 +964,7 @@ debuginfod_query_server (debuginfod_client *c, > curl_multi_add_handle(curlm, data[i].handle); > } > > + if (filename) curl_free(escaped_string); And released after the loop. Good. > /* Query servers in parallel. */ > if (vfd >= 0) > dprintf (vfd, "query %d urls in parallel\n", num_urls); > diff --git a/tests/ChangeLog b/tests/ChangeLog > index 85dca442..b84f420c 100644 > --- a/tests/ChangeLog > +++ b/tests/ChangeLog > @@ -132,11 +132,8 @@ > 2021-07-16 Noah Sanci > > PR28034 > - * run-debuginfod-find.sh: Added a test ensuring files with % > - escapable characters in their paths are accessible. The test > - itself is changing the name of a binary known previously as prog to > - p+r%o$g. General operations such as accessing p+r%o$g acts as the > - test for %-escape checking. > + * run-debuginfod-percent-escape.sh: Added a test ensuring files with % > + escapable characters in their paths are accessible. I think you should simply add a new ChangeLog entry at the top instead of changing an old existing one. And please do mention the Makefile.am (TESTS) and (EXTRA_DIST) addition. > 2021-07-21 Noah Sanci > > diff --git a/tests/Makefile.am b/tests/Makefile.am > index c586422e..ee17d3b1 100644 > --- a/tests/Makefile.am > +++ b/tests/Makefile.am > @@ -231,6 +231,7 @@ TESTS += run-debuginfod-dlopen.sh \ > run-debuginfod-federation-sqlite.sh \ > run-debuginfod-federation-link.sh \ > run-debuginfod-federation-metrics.sh \ > + run-debuginfod-percent-escape.sh \ > run-debuginfod-x-forwarded-for.sh > endif > endif > @@ -515,6 +516,7 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \ > run-debuginfod-archive-groom.sh \ > run-debuginfod-archive-rename.sh \ > run-debuginfod-archive-test.sh \ > + run-debuginfod-percent-escape.sh \ > debuginfod-rpms/fedora30/hello2-1.0-2.src.rpm \ > debuginfod-rpms/fedora30/hello2-1.0-2.x86_64.rpm \ > debuginfod-rpms/fedora30/hello2-debuginfo-1.0-2.x86_64.rpm \ > +env DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache DEBUGINFOD_URLS="http://127.0.0.1:$PORT1" \ > + LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find -vvv source F/p++r\$\#o^^g ${abs_builddir}/F/p++r\$\#o^^g.c > vlog1 2>&1 || true > +tempfiles vlog1 > +grep 'F/p%2B%2Br%24%23o%5E%5Eg.Ac' vlog1 OK, that is certainly a file name with lots of unexpected characters :) Thanks, Mark