* [Bug debuginfod/28034] client-side %-escape url characters @ 2021-08-26 19:27 Noah Sanci 2021-08-26 21:02 ` Frank Ch. Eigler 0 siblings, 1 reply; 11+ messages in thread From: Noah Sanci @ 2021-08-26 19:27 UTC (permalink / raw) To: elfutils-devel [-- Attachment #1: Type: text/plain, Size: 781 bytes --] Hello, When requesting some source files, some URL-inconvenient chars sometimes pop up. Example from f33 libstdc++: /buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug/ gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/ libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/ condition_variable.cc As this URL is passed into debuginfod's handler_cb, it appears that the + signs are helpfully unescaped to spaces by libmicrohttpd, which 'course breaks everything. In order to ensure the server properly parses urls such as this one, %-escape characters on the client side so that the correct url is preserved and properly processed on the server side. This patch preserves '/'s in the url. Please find the patch attached. -Noah Sanci [-- Attachment #2: 0001-debuginfod-PR28034-client-side-escape-url-characters.patch --] [-- Type: text/x-patch, Size: 3869 bytes --] From 0bc0f3f04b1c058fe8495bd76c962ec9d3c901dd Mon Sep 17 00:00:00 2001 From: Noah Sanci <nsanci@redhat.com> Date: Fri, 16 Jul 2021 15:16:20 -0400 Subject: [PATCH] debuginfod: PR28034 - client-side %-escape url characters When requesting some source files, some URL-inconvenient chars sometimes pop up. Example from f33 libstdc++: /buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug/ gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/ libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/ condition_variable.cc As this URL is passed into debuginfod's handler_cb, it appears that the + signs are helpfully unescaped to spaces by libmicrohttpd, which 'course breaks everything. In order to ensure the server properly parses urls such as this one, %-escape characters on the client side so that the correct url is preserved and properly processed on the server side. https://sourceware.org/bugzilla/show_bug.cgi?id=28034 Signed-off-by: Noah Sanci <nsanci@redhat.com> --- debuginfod/debuginfod-client.c | 52 ++++++++++++++++++++++++++++------ 1 file changed, 44 insertions(+), 8 deletions(-) diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c index 7d4b220f..ed5943f3 100644 --- a/debuginfod/debuginfod-client.c +++ b/debuginfod/debuginfod-client.c @@ -904,16 +904,52 @@ debuginfod_query_server (debuginfod_client *c, if (filename) /* must start with / */ { /* PR28034 escape characters in completed url to %hh format. */ - char *escaped_string; - escaped_string = curl_easy_escape(data[i].handle, filename, 0); - if (!escaped_string) + char escaped_string[PATH_MAX] = {'\0'}; + char *loc = (char *) filename; + char *loc2; + char *tmp; + for(size_t j = 0; j < strlen(filename); ++j) { - rc = -ENOMEM; - goto out2; + loc2 = strstr(loc, "/"); + // If the first character is a '/' + if ( (unsigned long) loc2 - (unsigned long) loc <= 0) + { + strcat(escaped_string, "/"); + loc ++; + } + // If we have reached the end of the path, there won't be a / + // but we still need to process one more string (the file's actual name) + else if (loc2 == NULL) + { + tmp = curl_easy_escape(data[i].handle, loc, 0); + if (!tmp) + { + rc = -ENOMEM; + goto out1; + } + strcat(escaped_string, tmp); + curl_free(tmp); + break; + } + // The default case, when there is a string surrounded by '/'s + else + { + // The third argument to curl_easy_escape isolated the string between the + // '/'s + tmp = curl_easy_escape(data[i].handle, loc, (unsigned long) loc2 - (unsigned long)loc); + if (!tmp) + { + rc = -ENOMEM; + goto out1; + } + strcat(escaped_string, tmp); + strcat(escaped_string, "/"); + curl_free(tmp); + loc = loc2+1; + } } - snprintf(data[i].url, PATH_MAX, "%s/%s/%s/%s", server_url, - build_id_bytes, type, escaped_string); - curl_free(escaped_string); + snprintf(data[i].url, PATH_MAX, "%s/%s/%s%s", server_url, + build_id_bytes, type, escaped_string); } else snprintf(data[i].url, PATH_MAX, "%s/%s/%s", server_url, build_id_bytes, type); -- 2.31.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug debuginfod/28034] client-side %-escape url characters 2021-08-26 19:27 [Bug debuginfod/28034] client-side %-escape url characters Noah Sanci @ 2021-08-26 21:02 ` Frank Ch. Eigler 2021-08-27 14:44 ` Noah Sanci 0 siblings, 1 reply; 11+ messages in thread From: Frank Ch. Eigler @ 2021-08-26 21:02 UTC (permalink / raw) To: Noah Sanci; +Cc: elfutils-devel Hi - > /* PR28034 escape characters in completed url to %hh format. */ > - char *escaped_string; > - escaped_string = curl_easy_escape(data[i].handle, filename, 0); > - if (!escaped_string) > + char escaped_string[PATH_MAX] = {'\0'}; > + char *loc = (char *) filename; > + char *loc2; > + char *tmp; > + for(size_t j = 0; j < strlen(filename); ++j) > { > - rc = -ENOMEM; > - goto out2; > + loc2 = strstr(loc, "/"); > + // If the first character is a '/' > [...] Holy cow that's a lot of work to do it this way. A couple of alternatives: - ditch curl_easy_escape :-( and use a malloc(strlen(x)*3) byte-by-byte copy from source string into destination if not [a-zA-Z0-9/.~] then %-escape or: - keep curl_easy_escape and postprocess byte-by-byte examine the result of curl_easy_escape - if seeing a "%2F", replace the % with a / and memmove the rest of the string 2 bytes ahead It shouldn't need strtok or strstr or a lot of logic or stuff like that really. - FChE ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug debuginfod/28034] client-side %-escape url characters 2021-08-26 21:02 ` Frank Ch. Eigler @ 2021-08-27 14:44 ` Noah Sanci 2021-08-27 15:07 ` Noah Sanci 0 siblings, 1 reply; 11+ messages in thread From: Noah Sanci @ 2021-08-27 14:44 UTC (permalink / raw) To: Frank Ch. Eigler; +Cc: elfutils-devel [-- Attachment #1: Type: text/plain, Size: 1382 bytes --] Hello, Here is an updated patch, using memmove. Much smaller. Thanks for the suggestions, Noah Sanci On Thu, Aug 26, 2021 at 5:02 PM Frank Ch. Eigler <fche@redhat.com> wrote: > > Hi - > > > /* PR28034 escape characters in completed url to %hh format. */ > > - char *escaped_string; > > - escaped_string = curl_easy_escape(data[i].handle, filename, 0); > > - if (!escaped_string) > > + char escaped_string[PATH_MAX] = {'\0'}; > > + char *loc = (char *) filename; > > + char *loc2; > > + char *tmp; > > + for(size_t j = 0; j < strlen(filename); ++j) > > { > > - rc = -ENOMEM; > > - goto out2; > > + loc2 = strstr(loc, "/"); > > + // If the first character is a '/' > > [...] > > Holy cow that's a lot of work to do it this way. > A couple of alternatives: > > - ditch curl_easy_escape :-( and use a > malloc(strlen(x)*3) > byte-by-byte copy from source string into destination > if not [a-zA-Z0-9/.~] then %-escape > > or: > - keep curl_easy_escape and postprocess > byte-by-byte examine the result of curl_easy_escape > - if seeing a "%2F", replace the % with a / and memmove the > rest of the string 2 bytes ahead > > It shouldn't need strtok or strstr or a lot of logic or stuff like > that really. > > - FChE > [-- Attachment #2: 0001-debuginfod-PR28034-client-side-escape-url-characters.patch --] [-- Type: text/x-patch, Size: 2407 bytes --] From de7e50955dba711aeee33196bf2bfea3c47696f7 Mon Sep 17 00:00:00 2001 From: Noah Sanci <nsanci@redhat.com> Date: Fri, 16 Jul 2021 15:16:20 -0400 Subject: [PATCH] debuginfod: PR28034 - client-side %-escape url characters When requesting some source files, some URL-inconvenient chars sometimes pop up. Example from f33 libstdc++: /buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug/ gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/ libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/ condition_variable.cc As this URL is passed into debuginfod's handler_cb, it appears that the + signs are helpfully unescaped to spaces by libmicrohttpd, which 'course breaks everything. In order to ensure the server properly parses urls such as this one, %-escape characters on the client side so that the correct url is preserved and properly processed on the server side. https://sourceware.org/bugzilla/show_bug.cgi?id=28034 Signed-off-by: Noah Sanci <nsanci@redhat.com> --- debuginfod/debuginfod-client.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c index 7d4b220f..eb49b583 100644 --- a/debuginfod/debuginfod-client.c +++ b/debuginfod/debuginfod-client.c @@ -905,13 +905,25 @@ debuginfod_query_server (debuginfod_client *c, { /* PR28034 escape characters in completed url to %hh format. */ char *escaped_string; + char *loc; escaped_string = curl_easy_escape(data[i].handle, filename, 0); if (!escaped_string) { rc = -ENOMEM; goto out2; } - snprintf(data[i].url, PATH_MAX, "%s/%s/%s/%s", server_url, + + loc = strstr(escaped_string, "%2F"); + if (loc != NULL) + do + { + loc[0] = '/'; + // pull the string back after replacement + memmove(loc+1,loc+3,strlen(loc+3)); + escaped_string[strlen(escaped_string) - 1] = '\0'; + escaped_string[strlen(escaped_string) - 1] = '\0'; + } while( (loc = strstr(loc, "%2F")) ); + snprintf(data[i].url, PATH_MAX, "%s/%s/%s%s", server_url, build_id_bytes, type, escaped_string); curl_free(escaped_string); } -- 2.31.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug debuginfod/28034] client-side %-escape url characters 2021-08-27 14:44 ` Noah Sanci @ 2021-08-27 15:07 ` Noah Sanci 2021-08-27 15:30 ` Noah Sanci 0 siblings, 1 reply; 11+ messages in thread From: Noah Sanci @ 2021-08-27 15:07 UTC (permalink / raw) To: Frank Ch. Eigler; +Cc: elfutils-devel [-- Attachment #1: Type: text/plain, Size: 1615 bytes --] Hello Update 2, no longer append nulls unnecessarily. -Noah Sanci On Fri, Aug 27, 2021 at 10:44 AM Noah Sanci <nsanci@redhat.com> wrote: > > Hello, > > Here is an updated patch, using memmove. Much smaller. > > Thanks for the suggestions, > > Noah Sanci > > > On Thu, Aug 26, 2021 at 5:02 PM Frank Ch. Eigler <fche@redhat.com> wrote: > > > > Hi - > > > > > /* PR28034 escape characters in completed url to %hh format. */ > > > - char *escaped_string; > > > - escaped_string = curl_easy_escape(data[i].handle, filename, 0); > > > - if (!escaped_string) > > > + char escaped_string[PATH_MAX] = {'\0'}; > > > + char *loc = (char *) filename; > > > + char *loc2; > > > + char *tmp; > > > + for(size_t j = 0; j < strlen(filename); ++j) > > > { > > > - rc = -ENOMEM; > > > - goto out2; > > > + loc2 = strstr(loc, "/"); > > > + // If the first character is a '/' > > > [...] > > > > Holy cow that's a lot of work to do it this way. > > A couple of alternatives: > > > > - ditch curl_easy_escape :-( and use a > > malloc(strlen(x)*3) > > byte-by-byte copy from source string into destination > > if not [a-zA-Z0-9/.~] then %-escape > > > > or: > > - keep curl_easy_escape and postprocess > > byte-by-byte examine the result of curl_easy_escape > > - if seeing a "%2F", replace the % with a / and memmove the > > rest of the string 2 bytes ahead > > > > It shouldn't need strtok or strstr or a lot of logic or stuff like > > that really. > > > > - FChE > > [-- Attachment #2: 0001-debuginfod-PR28034-client-side-escape-url-characters.patch --] [-- Type: text/x-patch, Size: 2268 bytes --] From f5c7c00c76b200675556a0ecc6bd8a5fdc7a30ea Mon Sep 17 00:00:00 2001 From: Noah Sanci <nsanci@redhat.com> Date: Fri, 16 Jul 2021 15:16:20 -0400 Subject: [PATCH] debuginfod: PR28034 - client-side %-escape url characters When requesting some source files, some URL-inconvenient chars sometimes pop up. Example from f33 libstdc++: /buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug/ gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/ libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/ condition_variable.cc As this URL is passed into debuginfod's handler_cb, it appears that the + signs are helpfully unescaped to spaces by libmicrohttpd, which 'course breaks everything. In order to ensure the server properly parses urls such as this one, %-escape characters on the client side so that the correct url is preserved and properly processed on the server side. https://sourceware.org/bugzilla/show_bug.cgi?id=28034 Signed-off-by: Noah Sanci <nsanci@redhat.com> --- debuginfod/debuginfod-client.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c index 7d4b220f..6db82f79 100644 --- a/debuginfod/debuginfod-client.c +++ b/debuginfod/debuginfod-client.c @@ -905,13 +905,23 @@ debuginfod_query_server (debuginfod_client *c, { /* PR28034 escape characters in completed url to %hh format. */ char *escaped_string; + char *loc; escaped_string = curl_easy_escape(data[i].handle, filename, 0); if (!escaped_string) { rc = -ENOMEM; goto out2; } - snprintf(data[i].url, PATH_MAX, "%s/%s/%s/%s", server_url, + + loc = strstr(escaped_string, "%2F"); + if (loc != NULL) + while( (loc = strstr(loc, "%2F")) ) + { + loc[0] = '/'; + // pull the string back after replacement + memmove(loc+1,loc+3,strlen(loc+3)+1); + } + snprintf(data[i].url, PATH_MAX, "%s/%s/%s%s", server_url, build_id_bytes, type, escaped_string); curl_free(escaped_string); } -- 2.31.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug debuginfod/28034] client-side %-escape url characters 2021-08-27 15:07 ` Noah Sanci @ 2021-08-27 15:30 ` Noah Sanci 2021-09-08 13:38 ` Mark Wielaard 0 siblings, 1 reply; 11+ messages in thread From: Noah Sanci @ 2021-08-27 15:30 UTC (permalink / raw) To: Frank Ch. Eigler; +Cc: elfutils-devel [-- Attachment #1: Type: text/plain, Size: 1864 bytes --] Hello, Next update, removed redundant if statement. -Noah Sanci On Fri, Aug 27, 2021 at 11:07 AM Noah Sanci <nsanci@redhat.com> wrote: > > Hello > > Update 2, no longer append nulls unnecessarily. > > -Noah Sanci > > On Fri, Aug 27, 2021 at 10:44 AM Noah Sanci <nsanci@redhat.com> wrote: > > > > Hello, > > > > Here is an updated patch, using memmove. Much smaller. > > > > Thanks for the suggestions, > > > > Noah Sanci > > > > > > On Thu, Aug 26, 2021 at 5:02 PM Frank Ch. Eigler <fche@redhat.com> wrote: > > > > > > Hi - > > > > > > > /* PR28034 escape characters in completed url to %hh format. */ > > > > - char *escaped_string; > > > > - escaped_string = curl_easy_escape(data[i].handle, filename, 0); > > > > - if (!escaped_string) > > > > + char escaped_string[PATH_MAX] = {'\0'}; > > > > + char *loc = (char *) filename; > > > > + char *loc2; > > > > + char *tmp; > > > > + for(size_t j = 0; j < strlen(filename); ++j) > > > > { > > > > - rc = -ENOMEM; > > > > - goto out2; > > > > + loc2 = strstr(loc, "/"); > > > > + // If the first character is a '/' > > > > [...] > > > > > > Holy cow that's a lot of work to do it this way. > > > A couple of alternatives: > > > > > > - ditch curl_easy_escape :-( and use a > > > malloc(strlen(x)*3) > > > byte-by-byte copy from source string into destination > > > if not [a-zA-Z0-9/.~] then %-escape > > > > > > or: > > > - keep curl_easy_escape and postprocess > > > byte-by-byte examine the result of curl_easy_escape > > > - if seeing a "%2F", replace the % with a / and memmove the > > > rest of the string 2 bytes ahead > > > > > > It shouldn't need strtok or strstr or a lot of logic or stuff like > > > that really. > > > > > > - FChE > > > [-- Attachment #2: 0001-debuginfod-PR28034-client-side-escape-url-characters.patch --] [-- Type: text/x-patch, Size: 2188 bytes --] From a27c9a82546b040fbb8e5263003fd8fb33f1f1e0 Mon Sep 17 00:00:00 2001 From: Noah Sanci <nsanci@redhat.com> Date: Fri, 16 Jul 2021 15:16:20 -0400 Subject: [PATCH] debuginfod: PR28034 - client-side %-escape url characters When requesting some source files, some URL-inconvenient chars sometimes pop up. Example from f33 libstdc++: /buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug/ gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/ libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/ condition_variable.cc As this URL is passed into debuginfod's handler_cb, it appears that the + signs are helpfully unescaped to spaces by libmicrohttpd, which 'course breaks everything. In order to ensure the server properly parses urls such as this one, %-escape characters on the client side so that the correct url is preserved and properly processed on the server side. https://sourceware.org/bugzilla/show_bug.cgi?id=28034 Signed-off-by: Noah Sanci <nsanci@redhat.com> --- debuginfod/debuginfod-client.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c index 7d4b220f..73960424 100644 --- a/debuginfod/debuginfod-client.c +++ b/debuginfod/debuginfod-client.c @@ -906,12 +906,20 @@ debuginfod_query_server (debuginfod_client *c, /* PR28034 escape characters in completed url to %hh format. */ char *escaped_string; escaped_string = curl_easy_escape(data[i].handle, filename, 0); + char *loc = escaped_string; if (!escaped_string) { rc = -ENOMEM; goto out2; } - snprintf(data[i].url, PATH_MAX, "%s/%s/%s/%s", server_url, + + while( (loc = strstr(loc, "%2F")) ) + { + loc[0] = '/'; + // pull the string back after replacement + memmove(loc+1,loc+3,strlen(loc+3)+1); + } + snprintf(data[i].url, PATH_MAX, "%s/%s/%s%s", server_url, build_id_bytes, type, escaped_string); curl_free(escaped_string); } -- 2.31.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug debuginfod/28034] client-side %-escape url characters 2021-08-27 15:30 ` Noah Sanci @ 2021-09-08 13:38 ` Mark Wielaard 2021-09-09 17:28 ` Noah Sanci 0 siblings, 1 reply; 11+ messages in thread From: Mark Wielaard @ 2021-09-08 13:38 UTC (permalink / raw) To: Noah Sanci, Frank Ch. Eigler; +Cc: elfutils-devel Hi Noah, This looks good, but the description is not really what the patch does. It is part of the PR28034 fix, but repeating that problem statement doesn't really make clear what this patch is about. Could explain in the commit message that curl_easy_escape () also escapes '/' to '%2F' and that we want to reverse that back to '/'? Also some (tiny) optimization hints. This can be done outside/before the loop /* Initialize each handle. */ for (int i = 0; i < num_urls; i++) So you only need to escape once. You of course then need to make sure the escaped_string is freed after the loop. We already check that the first char is a '/'. It seems silly to curl escape that one and then unescape it again. So maybe curl_easy_escape (data[i].handle, filename + 1, 0) and then change the snprintf pattern to "%s/%s/%s/%s"? ^ the slash got readded here. The strlen inside the while loop can also be done outside and then calculated instead of running strlen on the tail every time. size_t escaped_strlen = strlen (escaped_string); while( (loc = strstr(loc, "%2F")) ) { loc[0] = '/'; // pull the string back after replacement escaped_strlen -= (loc + 3) - escaped_string); memmove(loc+1, loc+3, escaped_strlen + 1); } (Note, not even compiled, so I might be wrong) Lastly I assume there are already testcases that cover this functionality? Just wanting to know how you tested it. Thanks, Mark ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug debuginfod/28034] client-side %-escape url characters 2021-09-08 13:38 ` Mark Wielaard @ 2021-09-09 17:28 ` Noah Sanci 2021-09-12 17:24 ` Mark Wielaard 0 siblings, 1 reply; 11+ messages in thread From: Noah Sanci @ 2021-09-09 17:28 UTC (permalink / raw) To: Mark Wielaard; +Cc: Frank Ch. Eigler, elfutils-devel [-- Attachment #1: Type: text/plain, Size: 1251 bytes --] Hello, The attached patch %-escapes debuginfod url characters, then unescapes only '/' characters. Previously characters such as '+' were not escaped and caused improper escaping further on in handler_cb. https://sourceware.org/bugzilla/show_bug.cgi?id=28034. On Wed, Sep 8, 2021 at 9:38 AM Mark Wielaard <mark@klomp.org> wrote: > /* Initialize each handle. */ > for (int i = 0; i < num_urls; i++) > > So you only need to escape once. You of course then need to make sure > the escaped_string is freed after the loop. Added > We already check that the first char is a '/'. It seems silly to curl > escape that one and then unescape it again. So maybe curl_easy_escape > (data[i].handle, filename + 1, 0) and then change the snprintf pattern > to "%s/%s/%s/%s"? > ^ the slash got readded here. Added > The strlen inside the while loop can also be done outside and then > calculated instead of running strlen on the tail every time. Added > Lastly I assume there are already testcases that cover this > functionality? Just wanting to know how you tested it. Previously, with run-debuginfod-find.sh the test was embedded within other tests. Now the test is independent and has been added to the list of TESTS. Regards, Noah Sanci [-- Attachment #2: 0001-debuginfod-PR28034-Percent-escape-debuginfod-urls.patch --] [-- Type: text/x-patch, Size: 7527 bytes --] From a080fd3feb79d601d86c54cf479562d6d5bed395 Mon Sep 17 00:00:00 2001 From: Noah Sanci <nsanci@redhat.com> Date: Thu, 9 Sep 2021 13:10:33 -0400 Subject: [PATCH] debuginfod: PR28034 - Percent escape debuginfod urls When requesting some source files, some URL-inconvenient chars sometimes pop up, including but not limited to '+', '^', and '&'. Example from f33 libstdc++: /buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug /gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/ libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/ condition_variable.cc As this URL is passed into debuginfod's handler_cb, it appears that the + signs are helpfully unescaped to spaces by libmicrohttpd, which 'course breaks everything. To combat this, before querying the debuginfod daemon, clients now % escape the source filename. This converts many alphanumeric characters into their %-code format, including '/' to %2F. We want to preserve the '/' in the url, so after conversion replace %2Fs with a '/'. https://sourceware.org/bugzilla/show_bug.cgi?id=28034 Signed-off-by: Noah Sanci <nsanci@redhat.com> --- debuginfod/debuginfod-client.c | 18 ++++++-- tests/ChangeLog | 7 +-- tests/Makefile.am | 2 + tests/run-debuginfod-percent-escape.sh | 60 ++++++++++++++++++++++++++ 4 files changed, 79 insertions(+), 8 deletions(-) create mode 100755 tests/run-debuginfod-percent-escape.sh diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c index d41723ce..77ca844b 100644 --- a/debuginfod/debuginfod-client.c +++ b/debuginfod/debuginfod-client.c @@ -883,6 +883,10 @@ debuginfod_query_server (debuginfod_client *c, data[i].errbuf[0] = '\0'; } + char *escaped_string; + if (filename) + escaped_string = curl_easy_escape(&target_handle, filename+1, 0); + /* Initialize each handle. */ for (int i = 0; i < num_urls; i++) { @@ -904,16 +908,23 @@ debuginfod_query_server (debuginfod_client *c, if (filename) /* must start with / */ { /* PR28034 escape characters in completed url to %hh format. */ - char *escaped_string; - escaped_string = curl_easy_escape(data[i].handle, filename, 0); + char *loc = escaped_string; if (!escaped_string) { rc = -ENOMEM; goto out2; } + + size_t escaped_strlen = strlen(escaped_string); + while ((loc = strstr(loc, "%2F"))) + { + loc[0] = '/'; + // pull the string back after replacement + memmove(loc+1, loc+3,escaped_strlen - (loc - escaped_string + 2) ); + escaped_strlen -=2; + } snprintf(data[i].url, PATH_MAX, "%s/%s/%s/%s", server_url, build_id_bytes, type, escaped_string); - curl_free(escaped_string); } else snprintf(data[i].url, PATH_MAX, "%s/%s/%s", server_url, build_id_bytes, type); @@ -953,6 +964,7 @@ debuginfod_query_server (debuginfod_client *c, curl_multi_add_handle(curlm, data[i].handle); } + if (filename) curl_free(escaped_string); /* Query servers in parallel. */ if (vfd >= 0) dprintf (vfd, "query %d urls in parallel\n", num_urls); diff --git a/tests/ChangeLog b/tests/ChangeLog index 85dca442..b84f420c 100644 --- a/tests/ChangeLog +++ b/tests/ChangeLog @@ -132,11 +132,8 @@ 2021-07-16 Noah Sanci <nsanci@redhat.com> PR28034 - * run-debuginfod-find.sh: Added a test ensuring files with % - escapable characters in their paths are accessible. The test - itself is changing the name of a binary known previously as prog to - p+r%o$g. General operations such as accessing p+r%o$g acts as the - test for %-escape checking. + * run-debuginfod-percent-escape.sh: Added a test ensuring files with % + escapable characters in their paths are accessible. 2021-07-21 Noah Sanci <nsanci@redhat.com> diff --git a/tests/Makefile.am b/tests/Makefile.am index c586422e..ee17d3b1 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -231,6 +231,7 @@ TESTS += run-debuginfod-dlopen.sh \ run-debuginfod-federation-sqlite.sh \ run-debuginfod-federation-link.sh \ run-debuginfod-federation-metrics.sh \ + run-debuginfod-percent-escape.sh \ run-debuginfod-x-forwarded-for.sh endif endif @@ -515,6 +516,7 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \ run-debuginfod-archive-groom.sh \ run-debuginfod-archive-rename.sh \ run-debuginfod-archive-test.sh \ + run-debuginfod-percent-escape.sh \ debuginfod-rpms/fedora30/hello2-1.0-2.src.rpm \ debuginfod-rpms/fedora30/hello2-1.0-2.x86_64.rpm \ debuginfod-rpms/fedora30/hello2-debuginfo-1.0-2.x86_64.rpm \ diff --git a/tests/run-debuginfod-percent-escape.sh b/tests/run-debuginfod-percent-escape.sh new file mode 100755 index 00000000..f7d8dc66 --- /dev/null +++ b/tests/run-debuginfod-percent-escape.sh @@ -0,0 +1,60 @@ +#!/usr/bin/env bash +# +# Copyright (C) 2019-2021 Red Hat, Inc. +# This file is part of elfutils. +# +# This file is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# elfutils is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +. $srcdir/debuginfod-subr.sh # includes set -e +# for test case debugging, uncomment: +set -x +unset VALGRIND_CMD +# This variable is essential and ensures no time-race for claiming ports occurs +# set base to a unique multiple of 100 not used in any other 'run-debuginfod-*' test +base=10000 +get_ports +DB=${PWD}/.debuginfod_tmp.sqlite +tempfiles $DB +mkdir F +env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../debuginfod/debuginfod $VERBOSE \ + -F -R -d $DB -p $PORT1 -t0 -g0 -v R ${PWD}/F > vlog$PORT1 2>&1 & +PID1=$! +tempfiles vlog$PORT1 +errfiles vlog$PORT1 +# Server must become ready +wait_ready $PORT1 'ready' 1 +# Be patient when run on a busy machine things might take a bit. + +# Build a non-stripped binary +echo "int main() { return 0; }" > ${PWD}/F/p++r\$\#o^^g.c +gcc -Wl,--build-id -g -o ${PWD}/F/p++r\$\#o^^g ${PWD}/F/p++r\$\#o^^g.c +BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \ + -a ${PWD}/F/p++r\\$\#o^^g | grep 'Build ID' | cut -d ' ' -f 7` +tempfiles ${PWD}/F/p++r\$\#o^^g.c ${PWD}/F/p++r\$\#o^^g +kill -USR1 $PID1 +# Now there should be 1 files in the index +wait_ready $PORT1 'thread_work_total{role="traverse"}' 1 +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0 +wait_ready $PORT1 'thread_busy{role="scan"}' 0 +rm -rf $DEBUGINFOD_CACHE_PATH # clean it from previous tests +ls F +env DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache DEBUGINFOD_URLS="http://127.0.0.1:$PORT1" \ + LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find -vvv source F/p++r\$\#o^^g ${abs_builddir}/F/p++r\$\#o^^g.c > vlog1 2>&1 || true +tempfiles vlog1 +grep 'F/p%2B%2Br%24%23o%5E%5Eg.c' vlog1 + +kill $PID1 +wait $PID1 +PID1=0 +exit 0 -- 2.31.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug debuginfod/28034] client-side %-escape url characters 2021-09-09 17:28 ` Noah Sanci @ 2021-09-12 17:24 ` Mark Wielaard 2021-09-13 16:20 ` Noah Sanci 0 siblings, 1 reply; 11+ messages in thread From: Mark Wielaard @ 2021-09-12 17:24 UTC (permalink / raw) To: Noah Sanci; +Cc: elfutils-devel Hi Noah, On Thu, Sep 09, 2021 at 01:28:21PM -0400, Noah Sanci via Elfutils-devel wrote: > The attached patch %-escapes debuginfod url characters, then unescapes only > '/' characters. Previously characters such as '+' were not escaped and caused > improper escaping further on in handler_cb. > https://sourceware.org/bugzilla/show_bug.cgi?id=28034. > > On Wed, Sep 8, 2021 at 9:38 AM Mark Wielaard <mark@klomp.org> wrote: > > /* Initialize each handle. */ > > for (int i = 0; i < num_urls; i++) > > > > So you only need to escape once. You of course then need to make sure > > the escaped_string is freed after the loop. > Added > > > We already check that the first char is a '/'. It seems silly to curl > > escape that one and then unescape it again. So maybe curl_easy_escape > > (data[i].handle, filename + 1, 0) and then change the snprintf pattern > > to "%s/%s/%s/%s"? > > ^ the slash got readded here. > Added > > > The strlen inside the while loop can also be done outside and then > > calculated instead of running strlen on the tail every time. > Added Thanks. I do have one more performance nitpick (sorry). > + char *escaped_string; > + if (filename) > + escaped_string = curl_easy_escape(&target_handle, filename+1, 0); The escaped_string is created outside the loop and reused each time (good). But... > + > /* Initialize each handle. */ > for (int i = 0; i < num_urls; i++) > { > @@ -904,16 +908,23 @@ debuginfod_query_server (debuginfod_client *c, > if (filename) /* must start with / */ > { > /* PR28034 escape characters in completed url to %hh format. */ > - char *escaped_string; > - escaped_string = curl_easy_escape(data[i].handle, filename, 0); > + char *loc = escaped_string; > if (!escaped_string) > { > rc = -ENOMEM; > goto out2; > } This check, and... > + > + size_t escaped_strlen = strlen(escaped_string); > + while ((loc = strstr(loc, "%2F"))) > + { > + loc[0] = '/'; > + // pull the string back after replacement > + memmove(loc+1, loc+3,escaped_strlen - (loc - escaped_string + 2) ); > + escaped_strlen -=2; > + } the manipulation of the escaped_string, could both also be done outside the loop, since they always do the same thing. > snprintf(data[i].url, PATH_MAX, "%s/%s/%s/%s", server_url, > build_id_bytes, type, escaped_string); > - curl_free(escaped_string); > } > else > snprintf(data[i].url, PATH_MAX, "%s/%s/%s", server_url, build_id_bytes, type); > @@ -953,6 +964,7 @@ debuginfod_query_server (debuginfod_client *c, > curl_multi_add_handle(curlm, data[i].handle); > } > > + if (filename) curl_free(escaped_string); And released after the loop. Good. > /* Query servers in parallel. */ > if (vfd >= 0) > dprintf (vfd, "query %d urls in parallel\n", num_urls); > diff --git a/tests/ChangeLog b/tests/ChangeLog > index 85dca442..b84f420c 100644 > --- a/tests/ChangeLog > +++ b/tests/ChangeLog > @@ -132,11 +132,8 @@ > 2021-07-16 Noah Sanci <nsanci@redhat.com> > > PR28034 > - * run-debuginfod-find.sh: Added a test ensuring files with % > - escapable characters in their paths are accessible. The test > - itself is changing the name of a binary known previously as prog to > - p+r%o$g. General operations such as accessing p+r%o$g acts as the > - test for %-escape checking. > + * run-debuginfod-percent-escape.sh: Added a test ensuring files with % > + escapable characters in their paths are accessible. I think you should simply add a new ChangeLog entry at the top instead of changing an old existing one. And please do mention the Makefile.am (TESTS) and (EXTRA_DIST) addition. > 2021-07-21 Noah Sanci <nsanci@redhat.com> > > diff --git a/tests/Makefile.am b/tests/Makefile.am > index c586422e..ee17d3b1 100644 > --- a/tests/Makefile.am > +++ b/tests/Makefile.am > @@ -231,6 +231,7 @@ TESTS += run-debuginfod-dlopen.sh \ > run-debuginfod-federation-sqlite.sh \ > run-debuginfod-federation-link.sh \ > run-debuginfod-federation-metrics.sh \ > + run-debuginfod-percent-escape.sh \ > run-debuginfod-x-forwarded-for.sh > endif > endif > @@ -515,6 +516,7 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \ > run-debuginfod-archive-groom.sh \ > run-debuginfod-archive-rename.sh \ > run-debuginfod-archive-test.sh \ > + run-debuginfod-percent-escape.sh \ > debuginfod-rpms/fedora30/hello2-1.0-2.src.rpm \ > debuginfod-rpms/fedora30/hello2-1.0-2.x86_64.rpm \ > debuginfod-rpms/fedora30/hello2-debuginfo-1.0-2.x86_64.rpm \ > +env DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache DEBUGINFOD_URLS="http://127.0.0.1:$PORT1" \ > + LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find -vvv source F/p++r\$\#o^^g ${abs_builddir}/F/p++r\$\#o^^g.c > vlog1 2>&1 || true > +tempfiles vlog1 > +grep 'F/p%2B%2Br%24%23o%5E%5Eg.Ac' vlog1 OK, that is certainly a file name with lots of unexpected characters :) Thanks, Mark ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug debuginfod/28034] client-side %-escape url characters 2021-09-12 17:24 ` Mark Wielaard @ 2021-09-13 16:20 ` Noah Sanci 2021-09-13 18:11 ` Noah Sanci 0 siblings, 1 reply; 11+ messages in thread From: Noah Sanci @ 2021-09-13 16:20 UTC (permalink / raw) To: Mark Wielaard, elfutils-devel [-- Attachment #1: Type: text/plain, Size: 1747 bytes --] Hello, On Sun, Sep 12, 2021 at 1:24 PM Mark Wielaard <mark@klomp.org> wrote: > The escaped_string is created outside the loop and reused each time > (good). But... > > + > > /* Initialize each handle. */ > > for (int i = 0; i < num_urls; i++) > > { > > @@ -904,16 +908,23 @@ debuginfod_query_server (debuginfod_client *c, > > if (filename) /* must start with / */ > > { > > /* PR28034 escape characters in completed url to %hh format. */ > > - char *escaped_string; > > - escaped_string = curl_easy_escape(data[i].handle, filename, 0); > > + char *loc = escaped_string; > > if (!escaped_string) > > { > > rc = -ENOMEM; > > goto out2; > > } > > This check, and... > > > + > > + size_t escaped_strlen = strlen(escaped_string); > > + while ((loc = strstr(loc, "%2F"))) > > + { > > + loc[0] = '/'; > > + // pull the string back after replacement > > + memmove(loc+1, loc+3,escaped_strlen - (loc - escaped_string + 2) ); > > + escaped_strlen -=2; > > + } > > the manipulation of the escaped_string, could both also be done > outside the loop, since they always do the same thing. Fixed. > > I think you should simply add a new ChangeLog entry at the top instead > of changing an old existing one. And please do mention the Makefile.am > (TESTS) and (EXTRA_DIST) addition. Should be good. > OK, that is certainly a file name with lots of unexpected characters :) Gotta be sure :) If there are any more efficiency changes, ChangeLog changes, or other changes, don't be afraid to reach out again. Thanks, Noah Sanci [-- Attachment #2: 0001-debuginfod-PR28034-Percent-escape-debuginfod-urls.patch --] [-- Type: text/x-patch, Size: 8522 bytes --] From fa9104c7435503d08308e18d4e1a82792f83fc56 Mon Sep 17 00:00:00 2001 From: Noah Sanci <nsanci@redhat.com> Date: Thu, 9 Sep 2021 13:10:33 -0400 Subject: [PATCH] debuginfod: PR28034 - Percent escape debuginfod urls When requesting some source files, some URL-inconvenient chars sometimes pop up, including but not limited to '+', '^', and '&'. Example from f33 libstdc++: /buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug /gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/ libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/ condition_variable.cc As this URL is passed into debuginfod's handler_cb, it appears that the + signs are helpfully unescaped to spaces by libmicrohttpd, which 'course breaks everything. To combat this, before querying the debuginfod daemon, clients now % escape the source filename. This converts many alphanumeric characters into their %-code format, including '/' to %2F. We want to preserve the '/' in the url, so after conversion replace %2Fs with a '/'. https://sourceware.org/bugzilla/show_bug.cgi?id=28034 Signed-off-by: Noah Sanci <nsanci@redhat.com> --- debuginfod/ChangeLog | 4 ++ debuginfod/debuginfod-client.c | 29 +++++++++---- tests/ChangeLog | 12 +++--- tests/Makefile.am | 2 + tests/run-debuginfod-percent-escape.sh | 60 ++++++++++++++++++++++++++ 5 files changed, 94 insertions(+), 13 deletions(-) create mode 100755 tests/run-debuginfod-percent-escape.sh diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog index 1173f9cd..b8eaa88c 100644 --- a/debuginfod/ChangeLog +++ b/debuginfod/ChangeLog @@ -2,6 +2,10 @@ * debuginfod.cxx (libarchive_fdcache::lookup): Add endl after obatched(clog) line. +2021-09-13 Noah Sanci <nsanci@redhat.com> + + * debuginfod-client.c (debuginfod_query_server): Removed constant + operations from a loop. curl_free memory. 2021-09-06 Dmitry V. Levin <ldv@altlinux.org> diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c index d41723ce..f55e1489 100644 --- a/debuginfod/debuginfod-client.c +++ b/debuginfod/debuginfod-client.c @@ -883,6 +883,26 @@ debuginfod_query_server (debuginfod_client *c, data[i].errbuf[0] = '\0'; } + char *escaped_string = NULL; + size_t escaped_strlen = 0; + if (filename) + { + escaped_string = curl_easy_escape(&target_handle, filename+1, 0); + if (!escaped_string) + { + rc = -ENOMEM; + goto out2; + } + char *loc = escaped_string; + escaped_strlen = strlen(escaped_string) - 2 + (size_t)escaped_string; + while ((loc = strstr(loc, "%2F"))) + { + loc[0] = '/'; + // pull the string back after replacement + memmove(loc+1, loc+3,escaped_strlen - (size_t)loc ); + escaped_strlen -=2; + } + } /* Initialize each handle. */ for (int i = 0; i < num_urls; i++) { @@ -904,16 +924,8 @@ debuginfod_query_server (debuginfod_client *c, if (filename) /* must start with / */ { /* PR28034 escape characters in completed url to %hh format. */ - char *escaped_string; - escaped_string = curl_easy_escape(data[i].handle, filename, 0); - if (!escaped_string) - { - rc = -ENOMEM; - goto out2; - } snprintf(data[i].url, PATH_MAX, "%s/%s/%s/%s", server_url, build_id_bytes, type, escaped_string); - curl_free(escaped_string); } else snprintf(data[i].url, PATH_MAX, "%s/%s/%s", server_url, build_id_bytes, type); @@ -953,6 +965,7 @@ debuginfod_query_server (debuginfod_client *c, curl_multi_add_handle(curlm, data[i].handle); } + if (filename) curl_free(escaped_string); /* Query servers in parallel. */ if (vfd >= 0) dprintf (vfd, "query %d urls in parallel\n", num_urls); diff --git a/tests/ChangeLog b/tests/ChangeLog index 1154686a..3f219320 100644 --- a/tests/ChangeLog +++ b/tests/ChangeLog @@ -30,6 +30,11 @@ * run-debuginfod-federation-metrics.sh: Likewise. * run-debuginfod-federation-sqlite.sh: Likewise. +2021-09-13 Noah Sanci <nsanci@redhat.com> + + * Makefile.am: added run-debuginfod-percent-escape.sh to TESTS and + EXTRA_DIST. + 2021-09-06 Dmitry V. Levin <ldv@altlinux.org> * elfcopy.c (copy_elf): Remove cast of malloc return value. @@ -164,11 +169,8 @@ 2021-07-16 Noah Sanci <nsanci@redhat.com> PR28034 - * run-debuginfod-find.sh: Added a test ensuring files with % - escapable characters in their paths are accessible. The test - itself is changing the name of a binary known previously as prog to - p+r%o$g. General operations such as accessing p+r%o$g acts as the - test for %-escape checking. + * run-debuginfod-percent-escape.sh: Added a test ensuring files with % + escapable characters in their paths are accessible. 2021-07-21 Noah Sanci <nsanci@redhat.com> diff --git a/tests/Makefile.am b/tests/Makefile.am index 22942733..43c34ce6 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -231,6 +231,7 @@ TESTS += run-debuginfod-dlopen.sh \ run-debuginfod-federation-sqlite.sh \ run-debuginfod-federation-link.sh \ run-debuginfod-federation-metrics.sh \ + run-debuginfod-percent-escape.sh \ run-debuginfod-x-forwarded-for.sh endif endif @@ -524,6 +525,7 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \ run-debuginfod-archive-groom.sh \ run-debuginfod-archive-rename.sh \ run-debuginfod-archive-test.sh \ + run-debuginfod-percent-escape.sh \ debuginfod-rpms/fedora30/hello2-1.0-2.src.rpm \ debuginfod-rpms/fedora30/hello2-1.0-2.x86_64.rpm \ debuginfod-rpms/fedora30/hello2-debuginfo-1.0-2.x86_64.rpm \ diff --git a/tests/run-debuginfod-percent-escape.sh b/tests/run-debuginfod-percent-escape.sh new file mode 100755 index 00000000..f7d8dc66 --- /dev/null +++ b/tests/run-debuginfod-percent-escape.sh @@ -0,0 +1,60 @@ +#!/usr/bin/env bash +# +# Copyright (C) 2019-2021 Red Hat, Inc. +# This file is part of elfutils. +# +# This file is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# elfutils is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +. $srcdir/debuginfod-subr.sh # includes set -e +# for test case debugging, uncomment: +set -x +unset VALGRIND_CMD +# This variable is essential and ensures no time-race for claiming ports occurs +# set base to a unique multiple of 100 not used in any other 'run-debuginfod-*' test +base=10000 +get_ports +DB=${PWD}/.debuginfod_tmp.sqlite +tempfiles $DB +mkdir F +env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../debuginfod/debuginfod $VERBOSE \ + -F -R -d $DB -p $PORT1 -t0 -g0 -v R ${PWD}/F > vlog$PORT1 2>&1 & +PID1=$! +tempfiles vlog$PORT1 +errfiles vlog$PORT1 +# Server must become ready +wait_ready $PORT1 'ready' 1 +# Be patient when run on a busy machine things might take a bit. + +# Build a non-stripped binary +echo "int main() { return 0; }" > ${PWD}/F/p++r\$\#o^^g.c +gcc -Wl,--build-id -g -o ${PWD}/F/p++r\$\#o^^g ${PWD}/F/p++r\$\#o^^g.c +BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \ + -a ${PWD}/F/p++r\\$\#o^^g | grep 'Build ID' | cut -d ' ' -f 7` +tempfiles ${PWD}/F/p++r\$\#o^^g.c ${PWD}/F/p++r\$\#o^^g +kill -USR1 $PID1 +# Now there should be 1 files in the index +wait_ready $PORT1 'thread_work_total{role="traverse"}' 1 +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0 +wait_ready $PORT1 'thread_busy{role="scan"}' 0 +rm -rf $DEBUGINFOD_CACHE_PATH # clean it from previous tests +ls F +env DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache DEBUGINFOD_URLS="http://127.0.0.1:$PORT1" \ + LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find -vvv source F/p++r\$\#o^^g ${abs_builddir}/F/p++r\$\#o^^g.c > vlog1 2>&1 || true +tempfiles vlog1 +grep 'F/p%2B%2Br%24%23o%5E%5Eg.c' vlog1 + +kill $PID1 +wait $PID1 +PID1=0 +exit 0 -- 2.31.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug debuginfod/28034] client-side %-escape url characters 2021-09-13 16:20 ` Noah Sanci @ 2021-09-13 18:11 ` Noah Sanci 2021-09-16 10:35 ` Mark Wielaard 0 siblings, 1 reply; 11+ messages in thread From: Noah Sanci @ 2021-09-13 18:11 UTC (permalink / raw) To: Mark Wielaard, elfutils-devel [-- Attachment #1: Type: text/plain, Size: 104 bytes --] Hello Quick arithmetic change to the original patch with an updated commit message. Best, Noah Sanci [-- Attachment #2: 0001-debuginfod-PR28034-No-longer-escape-and-loop-efficie.patch --] [-- Type: text/x-patch, Size: 8628 bytes --] From cb2a3957d017f40e1edb35ed8f8fd3b9dee1c6be Mon Sep 17 00:00:00 2001 From: Noah Sanci <nsanci@redhat.com> Date: Thu, 9 Sep 2021 13:10:33 -0400 Subject: [PATCH] debuginfod: PR28034 - No longer escape '/', and loop efficiency Previously, urls containing '/', so most urls, would escape '/' to %2F, which is undesirable for use in other libraries which may escape differently. This patch escapes the '/' and replaces all of them ensuring there are no %2Fs sent. Some inefficiencies within the code were fixed, such as changing constant operations of a while loop within a for loop to a while loop outside of a for loop. Also strlen is no longer used within the loop, simplifying the interior operations to mere arithmetic. https://sourceware.org/bugzilla/show_bug.cgi?id=28034 Signed-off-by: Noah Sanci <nsanci@redhat.com> --- debuginfod/ChangeLog | 4 ++ debuginfod/debuginfod-client.c | 35 +++++++++++---- tests/ChangeLog | 12 +++--- tests/Makefile.am | 2 + tests/run-debuginfod-percent-escape.sh | 60 ++++++++++++++++++++++++++ 5 files changed, 100 insertions(+), 13 deletions(-) create mode 100755 tests/run-debuginfod-percent-escape.sh diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog index 1173f9cd..b8eaa88c 100644 --- a/debuginfod/ChangeLog +++ b/debuginfod/ChangeLog @@ -2,6 +2,10 @@ * debuginfod.cxx (libarchive_fdcache::lookup): Add endl after obatched(clog) line. +2021-09-13 Noah Sanci <nsanci@redhat.com> + + * debuginfod-client.c (debuginfod_query_server): Removed constant + operations from a loop. curl_free memory. 2021-09-06 Dmitry V. Levin <ldv@altlinux.org> diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c index d41723ce..8a1c68d5 100644 --- a/debuginfod/debuginfod-client.c +++ b/debuginfod/debuginfod-client.c @@ -883,6 +883,32 @@ debuginfod_query_server (debuginfod_client *c, data[i].errbuf[0] = '\0'; } + char *escaped_string = NULL; + size_t escaped_strlen = 0; + if (filename) + { + escaped_string = curl_easy_escape(&target_handle, filename+1, 0); + if (!escaped_string) + { + rc = -ENOMEM; + goto out2; + } + char *loc = escaped_string; + escaped_strlen = strlen(escaped_string); + while ((loc = strstr(loc, "%2F"))) + { + loc[0] = '/'; + //pull the string back after replacement + // loc-escaped_string finds the distance from the origin to the new location + // - 2 accounts for the 2F which remain and don't need to be measured. + // The two above subtracted from escaped_strlen yields the remaining characters + // in the string which we want to pull back + memmove(loc+1, loc+3,escaped_strlen - (loc-escaped_string) - 2); + //Because the 2F was overwritten in the memmove (as desired) escaped_strlen is + // now two shorter. + escaped_strlen -= 2; + } + } /* Initialize each handle. */ for (int i = 0; i < num_urls; i++) { @@ -904,16 +930,8 @@ debuginfod_query_server (debuginfod_client *c, if (filename) /* must start with / */ { /* PR28034 escape characters in completed url to %hh format. */ - char *escaped_string; - escaped_string = curl_easy_escape(data[i].handle, filename, 0); - if (!escaped_string) - { - rc = -ENOMEM; - goto out2; - } snprintf(data[i].url, PATH_MAX, "%s/%s/%s/%s", server_url, build_id_bytes, type, escaped_string); - curl_free(escaped_string); } else snprintf(data[i].url, PATH_MAX, "%s/%s/%s", server_url, build_id_bytes, type); @@ -953,6 +971,7 @@ debuginfod_query_server (debuginfod_client *c, curl_multi_add_handle(curlm, data[i].handle); } + if (filename) curl_free(escaped_string); /* Query servers in parallel. */ if (vfd >= 0) dprintf (vfd, "query %d urls in parallel\n", num_urls); diff --git a/tests/ChangeLog b/tests/ChangeLog index 1154686a..3f219320 100644 --- a/tests/ChangeLog +++ b/tests/ChangeLog @@ -30,6 +30,11 @@ * run-debuginfod-federation-metrics.sh: Likewise. * run-debuginfod-federation-sqlite.sh: Likewise. +2021-09-13 Noah Sanci <nsanci@redhat.com> + + * Makefile.am: added run-debuginfod-percent-escape.sh to TESTS and + EXTRA_DIST. + 2021-09-06 Dmitry V. Levin <ldv@altlinux.org> * elfcopy.c (copy_elf): Remove cast of malloc return value. @@ -164,11 +169,8 @@ 2021-07-16 Noah Sanci <nsanci@redhat.com> PR28034 - * run-debuginfod-find.sh: Added a test ensuring files with % - escapable characters in their paths are accessible. The test - itself is changing the name of a binary known previously as prog to - p+r%o$g. General operations such as accessing p+r%o$g acts as the - test for %-escape checking. + * run-debuginfod-percent-escape.sh: Added a test ensuring files with % + escapable characters in their paths are accessible. 2021-07-21 Noah Sanci <nsanci@redhat.com> diff --git a/tests/Makefile.am b/tests/Makefile.am index 22942733..43c34ce6 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -231,6 +231,7 @@ TESTS += run-debuginfod-dlopen.sh \ run-debuginfod-federation-sqlite.sh \ run-debuginfod-federation-link.sh \ run-debuginfod-federation-metrics.sh \ + run-debuginfod-percent-escape.sh \ run-debuginfod-x-forwarded-for.sh endif endif @@ -524,6 +525,7 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \ run-debuginfod-archive-groom.sh \ run-debuginfod-archive-rename.sh \ run-debuginfod-archive-test.sh \ + run-debuginfod-percent-escape.sh \ debuginfod-rpms/fedora30/hello2-1.0-2.src.rpm \ debuginfod-rpms/fedora30/hello2-1.0-2.x86_64.rpm \ debuginfod-rpms/fedora30/hello2-debuginfo-1.0-2.x86_64.rpm \ diff --git a/tests/run-debuginfod-percent-escape.sh b/tests/run-debuginfod-percent-escape.sh new file mode 100755 index 00000000..f7d8dc66 --- /dev/null +++ b/tests/run-debuginfod-percent-escape.sh @@ -0,0 +1,60 @@ +#!/usr/bin/env bash +# +# Copyright (C) 2019-2021 Red Hat, Inc. +# This file is part of elfutils. +# +# This file is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# elfutils is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +. $srcdir/debuginfod-subr.sh # includes set -e +# for test case debugging, uncomment: +set -x +unset VALGRIND_CMD +# This variable is essential and ensures no time-race for claiming ports occurs +# set base to a unique multiple of 100 not used in any other 'run-debuginfod-*' test +base=10000 +get_ports +DB=${PWD}/.debuginfod_tmp.sqlite +tempfiles $DB +mkdir F +env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../debuginfod/debuginfod $VERBOSE \ + -F -R -d $DB -p $PORT1 -t0 -g0 -v R ${PWD}/F > vlog$PORT1 2>&1 & +PID1=$! +tempfiles vlog$PORT1 +errfiles vlog$PORT1 +# Server must become ready +wait_ready $PORT1 'ready' 1 +# Be patient when run on a busy machine things might take a bit. + +# Build a non-stripped binary +echo "int main() { return 0; }" > ${PWD}/F/p++r\$\#o^^g.c +gcc -Wl,--build-id -g -o ${PWD}/F/p++r\$\#o^^g ${PWD}/F/p++r\$\#o^^g.c +BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \ + -a ${PWD}/F/p++r\\$\#o^^g | grep 'Build ID' | cut -d ' ' -f 7` +tempfiles ${PWD}/F/p++r\$\#o^^g.c ${PWD}/F/p++r\$\#o^^g +kill -USR1 $PID1 +# Now there should be 1 files in the index +wait_ready $PORT1 'thread_work_total{role="traverse"}' 1 +wait_ready $PORT1 'thread_work_pending{role="scan"}' 0 +wait_ready $PORT1 'thread_busy{role="scan"}' 0 +rm -rf $DEBUGINFOD_CACHE_PATH # clean it from previous tests +ls F +env DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache DEBUGINFOD_URLS="http://127.0.0.1:$PORT1" \ + LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find -vvv source F/p++r\$\#o^^g ${abs_builddir}/F/p++r\$\#o^^g.c > vlog1 2>&1 || true +tempfiles vlog1 +grep 'F/p%2B%2Br%24%23o%5E%5Eg.c' vlog1 + +kill $PID1 +wait $PID1 +PID1=0 +exit 0 -- 2.31.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bug debuginfod/28034] client-side %-escape url characters 2021-09-13 18:11 ` Noah Sanci @ 2021-09-16 10:35 ` Mark Wielaard 0 siblings, 0 replies; 11+ messages in thread From: Mark Wielaard @ 2021-09-16 10:35 UTC (permalink / raw) To: Noah Sanci, elfutils-devel Hi Noah, On Mon, 2021-09-13 at 14:11 -0400, Noah Sanci via Elfutils-devel wrote: > Quick arithmetic change to the original patch with an updated commit > message. Thanks for commenting this, it is easy to make an off-by-one (or two) error. > Previously, urls containing '/', so most urls, would escape '/' to %2F, > which is undesirable for use in other libraries which may escape > differently. This patch escapes the '/' and replaces all of them > ensuring there are no %2Fs sent. > Some inefficiencies within the code were fixed, such as changing constant > operations of a while loop within a for loop to a while loop outside of > a for loop. Also strlen is no longer used within the loop, simplifying > the interior operations to mere arithmetic. > > https://sourceware.org/bugzilla/show_bug.cgi?id=28034 Looks good. Could you push it to the master branch? (Please do rebase first, so we keep a linear history) Thanks, Mark ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-09-16 10:35 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-26 19:27 [Bug debuginfod/28034] client-side %-escape url characters Noah Sanci 2021-08-26 21:02 ` Frank Ch. Eigler 2021-08-27 14:44 ` Noah Sanci 2021-08-27 15:07 ` Noah Sanci 2021-08-27 15:30 ` Noah Sanci 2021-09-08 13:38 ` Mark Wielaard 2021-09-09 17:28 ` Noah Sanci 2021-09-12 17:24 ` Mark Wielaard 2021-09-13 16:20 ` Noah Sanci 2021-09-13 18:11 ` Noah Sanci 2021-09-16 10:35 ` Mark Wielaard
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).