From: Noah Sanci <nsanci@redhat.com>
To: Mark Wielaard <mark@klomp.org>
Cc: elfutils-devel@sourceware.org, Frank Eigler <fche@redhat.com>
Subject: Re: [Bug debuginfod/28034] %-escape url characters
Date: Wed, 21 Jul 2021 14:18:15 -0400 [thread overview]
Message-ID: <CAJXA7qiHxBxryT67msvo2e256LYj9syWKx=2LxFErZL81KDkQw@mail.gmail.com> (raw)
In-Reply-To: <YPcM03PSfYzbaq6Z@wildebeest.org>
[-- Attachment #1: Type: text/plain, Size: 5851 bytes --]
Hello,
> On Mon, Jul 19, 2021 at 10:53:17AM -0400, Noah Sanci via Elfutils-devel wrote:
> > When requesting some source files, some URL-inconvenient chars
> > sometimes pop up. Example from f33 libstdc++:
> > /buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug/
> > gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/
> > libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/
> > condition_variable.cc
> > As this URL is passed into debuginfod's handler_cb, it appears that the
> > + signs are helpfully unescaped to spaces by libmicrohttpd, which
> > 'course breaks everything. We need to suppress this HTTP URL
> > processing step. Also worth checking that %HEX decoding is also
> > suppressed.
>
> I find this description slightly confusing. It ends with "Also worth
> checking..." but that is actually what is done in this patch. The part
> before that about what debuginfod/microhttpd does on the server side
> is interesting, but not really what this patch is about (just why this
> patch is necessary, but it seems necessary on the client side always
> whatever server is used).
Thank you for bringing this to my attention. Fixed
> curl_easy_escape () can return NULL on failure.
>
Fixed.
> > +Note: the client should %-escape characters in /SOURCE/FILE that are not shown as "unreserved" in section 2.3 of RFC3986.
Please read the new note just to ensure that it's reasonable, as my
updated note specifies some characters that will be percent escaped.
Otherwise fixed.
> > @@ -149,18 +150,18 @@ ps -q $PID1 -e -L -o '%p %c %a' | grep traverse
> > # Compile a simple program, strip its debuginfo and save the build-id.
> > # Also move the debuginfo into another directory so that elfutils
> > # cannot find it without debuginfod.
> > -echo "int main() { return 0; }" > ${PWD}/prog.c
> > -tempfiles prog.c
> > +echo "int main() { return 0; }" > ${PWD}/p+r%o\$g.c
> > +tempfiles p+r%o\$g.c
> > # Create a subdirectory to confound source path names
> > mkdir foobar
> > -gcc -Wl,--build-id -g -o prog ${PWD}/foobar///./../prog.c
> > -testrun ${abs_top_builddir}/src/strip -g -f prog.debug ${PWD}/prog
> > +gcc -Wl,--build-id -g -o p+r%o\$g ${PWD}/foobar///./../p+r%o\$g.c
> > +testrun ${abs_top_builddir}/src/strip -g -f p+r%o\$g.debug ${PWD}/p+r%o\$g
> > BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \
> > - -a prog | grep 'Build ID' | cut -d ' ' -f 7`
> > + -a p+r%o\\$g | grep 'Build ID' | cut -d ' ' -f 7`
> >
> > wait_ready $PORT1 'thread_work_total{role="traverse"}' 1
> > -mv prog F
> > -mv prog.debug F
> > +mv p+r%o\$g F
> > +mv p+r%o\$g.debug F
> > kill -USR1 $PID1
> > # Wait till both files are in the index.
> > wait_ready $PORT1 'thread_work_total{role="traverse"}' 2
> > @@ -171,7 +172,7 @@ wait_ready $PORT1 'thread_busy{role="scan"}' 0
> >
> > # Test whether elfutils, via the debuginfod client library dlopen hooks,
> > # is able to fetch debuginfo from the local debuginfod.
> > -testrun ${abs_builddir}/debuginfod_build_id_find -e F/prog 1
> > +testrun ${abs_builddir}/debuginfod_build_id_find -e F/p+r%o\$g 1
> >
> > ########################################################################
> >
> > @@ -213,22 +214,22 @@ fi
> > # Test whether debuginfod-find is able to fetch those files.
> > rm -rf $DEBUGINFOD_CACHE_PATH # clean it from previous tests
> > filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID`
> > -cmp $filename F/prog.debug
> > +cmp $filename F/p+r%o\$g.debug
> > if [ -w $filename ]; then
> > echo "cache file writable, boo"
> > err
> > fi
> >
> > -filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find executable F/prog`
> > -cmp $filename F/prog
> > +filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find executable F/p+r%o\\$g`
> > +cmp $filename F/p+r%o\$g
> >
> > # raw source filename
> > -filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $BUILDID ${PWD}/foobar///./../prog.c`
> > -cmp $filename ${PWD}/prog.c
> > +filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $BUILDID ${PWD}/foobar///./../p+r%o\\$g.c`
> > +cmp $filename ${PWD}/p+r%o\$g.c
> >
> > # and also the canonicalized one
> > -filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $BUILDID ${PWD}/prog.c`
> > -cmp $filename ${PWD}/prog.c
> > +filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $BUILDID ${PWD}/p+r%o\\$g.c`
> > +cmp $filename ${PWD}/p+r%o\$g.c
> >
> >
> > ########################################################################
> > @@ -672,7 +673,7 @@ touch -d '1970-01-01' $DEBUGINFOD_CACHE_PATH/00000000 # old enough to guarantee
> > echo 0 > $DEBUGINFOD_CACHE_PATH/cache_clean_interval_s
> > echo 0 > $DEBUGINFOD_CACHE_PATH/max_unused_age_s
> >
> > -testrun ${abs_builddir}/debuginfod_build_id_find -e F/prog 1
> > +testrun ${abs_builddir}/debuginfod_build_id_find -e F/p+r%o\$g 1
> >
> > testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID2 && false || true
> >
> > @@ -725,6 +726,7 @@ PID4=$!
> > wait_ready $PORT3 'ready' 1
> > tempfiles vlog$PORT3
> > errfiles vlog$PORT3
> > +tempfiles $DB.backup*
>
> ???
>
Hopefully the updated changelog message clarifies these changes.
> > kill -USR2 $PID4
> > wait_ready $PORT3 'thread_work_total{role="groom"}' 1
> > @@ -738,6 +740,7 @@ wait_ready $PORT3 'groom{statistic="files scanned (#)"}' 0
> > wait_ready $PORT3 'groom{statistic="files scanned (mb)"}' 0
> >
> > kill $PID4
> > +PID4=0
>
> And this?
> Might need to be
>
> wait $PID
> PID4=0
The small test suite fixes have been removed from the script, however
I would like to update the test suite ASAP. Can I just commit a new
branch titled something like 'elfutils-test-fixes' without a specific
pr?
Noah
[-- Attachment #2: 0001-debuginfod-PR28034-client-side-escape-url-characters.patch --]
[-- Type: text/x-patch, Size: 7532 bytes --]
From 26d39eee2ecc030ac7e488977092f62afd72e903 Mon Sep 17 00:00:00 2001
From: Noah Sanci <nsanci@redhat.com>
Date: Fri, 16 Jul 2021 15:16:20 -0400
Subject: [PATCH] debuginfod: PR28034 - client-side %-escape url characters
When requesting some source files, some URL-inconvenient chars
sometimes pop up. Example from f33 libstdc++:
/buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug/
gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/
libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/
condition_variable.cc
As this URL is passed into debuginfod's handler_cb, it appears that the
+ signs are helpfully unescaped to spaces by libmicrohttpd, which
'course breaks everything.
In order to ensure the server properly parses urls such as this one,
%-escape characters on the client side so that the correct url
is preserved and properly processed on the server side.
https://sourceware.org/bugzilla/show_bug.cgi?id=28034
Signed-off-by: Noah Sanci <nsanci@redhat.com>
a
---
debuginfod/ChangeLog | 6 ++++++
debuginfod/debuginfod-client.c | 17 ++++++++++++++---
doc/debuginfod.8 | 4 ++++
tests/ChangeLog | 9 +++++++++
tests/run-debuginfod-find.sh | 32 ++++++++++++++++----------------
5 files changed, 49 insertions(+), 19 deletions(-)
diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index e71436ca..aad35a4e 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,9 @@
+2021-07-16 Noah Sanci <nsanci@redhat.com>
+
+ PR28034
+ * debuginfod-client.c (debuginfod_query_server): % escape filename
+ so the completed url is processed properly.
+
2021-07-06 Alice Zhang <alizhang@redhat.com>
PR27531
diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index f12f609c..64936acd 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -831,9 +831,20 @@ debuginfod_query_server (debuginfod_client *c,
else
slashbuildid = "/buildid";
- if (filename) /* must start with / */
- snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s%s", server_url,
- slashbuildid, build_id_bytes, type, filename);
+ if (filename)/* must start with / */
+ {
+ /* PR28034 escape characters in completed url to %hh format. */
+ char *escaped_string;
+ escaped_string = curl_easy_escape(data[i].handle, filename, 0);
+ if (!escaped_string)
+ {
+ rc = ENOMEM;
+ goto out1;
+ }
+ snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s%s", server_url,
+ slashbuildid, build_id_bytes, type, escaped_string);
+ curl_free(escaped_string);
+ }
else
snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s", server_url,
slashbuildid, build_id_bytes, type);
diff --git a/doc/debuginfod.8 b/doc/debuginfod.8
index 1adf703a..ee8f76ae 100644
--- a/doc/debuginfod.8
+++ b/doc/debuginfod.8
@@ -299,6 +299,10 @@ l l.
\../bar/foo.c AT_comp_dir=/zoo/ /buildid/BUILDID/source/zoo//../bar/foo.c
.TE
+Note: the client should %-escape characters in /SOURCE/FILE that are
+not shown as "unreserved" in section 2.3 of RFC3986. Some characters
+that will be escaped include "+", "\\", "$", "!", the 'space' character,
+and ";". RFC3986 includes a more comprehensive list of these characters.
.SS /metrics
This endpoint returns a Prometheus formatted text/plain dump of a
diff --git a/tests/ChangeLog b/tests/ChangeLog
index 1460b422..0840e7af 100644
--- a/tests/ChangeLog
+++ b/tests/ChangeLog
@@ -1,3 +1,12 @@
+2021-07-16 Noah Sanci <nsanci@redhat.com>
+
+ PR28034
+ * run-debuginfod-find.sh: Added a test ensuring files with %
+ escapable characters in their paths are accessible. The test
+ itself is changing the name of a binary known previously as prog to
+ p+r%o$g. General operations such as accessing p+r%o$g acts as the
+ test for %-escape checking.
+
2021-07-06 Alice Zhang <alizhang@redhat.com>
PR27531
diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
index 73bbe65b..ecf3195e 100755
--- a/tests/run-debuginfod-find.sh
+++ b/tests/run-debuginfod-find.sh
@@ -149,18 +149,18 @@ ps -q $PID1 -e -L -o '%p %c %a' | grep traverse
# Compile a simple program, strip its debuginfo and save the build-id.
# Also move the debuginfo into another directory so that elfutils
# cannot find it without debuginfod.
-echo "int main() { return 0; }" > ${PWD}/prog.c
-tempfiles prog.c
+echo "int main() { return 0; }" > ${PWD}/p+r%o\$g.c
+tempfiles p+r%o\$g.c
# Create a subdirectory to confound source path names
mkdir foobar
-gcc -Wl,--build-id -g -o prog ${PWD}/foobar///./../prog.c
-testrun ${abs_top_builddir}/src/strip -g -f prog.debug ${PWD}/prog
+gcc -Wl,--build-id -g -o p+r%o\$g ${PWD}/foobar///./../p+r%o\$g.c
+testrun ${abs_top_builddir}/src/strip -g -f p+r%o\$g.debug ${PWD}/p+r%o\$g
BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \
- -a prog | grep 'Build ID' | cut -d ' ' -f 7`
+ -a p+r%o\\$g | grep 'Build ID' | cut -d ' ' -f 7`
wait_ready $PORT1 'thread_work_total{role="traverse"}' 1
-mv prog F
-mv prog.debug F
+mv p+r%o\$g F
+mv p+r%o\$g.debug F
kill -USR1 $PID1
# Wait till both files are in the index.
wait_ready $PORT1 'thread_work_total{role="traverse"}' 2
@@ -171,7 +171,7 @@ wait_ready $PORT1 'thread_busy{role="scan"}' 0
# Test whether elfutils, via the debuginfod client library dlopen hooks,
# is able to fetch debuginfo from the local debuginfod.
-testrun ${abs_builddir}/debuginfod_build_id_find -e F/prog 1
+testrun ${abs_builddir}/debuginfod_build_id_find -e F/p+r%o\$g 1
########################################################################
@@ -213,22 +213,22 @@ fi
# Test whether debuginfod-find is able to fetch those files.
rm -rf $DEBUGINFOD_CACHE_PATH # clean it from previous tests
filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID`
-cmp $filename F/prog.debug
+cmp $filename F/p+r%o\$g.debug
if [ -w $filename ]; then
echo "cache file writable, boo"
err
fi
-filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find executable F/prog`
-cmp $filename F/prog
+filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find executable F/p+r%o\\$g`
+cmp $filename F/p+r%o\$g
# raw source filename
-filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $BUILDID ${PWD}/foobar///./../prog.c`
-cmp $filename ${PWD}/prog.c
+filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $BUILDID ${PWD}/foobar///./../p+r%o\\$g.c`
+cmp $filename ${PWD}/p+r%o\$g.c
# and also the canonicalized one
-filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $BUILDID ${PWD}/prog.c`
-cmp $filename ${PWD}/prog.c
+filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source $BUILDID ${PWD}/p+r%o\\$g.c`
+cmp $filename ${PWD}/p+r%o\$g.c
########################################################################
@@ -672,7 +672,7 @@ touch -d '1970-01-01' $DEBUGINFOD_CACHE_PATH/00000000 # old enough to guarantee
echo 0 > $DEBUGINFOD_CACHE_PATH/cache_clean_interval_s
echo 0 > $DEBUGINFOD_CACHE_PATH/max_unused_age_s
-testrun ${abs_builddir}/debuginfod_build_id_find -e F/prog 1
+testrun ${abs_builddir}/debuginfod_build_id_find -e F/p+r%o\$g 1
testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID2 && false || true
--
2.31.1
next prev parent reply other threads:[~2021-07-21 18:18 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-19 14:53 Noah Sanci
2021-07-20 17:50 ` Mark Wielaard
2021-07-20 18:13 ` Frank Ch. Eigler
2021-07-20 19:08 ` Mark Wielaard
2021-07-21 18:18 ` Noah Sanci [this message]
2021-07-21 19:44 ` Noah Sanci
2021-07-22 12:29 ` Mark Wielaard
2021-07-22 12:28 ` Mark Wielaard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJXA7qiHxBxryT67msvo2e256LYj9syWKx=2LxFErZL81KDkQw@mail.gmail.com' \
--to=nsanci@redhat.com \
--cc=elfutils-devel@sourceware.org \
--cc=fche@redhat.com \
--cc=mark@klomp.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).