public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Noah Sanci <nsanci@redhat.com>
To: Mark Wielaard <mark@klomp.org>
Cc: elfutils-devel@sourceware.org
Subject: Re: [Bug debuginfod/27982] added DEBUGINFOD_MAXSIZE and DEBUGINFOD_MAXTIME
Date: Fri, 30 Jul 2021 09:39:56 -0400	[thread overview]
Message-ID: <CAJXA7qieVV5Q6WRReSew8wGcy9JOywmiDqNAbq2dYKAdFnKZog@mail.gmail.com> (raw)
In-Reply-To: <d3fdec3c9005a0b3f9f55891b07e4b8ef205317a.camel@klomp.org>

[-- Attachment #1: Type: text/plain, Size: 1907 bytes --]

Hello,

Here is the real patch :).

Thanks,
Noah

On Fri, Jul 30, 2021 at 7:11 AM Mark Wielaard <mark@klomp.org> wrote:
>
> Hi Noah,
>
> On Thu, 2021-07-29 at 16:29 -0400, Noah Sanci wrote:
> > Why have MAXTIME default to LONG_MAX? Which is long, but different
> > > on
> > > different arches 32/64bit. If MAXSIZE == 0 means infinite, why not make
> > >  MAXTIME=0 the same for consistency?
> >
> > Fixed.
> >
> > > The bug suggests to also check the Content-Length header on reciept (in
> > > case the server didn't support or respect the X-DEBUGINFOD-MAXSIZE
> > > header. Did you try to implement that?
> >
> > Fixed.
> >
> > > I believe various error handling goto out1 should be goto out2 (after
> > > your duplicate urls patch). Could you double check that?
> >
> > Fixed.
> >
> > > diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
> > > index feec5ddf..8ed7743d 100755
> > > --- a/tests/run-debuginfod-find.sh
> > > +++ b/tests/run-debuginfod-find.sh
> > > @@ -187,7 +187,7 @@ tempfiles find-vlog$PORT1
> > >  # wait for the server to fail the same number of times the query is retried.
> > >  wait_ready $PORT1 'http_responses_after_you_milliseconds_count{code="406"}' 1
> > >  # quickly ensure all reporting is functional
> > > -grep 'serving file '${PWD}'/F/p+r%o\$g.debug' vlog$PORT1
> > > +grep 'serving file '$(realpath ${PWD})'/F/p+r%o\$g.debug' vlog$PORT1
> > >  grep 'File too large' vlog$PORT1
> > >  grep 'using max size 1B' find-vlog$PORT1
> > >  if [ -f ${DEBUGINFOD_CACHE_PATH}/${BUILDID} ]; then
> >
> > Fixed.
>
> Great. Thanks. Although you don't just have to do things the way I like
> them. Feel free to push back and tell me you feel the solution you
> chose is better than what I am suggesting. It really is a conversation.
>
> > Find all the fixes in the attached patch.
>
> Looks like you attached the wrong patch (url-duplicate)
>
> Cheers,
>
> Mark
>

[-- Attachment #2: 0001-debuginfod-PR27982-added-DEBUGINFOD_MAXSIZE-and-DEBU.patch --]
[-- Type: text/x-patch, Size: 13148 bytes --]

From b7e28a6a1c5e2664cdaeff7e7f5ac39aebfc12d0 Mon Sep 17 00:00:00 2001
From: Noah Sanci <nsanci@redhat.com>
Date: Mon, 26 Jul 2021 13:29:11 -0400
Subject: [PATCH] debuginfod: PR27982 - added DEBUGINFOD_MAXSIZE and
 DEBUGINFOD_MAXTIME

DEBUGINFOD_TIMEOUT is a good way to catch servers that are too slow to
*start* transmitting a file.  But we have no way of limiting total
download time or space.  A user might prefer to have his debugger fetch
only quick & small files, and make do without the bigger ones.  Some
transitive dependencies of e.g. gnome programs are huge: 3GB of LLVM
debuginfo, 1GB of webkitgtk, etc. etc.
DEBUGINFOD_MAXSIZE and DEBUGINFOD_MAXTIME were added to dictate the
max download size and time of a debuginfod client. DEBUGINFOD_MAXSIZE
is handled server-side and is sent using the http header:
X-DEBUGINFOD-MAXSIZE. The client side then checks to ensure this maxsize
has been respected.

https://sourceware.org/bugzilla/show_bug.cgi?id=27982

Signed-off-by: Noah Sanci <nsanci@redhat.com>
---
 debuginfod/ChangeLog           | 12 +++++
 debuginfod/debuginfod-client.c | 99 ++++++++++++++++++++++++++++++++--
 debuginfod/debuginfod.cxx      | 13 +++++
 debuginfod/debuginfod.h.in     |  2 +
 doc/ChangeLog                  |  6 +++
 doc/debuginfod-find.1          | 15 ++++++
 tests/ChangeLog                |  7 +++
 tests/run-debuginfod-find.sh   | 26 +++++++++
 8 files changed, 177 insertions(+), 3 deletions(-)

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 81eb56f9..9e82d78d 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,15 @@
+2021-07-26  Noah Sanci  <nsanci@redhat.com>
+
+	PR27982
+	* debuginfod-client.c (globals): added default_maxsize and
+	default_maxtime.
+	(debuginfod_query_server): Added DEBUGINFOD_MAXSIZE and
+	DEBUGINFOD_MAXTIME envvar processing.
+	* debuginfod.cxx (handler_cb): If the requested file exceeds
+	maxsize return code 406.
+	* debuginfod.h.in: Added DEBUGINFOD_MAXSIZE_ENV_VAR and
+	DEBUGINFOD_MAXTIME_ENV_VAR.
+
 2021-07-16  Noah Sanci  <nsanci@redhat.com>
 
 	PR28034
diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index a70f6d79..7d4b220f 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -555,6 +555,39 @@ debuginfod_query_server (debuginfod_client *c,
   free (c->url);
   c->url = NULL;
 
+  /* PR 27982: Add max size if DEBUGINFOD_MAXSIZE is set. */
+  long maxsize = 0;
+  const char *maxsize_envvar;
+  maxsize_envvar = getenv(DEBUGINFOD_MAXSIZE_ENV_VAR);
+  if (maxsize_envvar != NULL)
+    maxsize = atol (maxsize_envvar);
+
+  /* PR 27982: Add max time if DEBUGINFOD_MAXTIME is set. */
+  long maxtime = 0;
+  const char *maxtime_envvar;
+  maxtime_envvar = getenv(DEBUGINFOD_MAXTIME_ENV_VAR);
+  if (maxtime_envvar != NULL)
+    maxtime = atol (maxtime_envvar);
+  if (maxtime && vfd >= 0)
+    dprintf(vfd, "using max time %lds\n", maxtime);
+
+  /* Maxsize is valid*/
+  if (maxsize > 0)
+    {
+      if (vfd)
+        dprintf (vfd, "using max size %ldB\n", maxsize);
+      char *size_header = NULL;
+      rc = asprintf (&size_header, "X-DEBUGINFOD-MAXSIZE: %ld", maxsize);
+      if (rc < 0)
+        {
+          rc = -ENOMEM;
+          goto out;
+        }
+      rc = debuginfod_add_http_header(c, size_header);
+      free(size_header);
+      if (rc < 0)
+        goto out;
+    }
   add_default_headers(c);
 
   /* Copy lowercase hex representation of build_id into buf.  */
@@ -833,7 +866,7 @@ debuginfod_query_server (debuginfod_client *c,
   if (data == NULL)
     {
       rc = -ENOMEM;
-      goto out0;
+      goto out1;
     }
 
   /* thereafter, goto out1 on error.  */
@@ -864,7 +897,7 @@ debuginfod_query_server (debuginfod_client *c,
       if (data[i].handle == NULL)
         {
           rc = -ENETUNREACH;
-          goto out2;
+          goto out1;
         }
       data[i].client = c;
 
@@ -876,7 +909,7 @@ debuginfod_query_server (debuginfod_client *c,
           if (!escaped_string)
             {
               rc = -ENOMEM;
-              goto out1;
+              goto out2;
             }
           snprintf(data[i].url, PATH_MAX, "%s/%s/%s/%s", server_url,
                    build_id_bytes, type, escaped_string);
@@ -927,8 +960,31 @@ debuginfod_query_server (debuginfod_client *c,
   long loops = 0;
   int committed_to = -1;
   bool verbose_reported = false;
+  struct timespec start_time, cur_time;
+  if ( maxtime > 0 && clock_gettime(CLOCK_MONOTONIC_RAW, &start_time) == -1)
+    {
+      rc = errno;
+      goto out2;
+    }
+  long delta = 0;
   do
     {
+      /* Check to see how long querying is taking. */
+      if (maxtime > 0)
+        {
+          if (clock_gettime(CLOCK_MONOTONIC_RAW, &cur_time) == -1)
+            {
+              rc = errno;
+              goto out2;
+            }
+          delta = cur_time.tv_sec - start_time.tv_sec;
+          if ( delta >  maxtime)
+            {
+              dprintf(vfd, "Timeout with max time=%lds and transfer time=%lds\n", maxtime, delta );
+              rc = -ETIME;
+              goto out2;
+            }
+        }
       /* Wait 1 second, the minimum DEBUGINFOD_TIMEOUT.  */
       curl_multi_wait(curlm, NULL, 0, 1000, NULL);
 
@@ -1010,6 +1066,29 @@ debuginfod_query_server (debuginfod_client *c,
           if ((*c->progressfn) (c, pa, pb))
             break;
         }
+      /* Check to see if we are downloading something which exceeds maxsize, if set.*/
+      if (maxsize > 0 && target_handle)
+        {
+          long dl_size = 0;
+#ifdef CURLINFO_SIZE_DOWNLOAD_T
+          curl_off_t download_size_t;
+          if (curl_easy_getinfo(target_handle, CURLINFO_CONTENT_LENGTH_DOWNLOAD_T,
+                                                    &download_size_t) == CURLE_OK)
+            dl_size = download_size_t >= (double)(LONG_MAX+1UL) ? LONG_MAX : (long)download_size_t;
+#else
+          double download_size;
+          if (curl_easy_getinfo(target_handle, CURLINFO_CONTENT_LENGTH_DOWNLOAD,
+                                                    &download_size) == CURLE_OK)
+            dl_size = download_size >= (double)(LONG_MAX+1UL) ? LONG_MAX : (long)download_size;
+#endif
+            if (dl_size > maxsize)
+              {
+                if (vfd >=0)
+                  dprintf(vfd, "Content-Length too large.\n");
+                rc = -EFBIG;
+                goto out2;
+              }
+        }
     } while (still_running);
 
   /* Check whether a query was successful. If so, assign its handle
@@ -1043,6 +1122,8 @@ debuginfod_query_server (debuginfod_client *c,
 
           if (msg->data.result != CURLE_OK)
             {
+              long resp_code;
+              CURLcode ok0;
               /* Unsuccessful query, determine error code.  */
               switch (msg->data.result)
                 {
@@ -1057,6 +1138,16 @@ debuginfod_query_server (debuginfod_client *c,
                 case CURLE_SEND_ERROR: rc = -ECONNRESET; break;
                 case CURLE_RECV_ERROR: rc = -ECONNRESET; break;
                 case CURLE_OPERATION_TIMEDOUT: rc = -ETIME; break;
+                case CURLE_HTTP_RETURNED_ERROR:
+                  ok0 = curl_easy_getinfo (msg->easy_handle,
+                                          CURLINFO_RESPONSE_CODE,
+				          &resp_code);
+                  /* 406 signals that the requested file was too large */
+                  if ( ok0 == CURLE_OK && resp_code == 406)
+                    rc = -EFBIG;
+                  else
+                    rc = -ENOENT;
+                  break;
                 default: rc = -ENOENT; break;
                 }
             }
@@ -1129,6 +1220,8 @@ debuginfod_query_server (debuginfod_client *c,
       if (efd >= 0)
         close(efd);
     }
+  else if (rc == -EFBIG)
+    goto out2;
 
   /* If the verified_handle is NULL and rc != -ENOENT, the query fails with
    * an error code other than 404, then do several retry within the retry_limit.
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 503edba7..4ddd9255 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -2102,6 +2102,13 @@ handler_cb (void * /*cls*/,
     }
   *ptr = NULL;                     /* reset when done */
   
+  const char *maxsize_string = MHD_lookup_connection_value(connection, MHD_HEADER_KIND, "X-DEBUGINFOD-MAXSIZE");
+  long maxsize = 0;
+  if (maxsize_string != NULL && maxsize_string[0] != '\0')
+    maxsize = atol(maxsize_string);
+  else
+    maxsize = 0;
+
 #if MHD_VERSION >= 0x00097002
   enum MHD_Result rc;
 #else
@@ -2190,6 +2197,12 @@ handler_cb (void * /*cls*/,
       if (r == 0)
         throw reportable_exception("internal error, missing response");
 
+      if (maxsize > 0 && http_size > maxsize)
+        {
+          MHD_destroy_response(r);
+          throw reportable_exception(406, "File too large, max size=" + std::to_string(maxsize));
+        }
+
       rc = MHD_queue_response (connection, MHD_HTTP_OK, r);
       http_code = MHD_HTTP_OK;
       MHD_destroy_response (r);
diff --git a/debuginfod/debuginfod.h.in b/debuginfod/debuginfod.h.in
index 282e523d..c358df4d 100644
--- a/debuginfod/debuginfod.h.in
+++ b/debuginfod/debuginfod.h.in
@@ -36,6 +36,8 @@
 #define DEBUGINFOD_PROGRESS_ENV_VAR "DEBUGINFOD_PROGRESS"
 #define DEBUGINFOD_VERBOSE_ENV_VAR "DEBUGINFOD_VERBOSE"
 #define DEBUGINFOD_RETRY_LIMIT_ENV_VAR "DEBUGINFOD_RETRY_LIMIT"
+#define DEBUGINFOD_MAXSIZE_ENV_VAR "DEBUGINFOD_MAXSIZE"
+#define DEBUGINFOD_MAXTIME_ENV_VAR "DEBUGINFOD_MAXTIME"
 
 /* The libdebuginfod soname.  */
 #define DEBUGINFOD_SONAME "@LIBDEBUGINFOD_SONAME@"
diff --git a/doc/ChangeLog b/doc/ChangeLog
index 05fcd23d..1822fc6b 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,9 @@
+2021-07-26  Noah Sanci <nsanci@redhat.com>
+
+	PR27982
+	* debuginfod-find.1: Document DEBUGINFOD_MAXTIME
+	and DEBUGINFOD_MAXSIZE.
+
 2021-04-23  Frank Ch. Eigler <fche@redhat.com>
 
 	PR27701
diff --git a/doc/debuginfod-find.1 b/doc/debuginfod-find.1
index 12d4ec2d..482a8ae7 100644
--- a/doc/debuginfod-find.1
+++ b/doc/debuginfod-find.1
@@ -147,6 +147,21 @@ is reexecuted.  Cache management parameters may be set by files under
 this directory: see the \fBdebuginfod_find_debuginfo(3)\fP man page
 for details.  The default is $HOME/.debuginfod_client_cache.
 
+.TP 21
+.B DEBUGINFOD_MAXTIME
+This environment variable dictates how long the client will wait to
+download a file found on a server in seconds. It is best used to ensure
+that a file is downloaded quickly or be rejected. The default is
+0 (infinite time).
+
+.TP 21
+.B DEBUGINFOD_MAXSIZE
+This environment variable dictates the maximum size of a file to
+download in bytes. This is best used if the user would like to ensure
+only small files are downloaded. A value of 0 causes no consideration
+for size, and the client may attempt to download a file of any size.
+The default is 0 (infinite size).
+
 .SH "FILES"
 .LP
 .PD .1v
diff --git a/tests/ChangeLog b/tests/ChangeLog
index dba750e4..34666609 100644
--- a/tests/ChangeLog
+++ b/tests/ChangeLog
@@ -1,3 +1,10 @@
+2021-07-26  Noah Sanci  <nsanci@redhat.com>
+
+	PR27982
+	* run-debuginfod-find.sh: Added a test to ensure that
+	DEBUGINFOD_MAXSIZE and DEBUGINFOD_MAXTIME work properly
+	by searching server and client logs for prompts.
+
 2021-07-16  Noah Sanci  <nsanci@redhat.com>
 
 	PR28034
diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
index 44e16242..991d1dc5 100755
--- a/tests/run-debuginfod-find.sh
+++ b/tests/run-debuginfod-find.sh
@@ -177,7 +177,33 @@ wait_ready $PORT1 'thread_busy{role="scan"}' 0
 testrun ${abs_builddir}/debuginfod_build_id_find -e F/p+r%o\$g 1
 
 ########################################################################
+## PR27892
+# Ensure DEBUGINFOD_MAXSIZE is functional and sends back the correct http
+# code
+env LD_LIBRARY_PATH=$ldpath DEBUGINFOD_RETRY_LIMIT=1 DEBUGINFOD_URLS="http://127.0.0.1:$PORT1/" DEBUGINFOD_MAXSIZE=1 \
+    ${abs_top_builddir}/debuginfod/debuginfod-find -v debuginfo F/p+r%o\$g.debug 2> find-vlog$PORT1 || true
+tempfiles find-vlog$PORT1
+# wait for the server to fail the same number of times the query is retried.
+wait_ready $PORT1 'http_responses_after_you_milliseconds_count{code="406"}' 1
+# ensure all reporting is functional
+grep 'serving file '$(realpath ${PWD})'/F/p+r%o\$g.debug' vlog$PORT1
+grep 'File too large' vlog$PORT1
+grep 'using max size 1B' find-vlog$PORT1
+if [ -f ${DEBUGINFOD_CACHE_PATH}/${BUILDID} ]; then
+  echo "File cached after maxsize check"
+  err
+fi
 
+# Ensure DEBUGINFOD_MAXTIME is functional
+env LD_LIBRARY_PATH=$ldpath DEBUGINFOD_URLS="http://127.0.0.1:8002/" DEBUGINFOD_MAXTIME=1 \
+    ${abs_top_builddir}/debuginfod/debuginfod-find -v debuginfo F/p+r%o\$g.debug 2> find-vlog$PORT1 || true
+grep 'using max time' find-vlog$PORT1
+# Ensure p+r%o\$g.debug is NOT cached
+if [ -f ${DEBUGINFOD_CACHE_PATH}/${BUILDID} ]; then
+  echo "File cached after maxtime check"
+  err
+fi
+########################################################################
 # PR25628
 rm -rf $DEBUGINFOD_CACHE_PATH # clean it from previous tests
 
-- 
2.31.1


  reply	other threads:[~2021-07-30 13:40 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-26 18:56 Noah Sanci
2021-07-29 13:36 ` Mark Wielaard
2021-07-29 13:58   ` Frank Ch. Eigler
2021-07-29 20:29   ` Noah Sanci
2021-07-30 11:11     ` Mark Wielaard
2021-07-30 13:39       ` Noah Sanci [this message]
2021-08-04 13:39         ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJXA7qieVV5Q6WRReSew8wGcy9JOywmiDqNAbq2dYKAdFnKZog@mail.gmail.com \
    --to=nsanci@redhat.com \
    --cc=elfutils-devel@sourceware.org \
    --cc=mark@klomp.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).