public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] PR 16286: Reading python value as string beyond declared size
@ 2013-12-02 23:15 Doug Evans
  2013-12-03 20:29 ` Pedro Alves
  0 siblings, 1 reply; 5+ messages in thread
From: Doug Evans @ 2013-12-02 23:15 UTC (permalink / raw)
  To: gdb-patches, brobecker, saugustine

Hi.

The patch for 16196 inadvertently broke 16286.
https://sourceware.org/bugzilla/show_bug.cgi?id=16196
https://sourceware.org/bugzilla/show_bug.cgi?id=16286

I think the fix to 16196 is correct: Why ignore fetchlimit if length > 0?

However, it's not the behaviour c_get_string is expecting.

This patch fixes things by removing fetchlimit when length > 0
when calling read_string in c_get_string.

One could want c_get_string to, say, only ignore the declared size if it's 1,
since that's how the variable-length-array-at-end-of-struct idiom is
commonly used.
I don't have a strong opinion, I just went for preserving
the previous behaviour.

c_get_string is the LA_GET_STRING method,
which is only called from py-value.c,
so I think this patch is reasonable and safe.
We can augment and/or add a new method should the need ever arise.

I will check this in if there are no objections.
I think this is a blocker for 7.7.

2013-12-02  Doug Evans  <dje@google.com>

	PR 16286
	* c-lang.c (c_get_string): Ignore the declared size of the object
	if a specific length is requested.

	testsuite/
	* gdb.python/py-value.c: #include stdlib.h, string.h.
	(str): New struct.
	(main): New local xstr.
	* gdb.python/py-value.exp (test_value_in_inferior): Add test to
	fetch a value as a string with a length beyond the declared length
	of the array.

diff --git a/gdb/c-lang.c b/gdb/c-lang.c
index eecc76d..8d25893 100644
--- a/gdb/c-lang.c
+++ b/gdb/c-lang.c
@@ -227,9 +227,13 @@ c_printstr (struct ui_file *stream, struct type *type,
    until a null character of the appropriate width is found, otherwise
    the string is read to the length of characters specified.  The size
    of a character is determined by the length of the target type of
-   the pointer or array.  If VALUE is an array with a known length,
-   the function will not read past the end of the array.  On
-   completion, *LENGTH will be set to the size of the string read in
+   the pointer or array.
+
+   If VALUE is an array with a known length, and *LENGTH is -1,
+   the function will not read past the end of the array.  However, any
+   declared size of the array is ignored if *LENGTH > 0.
+
+   On completion, *LENGTH will be set to the size of the string read in
    characters.  (If a length of -1 is specified, the length returned
    will not include the null character).  CHARSET is always set to the
    target charset.  */
@@ -309,6 +313,21 @@ c_get_string (struct value *value, gdb_byte **buffer,
     {
       CORE_ADDR addr = value_as_address (value);
 
+      /* Prior to the fix for PR 16196 read_string would ignore fetchlimit
+	 if length > 0.  The old "broken" behaviour is the behaviour we want:
+	 The caller may want to fetch 100 bytes from a variable length array
+	 implemented using the common idiom of having an array of length 1 at
+	 the end of a struct.  In this case we want to ignore the declared
+	 size of the array.  However, it's counterintuitive to implement that
+	 behaviour in read_string: what does fetchlimit otherwise mean if
+	 length > 0.  Therefore we implement the behaviour we want here:
+	 If *length > 0, don't specify a fetchlimit.  This preserves the
+	 previous behaviour.  We could move this check above where we know
+	 whether the array is declared with a fixed size, but we only want
+	 to apply this behaviour when calling read_string.  PR 16286.  */
+      if (*length > 0)
+	fetchlimit = UINT_MAX;
+
       err = read_string (addr, *length, width, fetchlimit,
 			 byte_order, buffer, length);
       if (err)
diff --git a/gdb/testsuite/gdb.python/py-value.c b/gdb/testsuite/gdb.python/py-value.c
index 0c94e64..4f8c27b 100644
--- a/gdb/testsuite/gdb.python/py-value.c
+++ b/gdb/testsuite/gdb.python/py-value.c
@@ -16,6 +16,8 @@
    along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
 
 #include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
 
 struct s
 {
@@ -39,6 +41,13 @@ typedef struct s *PTR;
 
 enum e evalue = TWO;
 
+struct str
+{
+  int length;
+  /* Variable length.  */
+  char text[1];
+};
+
 #ifdef __cplusplus
 
 struct Base {
@@ -86,6 +95,8 @@ main (int argc, char *argv[])
   int i = 2;
   int *ptr_i = &i;
   const char *sn = 0;
+  struct str *xstr;
+
   s.a = 3;
   s.b = 5;
   u.a = 7;
@@ -96,6 +107,12 @@ main (int argc, char *argv[])
   ptr_ref(ptr_i);
 #endif
 
+#define STR_LENGTH 100
+  xstr = (struct str *) malloc (sizeof (*xstr) + STR_LENGTH);
+  xstr->length = STR_LENGTH;
+  memset (xstr->text, 'x', STR_LENGTH);
+#undef STR_LENGTH
+
   save_argv = argv;      /* break to inspect struct and union */
   return 0;
 }
diff --git a/gdb/testsuite/gdb.python/py-value.exp b/gdb/testsuite/gdb.python/py-value.exp
index 43de063..a052104 100644
--- a/gdb/testsuite/gdb.python/py-value.exp
+++ b/gdb/testsuite/gdb.python/py-value.exp
@@ -291,6 +291,12 @@ proc test_value_in_inferior {} {
   # For the purposes of this test, use repr()
   gdb_py_test_silent_cmd "python nullst = nullst.string (length = 9)" "get string beyond null" 1
   gdb_test "python print (repr(nullst))" "u?'divide\\\\x00et'"
+
+  # Test fetching a string longer than its declared (in C) size.
+  # PR 16286
+  gdb_py_test_silent_cmd "python xstr = gdb.parse_and_eval('xstr')" "get xstr" 1
+  gdb_test "python print xstr\['text'\].string (length = xstr\['length'\])" "x{100}" \
+    "read string beyond declared size"
 }
 
 proc test_lazy_strings {} {

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR 16286: Reading python value as string beyond declared size
  2013-12-02 23:15 [PATCH] PR 16286: Reading python value as string beyond declared size Doug Evans
@ 2013-12-03 20:29 ` Pedro Alves
  2013-12-03 23:01   ` Doug Evans
  0 siblings, 1 reply; 5+ messages in thread
From: Pedro Alves @ 2013-12-03 20:29 UTC (permalink / raw)
  To: Doug Evans; +Cc: gdb-patches, brobecker, saugustine

On 12/02/2013 11:14 PM, Doug Evans wrote:
> +      if (*length > 0)
> +	fetchlimit = UINT_MAX;

Shouldn't this be:

      if (*length > 0)
	fetchlimit = *length;

?  That is, if the caller specified a limit, why do we do over it?

Couldn't this new check be merge above where we compute
fetchlimit to begin with?  With the comment there adjusted to
something like:

+      /* If have an explicit requested length, use that as fetchlimit.
+        Otherwise, if we know the size of the array, we can use it as
+        a limit on the number of characters to be fetched.  */



BTW, it looks like the not_lval/lval_internalvar path can
blindly read beyond the value's contents buffer, if *length
is bigger than the value's contents buffer size:

  /* If the string lives in GDB's memory instead of the inferior's,
     then we just need to copy it to BUFFER.  Also, since such strings
     are arrays with known size, FETCHLIMIT will hold the size of the
     array.  */
  if ((VALUE_LVAL (value) == not_lval
       || VALUE_LVAL (value) == lval_internalvar)
      && fetchlimit != UINT_MAX)
    {
      int i;
      const gdb_byte *contents = value_contents (value);

      /* If a length is specified, use that.  */
      if (*length >= 0)
	i  = *length;
        ^^^^^^^^^^^^^
      else
 	/* Otherwise, look for a null character.  */
 	for (i = 0; i < fetchlimit; i++)
	  if (extract_unsigned_integer (contents + i * width,
					width, byte_order) == 0)
 	    break;

      /* I is now either a user-defined length, the number of non-null
 	 characters, or FETCHLIMIT.  */
      *length = i * width;
      *buffer = xmalloc (*length);
      memcpy (*buffer, contents, *length);
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR 16286: Reading python value as string beyond declared size
  2013-12-03 20:29 ` Pedro Alves
@ 2013-12-03 23:01   ` Doug Evans
  2013-12-04 11:47     ` Pedro Alves
  0 siblings, 1 reply; 5+ messages in thread
From: Doug Evans @ 2013-12-03 23:01 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches, Joel Brobecker, Sterling Augustine

On Tue, Dec 3, 2013 at 12:29 PM, Pedro Alves <palves@redhat.com> wrote:
> On 12/02/2013 11:14 PM, Doug Evans wrote:
>> +      if (*length > 0)
>> +     fetchlimit = UINT_MAX;
>
> Shouldn't this be:
>
>       if (*length > 0)
>         fetchlimit = *length;
>
> ?  That is, if the caller specified a limit, why do we do over it?

read_string will take min (len, fetchlimit), and I saw no value in
passing fetchlimit = *length.

> Couldn't this new check be merge above where we compute
> fetchlimit to begin with?  With the comment there adjusted to
> something like:
>
> +      /* If have an explicit requested length, use that as fetchlimit.
> +        Otherwise, if we know the size of the array, we can use it as
> +        a limit on the number of characters to be fetched.  */

I want to keep fetchlimit and *length separate at this point.
They have different meanings/purposes.

> BTW, it looks like the not_lval/lval_internalvar path can
> blindly read beyond the value's contents buffer, if *length
> is bigger than the value's contents buffer size:
>
>   /* If the string lives in GDB's memory instead of the inferior's,
>      then we just need to copy it to BUFFER.  Also, since such strings
>      are arrays with known size, FETCHLIMIT will hold the size of the
>      array.  */
>   if ((VALUE_LVAL (value) == not_lval
>        || VALUE_LVAL (value) == lval_internalvar)
>       && fetchlimit != UINT_MAX)
>     {
>       int i;
>       const gdb_byte *contents = value_contents (value);
>
>       /* If a length is specified, use that.  */
>       if (*length >= 0)
>         i  = *length;
>         ^^^^^^^^^^^^^
>       else
>         /* Otherwise, look for a null character.  */
>         for (i = 0; i < fetchlimit; i++)
>           if (extract_unsigned_integer (contents + i * width,
>                                         width, byte_order) == 0)
>             break;
>
>       /* I is now either a user-defined length, the number of non-null
>          characters, or FETCHLIMIT.  */
>       *length = i * width;
>       *buffer = xmalloc (*length);
>       memcpy (*buffer, contents, *length);
>       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It didn't look right to me either, but I was leaving digging deeper
for another pass.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR 16286: Reading python value as string beyond declared size
  2013-12-03 23:01   ` Doug Evans
@ 2013-12-04 11:47     ` Pedro Alves
  2013-12-11  0:25       ` Doug Evans
  0 siblings, 1 reply; 5+ messages in thread
From: Pedro Alves @ 2013-12-04 11:47 UTC (permalink / raw)
  To: Doug Evans; +Cc: gdb-patches, Joel Brobecker, Sterling Augustine

On 12/03/2013 11:01 PM, Doug Evans wrote:
> On Tue, Dec 3, 2013 at 12:29 PM, Pedro Alves <palves@redhat.com> wrote:
>> On 12/02/2013 11:14 PM, Doug Evans wrote:
>>> +      if (*length > 0)
>>> +     fetchlimit = UINT_MAX;
>>
>> Shouldn't this be:
>>
>>       if (*length > 0)
>>         fetchlimit = *length;
>>
>> ?  That is, if the caller specified a limit, why do we do over it?
> 
> read_string will take min (len, fetchlimit), and I saw no value in
> passing fetchlimit = *length.

Ah, I see now.  Thanks.

>> BTW, it looks like the not_lval/lval_internalvar path can
>> blindly read beyond the value's contents buffer, if *length
>> is bigger than the value's contents buffer size:
> It didn't look right to me either, but I was leaving digging deeper
> for another pass.

OK.  TBC, I wasn't requesting that'd be fixed in this patch, only
for confirmation that I wasn't missing something.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR 16286: Reading python value as string beyond declared size
  2013-12-04 11:47     ` Pedro Alves
@ 2013-12-11  0:25       ` Doug Evans
  0 siblings, 0 replies; 5+ messages in thread
From: Doug Evans @ 2013-12-11  0:25 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches, Joel Brobecker, Sterling Augustine

Pedro Alves writes:
 > On 12/03/2013 11:01 PM, Doug Evans wrote:
 > > On Tue, Dec 3, 2013 at 12:29 PM, Pedro Alves <palves@redhat.com> wrote:
 > >> On 12/02/2013 11:14 PM, Doug Evans wrote:
 > >>> +      if (*length > 0)
 > >>> +     fetchlimit = UINT_MAX;
 > >>
 > >> Shouldn't this be:
 > >>
 > >>       if (*length > 0)
 > >>         fetchlimit = *length;
 > >>
 > >> ?  That is, if the caller specified a limit, why do we do over it?
 > > 
 > > read_string will take min (len, fetchlimit), and I saw no value in
 > > passing fetchlimit = *length.
 > 
 > Ah, I see now.  Thanks.

Thanks.  Committed.

 > >> BTW, it looks like the not_lval/lval_internalvar path can
 > >> blindly read beyond the value's contents buffer, if *length
 > >> is bigger than the value's contents buffer size:
 > > It didn't look right to me either, but I was leaving digging deeper
 > > for another pass.
 > 
 > OK.  TBC, I wasn't requesting that'd be fixed in this patch, only
 > for confirmation that I wasn't missing something.

I filed https://sourceware.org/bugzilla/show_bug.cgi?id=16313

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-12-11  0:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-02 23:15 [PATCH] PR 16286: Reading python value as string beyond declared size Doug Evans
2013-12-03 20:29 ` Pedro Alves
2013-12-03 23:01   ` Doug Evans
2013-12-04 11:47     ` Pedro Alves
2013-12-11  0:25       ` Doug Evans

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).