public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
* [Bug python/21699] printf "%s", $_as_string(...)   can include characters from the child
       [not found] <bug-21699-4717@http.sourceware.org/bugzilla/>
@ 2023-06-05 12:25 ` cvs-commit at gcc dot gnu.org
  2023-06-05 12:27 ` aburgess at redhat dot com
  1 sibling, 0 replies; 2+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-05 12:25 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=21699

--- Comment #3 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Andrew Burgess <aburgess@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=baab375361c365afee2577c94cbbd3fdd443d6da

commit baab375361c365afee2577c94cbbd3fdd443d6da
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Tue Jul 13 14:44:27 2021 -0400

    gdb: building inferior strings from within GDB

    History Of This Patch
    =====================

    This commit aims to address PR gdb/21699.  There have now been a
    couple of attempts to fix this issue.  Simon originally posted two
    patches back in 2021:

      https://sourceware.org/pipermail/gdb-patches/2021-July/180894.html
      https://sourceware.org/pipermail/gdb-patches/2021-July/180896.html

    Before Pedro then posted a version of his own:

      https://sourceware.org/pipermail/gdb-patches/2021-July/180970.html

    After this the conversation halted.  Then in 2023 I (Andrew) also took
    a look at this bug and posted two versions:

      https://sourceware.org/pipermail/gdb-patches/2023-April/198570.html
      https://sourceware.org/pipermail/gdb-patches/2023-April/198680.html

    The approach taken in my first patch was pretty similar to what Simon
    originally posted back in 2021.  My second attempt was only a slight
    variation on the first.

    Pedro then pointed out his older patch, and so we arrive at this
    patch.  The GDB changes here are mostly Pedro's work, but updated by
    me (Andrew), any mistakes are mine.

    The tests here are a combinations of everyone's work, and the commit
    message is new, but copies bits from everyone's earlier work.

    Problem Description
    ===================

    Bug PR gdb/21699 makes the observation that using $_as_string with
    GDB's printf can cause GDB to print unexpected data from the
    inferior.  The reproducer is pretty simple:

      #include <stddef.h>
      static char arena[100];

      /* Override malloc() so value_coerce_to_target() gets a known
         pointer, and we know we"ll see an error if $_as_string() gives
         a string that isn't null terminated. */
      void
      *malloc (size_t size)
      {
          memset (arena, 'x', sizeof (arena));
          if (size > sizeof (arena))
              return NULL;
          return arena;
      }

      int
      main ()
      {
        return 0;
      }

    And then in a GDB session:

      $ gdb -q test
      Reading symbols from /tmp/test...
      (gdb) start
      Temporary breakpoint 1 at 0x4004c8: file test.c, line 17.
      Starting program: /tmp/test

      Temporary breakpoint 1, main () at test.c:17
      17        return 0;
      (gdb) printf "%s\n", $_as_string("hello")
      "hello"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
      (gdb) quit

    The problem above is caused by how value_cstring is used within
    py-value.c, but once we understand the issue then it turns out that
    value_cstring is used in an unexpected way in many places within GDB.

    Within py-value.c we have a null-terminated C-style string.  We then
    pass a pointer to this string, along with the length of this
    string (so not including the null-character) to value_cstring.

    In value_cstring GDB allocates an array value of the given character
    type, and copies in requested number of characters.  However
    value_cstring does not add a null-character of its own.  This means
    that the value created by calling value_cstring is only
    null-terminated if the null-character is included in the passed in
    length.  In py-value.c this is not the case, and indeed, in most uses
    of value_cstring, this is not the case.

    When GDB tries to print one of these strings the value contents are
    pushed to the inferior, and then read back as a C-style string, that
    is, GDB reads inferior memory until it finds a null-terminator.  For
    the py-value.c case, no null-terminator is pushed into the inferior,
    so GDB will continue reading inferior memory until a null-terminator
    is found, with unpredictable results.

    Patch Description
    =================

    The first thing this patch does is better define what the arguments
    for the two function value_cstring and value_string should represent.
    The comments in the header file are updated to describe whether the
    length argument should, or should not, include a null-character.
    Also, the data argument is changed to type gdb_byte.  The functions as
    they currently exist will handle wide-characters, in which case more
    than one 'char' would be needed for each character.  As such using
    gdb_byte seems to make more sense.

    To avoid adding casts throughout GDB, I've also added an overload that
    still takes a 'char *', but asserts that the character type being used
    is of size '1'.

    The value_cstring function is now responsible for adding a null
    character at the end of the string value it creates.

    However, once we start looking at how value_cstring is used, we
    realise there's another, related, problem.  Not every language's
    strings are null terminated.  Fortran and Ada strings, for example,
    are just an array of characters, GDB already has the function
    value_string which can be used to create such values.

    Consider this example using current GDB:

      (gdb) set language ada
      (gdb) p $_gdb_setting("arch")
      $1 = (97, 117, 116, 111)
      (gdb) ptype $
      type = array (1 .. 4) of char
      (gdb) p $_gdb_maint_setting("test-settings string")
      $2 = (0)
      (gdb) ptype $
      type = array (1 .. 1) of char

    This shows two problems, first, the $_gdb_setting and
    $_gdb_maint_setting functions are calling value_cstring using the
    builtin_char character, rather than a language appropriate type.  In
    the first call, the 'arch' case, the value_cstring call doesn't
    include the null character, so the returned array only contains the
    expected characters.  But, in the $_gdb_maint_setting example we do
    end up including the null-character, even though this is not expected
    for Ada strings.

    This commit adds a new language method language_defn::value_string,
    this function takes a pointer and length and creates a language
    appropriate value that represents the string.  For C, C++, etc this
    will be a null-terminated string (by calling value_cstring), and for
    Fortran and Ada this can be a bounded array of characters with no null
    terminator.  Additionally, this new language_defn::value_string
    function is responsible for selecting a language appropriate character
    type.

    After this commit the only calls to value_cstring are from the C
    expression evaluator and from the default language_defn::value_string.

    And the only calls to value_string are from Fortan, Ada, and ObjectC
    related code.

    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=21699

    Co-Authored-By: Simon Marchi <simon.marchi@efficios.com>
    Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
    Co-Authored-By: Pedro Alves <pedro@palves.net>
    Approved-By: Simon Marchi <simon.marchi@efficios.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug python/21699] printf "%s", $_as_string(...)   can include characters from the child
       [not found] <bug-21699-4717@http.sourceware.org/bugzilla/>
  2023-06-05 12:25 ` [Bug python/21699] printf "%s", $_as_string(...) can include characters from the child cvs-commit at gcc dot gnu.org
@ 2023-06-05 12:27 ` aburgess at redhat dot com
  1 sibling, 0 replies; 2+ messages in thread
From: aburgess at redhat dot com @ 2023-06-05 12:27 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=21699

Andrew Burgess <aburgess at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|UNCONFIRMED                 |RESOLVED
                 CC|                            |aburgess at redhat dot com

--- Comment #4 from Andrew Burgess <aburgess at redhat dot com> ---
This issue should now be resolved.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-06-05 12:27 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-21699-4717@http.sourceware.org/bugzilla/>
2023-06-05 12:25 ` [Bug python/21699] printf "%s", $_as_string(...) can include characters from the child cvs-commit at gcc dot gnu.org
2023-06-05 12:27 ` aburgess at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).