public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Simon Marchi <simon.marchi@efficios.com>
To: gdb-patches@sourceware.org
Subject: [PATCH] gdb: consider null terminator for length arguments of value_cstring calls
Date: Tue, 13 Jul 2021 11:31:42 -0400	[thread overview]
Message-ID: <20210713153142.3957666-1-simon.marchi@efficios.com> (raw)

From: Simon Marchi <simon.marchi@polymtl.ca>

This patch started when I looked at this bit in cli/cli-cmds.c:

    if (*(char **) cmd->var)
      return value_cstring (*(char **) cmd->var, strlen (*(char **) cmd->var),
			    builtin_type (gdbarch)->builtin_char);
    else
      return value_cstring ("", 1,
			    builtin_type (gdbarch)->builtin_char);

I found it odd that the first value_cstring call passes a length that
does not consider the null terminator of the C string, but second does.
I want to clarify the documentation of value_cstring and fix the one
that is wrong.

Debugging a little bit, I found that the first call is the wrong one.
Doing:

  (gdb) set args foo
  (gdb) print $_gdb_setting("args")
  $1 = "foo"

creates a struct value of code TYPE_CODE_ARRAY of size 3, which doesn't
have a null terminator.  That does not create a valid C string.  It is
however printed correctly by GDB, because the printing code makes sure
not to read past the value's length.

A way to expose the bug, as implemented in the new test, is to make an
inferior function call with this value, using a function that reads the
string until the null terminator.  That allocates some memory in the
inferior for the string and copies it.  If the null terminator is
missing, the function will read past the allocated buffer, possibly
causing a crash.  In practice, the memory following the buffer must
pretty often be zeroed-out, because I didn't see a crash just with this.
But if we compile the inferior program with AddressSanitizer, then the
out-of-bounds access causes a crash systematically.

It would be nice to find a test that does not require compiling the
inferior program with AddressSanitizer, if somebody has a better idea,
but at least I think it's better than no test.

I found a few calls to value_cstring that were wrong, I fixed them
all, all exercised by the test.

The change in guile/scm-math.c maybe requires a bit of explanation.
According to the doc of gdbscm_scm_to_string, if don't pass a length
pointer, we get back a null-terminated string.  If we pass a length
pointer, we get a non-null-terminated string, but we get the length
separately.  Trying to pass "len + 1" to value_cstring would lead to GDB
reading past the allocated buffer, that is exactly of length `len`.  So
instead, don't pass a length pointer and use strlen on the result.

gdb.base/settings.exp and gdb.python/py-mi.exp required changes in some
expected outputs, because the array type created by $_gdb_setting_str
is now one element larger, to account for the null terminator.  I don't
think this change in user-visible behavior is a problem.

Change-Id: If8dd13cce55c70521e34e7c360139064b4e87496
---
 gdb/cli/cli-cmds.c                            |   8 +-
 gdb/guile/scm-math.c                          |   6 +-
 gdb/python/py-value.c                         |   2 +-
 gdb/testsuite/gdb.base/infcall-gdb-setting.c  |  42 ++++
 .../gdb.base/infcall-gdb-setting.exp          | 204 ++++++++++++++++++
 gdb/testsuite/gdb.base/settings.exp           |   2 +-
 gdb/testsuite/gdb.python/py-mi.exp            |   2 +-
 gdb/valops.c                                  |  11 +-
 gdb/value.c                                   |   2 +-
 gdb/value.h                                   |  14 ++
 10 files changed, 274 insertions(+), 19 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/infcall-gdb-setting.c
 create mode 100644 gdb/testsuite/gdb.base/infcall-gdb-setting.exp

diff --git a/gdb/cli/cli-cmds.c b/gdb/cli/cli-cmds.c
index 56ae12a0c19..f6fc5ab854f 100644
--- a/gdb/cli/cli-cmds.c
+++ b/gdb/cli/cli-cmds.c
@@ -2146,7 +2146,8 @@ value_from_setting (const cmd_list_element *cmd, struct gdbarch *gdbarch)
     case var_filename:
     case var_enum:
       if (*(char **) cmd->var)
-	return value_cstring (*(char **) cmd->var, strlen (*(char **) cmd->var),
+	return value_cstring (*(char **) cmd->var,
+			      strlen (*(char **) cmd->var) + 1,
 			      builtin_type (gdbarch)->builtin_char);
       else
 	return value_cstring ("", 1,
@@ -2198,7 +2199,7 @@ str_value_from_setting (const cmd_list_element *cmd, struct gdbarch *gdbarch)
       {
 	std::string cmd_val = get_setshow_command_value_string (cmd);
 
-	return value_cstring (cmd_val.c_str (), cmd_val.size (),
+	return value_cstring (cmd_val.c_str (), cmd_val.size () + 1,
 			      builtin_type (gdbarch)->builtin_char);
       }
 
@@ -2212,7 +2213,8 @@ str_value_from_setting (const cmd_list_element *cmd, struct gdbarch *gdbarch)
 	 escaping quotes.  So, we directly use the cmd->var string value,
 	 similarly to the value_from_setting code for these cases.  */
       if (*(char **) cmd->var)
-	return value_cstring (*(char **) cmd->var, strlen (*(char **) cmd->var),
+	return value_cstring (*(char **) cmd->var,
+			      strlen (*(char **) cmd->var) + 1,
 			      builtin_type (gdbarch)->builtin_char);
       else
 	return value_cstring ("", 1,
diff --git a/gdb/guile/scm-math.c b/gdb/guile/scm-math.c
index d9fd6718196..8b7fe9b69ec 100644
--- a/gdb/guile/scm-math.c
+++ b/gdb/guile/scm-math.c
@@ -776,8 +776,6 @@ vlscm_convert_typed_value_from_scheme (const char *func_name,
 	}
       else if (scm_is_string (obj))
 	{
-	  size_t len;
-
 	  if (type != NULL)
 	    {
 	      except_scm = gdbscm_make_misc_error (func_name, type_arg_pos,
@@ -789,12 +787,12 @@ vlscm_convert_typed_value_from_scheme (const char *func_name,
 	    {
 	      /* TODO: Provide option to specify conversion strategy.  */
 	      gdb::unique_xmalloc_ptr<char> s
-		= gdbscm_scm_to_string (obj, &len,
+		= gdbscm_scm_to_string (obj, nullptr,
 					target_charset (gdbarch),
 					0 /*non-strict*/,
 					&except_scm);
 	      if (s != NULL)
-		value = value_cstring (s.get (), len,
+		value = value_cstring (s.get (), strlen (s.get ()) + 1,
 				       language_string_char_type (language,
 								  gdbarch));
 	      else
diff --git a/gdb/python/py-value.c b/gdb/python/py-value.c
index 8df8a15f8d6..bc0f88b7a16 100644
--- a/gdb/python/py-value.c
+++ b/gdb/python/py-value.c
@@ -1907,7 +1907,7 @@ convert_value_from_python (PyObject *obj)
 	  gdb::unique_xmalloc_ptr<char> s
 	    = python_string_to_target_string (obj);
 	  if (s != NULL)
-	    value = value_cstring (s.get (), strlen (s.get ()),
+	    value = value_cstring (s.get (), strlen (s.get ()) + 1,
 				   builtin_type_pychar);
 	}
       else if (PyObject_TypeCheck (obj, &value_object_type))
diff --git a/gdb/testsuite/gdb.base/infcall-gdb-setting.c b/gdb/testsuite/gdb.base/infcall-gdb-setting.c
new file mode 100644
index 00000000000..f8f00063783
--- /dev/null
+++ b/gdb/testsuite/gdb.base/infcall-gdb-setting.c
@@ -0,0 +1,42 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+static char *buf;
+
+static char *
+print_str (const char *s)
+{
+  free (buf);
+  buf = strdup (s);
+  return buf;
+}
+
+static void
+end (void)
+{}
+
+int
+main (void)
+{
+  print_str ("unused");
+  end ();
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/infcall-gdb-setting.exp b/gdb/testsuite/gdb.base/infcall-gdb-setting.exp
new file mode 100644
index 00000000000..fd016c645a3
--- /dev/null
+++ b/gdb/testsuite/gdb.base/infcall-gdb-setting.exp
@@ -0,0 +1,204 @@
+# Copyright 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that string values are correctly allocated inside GDB when doing
+# various operations that yield a string object.
+#
+# The issue that lead to this test was missing NULL terminator in the C-string
+# values.  The way to expose this is:
+#
+#  - Compile a test program with AddressSanitizer, with a function that takes a
+#    string as a parameter and does something with it that requires reading
+#    until the NULL terminator (e.g. strdups it).
+#  - Call this function by hand, passing the string value to test
+#
+# This requires GDB to copy the string value to the inferior, allocating
+# as many bytes of new memory as the struct value requires.  If the NULL
+# terminator is missing, the function will read past the allocated buffer,
+# triggering an ASan failure, failing the test.
+#
+# The "infcall" portion is really just there to test and expose bugs in the
+# value-creation code.
+
+load_lib "trace-support.exp"
+load_lib "gdb-guile.exp"
+
+standard_testfile
+
+if {[build_executable "failed to prepare" \
+     $testfile $srcfile \
+     {debug additional_flags=-fsanitize=address ldflags=-fsanitize=address}]} {
+    return
+}
+
+# Test with string values made by $_gdb_setting & co.
+
+proc_with_prefix test_setting { } {
+    clean_restart $::binfile
+
+    if {![runto_main]} {
+	fail "could not run to main"
+	return
+    }
+
+    # This is an internal GDB implementation detail, but the variable backing a
+    # string setting starts as nullptr (unless explicitly initialized at startup).
+    # When assigning an empty value, the variable then points to an empty string.
+    # Test both cases, as it triggers different code paths (in addition to a
+    # non-empty value).
+    #
+    # Use "set trace-user" and "maintenance set test-settings string" as they are
+    # both not initialized at startup.
+    gdb_test "print print_str(\$_gdb_setting(\"trace-user\"))" \
+	" = $::hex \"\"" \
+	"print \$_gdb_setting nullptr"
+    gdb_test "print print_str(\$_gdb_setting_str(\"trace-user\"))" \
+	" = $::hex \"\"" \
+	"print \$_gdb_setting_str nullptr"
+
+    gdb_test "set trace-user"
+    gdb_test "print print_str(\$_gdb_setting(\"trace-user\"))" \
+	" = $::hex \"\"" \
+	"print \$_gdb_setting empty"
+    gdb_test "print print_str(\$_gdb_setting_str(\"trace-user\"))" \
+	" = $::hex \"\"" \
+	"print \$_gdb_setting_str empty"
+
+    gdb_test "set trace-user poulet"
+    gdb_test "print print_str(\$_gdb_setting(\"trace-user\"))" \
+	" = $::hex \"poulet\"" \
+	"print \$_gdb_setting non-empty"
+    gdb_test "print print_str(\$_gdb_setting_str(\"trace-user\"))" \
+	" = $::hex \"poulet\"" \
+	"print \$_gdb_setting_str non-empty"
+
+    gdb_test "print print_str(\$_gdb_maint_setting(\"test-settings string\"))" \
+	" = $::hex \"\"" \
+	"print \$_gdb_maint_setting nullptr"
+    gdb_test "print print_str(\$_gdb_maint_setting_str(\"test-settings string\"))" \
+	" = $::hex \"\"" \
+	"print \$_gdb_maint_setting_str nullptr"
+
+    gdb_test "maintenance set test-settings string"
+    gdb_test "print print_str(\$_gdb_maint_setting(\"test-settings string\"))" \
+	" = $::hex \"\"" \
+	"print \$_gdb_maint_setting empty"
+    gdb_test "print print_str(\$_gdb_maint_setting_str(\"test-settings string\"))" \
+	" = $::hex \"\"" \
+	"print \$_gdb_maint_setting_str empty"
+
+    gdb_test "maintenance set test-settings string perchaude"
+    gdb_test "print print_str(\$_gdb_maint_setting(\"test-settings string\"))" \
+	" = $::hex \"perchaude\"" \
+	"print \$_gdb_maint_setting non-empty"
+    gdb_test "print print_str(\$_gdb_maint_setting_str(\"test-settings string\"))" \
+	" = $::hex \"perchaude\"" \
+	"print \$_gdb_maint_setting_str non-empty"
+
+    # Test with a non-string setting, this tests yet another code path.
+    gdb_test "print print_str(\$_gdb_setting_str(\"remotetimeout\"))" \
+	" = $::hex \"2\"" \
+	"print \$_gdb_setting integer"
+}
+
+# Test with a string value created by gdb.Value in Python.
+
+proc_with_prefix test_python_value { } {
+    clean_restart $::binfile
+
+    if {![runto_main]} {
+	fail "could not run to main"
+	return
+    }
+
+    if {[skip_python_tests]} {
+	untested "skipping test_python_value"
+	return
+    }
+
+    gdb_test_no_output "python gdb.set_convenience_variable(\"foo\", \"bar\")" \
+	"set convenience var"
+    gdb_test "print print_str(\$foo)" \
+	" = $::hex \"bar\"" \
+	"print from Python value"
+}
+
+# Test with a string value created by make-value in Guile.
+
+proc_with_prefix test_guile_value { } {
+    clean_restart $::binfile
+
+    if {![runto_main]} {
+	fail "could not run to main"
+	return
+    }
+
+    if {[skip_guile_tests]} {
+	untested "skipping test_guile_value"
+	return
+    }
+
+    gdb_test_no_output "guile (use-modules (gdb))"
+    gdb_test_multiple "guile (history-append! (make-value \"foo\"))" "make value" {
+	-re -wrap "($::decimal)" {
+	    set histnum $expect_out(1,string)
+	}
+    }
+
+    gdb_test "print print_str(\$$histnum)" \
+        " = $::hex \"foo\"" \
+        "print from Guile value"
+}
+
+# Test with a string value coming from a string internal var.  The only internal
+# vars of this type, at the time of writing, are $trace_func and $trace_file.
+# They both require inspecting a trace frame.  So if the target is capable start
+# tracing, record one trace frame, and use $trace_func.
+
+proc_with_prefix test_internal_var { } {
+    if {![gdb_trace_common_supports_arch]} {
+	unsupported "arch does not support trace"
+	return
+    }
+
+    clean_restart $::binfile
+
+    if {![runto_main]} {
+	fail "could not run to main"
+	return
+    }
+
+    if {![gdb_target_supports_trace]} {
+	unsupported "target does not support trace"
+	return
+    }
+
+    gdb_test "break end" "Breakpoint $::decimal at $::hex.*"
+    gdb_test "trace print_str" "Tracepoint $::decimal at $::hex.*"
+    gdb_test_no_output "tstart"
+    gdb_test "continue" "Breakpoint $::decimal, end.*"
+    gdb_test_no_output "tstop"
+    gdb_test "tfind" "Found trace frame 0, tracepoint $::decimal.*"
+    gdb_test_no_output "set \$convenience = \$trace_func"
+    gdb_test "tfind none" "No longer looking at any trace frame.*"
+    gdb_test "print print_str(\$convenience)" \
+	" = $::hex \"print_str\"" \
+	"print internal var string"
+}
+
+test_setting
+test_python_value
+test_guile_value
+test_internal_var
diff --git a/gdb/testsuite/gdb.base/settings.exp b/gdb/testsuite/gdb.base/settings.exp
index 64257aaae5a..0a594685d6b 100644
--- a/gdb/testsuite/gdb.base/settings.exp
+++ b/gdb/testsuite/gdb.base/settings.exp
@@ -498,7 +498,7 @@ proc_with_prefix test-enum {} {
     gdb_test_no_output "$set_cmd zzz"
     show_setting "$show_cmd" "zzz" 0 "yyy"
 
-    check_type "test-settings enum" "type = char \\\[3\\\]"
+    check_type "test-settings enum" "type = char \\\[4\\\]"
 
     test_gdb_complete_multiple "$set_cmd " "" "" {
 	"xxx"
diff --git a/gdb/testsuite/gdb.python/py-mi.exp b/gdb/testsuite/gdb.python/py-mi.exp
index 6f0c1c65e62..b81ee9acd74 100644
--- a/gdb/testsuite/gdb.python/py-mi.exp
+++ b/gdb/testsuite/gdb.python/py-mi.exp
@@ -289,7 +289,7 @@ mi_create_dynamic_varobj nstype2 nstype2 1 \
   "create nstype2 varobj"
 
 mi_list_varobj_children nstype2 {
-    { {nstype2.<error at 0>} {<error at 0>} 6 {char \[6\]} }
+    { {nstype2.<error at 0>} {<error at 0>} 7 {char \[7\]} }
 } "list children after setting exception flag"
 
 mi_create_varobj me me \
diff --git a/gdb/valops.c b/gdb/valops.c
index bd547923496..f67612e87b0 100644
--- a/gdb/valops.c
+++ b/gdb/valops.c
@@ -1747,6 +1747,8 @@ value_array (int lowbound, int highbound, struct value **elemvec)
   return val;
 }
 
+/* See value.h.  */
+
 struct value *
 value_cstring (const char *ptr, ssize_t len, struct type *char_type)
 {
@@ -1761,14 +1763,7 @@ value_cstring (const char *ptr, ssize_t len, struct type *char_type)
   return val;
 }
 
-/* Create a value for a string constant by allocating space in the
-   inferior, copying the data into that space, and returning the
-   address with type TYPE_CODE_STRING.  PTR points to the string
-   constant data; LEN is number of characters.
-
-   Note that string types are like array of char types with a lower
-   bound of zero and an upper bound of LEN - 1.  Also note that the
-   string may contain embedded null bytes.  */
+/* See value.h.  */
 
 struct value *
 value_string (const char *ptr, ssize_t len, struct type *char_type)
diff --git a/gdb/value.c b/gdb/value.c
index 6a07495d32b..db34d2e5762 100644
--- a/gdb/value.c
+++ b/gdb/value.c
@@ -2188,7 +2188,7 @@ value_of_internalvar (struct gdbarch *gdbarch, struct internalvar *var)
       break;
 
     case INTERNALVAR_STRING:
-      val = value_cstring (var->u.string, strlen (var->u.string),
+      val = value_cstring (var->u.string, strlen (var->u.string) + 1,
 			   builtin_type (gdbarch)->builtin_char);
       break;
 
diff --git a/gdb/value.h b/gdb/value.h
index 379cddafbe7..24392437ed0 100644
--- a/gdb/value.h
+++ b/gdb/value.h
@@ -782,8 +782,22 @@ class scoped_value_mark
   const struct value *m_value;
 };
 
+/* Create a not_lval value with type TYPE_CODE_ARRAY.
+
+   PTR points to the string data; LEN is the length of the string including the
+   NULL terminator.  */
+
 extern struct value *value_cstring (const char *ptr, ssize_t len,
 				    struct type *char_type);
+
+/* Create a not_lval value with type TYPE_CODE_STRING.
+
+   PTR points to the string data; LEN is number of characters.
+
+   Note that string types are like array of char types with a lower
+   bound of zero and an upper bound of LEN - 1.  Also note that the
+   string may contain embedded null bytes.  */
+
 extern struct value *value_string (const char *ptr, ssize_t len,
 				   struct type *char_type);
 
-- 
2.32.0


             reply	other threads:[~2021-07-13 15:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-13 15:31 Simon Marchi [this message]
2021-07-13 17:07 ` Pedro Alves
2021-07-13 18:44   ` Simon Marchi
2021-07-14 10:25     ` Pedro Alves
2021-07-14 13:50       ` Simon Marchi
2021-07-15 14:29         ` Pedro Alves
2021-07-23 19:37           ` Simon Marchi
2021-07-21 15:21       ` Philippe Waroquiers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210713153142.3957666-1-simon.marchi@efficios.com \
    --to=simon.marchi@efficios.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).