From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 88FC83858D39 for ; Mon, 3 Apr 2023 21:49:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 88FC83858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680558546; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=L8Y4IAvh0f/alpBK33TMNQRpP7Irk3MH2kA2ZvGiFrA=; b=Sg/r4i3UMwYiI5/axkUn6ldlAh+UWgvXC35yRpyME8MaG2r+GSdAhWXpd3BbrSUazacYJv a68umrxuSimkXuzSm5+vk00b4Oc1JIxe3pM7cP7g1fp0clhNnR0vlKLKT8kxDYgQXZbcHw 3JQ19kQZjBUUFFYur8IlOVAmPUTERAY= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-631-HiZXY9YGMd6_EYoDn9Ezrg-1; Mon, 03 Apr 2023 17:49:05 -0400 X-MC-Unique: HiZXY9YGMd6_EYoDn9Ezrg-1 Received: by mail-qt1-f198.google.com with SMTP id h6-20020ac85846000000b003e3c23d562aso20739763qth.1 for ; Mon, 03 Apr 2023 14:49:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680558544; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=L8Y4IAvh0f/alpBK33TMNQRpP7Irk3MH2kA2ZvGiFrA=; b=UEUe/j7u1zSQKWyliy5iICA+UBjVk5xHPr1gD7mTPIR4yvGrOpt8QoKVFhQoamRQ6k H+YjMpWPSS6Gcxjc5S0D/DvYV2T7FRyBUXJmXltZyJ1H3Vd3/OGerX1VnhCcvKKVv/yE vy9nOKMesRKCHn4kU10qjHCXZeyedyPBgjtPE8Wh5Ss6PbBCLNapxvfbUFESwQOW/9MA iiLT0e5I+8GHfnn54AIkHjCPUdELNpC+0HCvGSkIpo2l7LNYn2sPEU3FXLVUc4ahd1/v tjpdd4Rl1ozK16Rmkl0cGCvltwzwvZP6TfB2ZXROY0SHGdAgeCIXthKsJqdIf+gX/cms R3DA== X-Gm-Message-State: AO0yUKX8f7Sg1/FAWj27UoID4mU1Kt5Ip+oIKZ7FWxbrwZbWMMlasGGc rcHy15EQXqggxlqaO5UyUsHTcjpZTotxTw66eLpPqVN1e3qEaipdatps0F3GEnGPrmsFokU+vFy cwJKwzBp8NQgwJYfGUmAXCMW+CrGa/Z6FthULjCfMvzKfRLvTA7QLPH2ljeFjZlF1rgQ/YrZqPN UGL7mm+g== X-Received: by 2002:a05:622a:11cc:b0:3da:7226:7539 with SMTP id n12-20020a05622a11cc00b003da72267539mr65010369qtk.66.1680558544201; Mon, 03 Apr 2023 14:49:04 -0700 (PDT) X-Google-Smtp-Source: AK7set9zf2qR485tbnkX+IKTr9bqr4985Sue6mREhnhBv2gAXdlLIvhI/Hly/kiod/+PUu6A+LBF6g== X-Received: by 2002:a05:622a:11cc:b0:3da:7226:7539 with SMTP id n12-20020a05622a11cc00b003da72267539mr65010328qtk.66.1680558543515; Mon, 03 Apr 2023 14:49:03 -0700 (PDT) Received: from localhost (95.72.115.87.dyn.plus.net. [87.115.72.95]) by smtp.gmail.com with ESMTPSA id s17-20020a05620a255100b0074868aed8easm3093639qko.71.2023.04.03.14.49.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Apr 2023 14:49:03 -0700 (PDT) From: Andrew Burgess To: gdb-patches@sourceware.org Cc: Andrew Burgess Subject: [PATCH] gdb: fix missing null-character when using value_cstring Date: Mon, 3 Apr 2023 22:49:00 +0100 Message-Id: X-Mailer: git-send-email 2.25.4 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: In PR gdb/21699 an issue was reported with the $_as_string convenience function. It was observed that the string returned by this function, when pushed into the inferior, was not null terminated. This was causing problems when using the result with GDB's printf command, as this command relies on the string having been pushed into the inferior and being null terminated. The bug includes a simple reproducer: #include static char arena[51] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"; /* Override malloc() so value_coerce_to_target() gets a known pointer, and we know we"ll see an error if $_as_string() gives a string that isn't NULL terminated. */ void *malloc (size_t size) { if (size > sizeof (arena)) return NULL; return arena; } int main () { return 0; } Then use this in a GDB session like this: $ gdb -q test Reading symbols from /tmp/test... (gdb) start Temporary breakpoint 1 at 0x4004c8: file test.c, line 17. Starting program: /tmp/test Temporary breakpoint 1, main () at test.c:17 17 return 0; (gdb) printf "%s\n", $_as_string("hello") "hello"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (gdb) quit The problem is with the GDB function value_cstring, or at least, how this function is being used. When $_as_string is called we enter fnpy_call (python/py-function.c), this calls into Python code. The Python code returns a result, which will be a Python string, and then we call convert_value_from_python to convert the Python string to a GDB value. In convert_value_from_python (python/py-value.c) we enter the gdbpy_is_string case (as the result is a string), then we call python_string_to_target_string, which returns a null terminated C string. Next we then make this call: value = value_cstring (s.get (), strlen (s.get ()), builtin_type_pychar); This passes the pointer to the first character 's.get ()' and the length of the string 'strlen (s.get ())', however, this length does not include the trailing null character. If we look at value_cstring (valops.c) we see that an array is created using the passed in length, and characters are copied into the newly allocated array value. This means we do not copy the strings trailing null character, nor does value_cstring add a trailing null. Finally, when we use this array value with printf, GDB pushed the array to the inferior, which mallocs some space based on the array length, and then copies the array content to the inferior. As the array doesn't include a trailing null, non is written into the inferior. So what we place into the inferior is not a C string, but is actually an array of characters. When printf tries to print the value it starts at the address of the first character and continues until it reaches a null. When that will be is undefined, so we may end up printing random garbage. Now, ignore for a moment that the whole push an array to the inferior just so we can fetch it in order to print it is clearly crazy. That's a problem for another day I think. The important question here is: should value_cstring include a null character or not. Given the function name include 'cstring', which I'm assuming means C style string, I think that we should be including a trailing null. Given that, I see two possibilities, either value_cstring can always add a trailing null, or value_cstring can assert that there is a trailing null, and the caller is responsible for making sure that the passed in length includes the null character. Given we're always passing from a C style string to begin with the question is really, should the length being passed to value_cstring include the null, or not include the null? The only place where we currently include the null in the passed length is from c_string_operation::evaluate. Every other use of value_cstring passes the length excluding the null. I was tempted to adjust c_string_operation::evaluate to exclude the null, and then have value_cstring add a trailing null. However, this does mean that if, in the future, a use is introduced that incorrectly includes the trailing null in the passed length, then we are unlikely to spot immediately - we'd instead create an array with two null characters at the end. Alternatively, if we change the requirements of value_cstring so that we require the passed length includes the trailing null, then we can assert that this is indeed the case within value_cstring. Any incorrect uses in the future will be quickly spotted. So that's what I did, c_string_operation::evaluate is left unchanged, but every other use of value_cstring is adjusted with a '+ 1' so that we include the null within the length being passed. I've added a header comment to value_cstring (value.h) to describe the requirements. Upon testing there were two tests that failed after this fix, gdb.base/settings.exp and gdb.python/py-mi.exp. In both of these cases we end up asking for the type of a character array allocated through value_cstring. The length of this array has now increased by one. Here's the previous behaviour: (gdb) set args abc (gdb) p $_gdb_setting("args") $1 = "abc" (gdb) ptype $1 type = char [3] (gdb) And here's the modified behaviour: (gdb) set args abc (gdb) p $_gdb_setting("args") $1 = "abc" (gdb) ptype $1 type = char [4] (gdb) Notice the type of $1 changed from an array of length 3 to an array of length 4. However, I don't think this is a bad thing, consider: char lit[] = "zzz"; int main() { return 0; } And in GDB: (gdb) ptype lit type = char [4] (gdb) The null character is considered part of the array, so I think the new behaviour makes sense. I've added some new tests that try to exercise this issue in a few different ways. At the end of the day though it's all just variations of the same thing, create a value by calling through value_cstring, then force GDB to push the value to the inferior, and then treat that as a C style string and see the lack of null character cause problems. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=21699 --- gdb/cli/cli-cmds.c | 17 ++++-- gdb/guile/scm-math.c | 16 ++++-- gdb/python/py-value.c | 8 ++- gdb/testsuite/gdb.base/cstring-exprs.c | 51 ++++++++++++++++++ gdb/testsuite/gdb.base/cstring-exprs.exp | 68 ++++++++++++++++++++++++ gdb/testsuite/gdb.base/settings.exp | 4 +- gdb/testsuite/gdb.python/py-mi.exp | 2 +- gdb/valops.c | 4 ++ gdb/value.c | 3 +- gdb/value.h | 4 ++ 10 files changed, 164 insertions(+), 13 deletions(-) create mode 100644 gdb/testsuite/gdb.base/cstring-exprs.c create mode 100644 gdb/testsuite/gdb.base/cstring-exprs.exp diff --git a/gdb/cli/cli-cmds.c b/gdb/cli/cli-cmds.c index 3b1c6a9f4bd..55cca84e7e5 100644 --- a/gdb/cli/cli-cmds.c +++ b/gdb/cli/cli-cmds.c @@ -2307,8 +2307,11 @@ value_from_setting (const setting &var, struct gdbarch *gdbarch) } if (len > 0) - return value_cstring (value, len, - builtin_type (gdbarch)->builtin_char); + { + /* The +1 here ensures we include the trailing null character. */ + return value_cstring (value, len + 1, + builtin_type (gdbarch)->builtin_char); + } else return value_cstring ("", 1, builtin_type (gdbarch)->builtin_char); @@ -2363,7 +2366,8 @@ str_value_from_setting (const setting &var, struct gdbarch *gdbarch) { std::string cmd_val = get_setshow_command_value_string (var); - return value_cstring (cmd_val.c_str (), cmd_val.size (), + /* The +1 ensures we include the trailing null character. */ + return value_cstring (cmd_val.c_str (), cmd_val.size () + 1, builtin_type (gdbarch)->builtin_char); } @@ -2392,8 +2396,11 @@ str_value_from_setting (const setting &var, struct gdbarch *gdbarch) } if (len > 0) - return value_cstring (value, len, - builtin_type (gdbarch)->builtin_char); + { + /* The +1 here ensures we include the trailing null character. */ + return value_cstring (value, len + 1, + builtin_type (gdbarch)->builtin_char); + } else return value_cstring ("", 1, builtin_type (gdbarch)->builtin_char); diff --git a/gdb/guile/scm-math.c b/gdb/guile/scm-math.c index 49fe93b97f8..c7dcfc85f11 100644 --- a/gdb/guile/scm-math.c +++ b/gdb/guile/scm-math.c @@ -803,9 +803,19 @@ vlscm_convert_typed_value_from_scheme (const char *func_name, 0 /*non-strict*/, &except_scm); if (s != NULL) - value = value_cstring (s.get (), len, - language_string_char_type (language, - gdbarch)); + { + /* The gdbscm_scm_to_string call doesn't guarantee to add + a tailing null-character, so we need to add our own. */ + gdb::unique_xmalloc_ptr tmp ((char *) xmalloc (len + 1)); + memcpy (tmp.get (), s.get (), len); + *(tmp.get () + len) = '\0'; + + /* The +1 here ensures we include the trailing null + character. */ + value = value_cstring (tmp.get (), len + 1, + language_string_char_type (language, + gdbarch)); + } else value = NULL; } diff --git a/gdb/python/py-value.c b/gdb/python/py-value.c index 65384c781bc..59b879e5ebc 100644 --- a/gdb/python/py-value.c +++ b/gdb/python/py-value.c @@ -1881,8 +1881,12 @@ convert_value_from_python (PyObject *obj) gdb::unique_xmalloc_ptr s = python_string_to_target_string (obj); if (s != NULL) - value = value_cstring (s.get (), strlen (s.get ()), - builtin_type_pychar); + { + /* The +1 here ensures we include the trailing null + character. */ + value = value_cstring (s.get (), strlen (s.get ()) + 1, + builtin_type_pychar); + } } else if (PyObject_TypeCheck (obj, &value_object_type)) value = ((value_object *) obj)->value->copy (); diff --git a/gdb/testsuite/gdb.base/cstring-exprs.c b/gdb/testsuite/gdb.base/cstring-exprs.c new file mode 100644 index 00000000000..8135edd97d4 --- /dev/null +++ b/gdb/testsuite/gdb.base/cstring-exprs.c @@ -0,0 +1,51 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2023 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include +#include + +/* A memory area used as the malloc memory buffer. */ + +static char arena[] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"; + +/* Override malloc(). When GDB tries to push strings into the inferior we + always return the same pointer to arena. This does mean we can't have + multiple strings in use at the same time, but that's fine for our basic + testing, and this is simpler than using dlsym. */ + +void +*malloc (size_t size) +{ + memset (arena, 'X', sizeof (arena)); + if (size > sizeof (arena)) + return NULL; + return arena; +} + +/* This function is called from GDB. */ + +void +take_string (const char *str) +{ + /* Nothing. */ +} + +int +main (void) +{ + return 0; +} diff --git a/gdb/testsuite/gdb.base/cstring-exprs.exp b/gdb/testsuite/gdb.base/cstring-exprs.exp new file mode 100644 index 00000000000..fc2f96affa0 --- /dev/null +++ b/gdb/testsuite/gdb.base/cstring-exprs.exp @@ -0,0 +1,68 @@ +# Copyright 2023 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# Test different ways that we can cause GDB to call the value_string +# function. This function should create a C style string, i.e. the +# string should include a null terminating character. +# +# If the value created by value_cstring is pushed into the inferior +# and the null terminating character is missing, then we can get +# unexpected results. + +standard_testfile + +if { [prepare_for_testing "failed to prepare" $testfile $srcfile] } { + return -1 +} + +if {![runto_main]} { + return 0 +} + +if [allow_python_tests] { + # The $_as_string convenience function is implemented in Python. + gdb_test {printf "%s\n", $_as_string("aabbcc")} "\"aabbcc\"" + + # Create a gdb.Value object for a string. Take its address (which + # forces it into the inferior), and then print the address as a + # string. + gdb_test_no_output {python addr = gdb.Value("xxyyzz").address} + gdb_test {python gdb.execute("x/1s 0x%x" % addr)} \ + "$hex :\\s+\"xxyyzz\"" + + # Call an inferior function through a gdb.Value object, pass a + # string to the inferior function and ensure it arrives correctly. + gdb_test "p/x take_string" " = $hex.*" + gdb_test_no_output "python func_ptr = gdb.history (0)" \ + "place address of take_string into Python variable" + gdb_test "python func_value = func_ptr.dereference()" "" + gdb_breakpoint "take_string" + gdb_test {python result = func_value("qqaazz")} \ + "Breakpoint $decimal, take_string \\(str=$hex \"qqaazz\"\\) at .*" + gdb_test "continue" "Continuing\\." +} + +# Use printf on a string parsed by the C expression parser. +gdb_test {printf "%s\n", "ddeeff"} "ddeeff" + +# Parse a string in the C expression parser, force it into the +# inferior by taking its address, then print it as a string. +gdb_test {x/1s &"gghhii"} "$hex :\\s+\"gghhii\"" + +# Use $_gdb_setting_str and $_gdb_maint_setting_str to create a string +# value, and then print using printf, which forces the string into the +# inferior. +gdb_test {printf "%s\n", $_gdb_setting_str("arch")} "auto" +gdb_test {printf "%s\n", $_gdb_maint_setting_str("bfd-sharing")} "on" diff --git a/gdb/testsuite/gdb.base/settings.exp b/gdb/testsuite/gdb.base/settings.exp index e54203924ec..9dd3ee174c7 100644 --- a/gdb/testsuite/gdb.base/settings.exp +++ b/gdb/testsuite/gdb.base/settings.exp @@ -504,7 +504,9 @@ proc_with_prefix test-enum {} { gdb_test_no_output "$set_cmd zzz" show_setting "$show_cmd" "zzz" 0 "yyy" - check_type "test-settings enum" "type = char \\\[3\\\]" + # A string literal includes the trailing null character, hence the + # array size of four here. + check_type "test-settings enum" "type = char \\\[4\\\]" test_gdb_complete_multiple "$set_cmd " "" "" { "xxx" diff --git a/gdb/testsuite/gdb.python/py-mi.exp b/gdb/testsuite/gdb.python/py-mi.exp index 97b8a42f715..c65c83553f1 100644 --- a/gdb/testsuite/gdb.python/py-mi.exp +++ b/gdb/testsuite/gdb.python/py-mi.exp @@ -288,7 +288,7 @@ mi_create_dynamic_varobj nstype2 nstype2 ".*" 1 \ "create nstype2 varobj" mi_list_varobj_children nstype2 { - { {nstype2.} {} 6 {char \[6\]} } + { {nstype2.} {} 7 {char \[7\]} } } "list children after setting exception flag" mi_create_varobj me me \ diff --git a/gdb/valops.c b/gdb/valops.c index d002c9dca9b..305885cc9c6 100644 --- a/gdb/valops.c +++ b/gdb/valops.c @@ -1738,9 +1738,13 @@ value_array (int lowbound, int highbound, struct value **elemvec) return val; } +/* See value.h. */ + struct value * value_cstring (const char *ptr, ssize_t len, struct type *char_type) { + gdb_assert (*(ptr + (len - 1)) == '\0'); + struct value *val; int lowbound = current_language->string_lower_bound (); ssize_t highbound = len / char_type->length (); diff --git a/gdb/value.c b/gdb/value.c index eab1933b256..4f1e0d6c53e 100644 --- a/gdb/value.c +++ b/gdb/value.c @@ -2044,7 +2044,8 @@ value_of_internalvar (struct gdbarch *gdbarch, struct internalvar *var) break; case INTERNALVAR_STRING: - val = value_cstring (var->u.string, strlen (var->u.string), + /* The +1 ensures we include the null character. */ + val = value_cstring (var->u.string, strlen (var->u.string) + 1, builtin_type (gdbarch)->builtin_char); break; diff --git a/gdb/value.h b/gdb/value.h index e13f92c397b..a23adda3c65 100644 --- a/gdb/value.h +++ b/gdb/value.h @@ -1182,6 +1182,10 @@ class scoped_value_mark const struct value *m_value; }; +/* Create an array value, with element type CHAR_TYPE and length LEN. The + array represents a C style string. The content for the value is copied + from PTR. The last array element must be a null character. */ + extern struct value *value_cstring (const char *ptr, ssize_t len, struct type *char_type); extern struct value *value_string (const char *ptr, ssize_t len, base-commit: 60a13bbcdfb0ce008a77563cea0c34c830d7b170 -- 2.25.4