public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: "abdul.b.ijaz" <abijaz@ecsmtp.iul.intel.com>, gdb-patches@sourceware.org
Cc: abdul.b.ijaz@intel.com, tom@tromey.com, nils-christian.kempke@intel.com
Subject: Re: [PATCH v3 1/1] gdb: Fix arrays of variable length strings for FORTRAN
Date: Fri, 13 May 2022 18:20:23 +0100	[thread overview]
Message-ID: <87r14xjqnc.fsf@redhat.com> (raw)
In-Reply-To: <20220504200529.11857-2-abdul.b.ijaz@intel.com>

"abdul.b.ijaz" <abijaz@ecsmtp.iul.intel.com> writes:

> GDB is not able to print arrays of FORTRAN variable length strings
> when dwarf info contains DW_AT_string_length only regarding the string
> length information.  So handing of dynamic array is updated to handle
> such cases.
>
> Suppose we have
>
> subroutine vla_array (arr1, arr2)
>   character (len=*):: arr1 (:)
>   character (len=5):: arr2 (:)
>
>   print *, arr1 ! break-here
>   print *, arr2
> end subroutine vla_array
>
> The "print arr1" and "print arr2" command at the "break-here" line
> gives the following output:
>
> (gdb) print arr1
> $1 = (<error reading variable>
> (gdb) print arr2
> $2 = ('abcde', 'abcde', 'abcde')
> (gdb) ptype arr1
> type = character*(*) (5)
> (gdb) ptype arr2
> type = character*5 (3)
>
> So GDB is able to print the array of string with static length but for
> variable length it fail to do so.  For this case now improve handling of
> the TYPE_CODE_STRING code in dynamic array length resolving function to set
> the static length of strings as well when only DW_AT_string_length is given
> in the dwarf info.
>
> Dwarf info using Intel® Fortran Compiler for such case contains following:
>  <1><fd>: Abbrev Number: 12 (DW_TAG_string_type)
>     <fe>   DW_AT_name        : (indirect string, offset: 0xd2): .str.ARR1
>     <102>   DW_AT_string_length: 3 byte block: 97 23 8 	(DW_OP_push_object_address; DW_OP_plus_uconst: 8)
>
> After fixing it and compiling using Intel® Fortran Compiler now gdb shows
> following:
>
> (gdb) p arr1
> $1 = ('abddefghij', 'abddefghij', 'abddefghij', 'abddefghij', 'abddefghij')
> (gdb) p arr2
> $2 = ('abcde', 'abcde', 'abcde')
> (gdb) ptype arr1
> type = character*10 (5)
> (gdb) ptype arr2
> type = character*5 (3)
>
> Fixing above issue introduce regression in gdb.fortran/mixed-lang-stack.exp
> test in case of "set language c" or "set language c++" so fix function
> c_value_print_inner accordingly.

I'd like to see some additional words here describing _why_ the changes
above broke mixed-lang-stack.exp, and why the change you made fixed it,
something like (reword as you see fit):

  - Before this change resolve_dynamic_array_or_string was called for
    all TYPE_CODE_ARRAY and TYPE_CODE_STRING types, but, in the end,
    this function always called create_array_type_with_stride, which
    creates a TYPE_CODE_ARRAY type.

  - After this change resolve_dynamic_array_or_string now calls
    create_array_type_with_stride or create_string_type, so if the
    incoming dynamic type is a TYPE_CODE_STRING then we'll get back a
    TYPE_CODE_STRING type.

  - In gdb.fortran/mixed-lang-stack.exp the test forces the language to
    C/C++ and print a Fortran string value.  The string value is a
    dynamic type with code TYPE_CODE_STRING.

  - Before this commit the dynamic type resolution would always convert
    this to a TYPE_CODE_ARRAY of characters, which the C value printing
    could handle.

  - After this commit we now get a TYPE_CODE_STRING, which neither the C
    value printing, or the generic value printing code can support.

  - And so, I've added support for TYPE_CODE_STRING to the C value
    printing, strings are handled just like arrays.

It's just, when I look at the change, I could see what you'd changed,
but it too me longer than I'd like to figure out why the change was
needed :/

I also noticed that this patch causes this failure:

  ptype s
  type = character*3
  (gdb) FAIL: gdb.opt/fortran-string.exp: ptype s

Though looking at the test script, I think the expected results were not
correct; we were looking for 'character (3)' while what we're getting
now seems better, I propose the following additional patch:

## START ##

diff --git a/gdb/testsuite/gdb.opt/fortran-string.exp b/gdb/testsuite/gdb.opt/fortran-string.exp
index 4c633356160..785fe5eace0 100644
--- a/gdb/testsuite/gdb.opt/fortran-string.exp
+++ b/gdb/testsuite/gdb.opt/fortran-string.exp
@@ -34,5 +34,5 @@ if ![runto f] then {
 
 gdb_test_no_output "set print frame-arguments all"
 gdb_test "frame" ".*s='foo'.*"
-gdb_test "ptype s" "type = character \\(3\\)"
+gdb_test "ptype s" "type = character\\*3"
 gdb_test "p s" "\\$\[0-9\]* = 'foo'"

## END ##

>                                    Test gdb.fortran/mixed-lang-stack.exp
> cover this change.
>
> gdb/ChangeLog:
> 2022-05-04  Abdul Basit Ijaz  <abdul.b.ijaz@intel.com>
>
> 	* gdbtypes.c (resolve_dynamic_array_or_string): Improve handling
> 	of TYPE_CODE_STRING code to use return value of create_string_type
> 	outcome for this case.
> 	* c-valprint.c (c_value_print_inner): Handle String type code
> 	in the same way as the Array type code to fix the regression
> 	in gdb.fortran/mixed-lang-stack.exp test.
>
> gdb/testsuite/ChangeLog:
> 2022-05-04  Abdul Basit Ijaz  <abdul.b.ijaz@intel.com>
>
> 	* gdb.fortran/vla-array.f90: New fie.
> 	* gdb.fortran/vla-array.exp: New fie.
>
> 2022-05-04 Abdul Basit Ijaz <abdul.b.ijaz@intel.com>
> ---
>  gdb/c-valprint.c                        |  1 +
>  gdb/gdbtypes.c                          | 19 +++++++--
>  gdb/testsuite/gdb.fortran/vla-array.exp | 57 +++++++++++++++++++++++++
>  gdb/testsuite/gdb.fortran/vla-array.f90 | 44 +++++++++++++++++++
>  4 files changed, 117 insertions(+), 4 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.fortran/vla-array.exp
>  create mode 100644 gdb/testsuite/gdb.fortran/vla-array.f90
>
> diff --git a/gdb/c-valprint.c b/gdb/c-valprint.c
> index bd445588ca..21a4f539cb 100644
> --- a/gdb/c-valprint.c
> +++ b/gdb/c-valprint.c
> @@ -427,6 +427,7 @@ c_value_print_inner (struct value *val, struct ui_file *stream, int recurse,
>    switch (type->code ())
>      {
>      case TYPE_CODE_ARRAY:
> +    case TYPE_CODE_STRING:
>        c_value_print_array (val, stream, recurse, options);
>        break;
>  
> diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
> index 2a51372a03..3f36245e64 100644
> --- a/gdb/gdbtypes.c
> +++ b/gdb/gdbtypes.c
> @@ -2341,16 +2341,24 @@ resolve_dynamic_array_or_string_1 (struct type *type,
>      = resolve_dynamic_range (range_type, addr_stack, rank, resolve_p);
>  
>    ary_dim = check_typedef (TYPE_TARGET_TYPE (type));
> -  if (ary_dim != NULL && ary_dim->code () == TYPE_CODE_ARRAY)
> +  if (ary_dim != NULL && (ary_dim->code () == TYPE_CODE_ARRAY
> +			  || ary_dim->code () == TYPE_CODE_STRING))
>      {
>        ary_dim = copy_type (ary_dim);
>        elt_type = resolve_dynamic_array_or_string_1 (ary_dim, addr_stack,
> -						    rank - 1, resolve_p);
> +						    (ary_dim->code ()
> +						     == TYPE_CODE_STRING ? rank
> +						     : rank - 1), resolve_p);

So I went back and forth on this change.

My initial though was that we should not change the rank calculation
here, but instead make a change in resolve_dynamic_array_or_string like
this:

## START ##

diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
index 3f36245e64a..ae09e1d9530 100644
--- a/gdb/gdbtypes.c
+++ b/gdb/gdbtypes.c
@@ -2346,9 +2346,7 @@ resolve_dynamic_array_or_string_1 (struct type *type,
     {
       ary_dim = copy_type (ary_dim);
       elt_type = resolve_dynamic_array_or_string_1 (ary_dim, addr_stack,
-						    (ary_dim->code ()
-						     == TYPE_CODE_STRING ? rank
-						     : rank - 1), resolve_p);
+						    rank - 1, resolve_p);
     }
   else
     elt_type = TYPE_TARGET_TYPE (type);
@@ -2460,7 +2458,8 @@ resolve_dynamic_array_or_string (struct type *type,
       rank = 1;
 
       for (struct type *tmp_type = check_typedef (TYPE_TARGET_TYPE (type));
-	   tmp_type->code () == TYPE_CODE_ARRAY;
+	   (tmp_type->code () == TYPE_CODE_ARRAY
+	    || tmp_type->code () == TYPE_CODE_STRING);
 	   tmp_type = check_typedef (TYPE_TARGET_TYPE (tmp_type)))
 	++rank;
     }

## END ##

But then I thought a little harder and realised that you were right all
along and that my proposal was not a good idea.  My change works fine
for arrays with known rank, and indeed, with my change, your test runs
just fine.

But, what about an array with assumed rank, that contains variable
length strings, something like:

  subroutine vla_array_func (array)
    character (len=*):: array (..)
    print *, rank(array)
  end subroutine vla_array_func

In this case we're going to end up passing through the dynamic rank
resolution path, and the rank we obtain is going to be the rank of
the array, and will not include the string, while your change would
handle this just fine.

[ NOTE: I couldn't get any test using the above Fortran code to do
  anything sensible under GDB, though from what I could see the problem
  was that the generated DWARF was not good enough, rather than it being
  a GDB problem.  Still, I would have liked to add a test that shows my
  pactch failing... ]

The reason I've mentioned this rollercoaster journey at all is because I
still think your change might not be the right way to go.  Notice the
the prepatched code, it basically goes:

  if ( /* Is the target type another array dimension?  */ )
    /* Resolve dynamic type for next array rank.  */
  else
    /* This must be the array element type, don't resolve it.  */

Your patch changes this, so the new logic goes like this (I've
deliberately restructured things to make it clearer):

  if ( /* Is the target type another array dimension?  */ )
    /* Resolve dynamic type for next array rank.  */
  else
    {
      if ( /* Is the element type a string?  */ )
        /* Resolve the dynamic type of the string.  */
      else
        /* Don't resolve the dynamic type of any othe element type.  */
    }

I noticed that you chose to special case the string handling, and I
wondered why?

What I think we should be doing here is not resolving the dynamic type
of the array elements, after all, each array element could, in theory,
have a different dynamic type, right?  So it doesn't really make sense
to resolve the element type just once.

So, I tried removing the string special case handling, just to see what
would happen.

.... and, stuff breaks.

What's going on?

Well, when printing the array, we do try to resolve the dynamic type for
each array element, which is what I was expecting, and which, I had
hoped would give us the results we wanted ... but this wasn't working.

The reason lies in value_from_contents_and_address.  In here you'll
notice we resolve the dynamic type, and then set the value's address
from the data location.

Remember, a dynamic object has two addresses of interest, an address for
the desriptor, the object address, which is always valid, and then a
data location address, where the actual value contents live, which is
only valid some of the time.

In value_from_contents_and_address, we throw out the object address, and
replace it with the data location address.  This is because GDB's values
only really have a concept of a single address.

What this means is that, once we've fetched the dynamic array object,
resolved its type, and converted it into a value object, we no longer
have access to its descriptor address.  What this means is that, if to
resolve the types of the elements we need the descriptor address, then
we're kind of stuck!  What actually happens is that we end up using the
wrong address for resolving the dynamic string type.

So, how to fix this?

I can think of two solutions of the top of my head, first would be to
have the value objects carry around both addresses, this would allow us
to continue to access the correct address later on when we need to.

The other solution would be to have some mechanism to understand that
the string type refers to the object address of the containing array,
and so we would know that we should resolve the string type once when we
resolve the array, and not defer the resolution until we read each
element.

This second approach is nice in that it would save us resolving the
dynamic type multiple times, plus, in theory I guess, it just relies on
us spotting some properties of the DWARF, and tagging the type in some
way... I don't know what that might be yet though.

As for this patch, I'm tempted to not block this work right now, but I
would like to see the code restructured to highlight the fact that there
is an issue here, maybe this patch (which applies on top of your work):

### START ###

diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
index 3f36245e64a..97c0f6ea2aa 100644
--- a/gdb/gdbtypes.c
+++ b/gdb/gdbtypes.c
@@ -2341,17 +2341,45 @@ resolve_dynamic_array_or_string_1 (struct type *type,
     = resolve_dynamic_range (range_type, addr_stack, rank, resolve_p);
 
   ary_dim = check_typedef (TYPE_TARGET_TYPE (type));
-  if (ary_dim != NULL && (ary_dim->code () == TYPE_CODE_ARRAY
-			  || ary_dim->code () == TYPE_CODE_STRING))
+  if (ary_dim != NULL && ary_dim->code () == TYPE_CODE_ARRAY)
     {
       ary_dim = copy_type (ary_dim);
       elt_type = resolve_dynamic_array_or_string_1 (ary_dim, addr_stack,
-						    (ary_dim->code ()
-						     == TYPE_CODE_STRING ? rank
-						     : rank - 1), resolve_p);
+						    rank - 1, resolve_p);
     }
   else
-    elt_type = TYPE_TARGET_TYPE (type);
+    {
+      /* The following special case for TYPE_CODE_STRING should not be
+	 needed, ideally we would defer resolving the dynamic type of the
+	 array elements until needed later, and indeed, the resolved type
+	 of each array element might be different, so attempting to resolve
+	 the type here makes no sense.
+
+	 However, in Fortran, for arrays of strings, each element must be
+	 the same type, as such, the DWARF for the string length relies on
+	 the object address of the array itself.
+
+	 The problem here is that, when we create value's from the dynamic
+	 array type, we resolve the data location, and use that as the
+	 value address, this completely discards the original value
+	 address, and it is this original value address that is the
+	 descriptor for the dynamic array, the very address that the DWARF
+	 needs us to push in order to resolve the dynamic string length.
+
+	 What this means then, is that, given the current state of GDB, if
+	 we don't resolve the string length now, then we will have lost
+	 access to the address of the dynamic object descriptor, and so we
+	 will not be able to resolve the dynamic string later.
+
+	 For now then, we special case TYPE_CODE_STRING on behalf of
+	 Fortran, and hope that this doesn't cause problems for anyone
+	 else.  */
+      if (ary_dim->code () == TYPE_CODE_STRING)
+	elt_type = resolve_dynamic_type_internal (TYPE_TARGET_TYPE (type),
+						  addr_stack, 0);
+      else
+	elt_type = TYPE_TARGET_TYPE (type);
+    }
 
   prop = type->dyn_prop (DYN_PROP_BYTE_STRIDE);
   if (prop != nullptr && type->code () == TYPE_CODE_STRING)

### END ###

I look forward to hearing your thoughts,

Thanks,
Andrew


>      }
>    else
>      elt_type = TYPE_TARGET_TYPE (type);
>  
>    prop = type->dyn_prop (DYN_PROP_BYTE_STRIDE);
> +  if (prop != nullptr && type->code () == TYPE_CODE_STRING)
> +    {
> +       prop = nullptr;
> +       warning (_("byte stride property ignored on string type"));
> +    }
>    if (prop != NULL && resolve_p)
>      {
>        if (dwarf2_evaluate_property (prop, NULL, addr_stack, &value))
> @@ -2370,8 +2378,11 @@ resolve_dynamic_array_or_string_1 (struct type *type,
>    else
>      bit_stride = TYPE_FIELD_BITSIZE (type, 0);
>  
> -  return create_array_type_with_stride (type, elt_type, range_type, NULL,
> -					bit_stride);
> +  if (type->code () == TYPE_CODE_STRING)
> +    return create_string_type (type, elt_type, range_type);
> +  else
> +    return create_array_type_with_stride (type, elt_type, range_type, NULL,
> +					  bit_stride);
>  }
>  
>  /* Resolve an array or string type with dynamic properties, return a new
> diff --git a/gdb/testsuite/gdb.fortran/vla-array.exp b/gdb/testsuite/gdb.fortran/vla-array.exp
> new file mode 100644
> index 0000000000..f9cd9f5e74
> --- /dev/null
> +++ b/gdb/testsuite/gdb.fortran/vla-array.exp
> @@ -0,0 +1,57 @@
> +# Copyright 2022 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +standard_testfile ".f90"
> +load_lib "fortran.exp"
> +
> +if {[skip_fortran_tests]} { return -1 }
> +
> +if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile} \
> +     {debug f90 quiet}] } {
> +    return -1
> +}
> +
> +if ![fortran_runto_main] {
> +    untested "could not run to main"
> +    return -1
> +}
> +
> +# Try to access vla string / vla string array / string array values.
> +gdb_breakpoint [gdb_get_line_number "arr_vla1-print"]
> +gdb_continue_to_breakpoint "arr_vla1-print"
> +
> +# GFortran does not emit DW_TAG_string_type for array of variable length
> +# string.
> +if [test_compiler_info "gcc*"] { setup_xfail *-*-* gcc/101826 }
> +gdb_test "print arr_vla1"  \
> +    " = \\\('vlaaryvlaary', 'vlaaryvlaary', 'vlaaryvlaary', 'vlaaryvlaary', 'vlaaryvlaary'\\\)"  \
> +    "print vla string array"
> +
> +if [test_compiler_info "gcc*"] { setup_xfail *-*-* gcc/101826 }
> +gdb_test "ptype arr_vla1"  \
> +    "type = character\\*12 \\(5\\)"  \
> +    "print variable length string array type"
> +gdb_test "print arr_vla2"  \
> +    " = 'vlaary'"  \
> +    "print variable length string"
> +gdb_test "ptype arr_vla2"  \
> +    "type = character\\*6"  \
> +    "print variable length string type"
> +gdb_test "print arr2"  \
> +    " = \\\('vlaaryvla', 'vlaaryvla', 'vlaaryvla'\\\)"  \
> +    "print string array"
> +gdb_test "ptype arr2"  \
> +    "type = character\\*9 \\(3\\)"  \
> +    "print string array type"
> diff --git a/gdb/testsuite/gdb.fortran/vla-array.f90 b/gdb/testsuite/gdb.fortran/vla-array.f90
> new file mode 100644
> index 0000000000..f64c3d64b1
> --- /dev/null
> +++ b/gdb/testsuite/gdb.fortran/vla-array.f90
> @@ -0,0 +1,44 @@
> +! Copyright 2022 Free Software Foundation, Inc.
> +!
> +! This program is free software; you can redistribute it and/or modify
> +! it under the terms of the GNU General Public License as published by
> +! the Free Software Foundation; either version 3 of the License, or
> +! (at your option) any later version.
> +!
> +! This program is distributed in the hope that it will be useful,
> +! but WITHOUT ANY WARRANTY; without even the implied warranty of
> +! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +! GNU General Public License for more details.
> +!
> +! You should have received a copy of the GNU General Public License
> +! along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +subroutine vla_array_func (arr_vla1, arr_vla2, arr2)
> +  character (len=*):: arr_vla1 (:)
> +  character (len=*):: arr_vla2
> +  character (len=9):: arr2 (:)
> +
> +  print *, arr_vla1    ! arr_vla1-print
> +  print *, arr_vla2
> +  print *, arr2
> +end subroutine vla_array_func
> +
> +program vla_array_main
> +interface
> +  subroutine vla_array_func (arr_vla1, arr_vla2, arr2)
> +    character (len=*):: arr_vla1 (:)
> +    character (len=*):: arr_vla2
> +    character (len=9):: arr2 (:)
> +  end subroutine vla_array_func
> +end interface
> +  character (len=9) :: arr1 (3)
> +  character (len=6) :: arr2
> +  character (len=12) :: arr3 (5)
> +
> +  arr1 = 'vlaaryvla'
> +  arr2 = 'vlaary'
> +  arr3 = 'vlaaryvlaary'
> +
> +  call vla_array_func (arr3, arr2, arr1)
> +
> +end program vla_array_main
> -- 
> 2.31.1


      reply	other threads:[~2022-05-13 17:20 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-04 20:05 [PATCH v3 0/1] Fix variable length strings array " abdul.b.ijaz
2022-05-04 20:05 ` [PATCH v3 1/1] gdb: Fix arrays of variable length strings " abdul.b.ijaz
2022-05-13 17:20   ` Andrew Burgess [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r14xjqnc.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=abdul.b.ijaz@intel.com \
    --cc=abijaz@ecsmtp.iul.intel.com \
    --cc=gdb-patches@sourceware.org \
    --cc=nils-christian.kempke@intel.com \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).