[PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers
@ 2023-06-25  9:21 Simon Farre
  2023-06-25 10:22 ` Eli Zaretskii
  2023-07-13 20:38 ` Tom Tromey
  0 siblings, 2 replies; 9+ messages in thread
From: Simon Farre @ 2023-06-25  9:21 UTC (permalink / raw)
  To: gdb-patches; +Cc: tom, eliz, Simon Farre

This patch introduces the `num_children` method to the NoOpArray-pretty
printer as well as fixing a bug where some types accidentally uses
NoOpScalar pretty printers. It also cleans up the retrieval of variables
during a "variables" request (or an "evaluate" request) reducing the
complexity.

Caching of variables would be fine if we
actually kept variables around during step/continues that stops inside
the same frame we're currently at - this is how I did it for my
DA-implementation Midas, to be able to be faster than most other DA:s;
meaning, I kept the `gdb.Value`s around until we've left the frame they
were created in. However, I think this breaks with the
DAP-specification that states that variableReferences should only ever
be valid "during a stop". So unless we actually build some
"remembering mechanism" that keeps variableReferences and invalidates
them only once we figure out we're not in the same frame anymore,
caching the values (children) is not important. Regardless, that kind of
functionality would rely on mechanisms other than the ones I've removed
here.

Going forward, Pretty Printers of "array type" should be expected (though not required)
to provide the `num_children` method, in order to be able to provide a
performant DAP-experience. They should also provide `children_range`,
a method meant to retrieve a sub-range of children of a Pretty Printer
wrapped variable without having to iterate over all prior elements of
the range. This method is appropriate on pretty printers that wrap
std::vector, arrays, etc. Even though these methods are not required,
there's no good reason to not implement them, because it provides an
absolutely astronomical performance difference. I've provided the method
for the NoOpArrayPrinter.

After hacking the std-library pretty printer for std::vector to provide
these two methods, servicing one "variables" request of the "Locals"
scope where a std::vector<std::string> of 10k elements exist, went from
35 seconds (!!!) down to 0.4 ms. For a 10k element of std::vector<int>
it went from 100 ms down to the same 0.4ms, also. The same went for
retrieving a sub-range of elements 9800-9900; it went from 35 seconds
(due to having to iterate over _all_ prior children) down to 350ms, for
std::vector<std::string> - the still-very-high time count is because of
std::string's Pretty Printer implementation which is incredibly slow.
For the std::vector<int>, retrieving that same sub-range was in the ~10
ms range.

These two new optional methods have been added to documentation as well
as an example that mimicks the layout of `std::vector`, to show an
example of how they can be implemented.
---
 gdb/doc/python.texi              | 52 ++++++++++++++++++++++++
 gdb/python/lib/gdb/dap/scopes.py | 70 ++++++++++++++++++++++++--------
 gdb/python/lib/gdb/dap/varref.py | 68 +++++++++----------------------
 gdb/python/lib/gdb/printing.py   | 42 +++++++++++++++++--
 4 files changed, 162 insertions(+), 70 deletions(-)

diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index 9cb97a9597d..61a7e49d1df 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -1707,6 +1707,23 @@ Children may be hidden from display based on the value of @samp{set
 print max-depth} (@pxref{Print Settings}).
 @end defun
 
+@defun pretty_printer.children_range (self, start, end)
+This method is optional but @value{GDBN} will call this method to
+be able to more effeciently compute a sub-range of children
+of the pretty-printers value.  If the pretty printer is for an
+array-like type, it is recommended that this method is implemented
+as the DAP interpreter will make extended use of it.  For an example on how
+to best implement this method @pxref{Writing a Pretty-Printer}.  Not
+implementing this and @code{num_children} can have drastic effects on
+the performance of a DAP-interpreter debug session.
+@end defun
+
+@defun pretty_printer.num_children (self)
+@value{GDBN} can call this method to query how many children this pretty
+printer will return from a @code{pretty_printer.children ()} call. This method
+is used by the DAP interpreter to be able to display paginated results.
+@end defun
+
 @defun pretty_printer.display_hint (self)
 The CLI may call this method and use its result to change the
 formatting of a value.  The result will also be supplied to an MI
@@ -1965,10 +1982,43 @@ These are the types we are going to pretty-print:
 @smallexample
 struct foo @{ int a, b; @};
 struct bar @{ struct foo x, y; @};
+struct vec @{ foo* start; foo* current; foo* end; @}
+
 @end smallexample
 
 Here are the printers:
 
+@smallexample
+class vecPrinter:
+    """Print a vec of foo objects."""
+
+    def __init__(self, val):
+        self.val = val
+
+    def to_string(self):
+        return ""
+
+    def num_children(self):
+        first = self.val["start"]
+        last = self.val["current"]
+        return last - first
+
+    def children_range(self, start, end):
+        ptr = self.val["start"]
+        for i in range(start, end):
+            yield (str(i), (ptr + i).dereference())
+
+    def children(self):
+        ptr = self.val["start"]
+        size = int(self.num_children())
+        for i in range(0, size):
+            yield (str(i), (ptr + i).dereference())
+
+    def display_hint(self):
+        return "array"
+
+@end smallexample
+
 @smallexample
 class fooPrinter:
     """Print a foo object."""
@@ -2003,6 +2053,7 @@ def build_pretty_printer():
         "my_library")
     pp.add_printer('foo', '^foo$', fooPrinter)
     pp.add_printer('bar', '^bar$', barPrinter)
+    pp.add_printer('vec', '^vec$', barPrinter)
     return pp
 @end smallexample
 
@@ -2025,6 +2076,7 @@ my_library.so:
   my_library
     foo
     bar
+    vec
 @end smallexample
 
 @node Type Printing API
diff --git a/gdb/python/lib/gdb/dap/scopes.py b/gdb/python/lib/gdb/dap/scopes.py
index 9b80dd9ce80..10d9a5a6b97 100644
--- a/gdb/python/lib/gdb/dap/scopes.py
+++ b/gdb/python/lib/gdb/dap/scopes.py
@@ -49,48 +49,76 @@ def _block_vars(block):
     return (args, locs)
 
 
+class ScopePrinter:
+    def __init__(self, frame, variables):
+        self.variables = variables
+        self.frame = frame
+
+    def to_string(self):
+        return ""
+
+    @in_gdb_thread
+    def children(self):
+        for sym in self.variables:
+            if sym.needs_frame:
+                val = sym.value(self.frame)
+            else:
+                val = sym.value()
+            yield ((sym.print_name, val))
+
+    def num_children(self):
+        return len(self.variables)
+
+
 class ScopeReference(BaseReference):
-    def __init__(self, name, hint, frame, var_list):
+    def __init__(self, name, hint, frame, printer):
         super().__init__(name)
         self.hint = hint
         self.frame = frame
         self.func = frame.function()
-        self.var_list = var_list
+        self.printer = printer
 
     def to_object(self):
         result = super().to_object()
         result["presentationHint"] = self.hint
         # How would we know?
         result["expensive"] = False
-        result["namedVariables"] = len(self.var_list)
+        result["namedVariables"] = self.printer.num_children()
         if self.func is not None:
             result["line"] = self.func.line
             # FIXME construct a Source object
         return result
 
     def child_count(self):
-        return len(self.var_list)
+        return self.printer.num_children()
+
+
+class RegisterFilePrinter:
+    def __init__(self, frame, registers):
+        self.registers = registers
+        self.frame = frame
+
+    def to_string(self):
+        return ""
 
     @in_gdb_thread
-    def fetch_one_child(self, idx):
-        sym = self.var_list[idx]
-        if sym.needs_frame:
-            val = sym.value(self.frame)
-        else:
-            val = sym.value()
-        return (sym.print_name, val)
+    def children(self):
+        for reg in self.registers:
+            yield (reg.name, self.frame.read_register(reg))
+
+    def num_children(self):
+        return len(self.registers)
 
 
 class RegisterReference(ScopeReference):
     def __init__(self, name, frame):
         super().__init__(
-            name, "registers", frame, list(frame.architecture().registers())
+            name,
+            "registers",
+            frame,
+            RegisterFilePrinter(frame, list(frame.architecture().registers())),
         )
 
-    @in_gdb_thread
-    def fetch_one_child(self, idx):
-        return (self.var_list[idx].name, self.frame.read_register(self.var_list[idx]))
-
 
 # Helper function to create a DAP scopes for a given frame ID.
 @in_gdb_thread
@@ -101,9 +129,15 @@ def _get_scope(id):
     if block is not None:
         (args, locs) = _block_vars(block)
         if args:
-            scopes.append(ScopeReference("Arguments", "arguments", frame, args))
+            scopes.append(
+                ScopeReference(
+                    "Arguments", "arguments", frame, ScopePrinter(frame, args)
+                )
+            )
         if locs:
-            scopes.append(ScopeReference("Locals", "locals", frame, locs))
+            scopes.append(
+                ScopeReference("Locals", "locals", frame, ScopePrinter(frame, locs))
+            )
     scopes.append(RegisterReference("Registers", frame))
     return [x.to_object() for x in scopes]
 
diff --git a/gdb/python/lib/gdb/dap/varref.py b/gdb/python/lib/gdb/dap/varref.py
index 213151fd3d3..bc73db7f45d 100644
--- a/gdb/python/lib/gdb/dap/varref.py
+++ b/gdb/python/lib/gdb/dap/varref.py
@@ -17,7 +17,7 @@ import gdb
 from .startup import in_gdb_thread
 from .server import client_bool_capability
 from abc import abstractmethod
-
+from itertools import islice
 
 # A list of all the variable references created during this pause.
 all_variables = []
@@ -73,19 +73,9 @@ class BaseReference:
         """Call this to declare that this variable or scope has no children."""
         self.ref = 0
 
-    @abstractmethod
-    def fetch_one_child(self, index):
-        """Fetch one child of this variable.
-
-        INDEX is the index of the child to fetch.
-        This should return a tuple of the form (NAME, VALUE), where
-        NAME is the name of the variable, and VALUE is a gdb.Value."""
-        return
-
     @abstractmethod
     def child_count(self):
-        """Return the number of children of this variable."""
-        return
+        return None
 
     @in_gdb_thread
     def fetch_children(self, start, count):
@@ -95,15 +85,19 @@ class BaseReference:
         COUNT is the number to return, with 0 meaning return all."""
         if count == 0:
             count = self.child_count()
-        if self.children is None:
-            self.children = [None] * self.child_count()
         result = []
-        for idx in range(start, start + count):
-            if self.children[idx] is None:
-                (name, value) = self.fetch_one_child(idx)
-                self.children[idx] = VariableReference(name, value)
-            result.append(self.children[idx])
-        return result
+        stop = start + count if count is not None else None
+
+        if hasattr(self.printer, "children_range"):
+            # Performant path
+            for (name, value) in self.printer.children_range(start, start + count):
+                result.append(VariableReference(name, value))
+            return result
+        else:
+            # The extremely slow path, if type is a "array-like"
+            for (name, value) in islice(self.printer.children(), start, stop):
+                result.append(VariableReference(name, value))
+            return result
 
 
 class VariableReference(BaseReference):
@@ -124,35 +118,17 @@ class VariableReference(BaseReference):
         self.child_cache = None
         if not hasattr(self.printer, "children"):
             self.no_children()
-            self.count = None
-        else:
-            self.count = -1
-
-    def cache_children(self):
-        if self.child_cache is None:
-            # This discards all laziness.  This could be improved
-            # slightly by lazily evaluating children, but because this
-            # code also generally needs to know the number of
-            # children, it probably wouldn't help much.  A real fix
-            # would require an update to gdb's pretty-printer protocol
-            # (though of course that is probably also inadvisable).
-            self.child_cache = list(self.printer.children())
-        return self.child_cache
 
     def child_count(self):
-        if self.count is None:
-            return None
-        if self.count == -1:
-            if hasattr(self.printer, "num_children"):
-                num_children = self.printer.num_children
-            else:
-                num_children = len(self.cache_children())
-            self.count = num_children
-        return self.count
+        if hasattr(self.printer, "num_children"):
+            return self.printer.num_children()
+        return None
 
     def to_object(self):
         result = super().to_object()
-        result[self.result_name] = self.printer.to_string()
+        result[self.result_name] = (
+            str(self.value) if self.ref == 0 else str(self.printer.to_string())
+        )
         num_children = self.child_count()
         if num_children is not None:
             if (
@@ -170,10 +146,6 @@ class VariableReference(BaseReference):
             result["type"] = str(self.value.type)
         return result
 
-    @in_gdb_thread
-    def fetch_one_child(self, idx):
-        return self.cache_children()[idx]
-
 
 @in_gdb_thread
 def find_variable(ref):
diff --git a/gdb/python/lib/gdb/printing.py b/gdb/python/lib/gdb/printing.py
index 353427d000a..a5ff5a6d96c 100644
--- a/gdb/python/lib/gdb/printing.py
+++ b/gdb/python/lib/gdb/printing.py
@@ -278,6 +278,9 @@ class NoOpScalarPrinter:
     def to_string(self):
         return self.value.format_string(raw=True)
 
+    def num_children(self):
+        return 0
+
 
 class NoOpArrayPrinter:
     """A no-op pretty printer that wraps an array value."""
@@ -287,9 +290,6 @@ class NoOpArrayPrinter:
         (low, high) = self.value.type.range()
         self.low = low
         self.high = high
-        # This is a convenience to the DAP code and perhaps other
-        # users.
-        self.num_children = high - low + 1
 
     def to_string(self):
         return ""
@@ -301,6 +301,13 @@ class NoOpArrayPrinter:
         for i in range(self.low, self.high + 1):
             yield (i, self.value[i])
 
+    def num_children(self):
+        return self.high - self.low + 1
+
+    def children_range(self, start, end):
+        for i in range(start, end):
+            yield (i, self.value[start])
+
 
 class NoOpStructPrinter:
     """A no-op pretty printer that wraps a struct or union value."""
@@ -316,6 +323,33 @@ class NoOpStructPrinter:
             if field.name is not None:
                 yield (field.name, self.value[field])
 
+    def num_children(self):
+        return len(
+            [
+                x
+                for x in self.value.type.fields()
+                if x.name is not None and hasattr(x, "bitpos")
+            ]
+        )
+
+
+def is_structured_type(type):
+    """Attempts to find out if TYPE points to something that's structured,
+    like a struct or a class.  Unfortunately type codes are not represented
+    as bitfields.  This has the unfortunate side effect of e.g.
+    TYPE_CODE_TYPEDEF not being the same as TYPE_CODE_STRUCT, even if that
+    typedef resolves to a struct."""
+    try:
+        fields = [
+            x for x in type.fields() if x.name is not None and hasattr(x, "bitpos")
+        ]
+        return len(fields) > 0 or type.code in (
+            gdb.TYPE_CODE_STRUCT,
+            gdb.TYPE_CODE_UNION,
+        )
+    except TypeError:
+        return False
+
 
 def make_visualizer(value):
     """Given a gdb.Value, wrap it in a pretty-printer.
@@ -331,7 +365,7 @@ def make_visualizer(value):
         result = gdb.printing.NoOpArrayPrinter(value)
         (low, high) = value.type.range()
         result.n_children = high - low + 1
-    elif value.type.code in (gdb.TYPE_CODE_STRUCT, gdb.TYPE_CODE_UNION):
+    elif is_structured_type(value.type):
         result = gdb.printing.NoOpStructPrinter(value)
     else:
         result = gdb.printing.NoOpScalarPrinter(value)
-- 
2.41.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers
  2023-06-25  9:21 [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers Simon Farre
@ 2023-06-25 10:22 ` Eli Zaretskii
  2023-07-13 20:38 ` Tom Tromey
  1 sibling, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2023-06-25 10:22 UTC (permalink / raw)
  To: Simon Farre; +Cc: gdb-patches, tom

> From: Simon Farre <simon.farre.cx@gmail.com>
> Cc: tom@tromey.com,
> 	eliz@gnu.org,
> 	Simon Farre <simon.farre.cx@gmail.com>
> Date: Sun, 25 Jun 2023 11:21:16 +0200
> 
> diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
> index 9cb97a9597d..61a7e49d1df 100644
> --- a/gdb/doc/python.texi
> +++ b/gdb/doc/python.texi
> @@ -1707,6 +1707,23 @@ Children may be hidden from display based on the value of @samp{set
>  print max-depth} (@pxref{Print Settings}).
>  @end defun
>  
> +@defun pretty_printer.children_range (self, start, end)
> +This method is optional but @value{GDBN} will call this method to
> +be able to more effeciently compute a sub-range of children
> +of the pretty-printers value.  If the pretty printer is for an
> +array-like type, it is recommended that this method is implemented
> +as the DAP interpreter will make extended use of it.  For an example on how
> +to best implement this method @pxref{Writing a Pretty-Printer}.  Not
> +implementing this and @code{num_children} can have drastic effects on
> +the performance of a DAP-interpreter debug session.
> +@end defun

Shouldn't this tell what will happen if this method is not
implemented?  The text side-steps that, but I think we should says
something about that.

> +@defun pretty_printer.num_children (self)
> +@value{GDBN} can call this method to query how many children this pretty
> +printer will return from a @code{pretty_printer.children ()} call. This method

That should be just @code{pretty_printer.children}, without the
parentheses.  Also, please leave two spaces between sentences.

Thanks.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers
  2023-06-25  9:21 [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers Simon Farre
  2023-06-25 10:22 ` Eli Zaretskii
@ 2023-07-13 20:38 ` Tom Tromey
  2023-07-17 20:52   ` Simon Farre
  2023-07-17 21:53   ` Matt Rice
  1 sibling, 2 replies; 9+ messages in thread
From: Tom Tromey @ 2023-07-13 20:38 UTC (permalink / raw)
  To: Simon Farre via Gdb-patches; +Cc: Simon Farre, tromey

>>>>> "Simon" == Simon Farre via Gdb-patches <gdb-patches@sourceware.org> writes:

Sorry about the delay on replying to this.

I have a few comments to make, most of them not about the patch but
rather about how we ought to proceed.

Simon> This patch introduces the `num_children` method to the NoOpArray-pretty
Simon> printer as well as fixing a bug where some types accidentally uses
Simon> NoOpScalar pretty printers. It also cleans up the retrieval of variables
Simon> during a "variables" request (or an "evaluate" request) reducing the
Simon> complexity.

I am not really sure we can/should do this.

The reason stems from a mistake we made early on when adding Python to
gdb -- when adding pretty-printers, we weren't very careful about
letting people know that we might add more methods to the printers.

This means existing code is free to use any name on the printer objects,
except the few we documented.  I'm not sure num_children or
children_range are used anywhere -- but they might well be.

Of course it's not very hard to fix this in printers.  However, it's bad
to ship a new gdb that breaks people's setup.  We've tried pretty hard
to avoid this.

Maybe one way out would be to define a new way to do pretty-printers,
with a compatibility shim.

Simon> Caching of variables would be fine if we
Simon> actually kept variables around during step/continues that stops inside
Simon> the same frame we're currently at

I don't see how to do this, since any aspect of them can change even
within the same frame.  Preserving the gdb.Value won't really work,
since it won't update.

Another issue in this same area is how to handle types that have special
cases when printing.

An example here is an Ada unconstrained array, or a Rust slice -- these
are both array-like objects that are implemented as a structure holding
a pointer and a length (more or less, Ada can be complicated).

These are exposed to Python as struct types, but we'd like to present
them as array types over DAP -- which is what is done for the CLI (via
special cases in the printing code) and also for MI (via varobj, though
I haven't actually implemented this for Rust yet).

I had a couple of ideas here.

One idea is to rewrite the DAP layer to simply use varobj via
gdb.execute_mi.  There are some downsides here, though.

Mainly, your proposed new methods wouldn't work automatically -- they'd
require hacking on the unfortunately-super-complicated varobj code to
make use of them.

Maybe adding watchpoint support here would also be difficult.  We would
no longer be seeing gdb.Values at all.

Another idea would be to try to fix this in the core, by adding some new
kind of array-like typecode and/or type attribute.  The idea here would
be to try to remove the special cases so that Python could see that
these things are "really" arrays.

I like this idea, since it seems more principled, but I fear it might be
pretty difficult.

There's a couple other kind-of-related issues.

One is that Rust tuples have field names like "__5" rather than "5".
Planning to ignore this one, probably.  This could be fixed in the DWARF
reader but the fix would have some downsides (in the obscure case of
"set lang c" and then trying to poke at the underlying representation).

Another is that an Ada string is just an array of character; this also
has a special case in printing, but I think changing NoOpArrayPrinter to
understand this seems basically reasonable.

Simon> +def is_structured_type(type):
Simon> +    """Attempts to find out if TYPE points to something that's structured,
Simon> +    like a struct or a class.  Unfortunately type codes are not represented
Simon> +    as bitfields.  This has the unfortunate side effect of e.g.
Simon> +    TYPE_CODE_TYPEDEF not being the same as TYPE_CODE_STRUCT, even if that
Simon> +    typedef resolves to a struct."""

I don't fully understand the text here, and it probably doesn't really
belong in the documentation for this function.

Normally what you'd do here instead is strip_typedefs:

Simon> @@ -331,7 +365,7 @@ def make_visualizer(value):
Simon>          result = gdb.printing.NoOpArrayPrinter(value)
Simon>          (low, high) = value.type.range()
Simon>          result.n_children = high - low + 1
Simon> -    elif value.type.code in (gdb.TYPE_CODE_STRUCT, gdb.TYPE_CODE_UNION):
Simon> +    elif is_structured_type(value.type):
Simon>          result = gdb.printing.NoOpStructPrinter(value)

... this code changed on master due to exactly this bug, and now does:

        ty = value.type.strip_typedefs()
...
        elif ty.code in (gdb.TYPE_CODE_STRUCT, gdb.TYPE_CODE_UNION):

Tom

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers
  2023-07-13 20:38 ` Tom Tromey
@ 2023-07-17 20:52   ` Simon Farre
  2023-07-19 14:41     ` Tom Tromey
  2023-07-17 21:53   ` Matt Rice
  1 sibling, 1 reply; 9+ messages in thread
From: Simon Farre @ 2023-07-17 20:52 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

Hi Tom, thanks for your input!
> The reason stems from a mistake we made early on when adding Python to
> gdb -- when adding pretty-printers, we weren't very careful about
> letting people know that we might add more methods to the printers.
>
> This means existing code is free to use any name on the printer objects,
> except the few we documented.  I'm not sure num_children or
> children_range are used anywhere -- but they might well be.
>
> Of course it's not very hard to fix this in printers.  However, it's bad
> to ship a new gdb that breaks people's setup.  We've tried pretty hard
> to avoid this.
We could go the c++ standard library route here and define silly (guard) 
names for everything we do, going forward.

So;
def __children_range, def __num_children.

If it's good for C++ standard library developers, it can be good for us 
as well ;).  Even though it's a tried and true approach
it's not one that I'm much fond of, but we could do that.
> These are exposed to Python as struct types, but we'd like to present
> them as array types over DAP -- which is what is done for the CLI (via
> special cases in the printing code) and also for MI (via varobj, though
> I haven't actually implemented this for Rust yet).
I don't understand this part, I wrote a the documentation example, that 
behaves like an dynamic array,
where pretty printer makes use of it's intimate knowledge of the length, 
because I specifically had languages
like Rust in mind (or any other type of dynamic data).

Or do you mean that it should automatically understand when
it's a built in type, in the language? That would be nice. But would 
that not introduce scope issues, like at what point do we stop?
Wouldn't we have to 1st class-implement every language that has a novel 
idea of representing (its own) standard data types? Or are for instance
Rust slices represented specifically as a special type in DWARF that we 
can recognize?

I think with the Python approach we leave more room for flexibility, 
than we do closing the PP stuff behind "MI-doors" as it were.
I also have a pet peeve against MI, so I *really* don't like that, just 
based on personal preference.

Besides, going the MI approach, excludes anyone who actually wants to 
write a (performant)
pretty printer for their own container, doesn't it? I don't think we can 
accept 100ms response times for trivial data,
when pretty printers are involved (and if libstdc++ pp is involved, it 
crashes and burns, making debugging using DAP impossible
as previously discussed).

Either way, having a "smart pretty printer" sounds good, for the use 
cases you've described, Ada strings,
Rust slices, etc - but my biggest question is how would we extend that 
to arbitrary client code?

Never mind about the caching-stuff I said, that can be left to any 
DA-implementers that wrap GDB's DAP,
if they choose to live on the wild side. I think closing the gaping 
performance hole is of much more importance.

Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers
  2023-07-13 20:38 ` Tom Tromey
  2023-07-17 20:52   ` Simon Farre
@ 2023-07-17 21:53   ` Matt Rice
  2023-07-18 18:52     ` Tom Tromey
  1 sibling, 1 reply; 9+ messages in thread
From: Matt Rice @ 2023-07-17 21:53 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Simon Farre via Gdb-patches, Simon Farre

On Thu, Jul 13, 2023 at 8:38 PM Tom Tromey via Gdb-patches
<gdb-patches@sourceware.org> wrote:
>
> Maybe one way out would be to define a new way to do pretty-printers,
> with a compatibility shim.

FWIW, there has been a bit of spit-balling within the rust community
about leveraging Debug trait implementations
within pretty printers, where what seemed to me like the obvious way
to leverage these was to compile Debug impls to wasm,
and write a python wrapper which calls into the wasm.  I haven't
thought or looked into it too much to have considered any hurdles of
this approach really, but it would be nice if languages with wasm
toolchains could define/compile pretty printers in their own language,
since often they already exist for the purpose of runtime logging...
Seemed worth bringing up if we're spit-balling here too.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers
  2023-07-17 21:53   ` Matt Rice
@ 2023-07-18 18:52     ` Tom Tromey
  2023-07-20  9:47       ` Simon Farre
  0 siblings, 1 reply; 9+ messages in thread
From: Tom Tromey @ 2023-07-18 18:52 UTC (permalink / raw)
  To: Matt Rice; +Cc: Tom Tromey, Simon Farre via Gdb-patches, Simon Farre

>>>>> "Matt" == Matt Rice <ratmice@gmail.com> writes:

Matt> On Thu, Jul 13, 2023 at 8:38 PM Tom Tromey via Gdb-patches
Matt> <gdb-patches@sourceware.org> wrote:
>> 
>> Maybe one way out would be to define a new way to do pretty-printers,
>> with a compatibility shim.

Matt> FWIW, there has been a bit of spit-balling within the rust community
Matt> about leveraging Debug trait implementations
Matt> within pretty printers, where what seemed to me like the obvious way
Matt> to leverage these was to compile Debug impls to wasm,
Matt> and write a python wrapper which calls into the wasm.  I haven't
Matt> thought or looked into it too much to have considered any hurdles of
Matt> this approach really, but it would be nice if languages with wasm
Matt> toolchains could define/compile pretty printers in their own language,
Matt> since often they already exist for the purpose of runtime logging...
Matt> Seemed worth bringing up if we're spit-balling here too.

Ages ago I looked into this a little, the idea being to compile selected
functions to Python using the gdb API.  In the end though it seemed too
hard for the value it would deliver.

Anyway, the wasm approach seems fine -- they solved all the toolchain
problems, which is nice.  The main issue is that, IIUC, debug traits
just emit a string representation -- but IMO the debugger really wants a
more key-value / array-like representation.

This seems like something that could be implemented "on the side", like
the way that gdb-natvis exists.

Tom

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers
  2023-07-17 20:52   ` Simon Farre
@ 2023-07-19 14:41     ` Tom Tromey
  0 siblings, 0 replies; 9+ messages in thread
From: Tom Tromey @ 2023-07-19 14:41 UTC (permalink / raw)
  To: Simon Farre; +Cc: Tom Tromey, gdb-patches

Simon> We could go the c++ standard library route here and define silly
Simon> (guard) names for everything we do, going forward.

gdb didn't reserve those, either.

Probably we should just add a second registration approach, one that
allows for future upgrades to the API.  It's just more work.

For DAP we could instead go the other route, and change the fallback
printers to use a fuller API, and then use a shim for existing printers.

>> These are exposed to Python as struct types, but we'd like to present
>> them as array types over DAP -- which is what is done for the CLI (via
>> special cases in the printing code) and also for MI (via varobj, though
>> I haven't actually implemented this for Rust yet).

Simon> I don't understand this part, I wrote a the documentation example,
Simon> that behaves like an dynamic array,
Simon> where pretty printer makes use of it's intimate knowledge of the
Simon> length, because I specifically had languages
Simon> like Rust in mind (or any other type of dynamic data).

Simon> Or do you mean that it should automatically understand when
Simon> it's a built in type, in the language? That would be nice. But would
Simon> that not introduce scope issues, like at what point do we stop?

gdb already does this.  I think the decisions are made per language, but
the general rule is that anything that can be created by the compiler is
normally supported by gdb, but anything in a library is expected to be
described by DWARF.

In Rust this mostly amounts to handling slices, but Ada has more
constructs that need special handling.

Simon> Wouldn't we have to 1st class-implement every language that has a
Simon> novel idea of representing (its own) standard data types? Or are for
Simon> instance
Simon> Rust slices represented specifically as a special type in DWARF that
Simon> we can recognize?

Normally the problem cases here are ones where there is a difference
between their representation in DWARF (e.g., a Rust slice or an Ada
unconstrained array are really struct types) and their meaning in the
language (these are both array-like constructs).

It's important that gdb have access to both of these things.  For
example, for a while GNAT could emit these kinds of arrays using
complicated location expressions -- but that makes it impossible to
create a new instance, because "undoing" the expressions is hard.

Simon> Either way, having a "smart pretty printer" sounds good, for the use
Simon> cases you've described, Ada strings,
Simon> Rust slices, etc - but my biggest question is how would we extend that
Simon> to arbitrary client code?

My thinking for the route forward is:

1. Add a new pretty-printer API that is clearly extensible and change
   the core to use it.  Old pretty-printers can be wrapped in shims.

2. Add support for these array-like objects to gdb.Value or maybe
   gdb.Type.  The idea here is to let Python be able to access the same
   things that gdb already knows.

Tom

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers
  2023-07-18 18:52     ` Tom Tromey
@ 2023-07-20  9:47       ` Simon Farre
  2023-07-21  0:27         ` Matt Rice
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Farre @ 2023-07-20  9:47 UTC (permalink / raw)
  To: Tom Tromey, Matt Rice; +Cc: gdb-patches

> Matt> FWIW, there has been a bit of spit-balling within the rust community
> Matt> about leveraging Debug trait implementations
> Matt> within pretty printers, where what seemed to me like the obvious way
> Matt> to leverage these was to compile Debug impls to wasm,
> Matt> and write a python wrapper which calls into the wasm.  I haven't
> Matt> thought or looked into it too much to have considered any hurdles of
> Matt> this approach really, but it would be nice if languages with wasm
> Matt> toolchains could define/compile pretty printers in their own language,
> Matt> since often they already exist for the purpose of runtime logging...
> Matt> Seemed worth bringing up if we're spit-balling here too.
>
> Ages ago I looked into this a little, the idea being to compile selected
> functions to Python using the gdb API.  In the end though it seemed too
> hard for the value it would deliver.
>
> Anyway, the wasm approach seems fine -- they solved all the toolchain
> problems, which is nice.  The main issue is that, IIUC, debug traits
> just emit a string representation -- but IMO the debugger really wants a
> more key-value / array-like representation.
>
> This seems like something that could be implemented "on the side", like
> the way that gdb-natvis exists.
>
> Tom
Right, I would agree with Tom here. Having it just represented as a 
string is sort of the "problem" we're in and even though a lot of 
people  (maybe even a majority) still use CLI/TUI, I'm not particularly 
sure demanding new users going forward should be forced to accept such a 
user interface, instead of having it as a choice. A GUI and a CLI/TUI 
have to make substantially different design choices, because a CLI is 
always a REPL, so each individual operation can be somewhat costly, 
because it's going to feel sort of fast for the end user anyway, whereas 
a GUI can see large swathes of the program at every instant - making 
each and every one of those operations build up to a total cost pretty 
quickly.

So, being able to query the types "directly", which means, sub objects, 
ranges etc is something that's valuable. GDB-CLI provides something like 
this with simple C-like types (the foo@10 expression). So, I think that 
excludes the #[derive(DebugWasm)] (ish) approach, unless it fulfills the 
requirements I've talked about here. I do like the idea that it would 
mean not having to write pp's for everything though.

Are there any lessons we can learn from LLDB here? How do they do it?

Then again, what's good about GDB/Python (and therefore the DAP interp) 
is that it allows for some customizability by the DA, so for instance, I 
would drive my DA to enforce the num_children/children_range approach, 
since the point of my DA was performance. So I think, no matter where we 
land, it's probably good if we could let the DA-implementer have room to 
make the tradeoffs it desires. It's also why I think it'd be good if the 
DAP interpreter isn't too reliant on MI because I believe MI was meant 
to solve a different problem (remote, vs a unified GUI & remote-protocol).

Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers
  2023-07-20  9:47       ` Simon Farre
@ 2023-07-21  0:27         ` Matt Rice
  0 siblings, 0 replies; 9+ messages in thread
From: Matt Rice @ 2023-07-21  0:27 UTC (permalink / raw)
  To: Simon Farre; +Cc: Tom Tromey, gdb-patches

On Thu, Jul 20, 2023 at 9:47 AM Simon Farre <simon.farre.cx@gmail.com> wrote:
>
>
> > Matt> FWIW, there has been a bit of spit-balling within the rust community
> > Matt> about leveraging Debug trait implementations
> > Matt> within pretty printers, where what seemed to me like the obvious way
> > Matt> to leverage these was to compile Debug impls to wasm,
> > Matt> and write a python wrapper which calls into the wasm.  I haven't
> > Matt> thought or looked into it too much to have considered any hurdles of
> > Matt> this approach really, but it would be nice if languages with wasm
> > Matt> toolchains could define/compile pretty printers in their own language,
> > Matt> since often they already exist for the purpose of runtime logging...
> > Matt> Seemed worth bringing up if we're spit-balling here too.
> >
> > Ages ago I looked into this a little, the idea being to compile selected
> > functions to Python using the gdb API.  In the end though it seemed too
> > hard for the value it would deliver.
> >
> > Anyway, the wasm approach seems fine -- they solved all the toolchain
> > problems, which is nice.  The main issue is that, IIUC, debug traits
> > just emit a string representation -- but IMO the debugger really wants a
> > more key-value / array-like representation.
> >
> > This seems like something that could be implemented "on the side", like
> > the way that gdb-natvis exists.
> >
> > Tom
> Right, I would agree with Tom here. Having it just represented as a
> string is sort of the "problem" we're in and even though a lot of
> people  (maybe even a majority) still use CLI/TUI, I'm not particularly
> sure demanding new users going forward should be forced to accept such a
> user interface, instead of having it as a choice. A GUI and a CLI/TUI
> have to make substantially different design choices, because a CLI is
> always a REPL, so each individual operation can be somewhat costly,
> because it's going to feel sort of fast for the end user anyway, whereas
> a GUI can see large swathes of the program at every instant - making
> each and every one of those operations build up to a total cost pretty
> quickly.
>
> So, being able to query the types "directly", which means, sub objects,
> ranges etc is something that's valuable. GDB-CLI provides something like
> this with simple C-like types (the foo@10 expression). So, I think that
> excludes the #[derive(DebugWasm)] (ish) approach, unless it fulfills the
> requirements I've talked about here. I do like the idea that it would
> mean not having to write pp's for everything though.

Yeah the #[derive(Debug)] example was largely given as a pragmatic example,
In that there is no current trait which matches an pretty-printer like API.
Similarish traits would be Serialize/Deserialize, and
https://crates.io/crates/cc-traits crate for collections,
(not in std).  But nothing really that enumerates only over the stuff
you want to see.

It seemed okayish to me primarily for a proof of concept, since it
exists, is common and could reach
a lot of developers without much burden.

One thing your replies make me wonder about is the 'selecting pretty printers',
In the manual it specifies a search order in which these are
controlled (falling back to 'raw-text' somewhat implicitly).
I primarily use CLI, but in these debugger interfaces, would it be
worthwhile considering
giving pretty printers a 'sort'. So that we could register multiple
pretty-printers 'formatted-text', 'array-like',
'struct-like', 'map-like'.  This kind of selection mechanism with
priority based on the sort of pretty-printer would perhaps work
better for languages with trait/type-class like languages like rust
and haskell.  Since that tends to be how they implement
enumeration.  Part of what makes this difficult is the orphan rules,
stating you either need to specify the trait, or the type implementing
the trait.  So unless the type provider or trait provider implements a
`DebugWasm` trait it is hard to add it in a way the debugger could
leverage.
So adding/implementing enumeration-like mechanisms for pretty printing
to these types of languages really has quite a few open questions.
But in my mind `pretty-printer for objects that implement trait X
where trait X is specified by the pretty printer`.
whether it be `Debug`, `Collection<Item=T> where T: Debug`, or `Serialize`,

Additionally it would provide data for some form of a populated
drop-down selection box, where users could select the pretty-printer
That would seem to at least allow developers to leverage existing
traits while not limiting things to formatted-text.

However I could just be excessively complicating things here, but just
spitballing ways that we can leverage existing things like
formatted-text,
while eventually being able to grow into a struct/array/map-like
representation which I agree is desirable...
Anyhow, this is probably beyond long enough.

> Are there any lessons we can learn from LLDB here? How do they do it?

*shrug*, I don't have any experience with it.

> Then again, what's good about GDB/Python (and therefore the DAP interp)
> is that it allows for some customizability by the DA, so for instance, I
> would drive my DA to enforce the num_children/children_range approach,
> since the point of my DA was performance. So I think, no matter where we
> land, it's probably good if we could let the DA-implementer have room to
> make the tradeoffs it desires. It's also why I think it'd be good if the
> DAP interpreter isn't too reliant on MI because I believe MI was meant
> to solve a different problem (remote, vs a unified GUI & remote-protocol).
>
> Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-07-21  0:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-25  9:21 [PATCH v1] gdb/DAP Introduce new methods to to Pretty Printers Simon Farre
2023-06-25 10:22 ` Eli Zaretskii
2023-07-13 20:38 ` Tom Tromey
2023-07-17 20:52   ` Simon Farre
2023-07-19 14:41     ` Tom Tromey
2023-07-17 21:53   ` Matt Rice
2023-07-18 18:52     ` Tom Tromey
2023-07-20  9:47       ` Simon Farre
2023-07-21  0:27         ` Matt Rice

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).