Python API - pretty printing complex types

public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed

* Python API - pretty printing complex types
@ 2011-03-09  0:43 Andrew Oakley
  2011-03-09  8:06 ` Joachim Protze
       [not found] ` <201103090954.49355.andre.poenitz@nokia.com>
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Oakley @ 2011-03-09  0:43 UTC (permalink / raw)
  To: gdb

I'm having difficulty writing pretty printers for some more complex
types and was wondering if anybody had any suggestions.  I think an
example is the easiest way of describing my problem.

I've got some C code that looks something like the following:

struct value_type { ... };

struct container {
        int interesting_field1;
        int interesting_field2;

        size_t values_length1;
        struct value_type * values1;

        size_t values_length2;
        struct value_type * values2;
};

I have a pretty printer for 'struct value_type' already.  I want to
write a pretty printer for 'struct container.  This struct contains two
lists of 'struct value_type', however the fact that there are two
lists is an implementation detail that we rarely care about.

Ideally my pretty printer would output something like this:

container = {
  interesting_field1 = 42,
  interesting_field2 = 0,
  members = {
        { value1 },
        { value2 },
        { value3 }
  }
}

The displayhint for container is 'map' and has fields called
'interesting_field1', 'interesting_field2' and 'members'.  The value
for 'members' is something that has a displayhint of 'array'.

The problem is that I don't know how to get 'members' printed
correctly.  It looks like the children member of the pretty printer for
'struct container' must return (string, gdb.Value) tuples, but I don't
have a unique type to return that gdb can use to find the next
pretty printer.  

For the simpler case of only having a single list of value I considered
returning a value of type 'struct value_type[length]', however there
does not seem to be any way to construct this type or any way to get
the length of the array type if we did manage to construct it.  Perhaps
these are worth adding to the API as gdb does seem to handle these
types internally.

If API improvements are needed to do this I can have a go at writing
the code (the current src/gdb/python code seems fairly easy to
understand).

-- 
Andrew Oakley

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Python API - pretty printing complex types
  2011-03-09  0:43 Python API - pretty printing complex types Andrew Oakley
@ 2011-03-09  8:06 ` Joachim Protze
  2011-03-10 21:08   ` Tom Tromey
       [not found] ` <201103090954.49355.andre.poenitz@nokia.com>
  1 sibling, 1 reply; 9+ messages in thread
From: Joachim Protze @ 2011-03-09  8:06 UTC (permalink / raw)
  To: Andrew Oakley; +Cc: gdb

My first approach makes use of the undocumented (online-doc) 
array-method of gdb.Type, that i put in a handy function -- the straight 
forward way.
For the second approach you have to put a typedef into your source -- 
the more flexible way for complex situations.

On 09.03.2011 01:46, Andrew Oakley wrote:
> struct value_type { ... };
>
> struct container {
>          int interesting_field1;
>          int interesting_field2;
>
>          size_t values_length1;
>          struct value_type * values1;
>
>          size_t values_length2;
>          struct value_type * values2;
> };
>
def cast_pointer_to_array(pointer, length):
     return 
pointer.cast(pointer.dereference().type.array(length-1).pointer()).dereference()

class container_printer:
     [...]
     def children(self):
         yield ("interesting_field1", self.val["interesting_field1"])
         yield ("interesting_field2", self.val["interesting_field2"])
         yield ("members1", cast_pointer_to_array(self.val["values1"], 
self.val["values_length1"])
         yield ("members2", cast_pointer_to_array(self.val["values2"], 
self.val["values_length2"])

     def display_hint (self):
         return "struct"

This way you get 2 Arrays of member-values
> Ideally my pretty printer would output something like this:
>
> container = {
>    interesting_field1 = 42,
>    interesting_field2 = 0,
>    members = {
>          { value1 },
>          { value2 },
>          { value3 }
>    }
> }
>
To get one single array use the second approach:


typedef struct container container_helper_type;

class container_printer:
     [...]
     def children(self):
         yield ("interesting_field1", self.val["interesting_field1"])
         yield ("interesting_field2", self.val["interesting_field2"])
         yield ("members", 
self.val.cast(gdb.lookup_type("container_helper_type")))

     def display_hint (self):
         return "struct"

class container_helper_type_printer:
     [...]
     def children(self):
         for i in range(self.val["values_length1"]):
             yield ("", self.val["values1"][i])
         for i in range(self.val["values_length2"]):
             yield ("", self.val["values2"][i])

     def display_hint (self):
         return "array"

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Python API - pretty printing complex types
       [not found] ` <201103090954.49355.andre.poenitz@nokia.com>
@ 2011-03-09 19:28   ` Andrew Oakley
  2011-03-10  9:07     ` André Pönitz
  2011-03-10 21:11     ` Tom Tromey
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Oakley @ 2011-03-09 19:28 UTC (permalink / raw)
  To: André Pönitz, Joachim Protze; +Cc: gdb

n Wed, 09 Mar 2011 09:07:09 +0100
Joachim Protze <joachim.protze@wh2.tu-dresden.de> wrote:

> My first approach makes use of the undocumented (online-doc) 
> array-method of gdb.Type, that i put in a handy function -- the
> straight forward way.

OK, I was looking for that but couldn't find a way to do it.  I see
this is now documented which is nice.  It still doesn't handle more
complex cases though.

On Wed, 9 Mar 2011 09:54:49 +0100
André Pönitz <andre.poenitz@nokia.com> wrote:

> On Wednesday 09 March 2011 01:46:19 ext Andrew Oakley wrote:
> > [...] If API improvements are needed to do this I can have a go at 
> > writing the code (the current src/gdb/python code seems fairly 
> > easy to understand).
> 
> My 2 ct: The most dearly missing feature in the "official" pretty
> printer API is the possibility to create multi-level displays with
> "phony groups" and a flexible way to steer the "expansion state" of
> such groups.

I don't know about "expansion state" it feels like it is GDBs job to
manage that rather than the pretty printers themselves.  

I think it would be nice to be able to return pretty printers from the
children iterator of another pretty printer. This would allow "phony
groups" to be created - simply return another pretty printer for the
group and it will get printed in the usual fashion.  

I'm not sure what the best way to go about this is.  The problem is
finding out if the value returned should be handled as a pretty printer
or if it was something else.  I don't code much in python so I don't
really know what the usual API conventions are.  These are the options
I've thought of:

1. Use a different display_hint if pretty printers will be returned.
This is awkward if some of the children are simple types and some are
more complicated.  I'm not sure how it would interact with GDB/MI.

2. Create a new base class that pretty printers inherit from.  This
doesn't really feel like "the python way" - duck typing seems to be
common.  

3. Assume that objects that are not gdb.Value instances are pretty
printers.  This prevents any further extensions from working in the
same way.

4. Assume that objects with a to_string member are pretty printers.
This is the only required member of pretty printers and seems more in
line with other python libraries.  We probably want to check for this
after checking if the object was a gdb.Value, both for performance
reasons and to ensure no existing code has an nasty surprises.

I think option 4 is the best choice here and I'm happy to write a patch
to do this if there is some agreement that it is a reasonable decision
(and therefore might actually get committed).  

-- 
Andrew Oakley

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Python API - pretty printing complex types
  2011-03-09 19:28   ` Andrew Oakley
@ 2011-03-10  9:07     ` André Pönitz
  2011-03-10 21:25       ` Tom Tromey
  2011-03-10 21:11     ` Tom Tromey
  1 sibling, 1 reply; 9+ messages in thread
From: André Pönitz @ 2011-03-10  9:07 UTC (permalink / raw)
  To: ext Andrew Oakley; +Cc: Joachim Protze, gdb

On Wednesday 09 March 2011 20:30:51 ext Andrew Oakley wrote:
> n Wed, 09 Mar 2011 09:07:09 +0100
> Joachim Protze <joachim.protze@wh2.tu-dresden.de> wrote:
> 
> > My first approach makes use of the undocumented (online-doc) 
> > array-method of gdb.Type, that i put in a handy function -- the
> > straight forward way.
> 
> OK, I was looking for that but couldn't find a way to do it.  I see
> this is now documented which is nice.  It still doesn't handle more
> complex cases though.
> 
> On Wed, 9 Mar 2011 09:54:49 +0100
> André Pönitz <andre.poenitz@nokia.com> wrote:
> 
> > On Wednesday 09 March 2011 01:46:19 ext Andrew Oakley wrote:
> > > [...] If API improvements are needed to do this I can have a go at 
> > > writing the code (the current src/gdb/python code seems fairly 
> > > easy to understand).
> > 
> > My 2 ct: The most dearly missing feature in the "official" pretty
> > printer API is the possibility to create multi-level displays with
> > "phony groups" and a flexible way to steer the "expansion state" of
> > such groups.
> 
> I don't know about "expansion state" it feels like it is GDBs job to
> manage that rather than the pretty printers themselves. 

I'd say it's a frontend's job to maintain the expansion state and
communicate that to gdb when asking for "expanded data".

 > I think it would be nice to be able to return pretty printers from the
> children iterator of another pretty printer. This would allow "phony
> groups" to be created - simply return another pretty printer for the
> group and it will get printed in the usual fashion.  

That's perhaps an possibility. Right now I feed a few parameters like
expansion state and individual formating requests to a fat script that
creates the full visible hierarchy in one go. That "naturally" solves the 
"phony group" issue, as each dumper can create as many level as
it wishes, and allows for the omission of repeated data (like type of
child nodes in a std::vector etc).

> I'm not sure what the best way to go about this is.  The problem is
> finding out if the value returned should be handled as a pretty printer
> or if it was something else.  I don't code much in python so I don't
> really know what the usual API conventions are.  These are the options
> I've thought of:
> 
> 1. Use a different display_hint if pretty printers will be returned.
> This is awkward if some of the children are simple types and some are
> more complicated.  I'm not sure how it would interact with GDB/MI.

> 2. Create a new base class that pretty printers inherit from.  This
> doesn't really feel like "the python way" - duck typing seems to be
> common.  

[I haven't done any python before using python with gdb, but duck 
typing certainly feels natural here]

> 3. Assume that objects that are not gdb.Value instances are pretty
> printers.  This prevents any further extensions from working in the
> same way.
> 
> 4. Assume that objects with a to_string member are pretty printers.
> This is the only required member of pretty printers and seems more in
> line with other python libraries.  We probably want to check for this
> after checking if the object was a gdb.Value, both for performance
> reasons and to ensure no existing code has an nasty surprises.
> 
> I think option 4 is the best choice here and I'm happy to write a patch
> to do this if there is some agreement that it is a reasonable decision
> (and therefore might actually get committed).  

It certainly looks like a step into the right direction. The missing "phony
levels" effectively prevented me from using the "official" pretty printing
approach in the past. 

It would be perfect if the pretty-printers-can-return-pretty-printers 
approach would also allow to (easily) feed  the pretty printers with 
per-value individual data. I found this pretty useful for "patchwork"
applications, that cannot easily use global settings for everything.
[In some cases you would like to do things like "display char * as
Latin1, but in some cases it's UTF-8, sometimes it's a \0-separated and
 \0\0-terminated 'list' of strings,  and sometimes really only a pointer
to a single char". Or you have some numerical data in an array that you'd
like to run through xplot as "pretty printer", but you don't want to 
invoke that on every value of type std::vector<double>. Things like that.]

Andre'

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Python API - pretty printing complex types
  2011-03-09  8:06 ` Joachim Protze
@ 2011-03-10 21:08   ` Tom Tromey
  0 siblings, 0 replies; 9+ messages in thread
From: Tom Tromey @ 2011-03-10 21:08 UTC (permalink / raw)
  To: Joachim Protze; +Cc: Andrew Oakley, gdb

>>>>> "Joachim" == Joachim Protze <joachim.protze@wh2.tu-dresden.de> writes:

Joachim> My first approach makes use of the undocumented (online-doc)
Joachim> array-method of gdb.Type, that i put in a handy function -- the
Joachim> straight forward way.

I looked, and this is documented in CVS.

We try pretty hard to document the whole Python API.  If you run across
something missing, or if you find something unclear or under-documented,
please report it in bugzilla.

Joachim> For the second approach you have to put a typedef into your source -- 
Joachim> the more flexible way for complex situations.

Nice trick!

I do think we should do something not needing tricks in the source.

Tom

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Python API - pretty printing complex types
  2011-03-09 19:28   ` Andrew Oakley
  2011-03-10  9:07     ` André Pönitz
@ 2011-03-10 21:11     ` Tom Tromey
  1 sibling, 0 replies; 9+ messages in thread
From: Tom Tromey @ 2011-03-10 21:11 UTC (permalink / raw)
  To: Andrew Oakley; +Cc: André Pönitz, Joachim Protze, gdb

>>>>> "Andrew" == Andrew Oakley <andrew@ado.is-a-geek.net> writes:

Andrew> I think it would be nice to be able to return pretty printers from the
Andrew> children iterator of another pretty printer. This would allow "phony
Andrew> groups" to be created - simply return another pretty printer for the
Andrew> group and it will get printed in the usual fashion.  

Yes, good idea.

Andrew> 3. Assume that objects that are not gdb.Value instances are pretty
Andrew> printers.  This prevents any further extensions from working in the
Andrew> same way.

Andrew> 4. Assume that objects with a to_string member are pretty printers.
Andrew> This is the only required member of pretty printers and seems more in
Andrew> line with other python libraries.  We probably want to check for this
Andrew> after checking if the object was a gdb.Value, both for performance
Andrew> reasons and to ensure no existing code has an nasty surprises.

Andrew> I think option 4 is the best choice here and I'm happy to write a patch
Andrew> to do this if there is some agreement that it is a reasonable decision
Andrew> (and therefore might actually get committed).  

I think either #3 or #4 would be fine.  I would approve a clean
(well, "clean-enough" :-) implementation of either.

If you do plan to implement this, contact me off-list so we can get the
paperwork stuff started.

If you don't plan to do it, please file it in bugzilla.

Tom

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Python API - pretty printing complex types
  2011-03-10  9:07     ` André Pönitz
@ 2011-03-10 21:25       ` Tom Tromey
  2011-03-11  7:41         ` Joachim Protze
  2011-03-11 11:25         ` André Pönitz
  0 siblings, 2 replies; 9+ messages in thread
From: Tom Tromey @ 2011-03-10 21:25 UTC (permalink / raw)
  To: André Pönitz; +Cc: ext Andrew Oakley, Joachim Protze, gdb

>>>>> "André" == André Pönitz <andre.poenitz@nokia.com> writes:

>> I don't know about "expansion state" it feels like it is GDBs job to
>> manage that rather than the pretty printers themselves. 

André> I'd say it's a frontend's job to maintain the expansion state and
André> communicate that to gdb when asking for "expanded data".

If I understand correctly, then I think the varobj stuff does this ok.
It is the front end's choice whether to fetch varobj children, and how
many to fetch.  Front ends do have to do a little dance to make the
"windowing" work out right, but it isn't too bad.

Also, the pretty-printer API was designed so that a printer can be
written to compute data lazily, to avoid over-fetching.  There are still
some wrinkles here, I think strings still don't work completely
properly.  I think there is still a PR open about this.

I know you aren't using varobj.  We could probably expose more of the
pretty-printer stuff to pure Python if you'd find that helpful...
though I suspect just the existing gdb.default_visualizer is enough.

André> It would be perfect if the pretty-printers-can-return-pretty-printers 
André> approach would also allow to (easily) feed  the pretty printers with 
André> per-value individual data. I found this pretty useful for "patchwork"
André> applications, that cannot easily use global settings for everything.
André> [In some cases you would like to do things like "display char * as
André> Latin1, but in some cases it's UTF-8, sometimes it's a \0-separated and
André>  \0\0-terminated 'list' of strings,  and sometimes really only a pointer
André> to a single char". Or you have some numerical data in an array that you'd
André> like to run through xplot as "pretty printer", but you don't want to 
André> invoke that on every value of type std::vector<double>. Things like that.]

varobj lets you assign a printer to a specific varobj, but I'm not sure
if anything uses this, and it probably only makes sense if there is
prior coordination with the front end.

Handling this via sub-pretty-printers for (e.g.) specific fields in
known structures seems reasonable.  But I don't know a fully general way
to handle this, like if the user wants "print some_global_string" to
automatically know to use a different encoding.

Tom

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Python API - pretty printing complex types
  2011-03-10 21:25       ` Tom Tromey
@ 2011-03-11  7:41         ` Joachim Protze
  2011-03-11 11:25         ` André Pönitz
  1 sibling, 0 replies; 9+ messages in thread
From: Joachim Protze @ 2011-03-11  7:41 UTC (permalink / raw)
  To: gdb

On 10.03.2011 22:25, Tom Tromey wrote:
> AndrÃ©>  It would be perfect if the pretty-printers-can-return-pretty-printers
> AndrÃ©>  approach would also allow to (easily) feed  the pretty printers with
> AndrÃ©>  per-value individual data. I found this pretty useful for "patchwork"
> AndrÃ©>  applications, that cannot easily use global settings for everything.
> AndrÃ©>  [In some cases you would like to do things like "display char * as
> AndrÃ©>  Latin1, but in some cases it's UTF-8, sometimes it's a \0-separated and
> AndrÃ©>   \0\0-terminated 'list' of strings,  and sometimes really only a pointer
> AndrÃ©>  to a single char". Or you have some numerical data in an array that you'd
> AndrÃ©>  like to run through xplot as "pretty printer", but you don't want to
> AndrÃ©>  invoke that on every value of type std::vector<double>. Things like that.]
>
> varobj lets you assign a printer to a specific varobj, but I'm not sure
> if anything uses this, and it probably only makes sense if there is
> prior coordination with the front end.
>
> Handling this via sub-pretty-printers for (e.g.) specific fields in
> known structures seems reasonable.  But I don't know a fully general way
> to handle this, like if the user wants "print some_global_string" to
> automatically know to use a different encoding.

In my case, i can derive context knowledge for a pointer/array by the 
usage in source code. The structure behind the pointer is a quite 
irregular tree structure (alternating datatypes). While printing a node 
of the tree, i have knowledge where to find the children of the node, 
how do walk down the tree and how to display leafs. For the inner nodes 
and leafs i cannot derive this context knowledge direct from source 
code, since just the roots address is found there.
Atm i just can return the address of the node in pretty-printers 
children function. But at the same time i call the childs pretty-printer 
with context knowledge and store the pp-object in a dict of (address, 
pretty-printer) and get them if my pp-lookup-function gets a request for 
one of the addresses in the dict.
My preferred way would be, to return the pretty-printer for the child in 
the children()-method.

Another nice feature would be a self defined Command, that acts like a 
pretty-printer namely returns (string,pp-object/value) and gdb manages 
formated output. This way you could define own commands to create 
various formats of output or to do non-standard typecasts. In my case 
the datatype is stored in a structure and i would like to do something like:
myprint (datatypestructure)address

The output would be the same as if my pretty-printer derives the 
connection of datatypestructure + address from sourcecode.

- Joachim

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Python API - pretty printing complex types
  2011-03-10 21:25       ` Tom Tromey
  2011-03-11  7:41         ` Joachim Protze
@ 2011-03-11 11:25         ` André Pönitz
  1 sibling, 0 replies; 9+ messages in thread
From: André Pönitz @ 2011-03-11 11:25 UTC (permalink / raw)
  To: ext Tom Tromey; +Cc: ext Andrew Oakley, Joachim Protze, gdb

On Thursday 10 March 2011 22:25:20 ext Tom Tromey wrote:
> >>>>> "André" == André Pönitz <andre.poenitz@nokia.com> writes:
> 
> >> I don't know about "expansion state" it feels like it is GDBs job to
> >> manage that rather than the pretty printers themselves. 
> 
> André> I'd say it's a frontend's job to maintain the expansion state and
> André> communicate that to gdb when asking for "expanded data".
> 
> If I understand correctly, then I think the varobj stuff does this ok.
> It is the front end's choice whether to fetch varobj children, and how
> many to fetch.  Front ends do have to do a little dance to make the
> "windowing" work out right, but it isn't too bad.
>
> Also, the pretty-printer API was designed so that a printer can be
> written to compute data lazily, to avoid over-fetching.  There are still
> some wrinkles here, I think strings still don't work completely
> properly.  I think there is still a PR open about this.
> 
> I know you aren't using varobj.  We could probably expose more of the
> pretty-printer stuff to pure Python if you'd find that helpful...
> though I suspect just the existing gdb.default_visualizer is enough.

I readily believe that varobj work ok for "normal C" cases, but after spending
quite some time on a varobj based general solution (still used on Mac btw)
I came to the conclusion that either I do not understand the concept on a 
very fundamental level, or the varobj approach can conceptually not work for
"phony" data. I.e. data that is entirely artificial (like grouping of data 
members of some struct) or the result of inferior calls ("getter" style 
functions).

Unfortunately, that kind of data is a large (and the most interesting) part of
what I need to display. E.g. object properties that are formally only accessible 
by getter functions, and even if most of them are backed by real data in some
memory "somewhere", that does not have to be the case when the result is 
constructed on-the-fly. How do I create a varobj representing such a result?

What kind of mechanism makes -var-update do the "right thing", i.e. notifies
me that that the result of an inferior call "changes", i.e. possibly would need 
to be re-computed as the context leading to its original computation?

I really think this cannot work with the gdb's varobj architecture, and neither
with some incremental improvement on top of it. But as it is easy to prove me
wrong by pointing to any other gdb frontend successfully handling that use 
case that question should be answerable.

> André> It would be perfect if the pretty-printers-can-return-pretty-printers 
> André> approach would also allow to (easily) feed  the pretty printers with 
> André> per-value individual data. I found this pretty useful for "patchwork"
> André> applications, that cannot easily use global settings for everything.
> André> [In some cases you would like to do things like "display char * as
> André> Latin1, but in some cases it's UTF-8, sometimes it's a \0-separated and
> André>  \0\0-terminated 'list' of strings,  and sometimes really only a pointer
> André> to a single char". Or you have some numerical data in an array that you'd
> André> like to run through xplot as "pretty printer", but you don't want to 
> André> invoke that on every value of type std::vector<double>. Things like that.]
> 
> varobj lets you assign a printer to a specific varobj, but I'm not sure
> if anything uses this, and it probably only makes sense if there is
> prior coordination with the front end.

That might work, pushing per-item configuration data from the frontend before
'-var-update' calls or such would be ok. I am not convinced, however, it
would help. Changing the display might easily impact the child nodes (see
above example of plain Latin1 vs \0-separated and \0\0-terminated list of 
strings) and therefore the varobjs representing these nodes. I have a gut
feeling that this will cause troubles.

> Handling this via sub-pretty-printers for (e.g.) specific fields in
> known structures seems reasonable.  But I don't know a fully general way
> to handle this, like if the user wants "print some_global_string" to
> automatically know to use a different encoding.

One could do some guesswork based on the actual string data. But according
to user feedback it's sufficient to have a reasonable first approximation
(like Latin1 or the system locale) and provide options to change encoding
globally and per-item. And that's straightforward to do with the "fat script"
approach.

Andre'

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-03-11 11:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-09  0:43 Python API - pretty printing complex types Andrew Oakley
2011-03-09  8:06 ` Joachim Protze
2011-03-10 21:08   ` Tom Tromey
     [not found] ` <201103090954.49355.andre.poenitz@nokia.com>
2011-03-09 19:28   ` Andrew Oakley
2011-03-10  9:07     ` André Pönitz
2011-03-10 21:25       ` Tom Tromey
2011-03-11  7:41         ` Joachim Protze
2011-03-11 11:25         ` André Pönitz
2011-03-10 21:11     ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).