Python and structured output from breakpoint

public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed

* Python and structured output from breakpoint_ops
@ 2011-10-07 15:16 Tom Tromey
  2011-10-07 15:40 ` Tom Tromey
  0 siblings, 1 reply; 6+ messages in thread
From: Tom Tromey @ 2011-10-07 15:16 UTC (permalink / raw)
  To: GDB Development

[-- Attachment #1: Type: text/plain, Size: 213 bytes --]

Phil and I have been discussing exposing the print_* breakpoint_ops
methods to Python, and we thought we'd take the discussion public.

Here is his latest, which quotes my latest; I'll send a reply to this
note.

[-- Attachment #2: Type: message/rfc822, Size: 6294 bytes --]

From: Phil Muldoon <pmuldoon@redhat.com>
To: Tom Tromey <tromey@redhat.com>
Subject: Re: structured output
Date: Wed, 05 Oct 2011 13:03:06 +0100
Message-ID: <m3zkhf7iit.fsf@redhat.com>

Tom Tromey <tromey@redhat.com> writes:

> The basic problem is that a "print" method on a breakpoint can be called
> in either an MI or CLI context.  Inside gdb, we use ui_out to deal with
> this, writing ui_out calls which either print text for the CLI or
> structured output for MI.

Let me say up-front I do not disagree with the idea of structured data,
just that the current APIs limit the information we can display to a
point where the return in functionality is minimal.

Take the "print_one" operation.  This is called when "info breakpoints" is
called.

To attend to the terminology for the way GDB tries to make the breakpoint
operations OO in a C world, the "always-called" function which calls the
function-pointer for the breakpoint operation, I will call "the parent".
Similarly, the function called via function pointer (our operations) I
will call "the child".

So in print_one's case, the parent does not call the child function
until it has already populated all of the fields before address.  So
there are only two fields left to fill: "address", and "what".  They
have to be a long and a string.  Example:

class MyBreakpoint (gdb.Breakpoint):

      def print_one (self, address):
        return (address,"Our Breakpoint")

The output from "info breakpoints" from the above example:

Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000000000400532 Our Breakpoint

So you can see "Num, Type, Disp and Enb" have been populated before our
child function is called.  In Python at the moment we do in fact return
an iterable for this function, it just so happens we limit it to a tuple
(or None).  Given that all one can return in the above context is
essential a long and a string, I'm not sure what value we can add by
letting the user return anything else? We would have to convert any list
within the tuple to a string anyway.  I'd prefer users just do this
themselves in Python.

The same goes for print_stop_action.  This tells GDB what to print (via
an enum) when the breakpoint stops.  Example:

class MyBreakpoint (gdb.Breakpoint):

    def print_stop_action (self):
        return ("Our Breakpoint, ", gdb.PRINT_SRC_AND_LOC)

When GDB stops the inferior at a breakpoint defined by
this class, GDB will print:

Our Breakpoint, foofunc (argc=1, argv=0xff) somewhere/some-file.c:10

So again we are limited to a string and a constant.  I guess we could,
if the user passed a list within a list, call ui_out_list there.  But to
me, you will only ever want this output on one line (in fact, it may be
a requirement, I am not to sure).

There seems to be more room to maneuver with print_mention, and
print_one_detail.  They are currently implemented as pure strings.  But
again, both I believe (and really, I want) to be implemented as a single
string.  print_mention is called when a breakpoint is created.  Is there
an example of what kind of structured output we could use here?

print_one_detail is an optional detail line below each entry for "info
breakpoints".  This has to be limited to a single line, to remain
constant with "info breakpoints" output.  In fact, if you look at the mi
command -break-list, it just maps to info break and captures that
output.  Maybe that conversation is what Jan was talking about when there
is an explicit mention that any field change has to be made by Vlad?

If we are talking about refactoring breakpoint_ops themselves to allow
far broader latitude in what they output (allowing print_one to output
to all of the fields for example), or adding new fields, then that is
another kettle of  fish.

What do you think?

Cheers

Phil

> We want to mirror this capability in the Python API.  There are a lot of
> possible ways to do this.  Here's one.
>
> Instead of a "print" method on a Python breakpoint object, have a
> "describe" method that returns an iterable object (e.g., usually a
> list).  The C code will iterate over all the elements in the sequence.
> If a given element is a string, it is passed to ui_out_text.  Other
> types of elements would then be available and be converted to other
> kinds of ui_out calls.  I didn't work out all the details, but e.g., a
> Python list could be converted to a ui_out_list.
>
>
> Tom

[-- Attachment #3: Type: text/plain, Size: 6 bytes --]

Tom

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Python and structured output from breakpoint_ops
  2011-10-07 15:16 Python and structured output from breakpoint_ops Tom Tromey
@ 2011-10-07 15:40 ` Tom Tromey
  2011-10-07 16:04   ` Pedro Alves
  2011-10-10  9:14   ` Phil Muldoon
  0 siblings, 2 replies; 6+ messages in thread
From: Tom Tromey @ 2011-10-07 15:40 UTC (permalink / raw)
  To: GDB Development

Phil> So in print_one's case, the parent does not call the child function
Phil> until it has already populated all of the fields before address.  So
Phil> there are only two fields left to fill: "address", and "what".  They
Phil> have to be a long and a string.
[...]
Phil> Given that all one can return in the above context is
Phil> essential a long and a string, I'm not sure what value we can add by
Phil> letting the user return anything else? We would have to convert any list
Phil> within the tuple to a string anyway.  I'd prefer users just do this
Phil> themselves in Python.

Phil> The same goes for print_stop_action.  This tells GDB what to print (via
Phil> an enum) when the breakpoint stops.

Phil> So again we are limited to a string and a constant.  I guess we could,
Phil> if the user passed a list within a list, call ui_out_list there.  But to
Phil> me, you will only ever want this output on one line (in fact, it may be
Phil> a requirement, I am not to sure).

I don't think it has to be.

Phil> There seems to be more room to maneuver with print_mention, and
Phil> print_one_detail.  They are currently implemented as pure strings.  But
Phil> again, both I believe (and really, I want) to be implemented as a single
Phil> string.  print_mention is called when a breakpoint is created.  Is there
Phil> an example of what kind of structured output we could use here?

I think I was hoping that we could unify some of the print methods.  It
seems strange to have 4 different method to print more or less the same
basic information.

This might mean constraining the output a little bit in order to provide
a simpler API.  I think that would be good, but that is just my opinion;
however, if it turned out to be too limiting we could always extend the
options later.

Even if all the methods can't be unified it seems that at least
print_one and print_one_detail could be.

Phil> print_one_detail is an optional detail line below each entry for "info
Phil> breakpoints".  This has to be limited to a single line, to remain
Phil> constant with "info breakpoints" output.

It seems like it could have multiple lines, just nothing does this yet.

This is a good example of where structured output is useful: right now
the code has to know how to format the continuation lines (e.g., start
with a tab) -- but it seems like it would be better not to bake this
into Python scripts everywhere, in case we want to change the "info
break" formatting in the future.  Some kind of structured result would
let us do this.

Phil> In fact, if you look at the mi command -break-list, it just maps
Phil> to info break and captures that output.  Maybe that conversation
Phil> is what Jan was talking about when there is an explicit mention
Phil> that any field change has to be made by Vlad?

I wouldn't worry about the field name thing in this discussion.  We're
already talking about extensions to gdb from third parties, nothing in
the core.

However, another important thing is that the print_* methods work from
both MI and the CLI.  Otherwise, -break-list is going to print garbage
when someone installs one of these Python-created breakpoints.

Tom

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Python and structured output from breakpoint_ops
  2011-10-07 15:40 ` Tom Tromey
@ 2011-10-07 16:04   ` Pedro Alves
  2011-10-10  9:22     ` Phil Muldoon
  2011-10-10  9:14   ` Phil Muldoon
  1 sibling, 1 reply; 6+ messages in thread
From: Pedro Alves @ 2011-10-07 16:04 UTC (permalink / raw)
  To: gdb; +Cc: Tom Tromey

On Friday 07 October 2011 16:39:59, Tom Tromey wrote:

> Phil> So again we are limited to a string and a constant.  I guess we could,
> Phil> if the user passed a list within a list, call ui_out_list there.  But to
> Phil> me, you will only ever want this output on one line (in fact, it may be
> Phil> a requirement, I am not to sure).
> 
> I don't think it has to be.
> 
> Phil> There seems to be more room to maneuver with print_mention, and
> Phil> print_one_detail.  They are currently implemented as pure strings.  But
> Phil> again, both I believe (and really, I want) to be implemented as a single
> Phil> string.  print_mention is called when a breakpoint is created.  Is there
> Phil> an example of what kind of structured output we could use here?
> 
> I think I was hoping that we could unify some of the print methods.  It
> seems strange to have 4 different method to print more or less the same
> basic information.

I still think we should cleanup the breakpoint printing machinery before
exporting it to python.  These methods were not converted to
breakpoint_ops yet.  By only considering a single string, you're leaving
out breakpoints with multiple locations.  And those will become even more
important with Tom's linespec/multi-location rework.

> This might mean constraining the output a little bit in order to provide
> a simpler API.  I think that would be good, but that is just my opinion;
> however, if it turned out to be too limiting we could always extend the
> options later.
> 
> Even if all the methods can't be unified it seems that at least
> print_one and print_one_detail could be.
> 
> Phil> print_one_detail is an optional detail line below each entry for "info
> Phil> breakpoints".  This has to be limited to a single line, to remain
> Phil> constant with "info breakpoints" output.
> 
> It seems like it could have multiple lines, just nothing does this yet.

Yeah.  Random catchpoints are likely to want it.

> This is a good example of where structured output is useful: right now
> the code has to know how to format the continuation lines (e.g., start
> with a tab) -- but it seems like it would be better not to bake this
> into Python scripts everywhere, in case we want to change the "info
> break" formatting in the future.  Some kind of structured result would
> let us do this.

Yeah.  I'd like that direction.  It'd allow for more smarter column/cell
wrapping too.

> Phil> In fact, if you look at the mi command -break-list, it just maps
> Phil> to info break and captures that output.  Maybe that conversation
> Phil> is what Jan was talking about when there is an explicit mention
> Phil> that any field change has to be made by Vlad?

The thing is that the fields that are output aren't constrained at all
by the "address" / "what" columns you see in the CLI.  Look at all
the "ui_out_*" calls.  It seems quite reasonable to me to be able to
output random fields from python too, so you could implement new
breakpoint/catchpoints in python and forward whatever necessary info
to the frontend through MI.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Python and structured output from breakpoint_ops
  2011-10-07 15:40 ` Tom Tromey
  2011-10-07 16:04   ` Pedro Alves
@ 2011-10-10  9:14   ` Phil Muldoon
  1 sibling, 0 replies; 6+ messages in thread
From: Phil Muldoon @ 2011-10-10  9:14 UTC (permalink / raw)
  To: Tom Tromey; +Cc: GDB Development

Tom Tromey <tromey@redhat.com> writes:

> Phil> So in print_one's case, the parent does not call the child function
> Phil> until it has already populated all of the fields before address.  So
> Phil> there are only two fields left to fill: "address", and "what".  They
> Phil> have to be a long and a string.
> [...]
> Phil> Given that all one can return in the above context is
> Phil> essential a long and a string, I'm not sure what value we can add by
> Phil> letting the user return anything else? We would have to convert any list
> Phil> within the tuple to a string anyway.  I'd prefer users just do this
> Phil> themselves in Python.
>
> Phil> The same goes for print_stop_action.  This tells GDB what to print (via
> Phil> an enum) when the breakpoint stops.
>
> Phil> So again we are limited to a string and a constant.  I guess we could,
> Phil> if the user passed a list within a list, call ui_out_list there.  But to
> Phil> me, you will only ever want this output on one line (in fact, it may be
> Phil> a requirement, I am not to sure).
>
> I don't think it has to be.
>
> Phil> There seems to be more room to maneuver with print_mention, and
> Phil> print_one_detail.  They are currently implemented as pure strings.  But
> Phil> again, both I believe (and really, I want) to be implemented as a single
> Phil> string.  print_mention is called when a breakpoint is created.  Is there
> Phil> an example of what kind of structured output we could use here?
>
> I think I was hoping that we could unify some of the print methods.  It
> seems strange to have 4 different method to print more or less the same
> basic information.

Yeah I have no problem with that at all.  In fact, I totally agree there
should not be a one-to-one mapping of the internal -> external APIs.
They make sense to GDB internally, but not externally.

>
> This might mean constraining the output a little bit in order to provide
> a simpler API.  I think that would be good, but that is just my opinion;
> however, if it turned out to be too limiting we could always extend the
> options later.
>
> Even if all the methods can't be unified it seems that at least
> print_one and print_one_detail could be.

Right, so in this context "describe_breakpoint" could be the name
there.  Similarly, something like "describe_new_breakpoint" for mention.
>
> Phil> print_one_detail is an optional detail line below each entry for "info
> Phil> breakpoints".  This has to be limited to a single line, to remain
> Phil> constant with "info breakpoints" output.
>
> It seems like it could have multiple lines, just nothing does this
> yet.

My caution here is that we have three interfaces to describe breakpoints
too: CLI, Annotations, and MI (1 and 2?).  Even though annotations seems
barely used, emacs uses it.  Maybe this isn't an issue at all.

>
> This is a good example of where structured output is useful: right now
> the code has to know how to format the continuation lines (e.g., start
> with a tab) -- but it seems like it would be better not to bake this
> into Python scripts everywhere, in case we want to change the "info
> break" formatting in the future.  Some kind of structured result would
> let us do this.

I'm not sure what you mean?  There are two fields, What and Address, so
in that narrow context there will not be a continuation.  The old
print_one_detail, yeah, I can see that.  Also there seems to be just a
free-flow standard to where the detail line starts.  Some start at the
second column, some at the beginning (at least, when I last looked).  
>
> Phil> In fact, if you look at the mi command -break-list, it just maps
> Phil> to info break and captures that output.  Maybe that conversation
> Phil> is what Jan was talking about when there is an explicit mention
> Phil> that any field change has to be made by Vlad?
>
> I wouldn't worry about the field name thing in this discussion.  We're
> already talking about extensions to gdb from third parties, nothing in
> the core.
>
> However, another important thing is that the print_* methods work from
> both MI and the CLI.  Otherwise, -break-list is going to print garbage
> when someone installs one of these Python-created breakpoints.

Right, see above, including annotations.  As they stand right now, in my
branch, they do work for all three of the interfaces.  My thoughts are
how do we write this API (With structured output)?  Do we completely
disseminate the API from the GDB internals, and ask for our own kind of
custom output, then just slot in the various bits of detail into the
various fields?  What are these APIs? What kind of structured output?

Cheers,

Phil

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Python and structured output from breakpoint_ops
  2011-10-07 16:04   ` Pedro Alves
@ 2011-10-10  9:22     ` Phil Muldoon
  2011-10-10 18:47       ` Pedro Alves
  0 siblings, 1 reply; 6+ messages in thread
From: Phil Muldoon @ 2011-10-10  9:22 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb, Tom Tromey

Pedro Alves <pedro@codesourcery.com> writes:

> On Friday 07 October 2011 16:39:59, Tom Tromey wrote:
>
>> Phil> So again we are limited to a string and a constant.  I guess we could,
>> Phil> if the user passed a list within a list, call ui_out_list there.  But to
>> Phil> me, you will only ever want this output on one line (in fact, it may be
>> Phil> a requirement, I am not to sure).
>> 
>> I don't think it has to be.
>> 
>> Phil> There seems to be more room to maneuver with print_mention, and
>> Phil> print_one_detail.  They are currently implemented as pure strings.  But
>> Phil> again, both I believe (and really, I want) to be implemented as a single
>> Phil> string.  print_mention is called when a breakpoint is created.  Is there
>> Phil> an example of what kind of structured output we could use here?
>> 
>> I think I was hoping that we could unify some of the print methods.  It
>> seems strange to have 4 different method to print more or less the same
>> basic information.
>
> I still think we should cleanup the breakpoint printing machinery before
> exporting it to python.  These methods were not converted to
> breakpoint_ops yet.  By only considering a single string, you're leaving
> out breakpoints with multiple locations.  And those will become even more
> important with Tom's linespec/multi-location rework.

I've no problem with this as long as we have a plan in place, when we
think it will be released, etc.  Right now (you) did an excellent
refactor internally, but what are the future plans?  When do we plan to
have them in place?  The usual tricky question ;)

I guess I am asking what you mean by clean-ups in this context?  


>> This might mean constraining the output a little bit in order to provide
>> a simpler API.  I think that would be good, but that is just my opinion;
>> however, if it turned out to be too limiting we could always extend the
>> options later.
>> 
>> Even if all the methods can't be unified it seems that at least
>> print_one and print_one_detail could be.
>> 
>> Phil> print_one_detail is an optional detail line below each entry for "info
>> Phil> breakpoints".  This has to be limited to a single line, to remain
>> Phil> constant with "info breakpoints" output.
>> 
>> It seems like it could have multiple lines, just nothing does this yet.
>
> Yeah.  Random catchpoints are likely to want it.

In a deeper context, fully implementing catchpoint creation in Python
seems quite tricky.  Many of the catchpoint APIs seem to need to know
about deep internal GDB state.  Do we want to expose those decisions
coupled with that information externally?  We made a promise with the
Python API that it will be stable.  I've not really though about this
too much yet; there might be a clean answer just around the corner.


>> Phil> In fact, if you look at the mi command -break-list, it just maps
>> Phil> to info break and captures that output.  Maybe that conversation
>> Phil> is what Jan was talking about when there is an explicit mention
>> Phil> that any field change has to be made by Vlad?
>
> The thing is that the fields that are output aren't constrained at all
> by the "address" / "what" columns you see in the CLI.  Look at all
> the "ui_out_*" calls.  It seems quite reasonable to me to be able to
> output random fields from python too, so you could implement new
> breakpoint/catchpoints in python and forward whatever necessary info
> to the frontend through MI.

Doing that from Python would be a good idea, I agree.  We could have a
field:data structure for the user to output whatever they wish, and MI
could be taught to learn, beyond the usual fields it expects, there are
"extra" fields: ignore them or print them.  I'm not sure why the
explicit field creations needs express approval from Vlad.  Are MI
clients parsing expected only fields? Order of fields?

Cheers

Phil

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Python and structured output from breakpoint_ops
  2011-10-10  9:22     ` Phil Muldoon
@ 2011-10-10 18:47       ` Pedro Alves
  0 siblings, 0 replies; 6+ messages in thread
From: Pedro Alves @ 2011-10-10 18:47 UTC (permalink / raw)
  To: pmuldoon; +Cc: gdb, Tom Tromey

On Monday 10 October 2011 10:22:20, Phil Muldoon wrote:
> Pedro Alves <pedro@codesourcery.com> writes:
> 
> > I still think we should cleanup the breakpoint printing machinery before
> > exporting it to python.  These methods were not converted to
> > breakpoint_ops yet.  By only considering a single string, you're leaving
> > out breakpoints with multiple locations.  And those will become even more
> > important with Tom's linespec/multi-location rework.
> 
> I've no problem with this as long as we have a plan in place, when we
> think it will be released, etc.  Right now (you) did an excellent
> refactor internally, but what are the future plans?  

The next step is to make breakpoint_ops->print_one work
with regular breakpoints.  print_one_breakpoint / 
print_one_breakpoint_location were never converted to breakpoint_ops.
This is not a case of the internal abstractions being too
detailed/internal to want to expose to python.  Rather it's a case
of the internal abstraction not being good even for GDBs own internals!
If we fix this (pick print_one_breakpoint / print_one_breakpoint_location
apart in a way that the core breakpoint print code doesn't know about
specific breakpoint types), then you win a good python abstraction
as a co/by-product. IOW, or from a different angle, if you come up with
a nice python abstraction for this, there's no reason that the core
wouldn't want the same nice abstraction too.  But only by cleaning up
the core can you know you _have_ a good abstraction.

> When do we plan to have them in place?  The usual tricky question ;)

Ah, if days had infinite hours... :-)  I don't have time presently
to work on that myself until next January.

> I guess I am asking what you mean by clean-ups in this context?  

See above.

> >> It seems like it could have multiple lines, just nothing does this yet.
> >
> > Yeah.  Random catchpoints are likely to want it.
> 
> In a deeper context, fully implementing catchpoint creation in Python
> seems quite tricky.  Many of the catchpoint APIs seem to need to know
> about deep internal GDB state.  Do we want to expose those decisions
> coupled with that information externally?  We made a promise with the
> Python API that it will be stable.  I've not really though about this
> too much yet; there might be a clean answer just around the corner.

Catchpoints that I'd find useful to write in python would for example
be things like putting a breakpoint in a special routine in your
special domain specific or embedded OS runtime -- "catch my-special-event".
You'd want to hide the fact that that's implemented by placing a
breakpoint, and the support is all there (I believe).

> >> Phil> In fact, if you look at the mi command -break-list, it just maps
> >> Phil> to info break and captures that output.  Maybe that conversation
> >> Phil> is what Jan was talking about when there is an explicit mention
> >> Phil> that any field change has to be made by Vlad?
> >
> > The thing is that the fields that are output aren't constrained at all
> > by the "address" / "what" columns you see in the CLI.  Look at all
> > the "ui_out_*" calls.  It seems quite reasonable to me to be able to
> > output random fields from python too, so you could implement new
> > breakpoint/catchpoints in python and forward whatever necessary info
> > to the frontend through MI.
> 
> Doing that from Python would be a good idea, I agree.  We could have a
> field:data structure for the user to output whatever they wish, and MI
> could be taught to learn, beyond the usual fields it expects, there are
> "extra" fields: ignore them or print them.  I'm not sure why the
> explicit field creations needs express approval from Vlad.  Are MI
> clients parsing expected only fields? Order of fields?

I'm not really sure I understand what you're asking.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-10-10 18:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-07 15:16 Python and structured output from breakpoint_ops Tom Tromey
2011-10-07 15:40 ` Tom Tromey
2011-10-07 16:04   ` Pedro Alves
2011-10-10  9:22     ` Phil Muldoon
2011-10-10 18:47       ` Pedro Alves
2011-10-10  9:14   ` Phil Muldoon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).