public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: Simon Marchi <simark@simark.ca>, gdb-patches@sourceware.org
Cc: Andrew Burgess <andrew.burgess@embecosm.com>
Subject: Re: [PATCHv4 1/5] gdb: add new base class to gdb_disassembler
Date: Thu, 05 May 2022 18:39:50 +0100	[thread overview]
Message-ID: <87czgrnajd.fsf@redhat.com> (raw)
In-Reply-To: <0a8f29f9-a8f9-e178-6c89-90ff38be7596@simark.ca>

Simon Marchi <simark@simark.ca> writes:

>> From: Andrew Burgess <andrew.burgess@embecosm.com>
>> 
>> The motivation for this change is an upcoming Python disassembler API
>> that I would like to add.  As part of that change I need to create a
>> new disassembler like class that contains a disassemble_info and a
>> gdbarch.  The management of these two objects is identical to how we
>> manage these objects within gdb_disassembler, so it might be tempting
>> for my new class to inherit from gdb_disassembler.
>> 
>> The problem however, is that gdb_disassembler has a tight connection
>> between its constructor, and its print_insn method.  In the
>> constructor the ui_file* that is passed in is replaced with a member
>> variable string_file*, and then in print_insn, the contents of the
>> member variable string_file are printed to the original ui_file*.
>> 
>> What this means is that the gdb_disassembler class has a tight
>> coupling between its constructor and print_insn; the class just isn't
>> intended to be used in a situation where print_insn is not going to be
>> called, which is how my (upcoming) sub-class would need to operate.
>> 
>> My solution then, is to separate out the management of the
>> disassemble_info and gdbarch into a new gdb_disassemble_info class,
>> and make this class a parent of gdb_disassembler.
>> 
>> In arm-tdep.c and mips-tdep.c, where we used to cast the
>> disassemble_info->application_data to a gdb_disassembler, we can now
>> cast to a gdb_disassemble_info as we only need to access the gdbarch
>> information.
>> 
>> Now, my new Python disassembler sub-class will still want to print
>> things to an output stream, and so we will want access to the
>> dis_asm_fprintf functionality for printing.
>> 
>> However, rather than move this printing code into the
>> gdb_disassemble_info base class, I have added yet another level of
>> hierarchy, a gdb_printing_disassembler, thus the class structure is
>> now:
>> 
>>   struct gdb_disassemble_info {};
>>   struct gdb_printing_disassembler : public gdb_disassemble_info {};
>>   struct gdb_disassembler : public gdb_printing_disassembler {};
>
> I can't explain it very well, but it seems a little strange to make the
> other classes inherit from gdb_disassemble_info.  From the name (and the
> code), I understand that gdb_disassemble_info purely holds the data
> necessary for the disassemblers to do their job.  So this is not really
> an IS-A relationship, but more a HAS-A.  So I would think composition
> would be more natural (i.e. gdb_printing_disassembler would have a field
> of type gdb_disassemble_info).
>
> But with the callbacks we pass to struct disassemble_info (such as
> fprintf_func and fprintf_styled_func), what is data and what is behavior
> gets a little blurry.
>
> It doesn't matter much in the end, I'm fine with what you have.  And it
> can always be refactored later.

Thanks.  I took another look at this code and started by simplifying
things so that we had just gdb_disassemble_info and gdb_disassembler.
Instead of having gdb_disassembler inherit from gdb_disassemble_info,
gdb_disassembler contained a gdb_disassemble_info.

Then I pulled the rest of the patches in on top and tried to get things
working again.  In the end I liked the new code a lot less that what I
have here.

The problem I ran into is that it is the gdb_disassemble_info that is
held in the disassemble_info::application_data field.  It is important
(I think) that this is the case, as we want code like gdb_print_insn_arm
(arm-tdep.c) to be able to make use of the application_data without
knowing precisely which disassembler class triggered the disassembly
call, e.g. we could be calling from a Python disassembler (after patch
#3 in this series), or from the "normal" disassembler (e.g. after an x/i
command), or we could be calling from the gdb_disassembler_test
disassembler (from the selftest files).  In all cases the
application_data is always a sub-class of gdb_disassemble_info.

So, when gdb_disassembler no longer inherits from the base type, but
instead contains a base type then what do we place in the
application_data field?  I think the only choice is to still place a
gdb_disassemble_info in that field, but this means that we end up
needing a mechanism to get from the application_data back to the actual
disassembler instance, here's why:

Currently my proposal is something like this (after patch #3):

  struct gdb_disassemble_info
  struct gdb_printing_disassembler : public gdb_disassemble_info
  struct gdb_disassembler : public gdb_printing_disassembler
  struct gdbpy_disassembler : public gdb_printing_disassembler

Each of these types are sub-classes of gdb_disassemble_info, and so,
callbacks like gdb_print_insn_arm are fine to cast the application_data
to the gdb_disassemble_info type and make use of it.

But then we have other callbacks,
e.g. gdb_disassembler::dis_asm_print_address and (from patch #3)
gdbpy_disassembler::read_memory_func, these callbacks are disassembler
specific.  In these we don't cast the application_data to
gdb_disassemble_info, but to the specific disassembler type,
gdb_disassembler and gdbpy_disassembler in the two examples I listed
above.  But there's another example in
gdb_disassembler_test::read_memory and ther might be more about.

The neat thing (I think) about using inheritance here is that this all
"just works", a callback defined within a particular disassembler can
assume that disassembler type.  A more generic callback can just assume
the generic gdb_disassemble_info type.

So after I tried restructuring things like this:

  struct gdb_disassemble_info
  struct gdb_disassembler
  struct gdbpy_disassembler

Where both gdb_disassembler and gdbpy_disassembler contain a
gdb_disassemble_info.  I still ended up placing the gdb_disassemble_info
into the application_data field.  However, now in the disassembler
specific callbacks how do I get back to the correct disassembler type?

I could give the gdb_disassemble_info a generic void* field, and then
place any pointer I want in there, but, I really don't want to do that.
Passing generic pointers around is almost never necessary (I claim) when
we have inheritance.  So what I did instead was this:

  struct gdb_disassemble_info
  struct gdb_disassembler
  struct gdbpy_disassemble_info : public gdb_disassemble_info
  struct gdbpy_disassembler

Now gdb_disassembler contains a gdb_disassemble_info, and
gdbpy_disassembler contains a gdbpy_disassemble_info.  The callback
problems is solved; generic callbacks can still treat the
application_data as a gdb_disassemble_info, but the disassembler
specific callbacks will know it's a particular sub-class of
gdb_disassemble_info (e.g. gdbpy_disassemble_info), and can cast the
application_data to that type.  The gdbpy_disassemble_info then holds a
pointer to the gdbpy_disassembler, and all is good.

Of course, the above is only a sub-set of what's needed, the
disassembler self-tests end up needing another gdb_disassemble_info
sub-class, and maybe there would be more needed if I'd actually taken
this patch to completion?

It feels like all I've done is move complexity out of disasm.h and
forced the complexity onto each user of the disassembler, which isn't
great.  The current approach might not be great, but actually using the
disassembler is pretty straight forward.

The other problem with the above is knowing what state to keep in
e.g. gdbpy_disassembler and what to move into gdbpy_disassemble_info?  I
played around with this a bit, but to some degree I felt like I was
forcing myself to keep gdbpy_disassembler around, it almost felt more
natural to move everything into gdbpy_disassemble_info, in which case,
aren't I almost back where I started?  Except I've moved the code out of
disasm.h and to the site of each user?

I know you said you'd be willing to see my original patch go in and
refactor later - and I am of course happy to see the code reworked later
- but I wanted to see if I could get something that you'd be happy with.

Anyway, if you have any suggestions I willing to take another pass at
this code.

>
>> +gdb_disassemble_info::gdb_disassemble_info
>> +  (struct gdbarch *gdbarch, struct ui_file *stream,
>> +   read_memory_ftype read_memory_func, memory_error_ftype memory_error_func,
>> +   print_address_ftype print_address_func, fprintf_ftype fprintf_func,
>> +   fprintf_styled_ftype fprintf_styled_func)
>> +    : m_gdbarch (gdbarch)
>>  {
>> -  init_disassemble_info (&m_di, &m_buffer, dis_asm_fprintf,
>> -			 dis_asm_styled_fprintf);
>> +  gdb_assert (fprintf_func != nullptr);
>> +  gdb_assert (fprintf_styled_func != nullptr);
>> +  init_disassemble_info (&m_di, stream, fprintf_func,
>> +			 fprintf_styled_func);
>>    m_di.flavour = bfd_target_unknown_flavour;
>> -  m_di.memory_error_func = dis_asm_memory_error;
>> -  m_di.print_address_func = dis_asm_print_address;
>> -  /* NOTE: cagney/2003-04-28: The original code, from the old Insight
>> -     disassembler had a local optimization here.  By default it would
>> -     access the executable file, instead of the target memory (there
>> -     was a growing list of exceptions though).  Unfortunately, the
>> -     heuristic was flawed.  Commands like "disassemble &variable"
>> -     didn't work as they relied on the access going to the target.
>> -     Further, it has been superseeded by trust-read-only-sections
>> -     (although that should be superseeded by target_trust..._p()).  */
>> -  m_di.read_memory_func = read_memory_func;
>> +  if (memory_error_func != nullptr)
>> +    m_di.memory_error_func = memory_error_func;
>> +  if (print_address_func != nullptr)
>> +    m_di.print_address_func = print_address_func;
>> +  if (read_memory_func != nullptr)
>> +    m_di.read_memory_func = read_memory_func;
>
> Are the nullptr checks needed?  Are these fields initially nullptr, or
> they have some other value?

Yes they are needed.  The three fields protected here are given non-null
values by the call to init_disassemble_info, we only want to replace
these defaults with a user supplied value if the user supplied value is
not nullptr.

Placing nullptr into these fields will cause the disassembler to crash!

This was documented in the header file on gdb_disassemble_info, but I've
added an extra comment in the code here to make it clear what's going on.

>
>
>> +struct gdb_disassemble_info
>> +{
>> +  DISABLE_COPY_AND_ASSIGN (gdb_disassemble_info);
>>  
>> -  /* Return the gdbarch of gdb_disassembler.  */
>> +  /* Return the gdbarch we are disassembing for.  */
>
> disassembing -> disassembling

Fixed.

I've working on updating patch #3 and will repost this series once
that's done.

Thanks,
Andrew


  parent reply	other threads:[~2022-05-05 17:39 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-13 21:59 [PATCH 0/5] Add Python API for the disassembler Andrew Burgess
2021-10-13 21:59 ` [PATCH 1/5] gdb: make disassembler fprintf callback a static member function Andrew Burgess
2021-10-20 20:40   ` Tom Tromey
2021-10-22 12:51     ` Andrew Burgess
2021-10-13 21:59 ` [PATCH 2/5] gdb/python: new gdb.architecture_names function Andrew Burgess
2021-10-14  6:52   ` Eli Zaretskii
2021-10-22 12:51     ` Andrew Burgess
2021-10-20 20:40   ` Tom Tromey
2021-10-22 13:02   ` Simon Marchi
2021-10-22 17:34     ` Andrew Burgess
2021-10-22 18:42       ` Simon Marchi
2021-10-13 21:59 ` [PATCH 3/5] gdb/python: move gdb.Membuf support into a new file Andrew Burgess
2021-10-20 20:42   ` Tom Tromey
2021-10-22 12:52     ` Andrew Burgess
2021-10-13 21:59 ` [PATCH 4/5] gdb: add extension language print_insn hook Andrew Burgess
2021-10-20 21:06   ` Tom Tromey
2021-10-13 21:59 ` [PATCH 5/5] gdb/python: implement the print_insn extension language hook Andrew Burgess
2021-10-14  7:12   ` Eli Zaretskii
2021-10-22 17:47     ` Andrew Burgess
2021-10-22 18:33       ` Eli Zaretskii
2021-10-22 13:30   ` Simon Marchi
2022-03-23 22:41 ` [PATCHv2 0/3] Add Python API for the disassembler Andrew Burgess
2022-03-23 22:41   ` [PATCHv2 1/3] gdb: add new base class to gdb_disassembler Andrew Burgess
2022-03-23 22:41   ` [PATCHv2 2/3] gdb: add extension language print_insn hook Andrew Burgess
2022-03-23 22:41   ` [PATCHv2 3/3] gdb/python: implement the print_insn extension language hook Andrew Burgess
2022-03-24  7:10     ` Eli Zaretskii
2022-03-24 19:51       ` Andrew Burgess
2022-04-04 22:19   ` [PATCHv3 0/6] Add Python API for the disassembler Andrew Burgess
2022-04-04 22:19     ` [PATCHv3 1/6] gdb: move gdb_disassembly_flag into a new disasm-flags.h file Andrew Burgess
2022-04-05 14:32       ` Tom Tromey
2022-04-06 12:18         ` Andrew Burgess
2022-04-04 22:19     ` [PATCHv3 2/6] gdb: add new base class to gdb_disassembler Andrew Burgess
2022-04-04 22:19     ` [PATCHv3 3/6] gdb: add extension language print_insn hook Andrew Burgess
2022-04-04 22:19     ` [PATCHv3 4/6] gdb/python: implement the print_insn extension language hook Andrew Burgess
2022-04-05 12:04       ` Eli Zaretskii
2022-04-04 22:19     ` [PATCHv3 5/6] gdb: refactor the non-printing disassemblers Andrew Burgess
2022-04-04 22:19     ` [PATCHv3 6/6] gdb: unify two dis_asm_read_memory functions in disasm.c Andrew Burgess
2022-04-25  9:15     ` [PATCHv4 0/5] Add Python API for the disassembler Andrew Burgess
2022-04-25  9:15       ` [PATCHv4 1/5] gdb: add new base class to gdb_disassembler Andrew Burgess
2022-05-03 13:34         ` Simon Marchi
2022-05-03 16:13           ` Andrew Burgess
2022-05-05 17:39           ` Andrew Burgess [this message]
2022-04-25  9:15       ` [PATCHv4 2/5] gdb: add extension language print_insn hook Andrew Burgess
2022-05-03 13:42         ` Simon Marchi
2022-04-25  9:15       ` [PATCHv4 3/5] gdb/python: implement the print_insn extension language hook Andrew Burgess
2022-04-25 11:26         ` Eli Zaretskii
2022-05-03 14:55         ` Simon Marchi
2022-05-05 18:17           ` Andrew Burgess
2022-05-24  1:16             ` Simon Marchi
2022-05-24  8:30               ` Andrew Burgess
2022-05-25 10:37                 ` Andrew Burgess
2022-04-25  9:15       ` [PATCHv4 4/5] gdb: refactor the non-printing disassemblers Andrew Burgess
2022-04-25  9:15       ` [PATCHv4 5/5] gdb: unify two dis_asm_read_memory functions in disasm.c Andrew Burgess
2022-05-03 10:12       ` [PATCHv4 0/5] Add Python API for the disassembler Andrew Burgess
2022-05-06 17:17       ` [PATCHv5 " Andrew Burgess
2022-05-06 17:17         ` [PATCHv5 1/5] gdb: add new base class to gdb_disassembler Andrew Burgess
2022-05-06 17:17         ` [PATCHv5 2/5] gdb: add extension language print_insn hook Andrew Burgess
2022-05-06 17:17         ` [PATCHv5 3/5] gdb/python: implement the print_insn extension language hook Andrew Burgess
2022-05-06 18:11           ` Eli Zaretskii
2022-05-18 10:08             ` Andrew Burgess
2022-05-18 12:08               ` Eli Zaretskii
2022-05-23  8:59                 ` Andrew Burgess
2022-05-23 11:23                   ` Eli Zaretskii
2022-05-06 17:17         ` [PATCHv5 4/5] gdb: refactor the non-printing disassemblers Andrew Burgess
2022-05-06 17:17         ` [PATCHv5 5/5] gdb: unify two dis_asm_read_memory functions in disasm.c Andrew Burgess
2022-05-25 10:49         ` [PATCHv6 0/6] Add Python API for the disassembler Andrew Burgess
2022-05-25 10:49           ` [PATCHv6 1/6] gdb/python: convert gdbpy_err_fetch to use gdbpy_ref Andrew Burgess
2022-05-25 10:49           ` [PATCHv6 2/6] gdb: add new base class to gdb_disassembler Andrew Burgess
2022-05-25 10:49           ` [PATCHv6 3/6] gdb: add extension language print_insn hook Andrew Burgess
2022-05-25 10:49           ` [PATCHv6 4/6] gdb/python: implement the print_insn extension language hook Andrew Burgess
2022-05-25 13:32             ` Eli Zaretskii
2022-05-25 10:49           ` [PATCHv6 5/6] gdb: refactor the non-printing disassemblers Andrew Burgess
2022-05-25 10:49           ` [PATCHv6 6/6] gdb: unify two dis_asm_read_memory functions in disasm.c Andrew Burgess
2022-06-15  9:04           ` [PUSHED 0/6] Add Python API for the disassembler Andrew Burgess
2022-06-15  9:04             ` [PUSHED 1/6] gdb/python: convert gdbpy_err_fetch to use gdbpy_ref Andrew Burgess
2022-06-15  9:04             ` [PUSHED 2/6] gdb: add new base class to gdb_disassembler Andrew Burgess
2022-06-15  9:04             ` [PUSHED 3/6] gdb: add extension language print_insn hook Andrew Burgess
2022-06-15  9:04             ` [PUSHED 4/6] gdb/python: implement the print_insn extension language hook Andrew Burgess
2022-06-15  9:04             ` [PUSHED 5/6] gdb: refactor the non-printing disassemblers Andrew Burgess
2022-06-15  9:04             ` [PUSHED 6/6] gdb: unify two dis_asm_read_memory functions in disasm.c Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czgrnajd.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=andrew.burgess@embecosm.com \
    --cc=gdb-patches@sourceware.org \
    --cc=simark@simark.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).