Re: Questions on XML register descriptions

public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed

From: Luis Machado <luis.machado@arm.com>
To: "H. Peter Anvin" <hpa@zytor.com>, gdb@sourceware.org
Subject: Re: Questions on XML register descriptions
Date: Fri, 29 Dec 2023 13:41:23 +0000	[thread overview]
Message-ID: <a6348a4c-6b2b-40ab-a03f-b6f4c448d255@arm.com> (raw)
In-Reply-To: <8d4f8356-d7ee-4b84-ba8a-346207bed3e8@zytor.com>

On 12/28/23 22:09, H. Peter Anvin wrote:
> Hi and thanks for responding!
> 
> So more concrete background: the simulator is for a family of Z80 systems developed between 1978 and 1983. Z80 of course has both memory and I/O spaces, and on top of that, the later models in the series (and even the first ones with add-ons) had to do paging tricks to deal with the limitations of a 16-bit address space.
> 
> This is not visible to the CPU, so sizeof(void *) == 2.
> 

Thanks for the info. In the case of Z80, it looks like gdb already has support for it, but I've never used it. You can check what's available in the gdb/z80-tdep.c file.

From skimming through it, I found the following:

  /* Number of bytes used for address:
      2 bytes for all Z80 family
      3 bytes for eZ80 CPUs operating in ADL mode */

So it looks like gdb knows at least *some* of it. I don't know how functional the port is at this point.

> On 12/27/23 03:28, Luis Machado wrote:
>>>
>>> 1. Some registers can't be written to, and some registers may affect other registers. What, if anything, is the best way to handle that?
>>
>> There isn't a single right way. You could teach gdb about those registers explicitly (not ideal) or you could make the remote just ignore or error out when gdb tries to modify such a register.
>>
>> As for side-effects (changing a register has an effect on a different register), this might be best implemented via p/P packets. With those gdb can write out one specific register and then it re-reads the rest of the registers.
>>
>> In such a case, you get the opportunity to apply the side-effects to other registers in time for sending the updated register buffer block.
>>
>> This is a bit odd with the g/G packets, as it gets confusing having to write all registers at once, and you might not know what values have changed or not.
>>
> 
> This is of course true, but it is gdb that selects between gG and pP packet, not the remote (which is what I control.)
> 

That's true, but if the remote advertises support for the p/P packets, I think things will work better. So you have at least some control over what gdb will do.

> Now, what I *can* do is to limit the return size of the g packet. If that means gdb will use pP packets to access other registers if/when requested, then that should deal with the problem.
> 

Ok, I went to check this again, as I had forgotten what gdb did. It looks like you're right. From the code comment about handling the g packet reply from the remote:

  /* If this is smaller than we guessed the 'g' packet would be,
     update our records.  A 'g' reply that doesn't include a register's
     value implies either that the register is not available, or that
     the 'p' packet must be used.  */

So it looks like you can force gdb to fallback to the p packet by making the g reply packet of minimum size.

>>>
>>> 2. How do the G/g commands interact with the XML target descriptions? Do they apply specifically to group="general", or are there other rules?
>>
>> My recollection of it is that G/g will request all of the reported XML register with non-zero sizes. If p/P is also supported, gdb will attempt to fetch things using g and write things using P.
>>
> 
> That solves a lot of problems, although I expect that there might still be issues with the register cache... but that is a second-order problem if only a bit annoying.
> 
>> I vaguely recall something about gdb not requesting some register in g, but I'm not sure if that's still a thing.
> 
> So gdb doesn't specify anything about what g is supposed to return. The documentation only says "general registers", which *sounds* like group="general", but I have no real idea.
> 
> The description of register groups say that "some register groups" have built-in meaning to gdb, but I cannot find anywhere *where* those groups and their associated meanings are described, except that "general", "float" and "vector" are described (but not their specific semantics), and "all" is mentioned elsewhere (the naming of which is kind of obvious.)

Yeah, sorry about that. The documentation can be a bit vague in some areas. Though not documented that way, from the code's side gdb lets the remote target choose what to return in the g packet. Of course the remote must honor the order of registers in the g packet, but otherwise it can decide how big/small to make the g packet reply.

This is useful in case you have many system registers. The g packet in those cases would get fairly big, and you don't want to keep sending system register contents back and forth if they are not important.

As for the groups, they mostly affect how gdb displays the registers. For instance, "info reg" will not show floating point/vector/system registers IIRC. I don't think it has an influence on the remote protocol.

> 
> There is an option "save-restore" in the register description; it is unclear to me how it interacts with the gG packets, but would at least can avoid the problem of gdb trying to restore registers that cannot be restored or would be ambiguous.
> 

save-restore controls whether gdb is required to save those registers in case it wants to make a manual function call (say, with a "call" command). In that case, gdb will save the register context before calling the function and will restore it afterwards.

If a register is not writable, it shouldn't have save-restore enabled.

>>>
>>> 3. Is there any concept at all of different address spaces in gdb? Alternatively, is there any way to tell gdb that it should communicate addresses wider than the pointer type of the base architecture (which would allow a Python helper script to define a convenience API.>
>>
>> Yes, but the architecture-specific / os-specific layers need to be taught about it. Ideally the compiler would also establish a way of communicating this information to debuggers via the generated DWARF info. See DW_AT_address_class in the DWARF standard.
>>
>> Are we talking about short pointers here? Say, a 16-bit pointer in a 32-bit address space?
>>
> 
> The compiler communicating it isn't essential here as I'm not trying to expose things that are visible to the compiler. What I want to do is to be able to use gdb to examine/access non-CPU-visible address spaces, like the physical RAM even when it is paged out, or the I/O address space.
> 
> The problem becomes that the type of the pointer is lost if it is simply converted to an integer, and casting it back to a pointer causes it to be truncated back to 16 bits.

Yeah, so this is the type of knowledge gdb must have in order to handle things correctly. Technically you could handle things correctly from the remote's side by inspecing the memory read/write packets, determining what the address space should be, what the pointer size might be and then going to fetch that information from wherever address space it belongs to.

> 
> I tried doing a Python script like this, but it ran into the above problem (plus the connection problem mentioned below):
> 
> 

What particular error do you get, out of curiosity?

> #!/usr/bin/python
> 
> import gdb
> 
> class Addrspace (gdb.Function):
>     """Convert an address or a pointer to the named address space."""
>     def __init__ (self, name, base, mask):
>         self.name = name
>         self.mask = mask
>         self.base = base
>         arch = gdb.selected_inferior().architecture()
>         bits = 8
>         while (mask|base) >> bits:
>             bits <<= 1
>         self.itype = arch.integer_type(bits, False)
>         super (Addrspace, self).__init__ (name)
> 
>     def invoke (self, val):
>         type = val.type
>         return ((val.cast(self.itype) & self.mask) \
>          + self.base).cast(type)
> 
> _io = Addrspace ("io", 0x01000000, 0xffff)
> 
> 
>>> 4. There doesn't seem to be a way to trigger a Python function when a connection is *established*, which is as far as I understand the point at which the XML register description is downloaded from the remote, and definitely the first point at which register values can be examined. Am I missing something obvious?
>>>
>>
>> The Python API is still growing. If this is needed, I'd make the case to the gdb community. Patches are always welcome.
> 
> 
> 
> 
>>> 5. I presume the byte codes use for the Z commands are the same as the agent bytecode in Appendix F, even though the latter specifically seems to be referring to tracepoints?
>>
>> Yes, the byte codes used for evaluating breakpoint conditions is the same as the ones used for tracepoints. The documentation could make that a bit more explict.
>>
>>>
>>> Many thanks,
>>>
>>>      -hpa
>>

next prev parent reply	other threads:[~2023-12-29 13:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-26 22:42 H. Peter Anvin
2023-12-27 11:28 ` Luis Machado
2023-12-28 22:09   ` H. Peter Anvin
2023-12-29 13:41     ` Luis Machado [this message]
2024-01-08  6:21       ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a6348a4c-6b2b-40ab-a03f-b6f4c448d255@arm.com \
    --to=luis.machado@arm.com \
    --cc=gdb@sourceware.org \
    --cc=hpa@zytor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).