Self-describing targets - a more concrete proposal

public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed

* Self-describing targets - a more concrete proposal
@ 2006-03-29 21:26 Daniel Jacobowitz
  2006-03-29 21:49 ` Eli Zaretskii
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Jacobowitz @ 2006-03-29 21:26 UTC (permalink / raw)
  To: gdb

I've posted about this work several times now, including:

  http://sourceware.org/ml/gdb/2005-05/msg00074.html
  http://sourceware.org/ml/gdb/2005-05/msg00171.html
  http://sourceware.org/ml/gdb/2006-01/msg00257.html
  http://sourceware.org/ml/gdb/2006-03/msg00031.html

It is in a much more concrete stage of development than it's ever been
before; almost everything described in the proposed documentation
actually works now, and I am generally happy with the format
of the descriptions.  You can find the code on
gdb-csl-available-20060303-branch in the CVS repository.  It's
currently only wired up for ARM, and there aren't any sample stub
implementations - that'll be along.

I'd particularly like to thank Paul Brook for some valuable
suggestions, and Jim Blandy for both suggestions and turning my messy
notes into the coherent Texinfo below (and for using "Self-Describing",
which I think is the phrase I'd been fumbling around for a while).

I would appreciate comments on the sample and documentation; while
there are a lot of things left on my to-do list for this project,
most of them are nice to have rather than important.  What's in CVS
is enough to be very, very useful.  If other GDB developers like
the path it's taking, I'd prefer to do future development on it
in mainline, instead of on a branch.

Here's a small, but useful, sample description.  Then, below it, Jim's
documentation - which includes details on what the description means. 
This description matches the current layout of the ARM register cache.

In target.xml:

<?xml version="1.0"?>
<!DOCTYPE target SYSTEM "gdb-target.dtd">
<target>
  <xi:include href="arm-core.xml"/>
  <xi:include href="arm-fpa.xml"/>
  <feature-set>
    <feature-ref name="org.gnu.gdb.arm.core" base-regnum="0"/>
    <feature-ref name="org.gnu.gdb.arm.fpa" base-regnum="16"/>
  </feature-set>
</target>

In arm-core.xml:

<?xml version="1.0"?>
<!DOCTYPE feature SYSTEM "gdb-target.dtd">
<feature name="org.gnu.gdb.arm.core">
  <reg name="r0" bitsize="32"/>
  <reg name="r1" bitsize="32"/>
  <reg name="r2" bitsize="32"/>
  <reg name="r3" bitsize="32"/>
  <reg name="r4" bitsize="32"/>
  <reg name="r5" bitsize="32"/>
  <reg name="r6" bitsize="32"/>
  <reg name="r7" bitsize="32"/>
  <reg name="r8" bitsize="32"/>
  <reg name="r9" bitsize="32"/>
  <reg name="r10" bitsize="32"/>
  <reg name="r11" bitsize="32"/>
  <reg name="r12" bitsize="32"/>
  <reg name="r13" bitsize="32"/>
  <reg name="r14" bitsize="32"/>
  <reg name="r15" bitsize="32"/>

  <!-- The CPSR is register 25, rather than register 16, because
       the FPA registers historically were placed between the PC
       and the CPSR in the "g" packet.  -->
  <reg name="cpsr" bitsize="32" regnum="25"/>
</feature>

In arm-fpa.xml:

<?xml version="1.0"?>
<!DOCTYPE feature SYSTEM "gdb-target.dtd">
<feature name="org.gnu.gdb.arm.fpa">
  <reg name="f0" bitsize="96" type="float"/>
  <reg name="f1" bitsize="96" type="float"/>
  <reg name="f2" bitsize="96" type="float"/>
  <reg name="f3" bitsize="96" type="float"/>
  <reg name="f4" bitsize="96" type="float"/>
  <reg name="f5" bitsize="96" type="float"/>
  <reg name="f6" bitsize="96" type="float"/>
  <reg name="f7" bitsize="96" type="float"/>

  <reg name="fps" bitsize="32"/>
</feature>

The documentation:

Appendix F Self-Describing Targets
**********************************

One of the challenges of using GDB to debug embedded systems is that
there are so many minor variants of each processor architecture in use.
It is common practice for vendors to start with a standard processor
core -- ARM, PowerPC, or MIPS, for example -- and then make changes to
adapt it to a particular market niche.  Some architectures have
hundreds of variants, available from dozens of vendors.  This leads to
a number of problems:

   * With so many different customized processors, it is difficult for
     the GDB maintainers to keep up with the changes.

   * Since individual variants may have short lifetimes or limited
     audiences, it may not be worthwhile to carry information about
     every variant in the GDB source tree.

   * When GDB does support the architecture of the embedded system at
     hand, the task of finding the correct architecture name to give the
     `set architecture' command can be error-prone.

   To address these problems, the GDB remote protocol allows a target
system to not only identify itself to GDB, but to actually describe its
own features.  This lets GDB support processor variants it has never
seen before -- to the extent that the descriptions are accurate, and
that GDB understands them.

F.1 Retrieving Self-Descriptions
================================

GDB retrieves a target's self-description via the remote protocol using
a `qPart' request (*note the `qPart' request: qPart request.) of the
form:
     qPart:features:read:ANNEX:OFFSET,LENGTH
   where ANNEX is the string `target.xml'.  The OFFSET and LENGTH
parameters are the offset into the description and the number of bytes
to transfer, as for other `qPart' requests.

   The `target.xml' annex contains an XML document describing the
target's features; its form is described in *Note Self-Description
Format::.

   Feature descriptions may be split into several annexes, which GDB
retrieves and assembles into a complete description.  An annex may use
XML Inclusions (http://www.w3.org/TR/xinclude/) to incorporate other
annexes, much as a C header file refers to other headers using
`#include'.  GDB first retrieves `target.xml', and then makes further
`qPart' requests as needed to retrieve the annexes referred to by any
`xi:include' elements it finds.  Naturally, annexes brought in by
`xi:include' may use `xi:include' themselves.

   To reduce protocol overhead, a target may supply a special annex
named `CHECKSUMS' that provides 160-bit SHA1 checksum values for the
annexes it has available.  The `CHECKSUMS' annex contains a series of
newline-terminated lines, each of which contains a 40-digit hexidecimal
checksum, two spaces, and the name of an annex with the given checksum.
Here is an example `CHECKSUM' annex:
     68c94ffc34f8ad2d7bfae3f5a6b996409211c1b1  target.xml
     0e8e850b0580fbaaa0872326cb1b8ad6adda9b0d  mmu.xml
     00f22e5f971ccec05c2acce98caf8cff4343c8cf  fpu.xml

   GDB uses these checksums to avoid retrieving a given annex more than
once.  When GDB retrieves an annex, it caches its contents locally.
Then, each time GDB thinks the target architecture may have changed
(say, after making a new remote protocol connection, or after starting
a new child process using the extended remote protocol), it retrieves
the `CHECKSUMS' annex afresh.  If the checksums show that a particular
annex's contents are the same on the target and in GDB's cache, GDB
avoids fetching it again.  If none of the annexes have changed, GDB
needs only retrieve the `CHECKSUMS' annex.

   `CHECKSUMS' need not provide a checksum for every annex available;
if a given annex is not mentioned, GDB will try to retrieve it each
time it thinks the target architecture may have changed.  The target
need not provide any `CHECKSUMS' annex at all; this is equivalent to an
empty `CHECKSUMS' annex.

F.2 Self-Description Format
===========================

A target description annex is an XML (http://www.w3.org/XML/) document
which complies with the Document Type Definition provided in the GDB
sources in `gdb/features/gdb-target.dtd'.  This means you can use
generally available tools like `xmllint' to check that your feature
descriptions are well-formed and valid.  However, to help people
unfamiliar with XML write descriptions for their targets, we also
describe the grammar here.

   At the moment, target descriptions can only describe register sets,
to be accessed via the remote protocol `g', `G', `p' and `P' requests.
We hope to extend the format to include other kinds of information,
like memory maps.

   Here is a simple sample target description:
     <?xml version="1.0"?>
     <!DOCTYPE target SYSTEM "gdb-target.dtd">
     <target>
       <feature name="bar">
         <reg name="s0" bitsize="32"/>
         <reg name="s1" bitsize="32" type="float"/>
       </feature>

       <feature-set>
         <feature-ref name="bar" base-regnum="1"/>
       </feature-set>
     </target>
   This describes a simple target feature set which only contains two
registers, named `s0' (a 32-bit integer register) and `s1' (a 32-bit
floating point register).

   A target description has the overall form:
     <?xml version="1.0"?>
     <!DOCTYPE target SYSTEM "gdb-target.dtd">
     <target>
       FEATURE...
       FEATURE-SET
     </target>
   The description is generally insensitive to whitespace and line
breaks, under the usual common-sense rules.  The ellipsis (`...') after
FEATURE indicates that FEATURE may appear zero or more times.

   Each FEATURE names and describes a single feature of the target; at
the moment, features can only describe register sets.  The FEATURE-SET
cites particular features by name, pulling together a complete
description of the target.  A FEATURE has the form:
     <feature name="NAME">
       REG...
     </feature>
   This defines a feature named NAME; each feature's name must be
unique across the description.

   Each REG has the form:
     <reg name="NAME"
          bitsize="SIZE"
          [regnum="NUM"]
          [readonly="READ-ONLY"]
          [save-restore="SAVE-RESTORE"]
          [type="TYPE"]
          [group="GROUP"]/>
   Items in [brackets] are optional.  The components are as follows:

NAME
     The register's name; it must be unique within the target
     description.

BITSIZE
     The register's size, in bits.

REGNUM
     The register's number.  If omitted, a register's number is one
     greater than that of the previous register; the first register's
     number defaults to zero.  But also see the `feature-ref' element's
     `base-regnum' attribute, below--these register numbers are relative
     to the `base-regnum'.

READONLY
     Whether the register is read-only or not; this must be either
     `yes' or `no'.  The default is `no'.

SAVE-RESTORE
     Whether the register should be preserved across inferior function
     calls; this must be either `yes' or `no'.  The default is `yes'.

TYPE
     The type of the register.  At the moment, TYPE must be either
     `int' or `float'.  The default is `int'.

GROUP
     The register group to which this register belongs.  At the moment,
     GROUP must be either `general', `float', or `vector'.  If no GROUP
     is specified, GDB will select a register group based on the
     register's type.

   A FEATURE-SET binds together a set of features to describe a
complete target.  There can be only one FEATURE-SET in a target.  Each
FEATURE-SET has the form:
     <feature-set>
       FEATURE-REF...
     </feature-set>
   where each FEATURE-REF has the form:
     <feature-ref name="NAME" [base-regnum="N"]/>
   This means that the target includes the feature named NAME.  If the
`base-regnum' is present, that means that registers in the given
feature are numbered starting with N, until overridden by an explicit
register number.

   It can sometimes be valuable to split a target description up into
several different annexes, either for organizational purposes, or to
allow GDB to cache portions of the description that change rarely.  To
make this possible, you can replace any feature description with an
inclusion directive of the form:
     <xi:include href="ANNEX"/>
   When GDB encounters an element of this form, it will retrieve the
annex named ANNEX (or use its cached copy), and replace the inclusion
directive with the contents of that annex.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Self-describing targets - a more concrete proposal
  2006-03-29 21:26 Self-describing targets - a more concrete proposal Daniel Jacobowitz
@ 2006-03-29 21:49 ` Eli Zaretskii
  2006-03-29 22:27   ` Jim Blandy
  0 siblings, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2006-03-29 21:49 UTC (permalink / raw)
  To: gdb

> Date: Wed, 29 Mar 2006 11:16:25 -0500
> From: Daniel Jacobowitz <drow@false.org>
> 
> I would appreciate comments on the sample and documentation

Comments on the documentation are below.  Note that I needed to guess
what was in the Texinfo source, since you posted the Info output, so I
could have guessed wrong, and my comments might thus be off the target.

> GDB retrieves a target's self-description via the remote protocol using
> a `qPart' request (*note the `qPart' request: qPart request.) of the form:

This cross-reference looks awkward.  I'm guessing that Jim used a
2-argument form of a @pxref here.  But the second arg is redundant
here because it is a substring of the 1st.  Am I missing some valid
reason for using the second argument?

>      qPart:features:read:ANNEX:OFFSET,LENGTH
>    where ANNEX is the string `target.xml'.  The OFFSET and LENGTH

The last line should have a @noindent before it.

> parameters are the offset into the description and the number of bytes
> to transfer, as for other `qPart' requests.
> 
>    The `target.xml' annex contains an XML document describing the
> target's features; its form is described in *Note Self-Description
> Format::.

There's something I don't understand here: is "target.xml" a literal
fixed string that will _always_ appear in the above packet?  If it is,
why do we need to mention its name?

>    To reduce protocol overhead, a target may supply a special annex
> named `CHECKSUMS' that provides 160-bit SHA1 checksum values for the
> annexes it has available.  The `CHECKSUMS' annex contains a series of
> newline-terminated lines, each of which contains a 40-digit hexidecimal
> checksum, two spaces, and the name of an annex with the given checksum.
> Here is an example `CHECKSUM' annex:
>      68c94ffc34f8ad2d7bfae3f5a6b996409211c1b1  target.xml
>      0e8e850b0580fbaaa0872326cb1b8ad6adda9b0d  mmu.xml
>      00f22e5f971ccec05c2acce98caf8cff4343c8cf  fpu.xml

Shouldn't we document how to generate a checksum for a file?

Other than that, looks fine to me.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Self-describing targets - a more concrete proposal
  2006-03-29 21:49 ` Eli Zaretskii
@ 2006-03-29 22:27   ` Jim Blandy
  2006-03-29 22:32     ` Daniel Jacobowitz
  0 siblings, 1 reply; 9+ messages in thread
From: Jim Blandy @ 2006-03-29 22:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb

On 3/29/06, Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Wed, 29 Mar 2006 11:16:25 -0500
> > From: Daniel Jacobowitz <drow@false.org>
> >
> > I would appreciate comments on the sample and documentation
>
> Comments on the documentation are below.  Note that I needed to guess
> what was in the Texinfo source, since you posted the Info output, so I
> could have guessed wrong, and my comments might thus be off the target.

Thanks very much.  We wanted to get comments on the actual design
itself, so we posted legible text instead of a texinfo patch.  When it
comes down to posting the final patch, we'll certainly include the
patch to gdb.texinfo in the usual way.

> > GDB retrieves a target's self-description via the remote protocol using
> > a `qPart' request (*note the `qPart' request: qPart request.) of the form:
>
> This cross-reference looks awkward.  I'm guessing that Jim used a
> 2-argument form of a @pxref here.  But the second arg is redundant
> here because it is a substring of the 1st.  Am I missing some valid
> reason for using the second argument?

It does look awkward.  I wanted to use @code for qPart.  But I'm not
sure it's worth it; I've simplified it and it seems okay.

> >      qPart:features:read:ANNEX:OFFSET,LENGTH
> >    where ANNEX is the string `target.xml'.  The OFFSET and LENGTH
>
> The last line should have a @noindent before it.

Done.

> > parameters are the offset into the description and the number of bytes
> > to transfer, as for other `qPart' requests.
> >
> >    The `target.xml' annex contains an XML document describing the
> > target's features; its form is described in *Note Self-Description
> > Format::.
>
> There's something I don't understand here: is "target.xml" a literal
> fixed string that will _always_ appear in the above packet?  If it is,
> why do we need to mention its name?

We talk about GDB retrieving other annexes (annices?) later; that
request is used for all of them.  I'll try to rephrase this.

> >    To reduce protocol overhead, a target may supply a special annex
> > named `CHECKSUMS' that provides 160-bit SHA1 checksum values for the
> > annexes it has available.  The `CHECKSUMS' annex contains a series of
> > newline-terminated lines, each of which contains a 40-digit hexidecimal
> > checksum, two spaces, and the name of an annex with the given checksum.
> > Here is an example `CHECKSUM' annex:
> >      68c94ffc34f8ad2d7bfae3f5a6b996409211c1b1  target.xml
> >      0e8e850b0580fbaaa0872326cb1b8ad6adda9b0d  mmu.xml
> >      00f22e5f971ccec05c2acce98caf8cff4343c8cf  fpu.xml
>
> Shouldn't we document how to generate a checksum for a file?

SHA-1 is the name of the specific hash function that must be used. 
I'll clear this up.

> Other than that, looks fine to me.

As always, thanks for the review!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Self-describing targets - a more concrete proposal
  2006-03-29 22:27   ` Jim Blandy
@ 2006-03-29 22:32     ` Daniel Jacobowitz
  2006-03-29 22:36       ` Rob Nikander
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Jacobowitz @ 2006-03-29 22:32 UTC (permalink / raw)
  To: Jim Blandy; +Cc: Eli Zaretskii, gdb

On Wed, Mar 29, 2006 at 01:26:36PM -0800, Jim Blandy wrote:
> > >    To reduce protocol overhead, a target may supply a special annex
> > > named `CHECKSUMS' that provides 160-bit SHA1 checksum values for the
> > > annexes it has available.  The `CHECKSUMS' annex contains a series of
> > > newline-terminated lines, each of which contains a 40-digit hexidecimal
> > > checksum, two spaces, and the name of an annex with the given checksum.
> > > Here is an example `CHECKSUM' annex:
> > >      68c94ffc34f8ad2d7bfae3f5a6b996409211c1b1  target.xml
> > >      0e8e850b0580fbaaa0872326cb1b8ad6adda9b0d  mmu.xml
> > >      00f22e5f971ccec05c2acce98caf8cff4343c8cf  fpu.xml
> >
> > Shouldn't we document how to generate a checksum for a file?
> 
> SHA-1 is the name of the specific hash function that must be used. 
> I'll clear this up.

I think what Eli would like us to mention here would be "GNU Coreutils
provides an sha1sum utility which produces output in this format". 
Makes sense to me.  GNU coreutils is fairly portable, and some other
platforms probably provide SHA-1 utilities, so I don't think folks will
have a big problem generating these; we could even build sha1sum with
GDB if it would be helpful, but I'd rather not import that much of
coreutils if I can avoid it.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Self-describing targets - a more concrete proposal
  2006-03-29 22:32     ` Daniel Jacobowitz
@ 2006-03-29 22:36       ` Rob Nikander
  2006-03-29 22:41         ` Paul Koning
  0 siblings, 1 reply; 9+ messages in thread
From: Rob Nikander @ 2006-03-29 22:36 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Jim Blandy, Eli Zaretskii, gdb


On Mar 29, 2006, at 4:49 PM, Daniel Jacobowitz wrote:
> Makes sense to me.  GNU coreutils is fairly portable, and some other
> platforms probably provide SHA-1 utilities, so I don't think folks  
> will
> have a big problem generating these; we could even build sha1sum with

FWIW, just the other day I had to copy some files to a Linux system  
from my Mac OS X laptop, because OS X seems to lack a sha1sum  
utility.  Both systems had md5.

Rob

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Self-describing targets - a more concrete proposal
  2006-03-29 22:36       ` Rob Nikander
@ 2006-03-29 22:41         ` Paul Koning
  2006-03-29 23:38           ` Daniel Jacobowitz
  0 siblings, 1 reply; 9+ messages in thread
From: Paul Koning @ 2006-03-29 22:41 UTC (permalink / raw)
  To: rob; +Cc: gdb

>>>>> "Rob" == Rob Nikander <rob@encodia.biz> writes:

 Rob> On Mar 29, 2006, at 4:49 PM, Daniel Jacobowitz wrote:
 >> Makes sense to me.  GNU coreutils is fairly portable, and some
 >> other platforms probably provide SHA-1 utilities, so I don't think
 >> folks will have a big problem generating these; we could even
 >> build sha1sum with

 Rob> FWIW, just the other day I had to copy some files to a Linux
 Rob> system from my Mac OS X laptop, because OS X seems to lack a
 Rob> sha1sum utility.  Both systems had md5.

"apropos sha1" shows that Mac OS X can do SHA-1 digests:
	 openssh dgst -sha1 filename

(The same command with -md5 does md5 checksums, and the answer matches
md5sum, as well it should.)

	paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Self-describing targets - a more concrete proposal
  2006-03-29 22:41         ` Paul Koning
@ 2006-03-29 23:38           ` Daniel Jacobowitz
  2006-03-30  0:15             ` Joseph S. Myers
  2006-03-30  7:11             ` Eli Zaretskii
  0 siblings, 2 replies; 9+ messages in thread
From: Daniel Jacobowitz @ 2006-03-29 23:38 UTC (permalink / raw)
  To: Paul Koning; +Cc: rob, gdb

On Wed, Mar 29, 2006 at 05:36:06PM -0500, Paul Koning wrote:
> "apropos sha1" shows that Mac OS X can do SHA-1 digests:
> 	 openssh dgst -sha1 filename

OpenSSL, you mean.  Nice trick.  The output format is different, but
that is not a big problem - we should probably mention both in the
manual.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Self-describing targets - a more concrete proposal
  2006-03-29 23:38           ` Daniel Jacobowitz
@ 2006-03-30  0:15             ` Joseph S. Myers
  2006-03-30  7:11             ` Eli Zaretskii
  1 sibling, 0 replies; 9+ messages in thread
From: Joseph S. Myers @ 2006-03-30  0:15 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Paul Koning, rob, gdb

On Wed, 29 Mar 2006, Daniel Jacobowitz wrote:

> On Wed, Mar 29, 2006 at 05:36:06PM -0500, Paul Koning wrote:
> > "apropos sha1" shows that Mac OS X can do SHA-1 digests:
> > 	 openssh dgst -sha1 filename
> 
> OpenSSL, you mean.  Nice trick.  The output format is different, but
> that is not a big problem - we should probably mention both in the
> manual.

"gpg --print-md sha1 filename" can also be used for this purpose.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Self-describing targets - a more concrete proposal
  2006-03-29 23:38           ` Daniel Jacobowitz
  2006-03-30  0:15             ` Joseph S. Myers
@ 2006-03-30  7:11             ` Eli Zaretskii
  1 sibling, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2006-03-30  7:11 UTC (permalink / raw)
  To: Paul Koning, rob; +Cc: gdb

> Date: Wed, 29 Mar 2006 17:41:14 -0500
> From: Daniel Jacobowitz <drow@false.org>
> Cc: rob@encodia.biz, gdb@sourceware.org
> 
> OpenSSL, you mean.  Nice trick.  The output format is different, but
> that is not a big problem - we should probably mention both in the
> manual.

Yes.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-03-30  4:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-03-29 21:26 Self-describing targets - a more concrete proposal Daniel Jacobowitz
2006-03-29 21:49 ` Eli Zaretskii
2006-03-29 22:27   ` Jim Blandy
2006-03-29 22:32     ` Daniel Jacobowitz
2006-03-29 22:36       ` Rob Nikander
2006-03-29 22:41         ` Paul Koning
2006-03-29 23:38           ` Daniel Jacobowitz
2006-03-30  0:15             ` Joseph S. Myers
2006-03-30  7:11             ` Eli Zaretskii

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).