SDTs with data types and argument names

public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed

* SDTs with data types and argument names
@ 2019-12-19  3:00 Craig Ringer
  2019-12-20  4:13 ` Craig Ringer
  2020-01-09 18:46 ` Frank Ch. Eigler
  0 siblings, 2 replies; 6+ messages in thread
From: Craig Ringer @ 2019-12-19  3:00 UTC (permalink / raw)
  To: systemtap

SystemTap has inherited the dtrace decision to give SDTs anonymous
arguments of type 'long' and generic names like arg1, arg2, etc.

This makes sense if you're trying to be DTrace compatible, but I don't
think stap is really trying to be very dtrace-like at runtime.

It'd be great to capture the probe argument names and their data types to
systemtap when SDTs are generated from a probes.d file. It'd make sense to
expose this capability for when probes are defined with STAP_PROBE(...) etc
in their own builds too.

The goal is to let you write

probe process("myapp").mark("some__tracepoint")
{
    printf("hit some__tracepoint(%s, %d)\n",
        user_string(useful_firstarg_name),
        some_secondarg->somemember->somelongmember);
}

and display useful arg names and types in `stap -L` too.

Saving the argument names looks relatively simple in most cases. Define an
additional set of macros in the usual STAP_PROBE2() etc style like the
following pseudoishcode:

    STAP_PROBE2_ARGAMES(provider, probename, argname1, argname2) \
        const char "__stap_argnames_" ## provider ## "_" ## probename ##
[2][] \
              = { #argname1, #argname2 } \
        __attribute__ ((unused)) \
        __attribute__ ((section (".probes")));

i.e generate some constant data with the probe names in a global array we
can look up when compiling the tapscript based on the provider and probe
name.

The 'dtrace' script could emit these automatically into the generated
probes.h and the compiler would de-duplicate them at link-time. But it'd be
cleaner if they were embedded into the .o optionally generated by the
dtrace script.

A nearly identical approach could be used to give systemtap access to the
textual datatype names for probes declared in probes.d. Or we could even
use gcc's __typeof__ to derive them.

Applications that wanted to expose type and arg info for probes would have
to do so explicitly by invoking STAP_PROBEn_ARGAMES(...) and
STAP_PROBEn_ARGTYPES(...) with the names and types of the probe somewhere
in global scope, outside the probe callsite. Which is a bit inconvenient,
but not that hard.

That is, unless there's some way we can escape the function scope in which
the STAP_PROBEn(...) macro is invoked and define global symbols. I've asked
for ideas about that here: https://stackoverflow.com/q/59402666/398670 .
If that's possible then ideally I'd like to use the gcc __typeof__ operator
to autogenerate typed symbols for each argument or to generate a
const array of type names. Also to have a variant that uses token pasting
to derive the argument names automatically, though we'd still need a
variant that lets you specify them explicitly for when the args are
expressions not simple variable name tokens.

So my hope is it'll be possible to write

    STAP_PROBE2(myprovider, myprobe, thing->foo, "foo", get_something(),
"something");

and have stap record the supplied argnames and infer the typeinfo then
record that too, so it can look it up during tapscript translation. Instead
of

    x = user_string($arg1)
    y = @cast(arg2, "SomethingType@my.c","/path/to/myprogram")->somevalue

you'd be able to write

    x = user_string($foo)
    y = $something->somevalue

and perhaps even more importantly, `stap -L` could show useful type and
argname info for probes so they'd serve as documentation of sorts.

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SDTs with data types and argument names
  2019-12-19  3:00 SDTs with data types and argument names Craig Ringer
@ 2019-12-20  4:13 ` Craig Ringer
  2020-01-09 18:46 ` Frank Ch. Eigler
  1 sibling, 0 replies; 6+ messages in thread
From: Craig Ringer @ 2019-12-20  4:13 UTC (permalink / raw)
  To: systemtap

On Thu, 19 Dec 2019 at 11:00, Craig Ringer <craig@2ndquadrant.com> wrote:

>
> That is, unless there's some way we can escape the function scope in which
> the STAP_PROBEn(...) macro is invoked and define global symbols. I've asked
> for ideas about that here: https://stackoverflow.com/q/59402666/398670 .
>

Looks like that's quite practical using the existing .pushsection and
.popsection features used in the existing <sys/sdt.h>. If building without
__ASSEMBLER__ we would treat STAP_PROBEn_ARGINFO(...) the same as
STAP_PROBEn(...) i.e. not generate arg info. But we could still emit it for
probes defined via a probes.d .

Will try to find time to draft a patch. My first foray into asm and custom
ELF sections...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SDTs with data types and argument names
  2019-12-19  3:00 SDTs with data types and argument names Craig Ringer
  2019-12-20  4:13 ` Craig Ringer
@ 2020-01-09 18:46 ` Frank Ch. Eigler
  2020-01-13  5:28   ` Craig Ringer
  1 sibling, 1 reply; 6+ messages in thread
From: Frank Ch. Eigler @ 2020-01-09 18:46 UTC (permalink / raw)
  To: Craig Ringer; +Cc: systemtap


Hi -

> It'd be great to capture the probe argument names and their data types to
> systemtap when SDTs are generated from a probes.d file. It'd make sense to
> expose this capability for when probes are defined with STAP_PROBE(...) etc
> in their own builds too.

Yeah.  I believe there was a kernel-bpf-oriented group last year, who
were speculating extending sdt.h in a similarly motivated way.


> The goal is to let you write
>
> probe process("myapp").mark("some__tracepoint")
> {
>     printf("hit some__tracepoint(%s, %d)\n",
>         user_string(useful_firstarg_name),
>         some_secondarg->somemember->somelongmember);
> }
> and display useful arg names and types in `stap -L` too.

Note that one point of the sdt.h structure was to make the executables
self-sufficient with respect to extracting this data, even if there is
no debuginfo available.  Adding type names can only work if that
debuginfo is available after all, or else if it's synthetically
generated via @cast("<foo.h>") type constructs.


> Saving the argument names looks relatively simple in most cases. Define an
> additional set of macros in the usual STAP_PROBE2() etc style like the
> following pseudoishcode:
>
>     STAP_PROBE2_ARGAMES(provider, probename, argname1, argname2) \
>         const char "__stap_argnames_" ## provider ## "_" ## probename ##
> [2][] \
>               = { #argname1, #argname2 } \
>         __attribute__ ((unused)) \
>         __attribute__ ((section (".probes")));
>
> i.e generate some constant data with the probe names in a global array we
> can look up when compiling the tapscript based on the provider and probe
> name.

Yeah, that's a sensible way of doing it, without creating a new note
format or anything.  It's important that the section be marked with
attributes that will force it to be pulled into the main executable
via the usual linker scripts.

> [...]
> So my hope is it'll be possible to write
>
>     STAP_PROBE2(myprovider, myprobe, thing->foo, "foo", get_something(),
> "something");
>
> and have stap record the supplied argnames and infer the typeinfo then
> record that too, so it can look it up during tapscript translation.

(FWIW, I wouldn't consider it a failure if the typeinfo has to be
manually added.)


- FChE

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SDTs with data types and argument names
  2020-01-09 18:46 ` Frank Ch. Eigler
@ 2020-01-13  5:28   ` Craig Ringer
  2020-01-13 20:54     ` Frank Ch. Eigler
  0 siblings, 1 reply; 6+ messages in thread
From: Craig Ringer @ 2020-01-13  5:28 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

On Fri, 10 Jan 2020 at 02:46, Frank Ch. Eigler <fche@redhat.com> wrote:

> > It'd be great to capture the probe argument names and their data types to
> > systemtap when SDTs are generated from a probes.d file. It'd make sense
> to
> > expose this capability for when probes are defined with STAP_PROBE(...)
> etc
> > in their own builds too.
>
> Yeah.  I believe there was a kernel-bpf-oriented group last year, who
> were speculating extending sdt.h in a similarly motivated way.
>

Good to know. Any idea who may've been involved? It'd be good to
collaborate and not duplicate work or explore a dead-end already followed.

> > The goal is to let you write
> >
> > probe process("myapp").mark("some__tracepoint")
> > {
> >     printf("hit some__tracepoint(%s, %d)\n",
> >         user_string(useful_firstarg_name),
> >         some_secondarg->somemember->somelongmember);
> > }
> > and display useful arg names and types in `stap -L` too.
>
> Note that one point of the sdt.h structure was to make the executables
> self-sufficient with respect to extracting this data, even if there is
> no debuginfo available.  Adding type names can only work if that
> debuginfo is available after all, or else if it's synthetically
> generated via @cast("<foo.h>") type constructs.
>

Indeed. And the latter option is hairy for complex and portable software:
you must get exactly the right header version, but you must also ensure you
have any number of preprocessor macros etc set precisely the same. There
can be header inclusion order considerations and more. I'm very reluctant
to use the automated header processing features.

Without debuginfo we'd still get useful probe names, which would IMO be
exceedingly useful. stap could expose them as $theArgName and still expose
them as $arg1 etc for BC, so that wouldn't upset anyone. It might also let
stap handle narrower integer types better. And *if* debuginfo was present,
it could allow the user to traverse structs etc via
$theArgName->member->foo .

I don't know of any way to ask gcc/gdb/binutils/etc to retain a subset of
debuginfo in an executable when it's being stripped, and I doubt that'd be
popular or accepted anyway. Where would you stop? In many cases the
immediate struct would be of little value without type info for its member
types and their member types and so on. So I realise that it's no
substitute for debuginfo, and doesn't make it possible to get full
functionality without it.

What it _should_ do is put static probes on an equal footing with DWARF
probes when debuginfo is present. Right now they're inferior in quite a
number of ways: no argument names, no argument types without explicit and
verbose casting, representations in monitor mode are hex statement
positions not probe names, and more.

> Saving the argument names looks relatively simple in most cases. Define an
> > additional set of macros in the usual STAP_PROBE2() etc style like the
> > following pseudoishcode:
> >
> >     STAP_PROBE2_ARGAMES(provider, probename, argname1, argname2) \
> >         const char "__stap_argnames_" ## provider ## "_" ## probename ##
> > [2][] \
> >               = { #argname1, #argname2 } \
> >         __attribute__ ((unused)) \
> >         __attribute__ ((section (".probes")));
> >
> > i.e generate some constant data with the probe names in a global array we
> > can look up when compiling the tapscript based on the provider and probe
> > name.
>
> Yeah, that's a sensible way of doing it, without creating a new note
> format or anything.  It's important that the section be marked with
> attributes that will force it to be pulled into the main executable
> via the usual linker scripts.
>

I'll look into that.

This won't be something I can leap to do in a hurry as I have to fit it in
bits and pieces around main deliverables. I'm sure you know the feeling.
But I'm keen to work on it when I get the chance.

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SDTs with data types and argument names
  2020-01-13  5:28   ` Craig Ringer
@ 2020-01-13 20:54     ` Frank Ch. Eigler
  2020-01-13 21:08       ` Frank Ch. Eigler
  0 siblings, 1 reply; 6+ messages in thread
From: Frank Ch. Eigler @ 2020-01-13 20:54 UTC (permalink / raw)
  To: Craig Ringer; +Cc: systemtap

Hi -

> > Yeah.  I believe there was a kernel-bpf-oriented group last year, who
> > were speculating extending sdt.h in a similarly motivated way.
> 
> Good to know. Any idea who may've been involved? It'd be good to
> collaborate and not duplicate work or explore a dead-end already followed.

https://web.archive.org/web/20190528152614/http://vger.kernel.org/lpc-bpf2018.html#session-11

"enhancing user defined tracepoints"

(h/t serhei) (btw, where did vger itself go???  did it merge with
Decker and disappeared into another dimension?)


> Indeed. And the latter option is hairy for complex and portable software:
> you must get exactly the right header version, but you must also ensure you
> have any number of preprocessor macros etc set precisely the same. There
> can be header inclusion order considerations and more. I'm very reluctant
> to use the automated header processing features.

Those provisos are all valid, yet it turns out to be useful & capable
a lot of the time.  If there is a "-devel" level packaged set of
headers, they should be well enough engineered to let this work.


> [...]  I don't know of any way to ask gcc/gdb/binutils/etc to retain
> a subset of debuginfo in an executable when it's being stripped, and
> I doubt that'd be popular or accepted anyway. [...]

See "BTF" and "CTF" for two efforts to keep some wee subsets of
debuginfo on the installation medium.  And see
debuginfod.systemtap.org :-) for a distribution vehicle for full
mainstream debuginfo.


- FChE

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SDTs with data types and argument names
  2020-01-13 20:54     ` Frank Ch. Eigler
@ 2020-01-13 21:08       ` Frank Ch. Eigler
  0 siblings, 0 replies; 6+ messages in thread
From: Frank Ch. Eigler @ 2020-01-13 21:08 UTC (permalink / raw)
  To: Craig Ringer, systemtap

Hi -

> > Good to know. Any idea who may've been involved? It'd be good to
> > collaborate and not duplicate work or explore a dead-end already followed.
> 
> https://web.archive.org/web/20190528152614/http://vger.kernel.org/lpc-bpf2018.html#session-11
> 
> "enhancing user defined tracepoints"

I believe this was also:

https://www.linuxplumbersconf.org/event/2/contributions/123/

- FChE

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-01-13 21:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-19  3:00 SDTs with data types and argument names Craig Ringer
2019-12-20  4:13 ` Craig Ringer
2020-01-09 18:46 ` Frank Ch. Eigler
2020-01-13  5:28   ` Craig Ringer
2020-01-13 20:54     ` Frank Ch. Eigler
2020-01-13 21:08       ` Frank Ch. Eigler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).