Newbie Notes - Craig Ringer

public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed

From: Craig Ringer <craig@2ndquadrant.com>
To: systemtap@sourceware.org
Subject: Newbie Notes
Date: Mon, 28 Oct 2019 07:19:00 -0000	[thread overview]
Message-ID: <CAMsr+YEm42ERBdSH22=G9X84GvSGiV0ExPkaULyUyGvFqvLAQw@mail.gmail.com> (raw)

Hi all

I've spent the last week or so getting into SystemTap, which was on my
"to learn" list for quite some time. I'm amazed by how powerful it is.
I also know that feedback from people who are learning my own software
is useful, so I thought I'd share some of the challenges and
roadblocks I experienced along the way and some suggestions for docs
etc.

I also have some open questions I'd love some suggestions on. I'll
lead with them, then get into my notes on things I stumbled over when
learning. For context, I'm writing probes against PostgreSQL, which is
a multi-processing fork() based server, so I'm pretty much always
interested in the state of many processes and their interactions.

Q1: Can I make a probe conditional per-pid in the "if" clause? What
*can* I do there?
----

Say I have some associative array "interesting_procs" and I don't want
to see "handle_some_event()" calls for any other PIDs.

I expected

  probe process("myapp").function("handle_some_event()") if (pid() in
interesting_procs) {
    ....
  }

to work, but pid() isn't allowed in a probe condition. So you can't
seem to use globals to toggle probes on a finer scope than
whole-script-globally.

The docs don't really explain what you can and cannot do in a probe if
condition.

Q2: Can I "return" or "exit" from a probe in a handler body or probe
alias handler body?
----

Assuming there's no good answer to Q1, is there any way to "return"
early from a probe body? Especially from a probe alias handler, so the
alias can cause the main probe body to be skipped entirely?

I know I can set a variable in the alias body, then test it in the
main probe body, but that's ugly and cumbersome. Is there a better
way?

e.g if I have:

probe myproc.something = probe process("myproc").function("something") {
    is_interesting = (@var("foo"));
};

probe myproc.something {
    if (is_interesting) {
        printf("XXX....");
    }
}

I'd like to be able to  instead

probe myproc.something = probe process("myproc").function("something") {
  if (@var("foo") == 0)
    return;
};

probe myproc.something {
    printf("XXX....");
}

Especially since the unused variables warnings stap emits mean that
probe alias handlers that define variables tend to create huge amounts
of warnings spam if some users of the alias don't care about all the
variables.

Q3: Some way to mark individual variables to suppress unused warnings?
---------------

In probe alias bodies it's often useful to define a set of variables
for use by the probe(s) that use the alias. But stap likes to spam
warnings about these if not all of them are used by all probes.

-w is a very heavy hammer to use for this as it hides many legitimate
issues too.

Is there a way to suppress unused warnings only for alias bodies? Or
preferably, mark individual variables as "don't warn if unused"?

Q4: tapscript backtraces from runtime faults, errors, etc, macro
expanded-from for errors at runtime?
--------------

Say I have some macro @MY_MACRO(x) or some function my_function(x).

I do something in the macro that raises an error() or fault.

The error from stap only reports the line number of the macro or function.

Is there any way to see where it was called from too? Or, for macro
expansions, to show the "expanded from" text for runtime errors like
are produced in compile-time errors?

Without this, tracking down a fault arising from some widely used
macro or function is ... painful.

For functions using ubacktrace() etc sometimes helps; it doesn't tell
you anything about the stap state but it offers hints about what probe
might've called the function. But that's a pretty backwards way of
doing it.

Q5: is there any way to get the len()/size() of a non-statistical array?
-------

Say I have a non-statistical associative array "arr". Can I get a
membership count with anything other than

    n = 0;
    foreach ([x] in arr) { n++; };
    printf("length is %d\n", n);

Q6: is there any way to get the executable-path of the current pid() in a probe
--------

Just that.

Q7: Is there a string-concatenation operator that works at parse-time
with macro-expansion?
--------

Say I want to use

   @define MY_PROG_BASEPATH %{ @1 %}
   @define MY_PROG(prog) %{ @MY_PROG_BASEPATH @prog %}

Q8: nicer way to warn() or error() with params than sprintf?
--------

If I have a warn(), error() or assert() I want to include the values
of variables in, is there a nicer way to do it than assert(condition,
sprintf(...)) ?

Especially for assert.

Q9: Can I iterate over a distinct-values slice of an array?
-------

Say I have an array arr[x,y,z]. I want to iterate over all distinct
"x" without repeats.

There are no local associative array vars.

AFAICS I have to

last_x = 0;
for (arr = [x+,y,z] in arr) {
  if (x != last_x) {
    printf("found distinct x: %d\n", x);
    last_x = x;
  }
}

which is kinda clunky, especially as I don't have much of a way to
wrap it up in a generic/reusable way.

Is there a better way?

Q10: How does the interpretation of probe bodies and function bodies differ?
-------

From what I've been able to tell, functions get expanded into probe
bodies, somewhat macro-like.

How do functions and macros differ? What can be done in probe bodies
but not functions? There are clearly some differences since as my
prior mail notes, functions resolve @cast and @var differently.

Q11: How to "fire" an "event" from a tapset?
--------

Say I have some complex condition or event that doesn't lend itself
well to being written as a probe alias. Especially with the matters in
Q1, Q2 and Q3 making it hard to present the main handler for the alias
with a consistent interface where it's only invoked at the right
times, and always with the same vars defined.

Is using aliases the wrong approach? If you want users to be able to
say "run this handler when event y occurs", and event Y may happen 6
different ways with different relevant local variables etc each time,
how do you do it?

Can a tapscript "fire" a synthetic probe? And put it in a context
where it can access a process's globals etc?

Q12: How to make a probe or probe alias show up or not show up in --monitor
--------

--monitor mode seems to have some kind of magic for deciding what
probes it shows hits for.

So far I've only seen it produce output for SDT markers, not function
probes, function return probes, etc.

How does it find out when a probe is hit and count it? Is there any
way to suppress a probe as a "hit" within the body, given that per Q1
the probe conditional expression seems pretty limited.

I didn't find much at all about --monitor mode yet.

Q13: Easy way to handle enums?
--------

Assume that project X has complex headers that don't lend themselves
well for inclusion in an embedded-C tapset, you're more than a bit
nervous about doing so for a large/complex project, you want to handle
different versions of the project without the user having to dig out
the right headers, etc etc.

Is there any better way to define individual enums in a tapset script
than translating

enum MyProgramEnum {
   XX,
   YY,
   ZZ
};

manually to

%define MyProgramEnum_XX %( 0 %)
%define MyProgramEnum_XX %( 1 %)
%define MyProgramEnum_XX %( 2 %)

I'm aware of the @cast header-inclusion support, but it makes me
exceedingly nervous to use it. I'd probably land up using gdb script
to generate faked-up headers with just the enum definitions or
something, but that's pretty horrible too...

Q14: Handling differences in target program version
------------

For the sake of an example, say that PostgreSQL version 11 has an
argument / global variable "foo", and in PostgreSQL 12 it's gone,
replaced with "foo_defined" and "foo_value". Details don't matter,
point is that one symbol vanished, two different ones replace it.

Is there any way to handle these version differences by making probes
conditional on target expressions/values? E.g. "use this probe if
@var("pg_version_num") > 120000, otherwise this other probe" ?

Or is it necessary to use %( conditionals %) and have the user specify
the inputs as commandline arguments?

AFAICS you can't try{}catch{} a @var expansion, etc as the're
compile-time issues not runtime faults.

Q15: variable/dynamic probes
----------

Say I want to probe a bunch of different binaries within some basedir
I want to pass as a script argument (or better yet, resolve from the
PATH of a target binary).

It seems I can't just

    alias pg.backend = process(@1 "/bin/myproc");

I can:

   @define PGBIN %( @1 %)
   alias pg.backend = process(@PGBIN)

but because string literal auto-concatentation doesn't happen for
macro-expanded strings, I cannot then:

   @define PGBASEDIR %( @1 %)
   alias pg.backend = process(@PGBASEDIR "/bin/postgres")

so I'm yet to find a good way to handle things like paths for process
probes. Ideas?

Is this not possible without some kind of not-currently-extant
operator for parse-time concatenation of macro-expanded strings?

New user experience, docs suggestions
===============================

Document stringifying numbers
----------

Concatenation with . doesn't stringify numbers.

Document that you just use sprintf to do it, or you may want to use
the "println" statement that automatically handles concatenation and
stringification.

Array wildcards, iterate-with-return need to be documented in BG
-----------

I found systemtap infinitely easier to use once I stumbled across the
array wildcard support. This would've been immensely useful early on
and I think it should be highlighted in the beginners' guide:

private arr;

arr[pid(),"foo"] = 42;

for (val = [pid, ph] in arr) {
}

for (val = [pid, ph] in arr[pid(),*]) {
}

for (val = [pid, ph] in arr[*,42]) {
}

if ([*,*] in arr) {
 /// true if array non-empty
}

if ([pid(),*] in arr) {
  // true if any entry for pid() in array
}

delete arr[pid(), *];

delete arr[*,*];

Document that $1 ... $n and $# are usable in preprocessor
---------

I tried to write code that used $# to decide whether $2, etc were
valid and got very confused that stap kept spamming warnings.

It seems you're supposed to use these in conditionals? Like

        myvar = %( $# >2 %? @3 %: "" %)

or equivalent conditionally-compiled blocks.

This could use some docs where script-args are mentioned and in the
conditional compilation section, also maybe a hint when stap warns of
unused arguments during compilation.

Document that probe definitions cannot reference variables
-----------

When getting started, it's surprising that you cannot

    pgbasedir = "/path/to/postgres";
    alias pg.backend = process($pgbasedir . "/bin/postgres")

and it might be nice to cover what you can and cannot do in probe arguments.

No process-local / thread-local vars
------------

Any non-probe-scoped variable, whether "global" or "private", is
global to the whole tapscript. There's AFAICS no way to instance such
variables per-process or per-thread automatically. So the beginners
guide should mention this and suggest the pattern:

    some_global[pid()] = "value";

and

   if ( pid() in some_global )

and

   my_some_global = some_global[pid()];

and warn that users should add a probe that deletes the value when the
pid exits so that the array doesn't full up, e.g.

probe process("myproc").end {
  delete some_global[pid()];
}

Of course having syntax to wrap this or even better, having lockless
thread-local/process-local variables, would be way better.

SDT probe data types and argument names:
-----------

SDT probes have arguments $arg1... $argn, and no data type information
is preserved. That's exceedingly unfortunate as they're otherwise a
super useful tool. Is it feasible to do anything like embed the param
names and types in the probes ELF section or an extension section of
it, for reading by stap?

Alternately, though not as good, would be for the 'dtrace' tool to
generate a tapset with aliases and @casts from the .d file. I may
write a script for this; if I do, I'll post it here.

String access: user_string, user_string_warn
-------------

It'd be good to make these more prominent, most importantly in the
target variables section of the beginners guide
https://sourceware.org/systemtap/SystemTap_Beginners_Guide/targetvariables.html

Also warn about the @var caveat my immediately prior mail mentioned,
re needing absolute paths for modules and not doing PATH-expansion
when using @var from a function().

Target array access
-------------

It'd be nice if the docs mentioned how to access target array
variables or array-valued struct members. It's fairly obvious for
locals:

    $targetvar[index];

    $targetstruct->arraymember[index];

For @var and @cast it works similarly, but will result in confusing
failures to resolve the variable or type if you do it wrong. Parens
are allowed and seem to be useful (needed?). E.g.

    (@var("globalarray"))[idx]

    (@var("globalstruct")->arraymember)[idx]

Deref'ing scalar as array produces confusing symbol not found error
-------

but if you attempt to

    @var("globalstruct")[0]

stap will report that it cannot find the symbol "globalstruct" even
though it's right there. The real error seems to be that it can't cope
with the attempt to dereference it as an array when it's a scalar.

e.g. if globalstruct is

struct globalstructtype {
     my_type varlen_array_member[0];
};

globalstructtype *globalstruct;

then forgetting to reference the member and using

    @var("globalstruct")[0]

produces a symbol resolution error for globalstruct, which is confusing.

Typecasting
-----------

Same warning for @cast as for @var re functions, module paths.

Warn that @cast to a pointer-hiding typedef will produce confusing
results, strange faults, etc.

Queue stats
-------------

tapset crosslinking
---------------

It'd be nice to see more x-linking of important tapset info in the
main docs and beginner docs, e.g. linking to

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise

next             reply	other threads:[~2019-10-28  7:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-28  7:19 Craig Ringer [this message]
2019-10-28  8:55 ` Craig Ringer
2019-10-30 20:10 ` Frank Ch. Eigler
2019-10-31 14:01   ` Craig Ringer
2019-11-07 18:52     ` Frank Ch. Eigler
2019-11-08  4:15       ` Craig Ringer
2019-11-08 12:03         ` Frank Ch. Eigler
2019-11-10  5:05           ` Craig Ringer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMsr+YEm42ERBdSH22=G9X84GvSGiV0ExPkaULyUyGvFqvLAQw@mail.gmail.com' \
    --to=craig@2ndquadrant.com \
    --cc=systemtap@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).