public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
From: Craig Ringer <craig@2ndquadrant.com>
To: "Frank Ch. Eigler" <fche@redhat.com>
Cc: systemtap@sourceware.org
Subject: Re: Newbie Notes
Date: Thu, 31 Oct 2019 14:01:00 -0000	[thread overview]
Message-ID: <CAMsr+YEJUuncM_XB2WM=gk9ja2MeaqYBj6swzjG=CnmmcPQtWw@mail.gmail.com> (raw)
In-Reply-To: <87bltyf0zp.fsf@redhat.com>

On Thu, 31 Oct 2019 at 04:10, Frank Ch. Eigler <fche@redhat.com> wrote:

>
>
> > Q1: Can I make a probe conditional per-pid in the "if" clause? What
> > *can* I do there?
>
> The [man stap] page covers probe conditionals this way:
>
>        Probes may be decorated with an arming condition, consisting of a
>        simple boolean expres‐sion on read-only global script variables.
>        While disarmed (inactive, condition evaluates to false), some
>        probe types reduce or eliminate their run-time overheads.  When
>        an arming condition evaluates to true, probes will be soon
>        re-armed, and their probe handlers will start getting called as
>        the events fire.  [...]
>
> The key is that these conditions are -arming conditions-.  They cannot
> possibly use contextual values such as pid(), because in order to even
> evaluate that condition, the probe would have to be armed & running!  In
> contrast, "boolean expressions on read-only global variables" may be
> evaluated anywhere: namely at the conclusion of all the probes that
> actually modify those variables.  Then those probes can enqueue an
> arm/disarm operation.
>

OK, that makes sense. I read over the manpage a few times, but I must've
missed that. My apologies.


> > Q2: Can I "return" or "exit" from a probe in a handler body or probe
> > alias handler body?
>
> "next", as in awk.  This too is in [man stap].
>

Ah! Thanks. Right there in "STATEMENTS" too.

Also in https://sourceware.org/systemtap/langref/6_Statement_types.html .

Chalk one up to me being blind. My apologies. It might be handy to mention
the keyword "return" or "exit" there for search convenience for similarly
reading-ability-impaired folks.


> Unused variables defined in aliases should not generate any warnings; or
> maybe you're using "stap -u" ?
>

Nope, I noticed it with regular "stap". I'll re-check and see if I can tell
what was going on and come back with details...

OK. Take this script, which will work on anything you have lying around
with simple adaptations:

    probe pg = process("postgres")
    {
        v1 = pid();
        v2 = execname();
        v3 = "availble for future callers";
    }

    probe pg.function("PostgresMain")
    {
        printf("my pid is %d and name is %s\n", v1, v2);
    }

    probe pg.function("PostmasterMain")
    {
        printf("I don't use those variables\n");
    }

Since the variables defined here are in the prelude handler of the alias I
wouldn't expect warnings for them, but on either stap 4.1 or
4.2-git-cadeeaa25 I get:

    sudo PATH=/usr/pgsql-11/bin:$PATH ./stap -v warnings.stp

    Pass 1: parsed user script and 476 library scripts using
201060virt/78756res/9736shr/69012data kb, in 150usr/30sys/176real ms.
    WARNING: Eliding assignment to 'v3': operator '=' at warnings.stp:5:12
     source:         v3 = "availble for future callers";
                        ^
    WARNING: Eliding assignment to 'v1': operator '=' at :3:12
     source:         v1 = pid();
                        ^
    WARNING: Eliding assignment to 'v2': operator '=' at :4:12
     source:         v2 = execname();
                        ^
    WARNING: Eliding assignment to 'v3': operator '=' at :5:12
     source:         v3 = "availble for future callers";
                        ^
    WARNING: Eliding side-effect-free expression : identifier 'v3' at :5:9
     source:         v3 = "availble for future callers";
                     ^
    WARNING: Eliding side-effect-free expression : identifier 'v1' at :3:9
     source:         v1 = pid();
                     ^
    WARNING: Eliding side-effect-free expression : identifier 'v2' at :4:9
     source:         v2 = execname();
                     ^
    WARNING: Eliding side-effect-free expression : identifier 'v3' at :5:9
     source:         v3 = "availble for future callers";
                     ^

I'm inclined to suspect a severe case of PEBKAC is at work here, but it
wouldn't be the first time I've found that when I speak up about something
that confuses me it's confused others in the past too.


> > Q4: tapscript backtraces from runtime faults, errors, etc, macro
> > expanded-from for errors at runtime?
> > --------------
> >
> > Say I have some macro @MY_MACRO(x) or some function my_function(x).
> > I do something in the macro that raises an error() or fault.
> > The error from stap only reports the line number of the macro or
> function.
>
> Aha.  Hmm, this has come up before, but I don't think we recorded it
> properly as an RFE (request-for-enhancement) BZ.  It shouldn't be too
> hard, given that we do track statement-by-statement where we are, in the
> context->last_stmt variable.  We would have to make that nestable, by
> storing it inside the context->locals[LEVEL] struct instead.  It's a bit
> of work but not that much.  Might you be interested in giving
> implementing it a try?
>

Definitely interested. I'll have to check with work about getting the time
since I'm currently rather short of outside-work coding time, but I think
stap looks like a tool we could really benefit from.

(I'm also getting into full-stack tracing with Jaeger and now OpenCensus,
but that's much more high level and heavyweight, so both have a role).


> @define @length1(a,v) %( @v = 0; foreach (_x in @a) { @v ++ } %)
> @define @length2(a,v) %( @v = 0; foreach (_x,_y in @a) { @v ++ } %)
>

Thanks. Was mostly checking if I was missing something, a-la "next".


> > Q6: is there any way to get the executable-path of the current pid()
> > in a probe Just that.
>
> task_execname(task_current())   [man tapset::task]
>

Since posting, I found function::execname in tapset::context .

It's not totally clear to me how that differs from
using task_execname(task_current()), if at all.

It'd be good to mention execname() in the beginners' guide alongside pid()
and to see-also between those funcs.


>
> > Q7: Is there a string-concatenation operator that works at parse-time
> > with macro-expansion?
> > Say I want to use
> >
> >    @define MY_PROG_BASEPATH %{ @1 %}
> >    @define MY_PROG(prog) %{ @MY_PROG_BASEPATH @prog %}
>
> Same rules as in C: adjacent string literals concatenate during parse.
>


The issue is specifically within @var and @cast expansion. It's done early,
and doesn't appear to benefit from implicit string concatenation at all,
macro-expanded or otherwise. Per my other post.

This works:

  @define PGBIN %( @1 "/bin/postgres" %)
  probe begin { printf("%s\n", @PGBIN); }

so does this:

    @define PGBIN %( @1 "/bin/postgres" %)
    probe process(@PGBIN).function("PostgresMain") { printf("%s\n",
@PGBIN); }

but not this (where $1 = /usr/pgsql-11/ ):

        @define PGBIN %( @1 "bin/postgres" %)
        function get_pgver:long() {
            return @var("server_version_num@guc.c", @PGBIN);
        }
        probe process(@PGBIN).function("PostgresMain") {
            printf("%s", @PGBIN, get_pgver());
        }

producing:

    parse error: expected ')'
        saw: string 'bin/postgres' at concat5.stp:1:29
         source:         @define PGBIN %( @1  ... bin/postgres ...
"bin/postgres" %)
                              ^
        in expansion of macro: operator '@PGBIN' at concat5.stp:3:53
         source:             return @var("server_version_num@guc.c",
@PGBIN);
                                     ^

    1 parse error.
    Pass 1: parse failed.  [man error::pass1]

however if I substitute the @var line with

            return @var("server_version_num@guc.c",
"/usr/pgsql-11/bin/postgres");

it works.

I haven't worked out exactly why yet, but it looks likely to be a matter of
order-of-processing. I wonder if @var @cast etc need to be processed in a
pass *after* user defined macros.


> We don't have variadic macros, and variadic functions don't quite
> propagate string literals well enough for this to work:
>
> function warn(msg,fmt,var1) { warn(sprint("%s " fmt, msg, var1)) }
> function warn(msg,fmt,var1,var2) { warn(sprint("%s " fmt, msg, var1,
> var2)) }
>

No worries, just figured it was worth checking.


> > Q9: Can I iterate over a distinct-values slice of an array?
> > -------
> > Say I have an array arr[x,y,z]. I want to iterate over all distinct
> > "x" without repeats.
> > There are no local associative array vars.
>
> Yeah, it'd require about the same amount of temporary storage as if you
> spelled out the pair of operations consecutively:
>
>     foreach ([x,y,z] in array)  array2[x] = 2
>     foreach ([unique] in array2) /* bob is unique */ ;
>

Right - but the docs say global array access requires locking and there are
no local arrays. Isn't that a performance/overhead concern?

It's not a big deal as the wonderful sorted-array-traversal operations let
you just notice when there's a distinct key being visited. Just a touch
verbose.

Temporary arrays are a problem from a memory allocation perspective:
> we just don't like doing it at run time, and arrays are too big to
> put into the contect struct.
>

Makes sense, especially when there's all that concern about contiguous
memory, fragmentation etc when working in the kernel.


>  Q10: How does the interpretation of probe bodies and function bodies
> differ?
> > -------
> >
> > From what I've been able to tell, functions get expanded into probe
> > bodies, somewhat macro-like.
>
> They could be, but actually they're translated to distinct blobs of C,
> which are called/shared from multiple caller probes as needed.
>

Huh. I assumed they were expanded because it looked like when I called the
same function from multiple probes, any warnings raised by the function got
emitted repeatedly. I may be mistaken.

It would be nice to document this alongside @cast and @var as it's very
confusing as a new user to move something into a function and have its
behaviour change suddenly.

Come to think of it, wouldn't it be pretty easy to use the focus_on_module
stuff to let functions be annotated with a default target module, or to
define blocks with target modules? Think

    function get_application_name:string() default_module("postgres")
    {
        return @var("application_name@guc.c");
    }

or

    function get_application_name:string()
    {
        @default_module("postgres") {
            return @var("application_name@guc.c");
        }
    }

Pointless in the simple case above, but not in more complex deployments.

OTOH, the lead time for anything like this to reach production systems is
so long it's questionable whether it's worth bothering with as any scripts
will need workarounds in the meantime.



> > What can be done in probe bodies but not functions? There are clearly
> > some differences since as my prior mail notes, functions resolve @cast
> > and @var differently.
>
> That's becuase reusable functions don't have a natural context to
> resolve types/$variables within.  Probes do - the probe point itself.
>

Makes sense given what you said above; my assumption about functions was
incorrect. And there'd be no way to infer the context from the call site
without specializing the function into callsite-specific variants, which
would be a nice feature to have but clearly not one that's present now.


> > Q11: How to "fire" an "event" from a tapset?
> > --------
> > [...]
> > Is using aliases the wrong approach? If you want users to be able to
> > say "run this handler when event y occurs", and event Y may happen 6
> > different ways with different relevant local variables etc each time,
> > how do you do it?
>
> You may be overthinking it: just say     probe y  { ... }
> declaratively.  If you want to turn it on or off, use arming
> conditionals or normal conditionals.
>

Won't work for me because I'm working in a complex multi-process context
and I can't arm/disarm probes per-process.

But I can just have my alias handler do

   if (foo) { next; }

which is fine.


> > Can a tapscript "fire" a synthetic probe? And put it in a context
> > where it can access a process's globals etc?
>
> Nope, we cannot do something as messy as gdb's inferior-function-calls.
>

Yeah. No ptrace(), non-invasive, process is actively executing, working in
kernel space, all that.

I was thinking more of something like you have with associative arrays,
synthetic probes on timer() events etc. Register an event (internally a
global) at compile time. Fire it to trigger the handler.

But it's not really necessary if probe aliases aren't meant to warn on
unused vars and I can "next" from a probe alias handler. I'll confirm it
works as expected in my test case...

> Q12: How to make a probe or probe alias show up or not show up in
> --monitor
> > --------
> > --monitor mode seems to have some kind of magic for deciding what
> > probes it shows hits for.
> > So far I've only seen it produce output for SDT markers, not function
> > probes, function return probes, etc.
>
> They should all show up, once they start getting hits.
>

Weird, they don't here. Will install latest stap and look into it; while
I've studied the latest git sources I've still been testing against

     Systemtap translator/driver (version 4.1/0.177/0.176, rpm 4.1-2.fc30)





> > How does it find out when a probe is hit and count it? Is there any
> > way to suppress a probe as a "hit" within the body, given that per Q1
> > the probe conditional expression seems pretty limited.
>
> When you ask "how ...." - you can gird your loins and look at the
> output of "stap -p3 ....".  It'll show you exactly how. :-)
>

Thanks. I've spent a while browsing that output already. I guess I may've
been unclear; I'm talking about how --monitor decides what is / isn't a
hit. Does using "next" in a probe alias body suppress it? etc.

As a heavy user of perf's "perf top", stap's --monitor is of great interest
to me.

I hadn't had time to read the monitor code before I posted this. I had a
look now and AFAICS it's injecting code into the tapscript that probes
procfs "monitor_status" to generate some json summary output about globals,
probes, etc from a probe done in ().

It calls __private___monitor_data_function_probes for each probe, which is
defined in gen_monitor_data() in a bit of a wall of inline code as
cstrings. That's embedded C (?) that looks up struct stap_probes which is
defined by translate_pass() to create the macro STAP_PROBE_INIT(...) .

I got a bit lost in the C++ around struct probe and struct statement at
that point, but some guesswork took me to common_probe_entryfn_prologue and
STP_TIMING which uses skipped_count(), but that seems to be session-level
not per-probe. STP_ALIBI seems to generate a per-probe atomic_inc(...) but
I didn't track that back to where it's defined or why.



> > I didn't find much at all about --monitor mode yet.
>
> Yeah, it's not something we made a big fuss about.  [man stap].
>
>
> > Q13: Easy way to handle enums?
> > --------
> > [...]
> > enum MyProgramEnum {
> >    XX,
> >    YY,
> >    ZZ
> > };
> >
> > manually to
> >
> > %define MyProgramEnum_XX %( 0 %)
> > %define MyProgramEnum_XX %( 1 %)
> > %define MyProgramEnum_XX %( 2 %)
>
> Have you seen   @const()?  It's basically a wrapper for embedded-C.
>

Yeah, I did see that, but you can't use embedded C normal in user tap
scripts, only tapsets or guru mode, and as I mentioned I'm more than a tad
nervous about trying to include/parse complex and portable headers like
those in postgres. Plus that way you need the headers on hand.

Not a biggie, it'd just be convenient to be able to copy the enums.


> Some dwarf magic should let us find enum declarations in scope of a
> probe, this sounds like a good RFE.
>

Huh, good point. I was only thinking parse/compile time, didn't occur to me
it could be done at runtime. @enum(...) and if using -ggdb3 maybe
even @macro(...) for simple integers. But I can't volunteer for that one,
so it's likely to be an idle-for-a-long-time RFE unless I can get some
allocated time for stap.


> > Q14: Handling differences in target program version
> > ------------
> > For the sake of an example, say that PostgreSQL version 11 has an
> > argument / global variable "foo", and in PostgreSQL 12 it's gone,
> > replaced with "foo_defined" and "foo_value". Details don't matter,
> > point is that one symbol vanished, two different ones replace it.
> >
> > Is there any way to handle these version differences by making probes
> > conditional on target expressions/values? E.g. "use this probe if
> > @var("pg_version_num") > 120000, otherwise this other probe" ?
>
> A probe handler can use @defined() and macro wrappers to adapt to
> the presence or absence of context variables.  See our old pal
> [man stap] and its friends [man stapprobes].  Many tapsets use
> it too.
>
> PR13009 would let probe point conditionals include @defined() tests
> (which normally resolve to literals early during translation).
>
>
> > Q15: variable/dynamic probes
> > ----------
> >
> > Say I want to probe a bunch of different binaries within some basedir
> > I want to pass as a script argument (or better yet, resolve from the
> > PATH of a target binary).
> >
> > It seems I can't just
> >     alias pg.backend = process(@1 "/bin/myproc");
>
> This works for me:
>
> stap -e '
>       probe pg.backend = process(@1 "/bin/myproc") {}
>       probe pg.backend.function("foo") {}
> ' /directory/prefix
>
>
> > but because string literal auto-concatentation doesn't happen for
> > macro-expanded strings
>
> If so, this is a bug.
>
>
> > New user experience, docs suggestions
> > ===============================
> >
> >
> > Document stringifying numbers
> > ----------
> >
> > Concatenation with . doesn't stringify numbers.
>
> Yup, because "." forms an important signal to the type inferences
> that its operands are strings.
>

Which makes sense. There are many extremely useful hints in systemtap
already and this might be a candidate for another though:

    $ sudo stap -e 'probe begin { a = 1; b = "b" . a; printf("b is %s\n",
b); }'
    semantic error: type mismatch (string): identifier 'a' at <input>:1:32
            source: probe begin { a = 1; b = "b" . a; printf("b is %s\n",
b); }
                                                   ^

    semantic error: type was first inferred here (long): identifier 'a' at
:1:15
            source: probe begin { a = 1; b = "b" . a; printf("b is %s\n",
b); }

for operator(.) like "operator '.' takes string operands, do you need
sprint() or sprintf()? see [function::sprint], [function::sprintf]"

?

Unsure how simple it is to determine the operator (.) is the cause of the
type mismatch though and frankly it's probably easier to just add a note
alongside the . operator in the docs pointing to sprint() and sprintf() for
coercing integers to strings.

A note on the print/sprint entry in the manpage that mentions the keyword
"integer" wouldn't hurt either. I'll see if I can put together a docs PR
soon.



> > Array wildcards, iterate-with-return need to be documented in BG
> > -----------
> >
> > I found systemtap infinitely easier to use once I stumbled across the
> > array wildcard support. This would've been immensely useful early on
> > and I think it should be highlighted in the beginners' guide:
> > [...]
>
> I think we're suffering to some extent because we have too much
> documentation - and not all of it is updated when some new feature
> gets added.


Right, I get that. You've pointed out places where things were documented
and I didn't find them because I was doing targeted searches and reading
key sections, having not read it cover-to-cover yet.

I'm prone to writing too much documentation that people don't read myself.


> > Document that $1 ... $n and $# are usable in preprocessor
> > ---------
> >
> > I tried to write code that used $# to decide whether $2, etc were
> > valid and got very confused that stap kept spamming warnings.  [...]
>
> OK.  There is also the tapset/argv.stp file that defines a boringly
> C-like argc & argv[] global variable pair.
>

Huh, handy. That's good to know :) . TBH I probably would've assumed the
argv tapset was for target arguments anyway had I seen it ; it's worth
x-ref'ing where $1 @1 etc are doc'd in man stap.


> > Document that probe definitions cannot reference variables
> > -----------
> >
> > When getting started, it's surprising that you cannot
> >
> >     pgbasedir = "/path/to/postgres";
> >     alias pg.backend = process($pgbasedir . "/bin/postgres")
> > and it might be nice to cover what you can and cannot do in probe
> arguments.
>
> Probe points must be literals, as they must be evaluated at translate
> time.  Therefore they cannot vary.
>

Yeah, I get that now :) . This is one of those "confused me at the
beginning" things.

I didn't know about macros yet.

> No process-local / thread-local vars
> > ------------
> > Any non-probe-scoped variable, whether "global" or "private", is
> > global to the whole tapscript. There's AFAICS no way to instance such
> > variables per-process or per-thread automatically. So the beginners
> > guide should mention this and suggest the pattern:  [...]
>
> See the "this.stp" tapset for syntactic sugar for "hidden array
> indexed with tid()", like:
>
>      probe tp_syscall.read {
>          @this1 = buf_uaddr
>      }
>      probe tp_syscall.read.return {
>          buf_uaddr = @this1
>          printf("%s", user_string(buf_uaddr))
>      }
>

Yeah, I found that, but there's no equivalent for pid() ....

ugh, and I'm so stupid it hurts. For non-threaded processes pid()
corresponds 1:! to tid() . Sigh.


> > SDT probe data types and argument names:
> > -----------
> >
> > SDT probes have arguments $arg1... $argn, and no data type information
> > is preserved. That's exceedingly unfortunate as they're otherwise a
> > super useful tool. Is it feasible to do anything like embed the param
> > names and types in the probes ELF section or an extension section of
> > it, for reading by stap?
>
> Tough as from a C macro call given values, we can't get type names as
> literals that we can just include into the .sdt elf note.  Will think
> about it.
>

Right, not saying it's easy ;)

DWARF with -ggdb3 has the macro definition, but probably not in a form
convenient to map through the layers of expansion back to the original
argument types. I think it's just the raw definition.

I was thinking about it mostly for SDTs defined by probes.d, which already
syntatically accepts c-alike probe definitions, it just ignores the
argument names and types completely AFAICS. In this case the dtrace script
could generate a second macro alongside the DTRACE_PROBE2 or whatever like
a

    STAP_PROBE_ARG_NAMES2("somearg", "someotherarg");
    STAP_PROBE_ARG_TYPES2("const char *", "uint32_t");

which can presumably be inserted into a custom ELF section in much the same
manner as the probe points themselves are generated.

Whether it's useful to do so is another matter, though, as it'd require
other tools to agree upon and adopt the scheme. Also, this way the type
information would only be strings, which limits options for type checking
and for doing anything useful with it in the face of typedefs etc.

Dude, thanks a lot for all this info.  It's worth virtual gold to hear
> from users with fresh eyes.


I hoped so. I know it can also be frustrating because to you it's "just
right there dammit!" - but new users face information overload, are trying
to adapt to new terminology and structures, and are usually under time
pressure. So it's not just whether it's documented, but where and how.

I wish I could do better at that myself. Sometimes I feel like the only
user of some of the docs I write is ... myself.


> For those cases where some missing feature
> or a bug was identified (as opposed to RTFManPage), it would be great to
> get them into the queue (sourceware.org/bugzilla), and we can most
> certainly take patches, even little ones like your preferred wordings
> for the documentation nits.
>
> If you'd like me to open bugzilla's for my impression of them, can do,
> but doing it yourself would get you into the system and get the request
> & sample-desire worded just right.
>

Can do. I'll try to prep the odd patch as well.

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise

  reply	other threads:[~2019-10-31 14:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-28  7:19 Craig Ringer
2019-10-28  8:55 ` Craig Ringer
2019-10-30 20:10 ` Frank Ch. Eigler
2019-10-31 14:01   ` Craig Ringer [this message]
2019-11-07 18:52     ` Frank Ch. Eigler
2019-11-08  4:15       ` Craig Ringer
2019-11-08 12:03         ` Frank Ch. Eigler
2019-11-10  5:05           ` Craig Ringer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMsr+YEJUuncM_XB2WM=gk9ja2MeaqYBj6swzjG=CnmmcPQtWw@mail.gmail.com' \
    --to=craig@2ndquadrant.com \
    --cc=fche@redhat.com \
    --cc=systemtap@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).