From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <systemtap-return-26716-listarch-systemtap=sources.redhat.com@sourceware.org>
Received: (qmail 91359 invoked by alias); 30 Oct 2019 20:10:47 -0000
Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <systemtap.sourceware.org>
List-Subscribe: <mailto:systemtap-subscribe@sourceware.org>
List-Post: <mailto:systemtap@sourceware.org>
List-Help: <mailto:systemtap-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: systemtap-owner@sourceware.org
Received: (qmail 91164 invoked by uid 89); 30 Oct 2019 20:10:46 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-5.2 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2 autolearn=ham version=3.3.1 spammy=dude, suffering, Easy, armed
X-HELO: us-smtp-1.mimecast.com
Received: from us-smtp-delivery-1.mimecast.com (HELO us-smtp-1.mimecast.com) (205.139.110.120) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 30 Oct 2019 20:10:43 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;	s=mimecast20190719; t=1572466241;	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:	 content-transfer-encoding:content-transfer-encoding:	 in-reply-to:in-reply-to:references:references;	bh=N77SrYbFDaLTfznFd9C5Scoqv+jnDud40FQSaa3VBi0=;	b=dXPj4eMXFpODJoy6xxiaTrG0DAg293eWe6ZC8rZ9rr8k/QTz3cHgmZY9KSBcoVpcDFGR61	VsGJjVJ0qzGjbi7NIaixUZq0iuuGco+80qvlKEisMCa/y5tqcaUtck88sv/nv2qFuLr4fl	c4d1l4Npy02oGYLMgO9OyfhhSOmszmw=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-206-rd__qAdpNKmiiELw7-q4gw-1; Wed, 30 Oct 2019 16:10:38 -0400
Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12])	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))	(No client certificate requested)	by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 349FB2AD;	Wed, 30 Oct 2019 20:10:37 +0000 (UTC)
Received: from redhat.com (ovpn-116-53.phx2.redhat.com [10.3.116.53])	by smtp.corp.redhat.com (Postfix) with ESMTPS id DA94360BE0;	Wed, 30 Oct 2019 20:10:36 +0000 (UTC)
Received: from [127.0.0.1] (helo=vm-rhel7)	by redhat.com with esmtp (Exim 4.92)	(envelope-from <fche@redhat.com>)	id 1iPuId-0004St-CR; Wed, 30 Oct 2019 16:10:35 -0400
From: fche@redhat.com (Frank Ch. Eigler)
To: Craig Ringer <craig@2ndquadrant.com>
Cc: systemtap@sourceware.org
Subject: Re: Newbie Notes
References: <CAMsr+YEm42ERBdSH22=G9X84GvSGiV0ExPkaULyUyGvFqvLAQw@mail.gmail.com>
Date: Wed, 30 Oct 2019 20:10:00 -0000
In-Reply-To: <CAMsr+YEm42ERBdSH22=G9X84GvSGiV0ExPkaULyUyGvFqvLAQw@mail.gmail.com>	(Craig Ringer's message of "Mon, 28 Oct 2019 15:19:14 +0800")
Message-ID: <87bltyf0zp.fsf@redhat.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-SW-Source: 2019-q4/txt/msg00018.txt.bz2


Hi, Craig -

First of all, thanks for your kind words on lwn.


craig wrote:

> [...] I thought I'd share some of the challenges and
> roadblocks I experienced along the way and some suggestions for docs
> etc.

Great.

> Q1: Can I make a probe conditional per-pid in the "if" clause? What
> *can* I do there?

The [man stap] page covers probe conditionals this way:

       Probes may be decorated with an arming condition, consisting of a
       simple boolean expres=E2=80=90sion on read-only global script variab=
les.
       While disarmed (inactive, condition evaluates to false), some
       probe types reduce or eliminate their run-time overheads.  When
       an arming condition evaluates to true, probes will be soon
       re-armed, and their probe handlers will start getting called as
       the events fire.  [...]

The key is that these conditions are -arming conditions-.  They cannot
possibly use contextual values such as pid(), because in order to even
evaluate that condition, the probe would have to be armed & running!  In
contrast, "boolean expressions on read-only global variables" may be
evaluated anywhere: namely at the conclusion of all the probes that
actually modify those variables.  Then those probes can enqueue an
arm/disarm operation.


> Q2: Can I "return" or "exit" from a probe in a handler body or probe
> alias handler body?

"next", as in awk.  This too is in [man stap].


> [...]
> I know I can set a variable in the alias body, then test it in the
> main probe body, but that's ugly and cumbersome. Is there a better
> way?
> [...]
> I'd like to be able to  instead
>
> probe myproc.something =3D probe process("myproc").function("something") {
>   if (@var("foo") =3D=3D 0)
>     return;
> };

Say "next" instead of "return", and bob is your uncle.


> Especially since the unused variables warnings stap emits mean that
> probe alias handlers that define variables tend to create huge amounts
> of warnings spam if some users of the alias don't care about all the
> variables.

Unused variables defined in aliases should not generate any warnings; or
maybe you're using "stap -u" ?


> Q3: Some way to mark individual variables to suppress unused warnings?
> ---------------
>
> In probe alias bodies it's often useful to define a set of variables
> for use by the probe(s) that use the alias. But stap likes to spam
> warnings about these if not all of them are used by all probes.

That's not normal behaviour.  One point of aliases is to define a whole
menu of potentially useful variables, which a probe may quietly choose
from.  If that's not working, please report.


> Q4: tapscript backtraces from runtime faults, errors, etc, macro
> expanded-from for errors at runtime?
> --------------
>
> Say I have some macro @MY_MACRO(x) or some function my_function(x).
> I do something in the macro that raises an error() or fault.
> The error from stap only reports the line number of the macro or function.

Aha.  Hmm, this has come up before, but I don't think we recorded it
properly as an RFE (request-for-enhancement) BZ.  It shouldn't be too
hard, given that we do track statement-by-statement where we are, in the
context->last_stmt variable.  We would have to make that nestable, by
storing it inside the context->locals[LEVEL] struct instead.  It's a bit
of work but not that much.  Might you be interested in giving
implementing it a try?


> Q5: is there any way to get the len()/size() of a non-statistical array?

There should be.  As you realize, statistics arrays are more complicated
(since they are per-cpu, until a merging mandated by a foreach()
iteration).

Until the language/runtime has an optimized version, you can still
at least make the call sites small via macros such as:

@define @length1(a,v) %( @v =3D 0; foreach (_x in @a) { @v ++ } %)
@define @length2(a,v) %( @v =3D 0; foreach (_x,_y in @a) { @v ++ } %)


> Q6: is there any way to get the executable-path of the current pid()
> in a probe Just that.

task_execname(task_current())   [man tapset::task]


> Q7: Is there a string-concatenation operator that works at parse-time
> with macro-expansion?
> Say I want to use
>
>    @define MY_PROG_BASEPATH %{ @1 %}
>    @define MY_PROG(prog) %{ @MY_PROG_BASEPATH @prog %}

Same rules as in C: adjacent string literals concatenate during parse.


> Q8: nicer way to warn() or error() with params than sprintf?
> --------
> If I have a warn(), error() or assert() I want to include the values
> of variables in, is there a nicer way to do it than assert(condition,
> sprintf(...)) ?

We don't have variadic macros, and variadic functions don't quite
propagate string literals well enough for this to work:

function warn(msg,fmt,var1) { warn(sprint("%s " fmt, msg, var1)) }
function warn(msg,fmt,var1,var2) { warn(sprint("%s " fmt, msg, var1, var2))=
 }


> Especially for assert.
>
> Q9: Can I iterate over a distinct-values slice of an array?
> -------
> Say I have an array arr[x,y,z]. I want to iterate over all distinct
> "x" without repeats.
> There are no local associative array vars.

Yeah, it'd require about the same amount of temporary storage as if you
spelled out the pair of operations consecutively:

    foreach ([x,y,z] in array)  array2[x] =3D 2
    foreach ([unique] in array2) /* bob is unique */ ;

> AFAICS I have to
>
> last_x =3D 0;
> for (arr =3D [x+,y,z] in arr) {
>   if (x !=3D last_x) {
>     printf("found distinct x: %d\n", x);
>     last_x =3D x;
>   }
> }

That works too but ...

Temporary arrays are a problem from a memory allocation perspective:
we just don't like doing it at run time, and arrays are too big to
put into the contect struct.


> Q10: How does the interpretation of probe bodies and function bodies diff=
er?
> -------
>
> From what I've been able to tell, functions get expanded into probe
> bodies, somewhat macro-like.

They could be, but actually they're translated to distinct blobs of C,
which are called/shared from multiple caller probes as needed.

> How do functions and macros differ?=20

Macros are parse-time expansions (so they disappear by -p1); functions
are ... well like functions elsewhere.

> What can be done in probe bodies but not functions? There are clearly
> some differences since as my prior mail notes, functions resolve @cast
> and @var differently.

That's becuase reusable functions don't have a natural context to
resolve types/$variables within.  Probes do - the probe point itself.


> Q11: How to "fire" an "event" from a tapset?
> --------
> [...]
> Is using aliases the wrong approach? If you want users to be able to
> say "run this handler when event y occurs", and event Y may happen 6
> different ways with different relevant local variables etc each time,
> how do you do it?

You may be overthinking it: just say     probe y  { ... }
declaratively.  If you want to turn it on or off, use arming
conditionals or normal conditionals.


> Can a tapscript "fire" a synthetic probe? And put it in a context
> where it can access a process's globals etc?

Nope, we cannot do something as messy as gdb's inferior-function-calls.


> Q12: How to make a probe or probe alias show up or not show up in --monit=
or
> --------
> --monitor mode seems to have some kind of magic for deciding what
> probes it shows hits for.
> So far I've only seen it produce output for SDT markers, not function
> probes, function return probes, etc.

They should all show up, once they start getting hits.

> How does it find out when a probe is hit and count it? Is there any
> way to suppress a probe as a "hit" within the body, given that per Q1
> the probe conditional expression seems pretty limited.

When you ask "how ...." - you can gird your loins and look at the
output of "stap -p3 ....".  It'll show you exactly how. :-)

> I didn't find much at all about --monitor mode yet.

Yeah, it's not something we made a big fuss about.  [man stap].


> Q13: Easy way to handle enums?
> --------
> [...]
> enum MyProgramEnum {
>    XX,
>    YY,
>    ZZ
> };
>
> manually to
>
> %define MyProgramEnum_XX %( 0 %)
> %define MyProgramEnum_XX %( 1 %)
> %define MyProgramEnum_XX %( 2 %)

Have you seen   @const()?  It's basically a wrapper for embedded-C.
[man stap]

Some dwarf magic should let us find enum declarations in scope of a
probe, this sounds like a good RFE.


> Q14: Handling differences in target program version
> ------------
> For the sake of an example, say that PostgreSQL version 11 has an
> argument / global variable "foo", and in PostgreSQL 12 it's gone,
> replaced with "foo_defined" and "foo_value". Details don't matter,
> point is that one symbol vanished, two different ones replace it.
>
> Is there any way to handle these version differences by making probes
> conditional on target expressions/values? E.g. "use this probe if
> @var("pg_version_num") > 120000, otherwise this other probe" ?

A probe handler can use @defined() and macro wrappers to adapt to
the presence or absence of context variables.  See our old pal
[man stap] and its friends [man stapprobes].  Many tapsets use
it too.

PR13009 would let probe point conditionals include @defined() tests
(which normally resolve to literals early during translation).


> Q15: variable/dynamic probes
> ----------
>
> Say I want to probe a bunch of different binaries within some basedir
> I want to pass as a script argument (or better yet, resolve from the
> PATH of a target binary).
>
> It seems I can't just
>     alias pg.backend =3D process(@1 "/bin/myproc");

This works for me:

stap -e '
      probe pg.backend =3D process(@1 "/bin/myproc") {}
      probe pg.backend.function("foo") {}
' /directory/prefix


> but because string literal auto-concatentation doesn't happen for
> macro-expanded strings

If so, this is a bug.


> New user experience, docs suggestions
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>
>
> Document stringifying numbers
> ----------
>
> Concatenation with . doesn't stringify numbers.

Yup, because "." forms an important signal to the type inferences
that its operands are strings.


> Document that you just use sprintf to do it, or you may want to use
> the "println" statement that automatically handles concatenation and
> stringification.

Sure.


> Array wildcards, iterate-with-return need to be documented in BG
> -----------
>
> I found systemtap infinitely easier to use once I stumbled across the
> array wildcard support. This would've been immensely useful early on
> and I think it should be highlighted in the beginners' guide:
> [...]

I think we're suffering to some extent because we have too much
documentation - and not all of it is updated when some new feature
gets added.  [man stap] and [man stapprobes] are the masters where
everything should be included; the more prosey pdfs are secondary
and we often seem to neglect them.


> Document that $1 ... $n and $# are usable in preprocessor
> ---------
>
> I tried to write code that used $# to decide whether $2, etc were
> valid and got very confused that stap kept spamming warnings.  [...]

OK.  There is also the tapset/argv.stp file that defines a boringly
C-like argc & argv[] global variable pair.  Hm, it lacks the markup
needed to plop it into the generated man pages along with other
[man -k tapset::].


> Document that probe definitions cannot reference variables
> -----------
>
> When getting started, it's surprising that you cannot
>
>     pgbasedir =3D "/path/to/postgres";
>     alias pg.backend =3D process($pgbasedir . "/bin/postgres")
> and it might be nice to cover what you can and cannot do in probe argumen=
ts.

Probe points must be literals, as they must be evaluated at translate
time.  Therefore they cannot vary.


> No process-local / thread-local vars
> ------------
> Any non-probe-scoped variable, whether "global" or "private", is
> global to the whole tapscript. There's AFAICS no way to instance such
> variables per-process or per-thread automatically. So the beginners
> guide should mention this and suggest the pattern:  [...]

See the "this.stp" tapset for syntactic sugar for "hidden array
indexed with tid()", like:

     probe tp_syscall.read {
         @this1 =3D buf_uaddr
     }
     probe tp_syscall.read.return {
         buf_uaddr =3D @this1
         printf("%s", user_string(buf_uaddr))
     }


> SDT probe data types and argument names:
> -----------
>
> SDT probes have arguments $arg1... $argn, and no data type information
> is preserved. That's exceedingly unfortunate as they're otherwise a
> super useful tool. Is it feasible to do anything like embed the param
> names and types in the probes ELF section or an extension section of
> it, for reading by stap?

Tough as from a C macro call given values, we can't get type names as
literals that we can just include into the .sdt elf note.  Will think
about it.


> String access: user_string, user_string_warn
> -------------
> [...]

> Target array access
> -------------
> It'd be nice if the docs mentioned how to access target array
> variables or array-valued struct members. It's fairly obvious for
> locals:
>
>     $targetvar[index];
>     $targetstruct->arraymember[index];
>
> For @var and @cast it works similarly, but will result in confusing
> failures to resolve the variable or type if you do it wrong. Parens
> are allowed and seem to be useful (needed?). E.g.
>     (@var("globalarray"))[idx]
>     (@var("globalstruct")->arraymember)[idx]

OK.=20

> Deref'ing scalar as array produces confusing symbol not found error

This is a diagnostic quality bug.


> Typecasting
> -----------
>
> Same warning for @cast as for @var re functions, module paths.
>
> Warn that @cast to a pointer-hiding typedef will produce confusing
> results, strange faults, etc.
>
>
> Queue stats
> -------------
>
>
> tapset crosslinking
> ---------------
>
> It'd be nice to see more x-linking of important tapset info in the
> main docs and beginner docs, e.g. linking to


Dude, thanks a lot for all this info.  It's worth virtual gold to hear
from users with fresh eyes.  For those cases where some missing feature
or a bug was identified (as opposed to RTFManPage), it would be great to
get them into the queue (sourceware.org/bugzilla), and we can most
certainly take patches, even little ones like your preferred wordings
for the documentation nits.

If you'd like me to open bugzilla's for my impression of them, can do,
but doing it yourself would get you into the system and get the request
& sample-desire worded just right.


- FChE