RE: Accessing target variables in return probes (was Re: src ChangeLog tapsets.cxx)

public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed

* RE: Accessing target variables in return probes (was Re: src ChangeLog tapsets.cxx)
@ 2006-11-22  7:40 Stone, Joshua I
  2006-12-14 18:27 ` David Smith
  0 siblings, 1 reply; 6+ messages in thread
From: Stone, Joshua I @ 2006-11-22  7:40 UTC (permalink / raw)
  To: David Smith; +Cc: Systemtap List

On Tuesday, November 21, 2006 11:56 AM, David Smith wrote:
> Stone, Joshua I wrote:
>> * The tvar_ctr needs to be per-thread, else you get a race condition.
>> Consider the simple case where two threads run the same function,
>> interleaving counter access.  (i.e., T1 enters, T2 enters, T1
>> returns, T2 returns).  The threads will end up reading each other's
>> data! 
> 
> Sigh.  I can't believe I missed this since I sat down and wrote out a
> similar situation.  Actually the threads will end up getting zeros
> (the default value).

Ah, yes, didn't quite map that out right in my head -- it will keep the
threads' data separate.  Zero values and mangled counters are still a
problem, of course.

> To do per-thread counters, I think I'll need another map to store the
> counter value for that thread.  So, accessing cached parameter would
> look something like:
> 
> tvar_cached_data[tid(), tvar_ctr[tid()]]
> 
> Does that look like it would work?  Got any better ideas?

That should work just fine.

>> * As a later optimization, consider merging duplicate target-var
>> references into a single temp call.  For example, 'probe
>> syscall.open.return { if ($flags) print($flags) }' right now
>> duplicates the effort for each $flags reference.
> 
> Currently the code to get the actual parameter value gets merged to a
> single function.  You are correct that the parameter caching map calls
> get duplicated.

The dwarf function only exists once, but the new entry probe, the
temporary maps, and the read from the maps are all duplicated.

> This actually might be considered a requirement, since if you have two
> different return probes for the same function that access the same
> cached parameter, we want them to be separate (so the counters are
> handled correctly).

For distinct probes, sure, it's much easier to keep them separate.  We
don't want the headache of separate probes trying to coordinate what
order they access it in, or which one should delete and decrement.  But
when the reference is duplicated within a single probe, it seems like it
would be easy to read the value from the same place, especially if
you're going to read it into a local at the top of the probe anyway.

This sort of issue may crop up more often if scripts get more elaborate
with probe aliasing -- like this intentionally convoluted example:

/* a custom tapset that defines an interesting probe point */
probe sysopen_with_flags = syscall.open.return {
	if (!$flags) next
}

/* a script file that uses it. */
probe sysopen_with_flags {
	printf("sysopen flags:%d = %d\n", $flags, $return)
}

> I'm thinking that it might be possible to inject the code at the
> *beginning* of the function, with the help of a temporary variable.
> Something like this:
> 
> probe kernel.function("foo").return
> {
> { /* start of injected code */
>     tvar_cache_tmp = tvar_cached_data[tid(), tvar_ctr[tid()]]
>     delete tvar_cached_data[tid(), tvar_ctr[tid()]--]
>    /* end of injected code */
> }
> ... rest of original probe ...
> }
> 
> then use 'tvar_cache_tmp' everywhere the target variable was accessed.

That looks good.  You can also cache tid() into a local instead of
paying for the function call four times.

You can probably use some internal bookkeeping to track which variables
you've injected code for, and just reuse the tvar_cached_temp if it's
already there.  This solves the multiple-reference problem above.

> Josh, do you (or anyone else) know if it would be a good idea to
> delete the counter map value when it reaches 0?

No need -- setting a value to zero removes it from the map.

>> * A potential bug: due to the way string targets are read, only the
>> string pointer is being saved for the return probe.  The
>> 'kernel/user_string' call happens in the return probe.  In most cases
>> this is probably ok, but there might be some functions where that
>> string is manipulated during the course of execution.  The cleanest
>> way I can think to solve this is to extend the language to
>> automatically treat a target like a string -- probably using
>> '@mystring' like we do with positional parameters.  Behind the
>> scenes we can look at the pointer range to figure whether to use
>> kernel_string or user_string. 
> 
> Hmm, I wonder if we could use a function for that and force users to
> use it everywhere.  It would be safer and easier.

A function for automatically choosing user_string or kernel_string?
Yeah, that would be a good thing to have.

I still like the idea of a '@var' syntax for target strings too...

Josh

(David will get this twice, as I forgot to CC the list the first time.
Sorry!)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Accessing target variables in return probes (was Re: src ChangeLog  tapsets.cxx)
  2006-11-22  7:40 Accessing target variables in return probes (was Re: src ChangeLog tapsets.cxx) Stone, Joshua I
@ 2006-12-14 18:27 ` David Smith
  2006-12-14 23:17   ` Frank Ch. Eigler
  0 siblings, 1 reply; 6+ messages in thread
From: David Smith @ 2006-12-14 18:27 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: Systemtap List

Here's an update on this functionality (accessing target variables in 
return probes).

- When caching target variables for use in a return probe, the counters 
are per-thread.  This should fix almost all** problems with threads 
getting incorrect data or zeros.

- The code has been optimized so that multiple references to the same 
target variable in a probe is handled properly (both instances will 
refer to the same temporary variable that was initialized with a value 
from the cache).

- The code has been optimized so that only one entry probe is generated 
per return probe no matter how many target variables are referenced in 
the return probe.


** Currently, the only known problem when accessing target variables in 
a return probe is the problem of skipped return probes and a recursive 
probed kernel function.  There is no way to know if a return probe has 
been skipped, so in the case of probing a recursive kernel function, the 
cached values can get out of sync.  Valid values will be returned, just 
from the wrong call.

Stone, Joshua I wrote:
> On Tuesday, November 21, 2006 11:56 AM, David Smith wrote:
>> Stone, Joshua I wrote:
>>> * The tvar_ctr needs to be per-thread, else you get a race condition.
>>> Consider the simple case where two threads run the same function,
>>> interleaving counter access.  (i.e., T1 enters, T2 enters, T1
>>> returns, T2 returns).  The threads will end up reading each other's
>>> data! 
>> Sigh.  I can't believe I missed this since I sat down and wrote out a
>> similar situation.  Actually the threads will end up getting zeros
>> (the default value).
> 
> Ah, yes, didn't quite map that out right in my head -- it will keep the
> threads' data separate.  Zero values and mangled counters are still a
> problem, of course.
> 
>> To do per-thread counters, I think I'll need another map to store the
>> counter value for that thread.  So, accessing cached parameter would
>> look something like:
>>
>> tvar_cached_data[tid(), tvar_ctr[tid()]]
>>
>> Does that look like it would work?  Got any better ideas?
> 
> That should work just fine.
> 
>>> * As a later optimization, consider merging duplicate target-var
>>> references into a single temp call.  For example, 'probe
>>> syscall.open.return { if ($flags) print($flags) }' right now
>>> duplicates the effort for each $flags reference.
>> Currently the code to get the actual parameter value gets merged to a
>> single function.  You are correct that the parameter caching map calls
>> get duplicated.
> 
> The dwarf function only exists once, but the new entry probe, the
> temporary maps, and the read from the maps are all duplicated.
> 
>> This actually might be considered a requirement, since if you have two
>> different return probes for the same function that access the same
>> cached parameter, we want them to be separate (so the counters are
>> handled correctly).
> 
> For distinct probes, sure, it's much easier to keep them separate.  We
> don't want the headache of separate probes trying to coordinate what
> order they access it in, or which one should delete and decrement.  But
> when the reference is duplicated within a single probe, it seems like it
> would be easy to read the value from the same place, especially if
> you're going to read it into a local at the top of the probe anyway.
> 
> This sort of issue may crop up more often if scripts get more elaborate
> with probe aliasing -- like this intentionally convoluted example:
> 
> /* a custom tapset that defines an interesting probe point */
> probe sysopen_with_flags = syscall.open.return {
> 	if (!$flags) next
> }
> 
> /* a script file that uses it. */
> probe sysopen_with_flags {
> 	printf("sysopen flags:%d = %d\n", $flags, $return)
> }
> 
>> I'm thinking that it might be possible to inject the code at the
>> *beginning* of the function, with the help of a temporary variable.
>> Something like this:
>>
>> probe kernel.function("foo").return
>> {
>> { /* start of injected code */
>>     tvar_cache_tmp = tvar_cached_data[tid(), tvar_ctr[tid()]]
>>     delete tvar_cached_data[tid(), tvar_ctr[tid()]--]
>>    /* end of injected code */
>> }
>> ... rest of original probe ...
>> }
>>
>> then use 'tvar_cache_tmp' everywhere the target variable was accessed.
> 
> That looks good.  You can also cache tid() into a local instead of
> paying for the function call four times.
> 
> You can probably use some internal bookkeeping to track which variables
> you've injected code for, and just reuse the tvar_cached_temp if it's
> already there.  This solves the multiple-reference problem above.
> 
>> Josh, do you (or anyone else) know if it would be a good idea to
>> delete the counter map value when it reaches 0?
> 
> No need -- setting a value to zero removes it from the map.
> 
>>> * A potential bug: due to the way string targets are read, only the
>>> string pointer is being saved for the return probe.  The
>>> 'kernel/user_string' call happens in the return probe.  In most cases
>>> this is probably ok, but there might be some functions where that
>>> string is manipulated during the course of execution.  The cleanest
>>> way I can think to solve this is to extend the language to
>>> automatically treat a target like a string -- probably using
>>> '@mystring' like we do with positional parameters.  Behind the
>>> scenes we can look at the pointer range to figure whether to use
>>> kernel_string or user_string. 
>> Hmm, I wonder if we could use a function for that and force users to
>> use it everywhere.  It would be safer and easier.
> 
> A function for automatically choosing user_string or kernel_string?
> Yeah, that would be a good thing to have.
> 
> I still like the idea of a '@var' syntax for target strings too...
> 
> 
> Josh
> 
> 
> (David will get this twice, as I forgot to CC the list the first time.
> Sorry!)


-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Accessing target variables in return probes (was Re: src ChangeLog  tapsets.cxx)
  2006-12-14 18:27 ` David Smith
@ 2006-12-14 23:17   ` Frank Ch. Eigler
  0 siblings, 0 replies; 6+ messages in thread
From: Frank Ch. Eigler @ 2006-12-14 23:17 UTC (permalink / raw)
  To: David Smith; +Cc: Stone, Joshua I, Systemtap List


dsmith wrote:

> [...]  There is no way to know if a return probe has been skipped,
> so in the case of probing a recursive kernel function, the cached
> values can get out of sync.  [...]

In case I didn't put this idea on paper already: one way may be to
also store the thread's current stack pointer somewhere in the
tracking data stream.  If we realize that our "current" stack
pointer is higher than that of the highest-nesting-numbered
saved data record, we could unroll to recover synchronization.

In other words, for each synthetic entry point, also save
   n = (nesting[tid()] ++)
   thread_entry_sp[tid(), n] = regs_sp() # add to tapset if needed
   saved_data_value[tid(), n] = $data

and in the return probe handler,

   first=1
   while (first || (thread_entry_sp[tid(), n] < current_sp) {
     first=0   # a candidate for do/while, n'est ce pas?
     n = (-- nesting[tid()])
     current_sp = regs_sp()
     restored_data = saved_data_value[tid(), n];
     delete saved_data_value[...]
     delete thread_entry_sp[...]
   }

Anyway, just a sketch.  Do you get the idea?


- FChE

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Accessing target variables in return probes (was Re: src ChangeLog  tapsets.cxx)
@ 2006-12-15  0:41 Stone, Joshua I
  2006-12-18 13:20 ` Frank Ch. Eigler
  0 siblings, 1 reply; 6+ messages in thread
From: Stone, Joshua I @ 2006-12-15  0:41 UTC (permalink / raw)
  To: fche, David Smith; +Cc: Systemtap List

On Thursday, December 14, 2006 12:09 PM, fche@redhat.com wrote:
> In case I didn't put this idea on paper already: one way may be to
> also store the thread's current stack pointer somewhere in the
> tracking data stream.  If we realize that our "current" stack
> pointer is higher than that of the highest-nesting-numbered
> saved data record, we could unroll to recover synchronization.

Nice idea!  Have you prototyped this at all?

We also discussed the possibility of extending kretprobes with a
pre_handler -- a handler that is called during entry, but only if the
return-probe instance was successfully allocated.  I looked into it, and
it seems like a very minor change:

(fake patch example, in arch/x86_64/kernel/kprobes.c)
/* Called with kretprobe_lock held */
void __kprobes arch_prepare_kretprobe(struct kretprobe *rp,
                      struct pt_regs *regs)
{
    unsigned long *sara = (unsigned long *)regs->rsp;
    struct kretprobe_instance *ri;

    if ((ri = get_free_rp_inst(rp)) != NULL) {
+       if (!(rp->pre_handler && rp->pre_handler(rp, regs))) {
            ri->rp = rp;
            ri->task = current;
            ri->ret_addr = (kprobe_opcode_t *) *sara;

            /* Replace the return addr with trampoline addr */
            *sara = (unsigned long) &kretprobe_trampoline;
            add_rp_inst(ri);
+       }
    } else {
        rp->nmissed++;
    }
}
(Repeat for other archs, and add pre_handler to struct kretprobe...)

If we push for a patch like this to kprobes, then we can use it when the
kretprobe pre_handler is available, or else use your method for older
kernels.

Josh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Accessing target variables in return probes (was Re: src ChangeLog  tapsets.cxx)
  2006-12-15  0:41 Stone, Joshua I
@ 2006-12-18 13:20 ` Frank Ch. Eigler
  0 siblings, 0 replies; 6+ messages in thread
From: Frank Ch. Eigler @ 2006-12-18 13:20 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: systemtap

Hi, Josh -

> > [stack-pointer-sensitive kretprobe skip detection]
> [...]  Have you prototyped this at all?

Nope.  I figure one might simulate it with an explicit .return probe
that periodically skips processing.

> We also discussed the possibility of extending kretprobes with a
> pre_handler -- a handler that is called during entry, but only if
> the return-probe instance was successfully allocated.  [...]

If that were the only condition under which data for return & entry
probes might become unsynchronized, that could be sufficient.  It
would be nice if the early prologue debugging information were good
enough to extract parameters at that point.

Perhaps a way around this is to have the translator test the debugging
info at the entry point address.  If it fails for some reason, then
fall back to prologue analysis.  Depending on the reliability of this
heuristic, this could apply to all .function() probes.

- FChE

^ permalink raw reply	[flat|nested] 6+ messages in thread

[parent not found: <C56DB814FAA30B418C75310AC4BB279DF731A9@scsmsx413.amr.corp.intel.com>]

* Accessing target variables in return probes (was Re: src ChangeLog  tapsets.cxx)
       [not found] <C56DB814FAA30B418C75310AC4BB279DF731A9@scsmsx413.amr.corp.intel.com>
@ 2006-11-21 21:28 ` David Smith
  0 siblings, 0 replies; 6+ messages in thread
From: David Smith @ 2006-11-21 21:28 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: Systemtap List

Stone, Joshua I wrote:
> On Monday, November 20, 2006 2:45 PM, dsmith@sourceware.org wrote:
>> Log message:
>> 	2006-11-20  David Smith  <dsmith@redhat.com>
>>
>> 	* tapsets.cxx
>> 	(dwarf_var_expanding_copy_visitor::visit_target_symbol):  BZ
>> 	1382.  Target variables can be accessed in return probes.  A new
>> 	function entry probe is generated that saves the target
> variables
>> 	so that they can be accesssed in the return probe.
> 
> This looks very cool!  It will be nice to have complicated functionality
> like this that "just works" for the end-user.
> 
> I don't know if this is still a work in progress, so I apologize if I'm
> critiquing prematurely.  If nothing else, this is just me jotting down
> the thoughts in my head.

Well, I thought it was done, but after reading your email I realize it 
isn't...

> Here's a few things that I see from inspection:
> 
> * The tvar_ctr needs to be per-thread, else you get a race condition.
> Consider the simple case where two threads run the same function,
> interleaving counter access.  (i.e., T1 enters, T2 enters, T1 returns,
> T2 returns).  The threads will end up reading each other's data!

Sigh.  I can't believe I missed this since I sat down and wrote out a 
similar situation.  Actually the threads will end up getting zeros (the 
default value).   Since the cached parameter map uses 'tid()' as an 
index, the following will happen:

tvar_cached_data[t1, 0] = $tvar  (thread 1 entry)
tvar_cached_data[t2, 1] = $tvar  (thread 2 entry)
x = tvar_cached_data[t1, 1]  (thread 1 exit - no data at [t1, 1])
x = tvar_cached_data[t2, 0]  (thread 2 exit - no data at [t2, 0])

To do per-thread counters, I think I'll need another map to store the 
counter value for that thread.  So, accessing cached parameter would 
look something like:

tvar_cached_data[tid(), tvar_ctr[tid()]]

Does that look like it would work?  Got any better ideas?

> * The counter gets decremented multiple times if someone accesses a
> target variable within a loop body.

Gack.  I had tested the same parameter getting accessed multiple times, 
but not one parameter getting accessed in a loop.

See follow-up thought below.

> * The data is still hanging around in the maps after the probe is done,
> and the maps are limited in the number of entries.  In manually written
> code I would 'delete' from the map after reading it, but I can see how
> injecting a delete might be challenging.  It might be easier if the map
> runtime offered a read-and-delete call.

Because of the above problem of the counter getting decremented multiple 
times, I'm going to have to figure out how to inject code (that gets 
called once!) to decrement the counter.  I might as well add a "delete" 
there too.

> * As a later optimization, consider merging duplicate target-var
> references into a single temp call.  For example, 'probe
> syscall.open.return { if ($flags) print($flags) }' right now duplicates
> the effort for each $flags reference.

Currently the code to get the actual parameter value gets merged to a 
single function.  You are correct that the parameter caching map calls 
get duplicated.

This actually might be considered a requirement, since if you have two 
different return probes for the same function that access the same 
cached parameter, we want them to be separate (so the counters are 
handled correctly).

> * For the previous three, it would probably be easier if you can inject
> the cleanup code at the end of the return probe.  Do the map read at the
> user's reference site(s), and then do a delete and ctr-- at the end.
> (Beware code that uses 'next' to skip out early...)

I'm thinking that it might be possible to inject the code at the 
*beginning* of the function, with the help of a temporary variable. 
Something like this:

probe kernel.function("foo").return
{
{ /* start of injected code */
    tvar_cache_tmp = tvar_cached_data[tid(), tvar_ctr[tid()]]
    delete tvar_cached_data[tid(), tvar_ctr[tid()]--]
   /* end of injected code */
}
... rest of original probe ...
}

then use 'tvar_cache_tmp' everywhere the target variable was accessed.

Josh, do you (or anyone else) know if it would be a good idea to delete 
the counter map value when it reaches 0?

> * A potential bug: due to the way string targets are read, only the
> string pointer is being saved for the return probe.  The
> 'kernel/user_string' call happens in the return probe.  In most cases
> this is probably ok, but there might be some functions where that string
> is manipulated during the course of execution.  The cleanest way I can
> think to solve this is to extend the language to automatically treat a
> target like a string -- probably using '@mystring' like we do with
> positional parameters.  Behind the scenes we can look at the pointer
> range to figure whether to use kernel_string or user_string.

Hmm, I wonder if we could use a function for that and force users to use 
it everywhere.  It would be safer and easier.

> Thanks for letting me brain-dump...

No, thank you for pointing out the numerous flaws in this before it 
started getting used.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-12-17 19:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-11-22  7:40 Accessing target variables in return probes (was Re: src ChangeLog tapsets.cxx) Stone, Joshua I
2006-12-14 18:27 ` David Smith
2006-12-14 23:17   ` Frank Ch. Eigler
  -- strict thread matches above, loose matches on Subject: below --
2006-12-15  0:41 Stone, Joshua I
2006-12-18 13:20 ` Frank Ch. Eigler
     [not found] <C56DB814FAA30B418C75310AC4BB279DF731A9@scsmsx413.amr.corp.intel.com>
2006-11-21 21:28 ` David Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).