From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <systemtap-return-5132-listarch-systemtap=sources.redhat.com@sourceware.org>
Received: (qmail 14653 invoked by alias); 14 Dec 2006 17:16:08 -0000
Received: (qmail 14642 invoked by uid 22791); 14 Dec 2006 17:16:07 -0000
X-Spam-Status: No, hits=-2.7 required=5.0 	tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 14 Dec 2006 17:15:59 +0000
Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) 	by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kBEHFvPt016837; 	Thu, 14 Dec 2006 12:15:57 -0500
Received: from pobox.hsv.redhat.com (pobox.hsv.redhat.com [172.16.16.12]) 	by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kBEHFpwR011316; 	Thu, 14 Dec 2006 12:15:52 -0500
Received: from [172.16.17.170] (dhcp-170.hsv.redhat.com [172.16.17.170]) 	by pobox.hsv.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kBEHFppK017811; 	Thu, 14 Dec 2006 12:15:51 -0500
Message-ID: <458186C8.7010208@redhat.com>
Date: Thu, 14 Dec 2006 18:27:00 -0000
From: David Smith <dsmith@redhat.com>
User-Agent: Thunderbird 1.5.0.8 (X11/20061107)
MIME-Version: 1.0
To: "Stone, Joshua I" <joshua.i.stone@intel.com>
CC: Systemtap List <systemtap@sources.redhat.com>
Subject: Re: Accessing target variables in return probes (was Re: src ChangeLog  tapsets.cxx)
References: <C56DB814FAA30B418C75310AC4BB279DF737FA@scsmsx413.amr.corp.intel.com>
In-Reply-To: <C56DB814FAA30B418C75310AC4BB279DF737FA@scsmsx413.amr.corp.intel.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <systemtap.sourceware.org>
List-Subscribe: <mailto:systemtap-subscribe@sourceware.org>
List-Post: <mailto:systemtap@sourceware.org>
List-Help: <mailto:systemtap-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: systemtap-owner@sourceware.org
X-SW-Source: 2006-q4/txt/msg00674.txt.bz2

Here's an update on this functionality (accessing target variables in 
return probes).

- When caching target variables for use in a return probe, the counters 
are per-thread.  This should fix almost all** problems with threads 
getting incorrect data or zeros.

- The code has been optimized so that multiple references to the same 
target variable in a probe is handled properly (both instances will 
refer to the same temporary variable that was initialized with a value 
from the cache).

- The code has been optimized so that only one entry probe is generated 
per return probe no matter how many target variables are referenced in 
the return probe.


** Currently, the only known problem when accessing target variables in 
a return probe is the problem of skipped return probes and a recursive 
probed kernel function.  There is no way to know if a return probe has 
been skipped, so in the case of probing a recursive kernel function, the 
cached values can get out of sync.  Valid values will be returned, just 
from the wrong call.

Stone, Joshua I wrote:
> On Tuesday, November 21, 2006 11:56 AM, David Smith wrote:
>> Stone, Joshua I wrote:
>>> * The tvar_ctr needs to be per-thread, else you get a race condition.
>>> Consider the simple case where two threads run the same function,
>>> interleaving counter access.  (i.e., T1 enters, T2 enters, T1
>>> returns, T2 returns).  The threads will end up reading each other's
>>> data! 
>> Sigh.  I can't believe I missed this since I sat down and wrote out a
>> similar situation.  Actually the threads will end up getting zeros
>> (the default value).
> 
> Ah, yes, didn't quite map that out right in my head -- it will keep the
> threads' data separate.  Zero values and mangled counters are still a
> problem, of course.
> 
>> To do per-thread counters, I think I'll need another map to store the
>> counter value for that thread.  So, accessing cached parameter would
>> look something like:
>>
>> tvar_cached_data[tid(), tvar_ctr[tid()]]
>>
>> Does that look like it would work?  Got any better ideas?
> 
> That should work just fine.
> 
>>> * As a later optimization, consider merging duplicate target-var
>>> references into a single temp call.  For example, 'probe
>>> syscall.open.return { if ($flags) print($flags) }' right now
>>> duplicates the effort for each $flags reference.
>> Currently the code to get the actual parameter value gets merged to a
>> single function.  You are correct that the parameter caching map calls
>> get duplicated.
> 
> The dwarf function only exists once, but the new entry probe, the
> temporary maps, and the read from the maps are all duplicated.
> 
>> This actually might be considered a requirement, since if you have two
>> different return probes for the same function that access the same
>> cached parameter, we want them to be separate (so the counters are
>> handled correctly).
> 
> For distinct probes, sure, it's much easier to keep them separate.  We
> don't want the headache of separate probes trying to coordinate what
> order they access it in, or which one should delete and decrement.  But
> when the reference is duplicated within a single probe, it seems like it
> would be easy to read the value from the same place, especially if
> you're going to read it into a local at the top of the probe anyway.
> 
> This sort of issue may crop up more often if scripts get more elaborate
> with probe aliasing -- like this intentionally convoluted example:
> 
> /* a custom tapset that defines an interesting probe point */
> probe sysopen_with_flags = syscall.open.return {
> 	if (!$flags) next
> }
> 
> /* a script file that uses it. */
> probe sysopen_with_flags {
> 	printf("sysopen flags:%d = %d\n", $flags, $return)
> }
> 
>> I'm thinking that it might be possible to inject the code at the
>> *beginning* of the function, with the help of a temporary variable.
>> Something like this:
>>
>> probe kernel.function("foo").return
>> {
>> { /* start of injected code */
>>     tvar_cache_tmp = tvar_cached_data[tid(), tvar_ctr[tid()]]
>>     delete tvar_cached_data[tid(), tvar_ctr[tid()]--]
>>    /* end of injected code */
>> }
>> ... rest of original probe ...
>> }
>>
>> then use 'tvar_cache_tmp' everywhere the target variable was accessed.
> 
> That looks good.  You can also cache tid() into a local instead of
> paying for the function call four times.
> 
> You can probably use some internal bookkeeping to track which variables
> you've injected code for, and just reuse the tvar_cached_temp if it's
> already there.  This solves the multiple-reference problem above.
> 
>> Josh, do you (or anyone else) know if it would be a good idea to
>> delete the counter map value when it reaches 0?
> 
> No need -- setting a value to zero removes it from the map.
> 
>>> * A potential bug: due to the way string targets are read, only the
>>> string pointer is being saved for the return probe.  The
>>> 'kernel/user_string' call happens in the return probe.  In most cases
>>> this is probably ok, but there might be some functions where that
>>> string is manipulated during the course of execution.  The cleanest
>>> way I can think to solve this is to extend the language to
>>> automatically treat a target like a string -- probably using
>>> '@mystring' like we do with positional parameters.  Behind the
>>> scenes we can look at the pointer range to figure whether to use
>>> kernel_string or user_string. 
>> Hmm, I wonder if we could use a function for that and force users to
>> use it everywhere.  It would be safer and easier.
> 
> A function for automatically choosing user_string or kernel_string?
> Yeah, that would be a good thing to have.
> 
> I still like the idea of a '@var' syntax for target strings too...
> 
> 
> Josh
> 
> 
> (David will get this twice, as I forgot to CC the list the first time.
> Sorry!)


-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)