* measuring accept queueing time (revisited)
@ 2012-11-30 3:06 Eric Wong
2012-11-30 17:32 ` Frank Ch. Eigler
0 siblings, 1 reply; 6+ messages in thread
From: Eric Wong @ 2012-11-30 3:06 UTC (permalink / raw)
To: systemtap
Hi all, I'm wondering if there's an updated/better version of
the tapset posted here:
http://sourceware.org/ml/systemtap/2006-q1/msg00385.html
I can probably figure it out on my own, but would like to avoid
doing so if somebody's already done the work :)
In userspace, I've been relying on tcpi_last_data_recv from the TCP_INFO
getsockopt(). but I'd rather not have to modify/restart every TCP
server to get that info.
I'd also like to be able to get the same accept queueing time for
Unix sockets, too.
Thanks for reading.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: measuring accept queueing time (revisited)
2012-11-30 3:06 measuring accept queueing time (revisited) Eric Wong
@ 2012-11-30 17:32 ` Frank Ch. Eigler
2012-11-30 22:35 ` Eric Wong
0 siblings, 1 reply; 6+ messages in thread
From: Frank Ch. Eigler @ 2012-11-30 17:32 UTC (permalink / raw)
To: Eric Wong; +Cc: systemtap
Hi, Eric -
normalperson (we'll just see about that :-) wrote:
> Hi all, I'm wondering if there's an updated/better version of
> the [script] posted here:
>
> http://sourceware.org/ml/systemtap/2006-q1/msg00385.html
>
> I can probably figure it out on my own, but would like to avoid
> doing so if somebody's already done the work :)
Yeah, some changes since then:
- data structures are more easily traversed by pure
script code -> and @cast type operators
- the kernel has useful tracepoints compiled in for more events,
**but** this does not currently seem to include any interesting
tcp/network/socket stuff
- the translator has gotten better placing probes
- the compiler has gotten better at generating debug data to let
us find data
So these days, I'd try to go for intercepting the inline functions
probe kernel.function("sk_acceptq_added") {
/* record $sk, timestamp */
}
probe kernel.function("sk_acceptq_removed") {
/* match $sk */
/* record elapsed time, execname(), pid() */
}
or their (source-code-reading-determined) callers.
And perhaps try to raise a flag on lkml to get some tracepoints into
the tcp/ip layer pretty please.
> [...] I'd also like to be able to get the same accept queueing time
> for Unix sockets, too.
There doesn't seem to be much queuing logic in UNIX sockets, for the
connect/accept path at least.
- FChE
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: measuring accept queueing time (revisited)
2012-11-30 17:32 ` Frank Ch. Eigler
@ 2012-11-30 22:35 ` Eric Wong
2012-11-30 23:37 ` Josh Stone
0 siblings, 1 reply; 6+ messages in thread
From: Eric Wong @ 2012-11-30 22:35 UTC (permalink / raw)
To: Frank Ch. Eigler; +Cc: systemtap
"Frank Ch. Eigler" <fche@redhat.com> wrote:
> normalperson (we'll just see about that :-) wrote:
>
> > Hi all, I'm wondering if there's an updated/better version of
> > the [script] posted here:
> >
> > http://sourceware.org/ml/systemtap/2006-q1/msg00385.html
> >
> > I can probably figure it out on my own, but would like to avoid
> > doing so if somebody's already done the work :)
>
> Yeah, some changes since then:
> - data structures are more easily traversed by pure
> script code -> and @cast type operators
> - the kernel has useful tracepoints compiled in for more events,
> **but** this does not currently seem to include any interesting
> tcp/network/socket stuff
> - the translator has gotten better placing probes
> - the compiler has gotten better at generating debug data to let
> us find data
>
> So these days, I'd try to go for intercepting the inline functions
>
> probe kernel.function("sk_acceptq_added") {
> /* record $sk, timestamp */
> }
> probe kernel.function("sk_acceptq_removed") {
> /* match $sk */
> /* record elapsed time, execname(), pid() */
> }
Thank you! With that hint, I came up with the following for TCP:
-------------------------------- 8<----------------------------------
global start
probe kernel.function("sk_acceptq_added") {
start[$sk] = cpu_clock_ms(0)
}
probe kernel.function("sk_acceptq_removed") {
started = start[$sk]
if (started) {
delete start[$sk]
diff = cpu_clock_ms(0) - started
printf("%s[%d] %ld\n", execname(), pid(), diff)
}
}
-------------------------------- 8<----------------------------------
> or their (source-code-reading-determined) callers.
>
> And perhaps try to raise a flag on lkml to get some tracepoints into
> the tcp/ip layer pretty please.
Will do if needed (not sure what to even ask for at this point, but
I'll try to figure it out).
> > [...] I'd also like to be able to get the same accept queueing time
> > for Unix sockets, too.
>
> There doesn't seem to be much queuing logic in UNIX sockets, for the
> connect/accept path at least.
I managed to come up with the following for UNIX sockets. It could
probably be done better, but I'm not sure why my probe for unix_accept()
does not work.
Thus, I'm probing skb_recv_datagram() instead, but that might cast
too big of a net (I expect it to be slower, but haven't benched).
-------------------------------- 8<----------------------------------
global start
/*
* 3.6.8 only has two unix_create1() callers:
* - unix_stream_connect() - we care about this
* - unix_create() - we don't care about this
*/
probe kernel.function("unix_create1").return {
start[$return] = cpu_clock_ms(0)
}
/*
* we only care about unix_create1() called from unix_stream_connect(),
* so avoid wasting space here.
* Maybe future (or past) kernels had more unix_create1() callers,
* that would be bad...
*/
probe kernel.function("unix_create").return {
delete start[$return]
}
/* XXX: This doesn't seem to work, so I'm probing skb_recv_diagram instead */
probe kernel.function("unix_accept").return {
if ($return != 0)
next
sk = $newsock->sk
/* state: 3 == SS_CONNECTED, but this shows 1 (SS_UNCONNECTED) ... */
/* printf("accept %d %lx %lx\n", $newsock->state, $newsock, sk); */
started = start[sk]
if (started) {
delete start[sk]
diff = cpu_clock_ms(0) - started
printf("%s[%d] %ld\n", execname(), pid(), diff)
}
}
probe kernel.function("skb_recv_datagram").return {
if (!$return)
next
sk = $return->sk
started = start[sk]
if (started) {
delete start[sk]
diff = cpu_clock_ms(0) - started
printf("%s[%d] %ld\n", execname(), pid(), diff)
}
}
-------------------------------- 8<----------------------------------
I'm using an unpatched Linux 3.6.8 from the stable branch
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: measuring accept queueing time (revisited)
2012-11-30 22:35 ` Eric Wong
@ 2012-11-30 23:37 ` Josh Stone
2012-12-01 0:13 ` Eric Wong
2012-12-02 19:48 ` Turgis, Frederic
0 siblings, 2 replies; 6+ messages in thread
From: Josh Stone @ 2012-11-30 23:37 UTC (permalink / raw)
To: systemtap
On 11/30/2012 02:35 PM, Eric Wong wrote:
> /* XXX: This doesn't seem to work, so I'm probing skb_recv_diagram instead */
> probe kernel.function("unix_accept").return {
> if ($return != 0)
> next
> sk = $newsock->sk
> /* state: 3 == SS_CONNECTED, but this shows 1 (SS_UNCONNECTED) ... */
> /* printf("accept %d %lx %lx\n", $newsock->state, $newsock, sk); */
This is probably an instance of one common confusion -- in .return
probes, all instances of $var, $var->member, etc. are captured when the
function is *entered* (except for $return, of course). The context in
which the return trampoline actually runs is just not amenable to
decoding DWARF from the function we just left.
So it makes twisted sense to see SS_UNCONNECTED from the entry of
unix_accept, but you want to see the SS_CONNECTED when it returns.
One way to cheat this is to go ahead and grab the implicit entry value
for $newsock, and then use a dynamic @cast to get the member you really
want at return time. Something like:
@cast($newsock, "struct socket")->state
You probably need the same for $newsock->sk too.
We had an old bug arguing about whether delaying member access should be
standard, which I can't find now. Suffice to say we never agreed to
change this behavior, so here we are. PR14437 would give a mechanism to
be explicit about it though, like @entry($newsock)->state.
http://sourceware.org/bugzilla/show_bug.cgi?id=14437
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: measuring accept queueing time (revisited)
2012-11-30 23:37 ` Josh Stone
@ 2012-12-01 0:13 ` Eric Wong
2012-12-02 19:48 ` Turgis, Frederic
1 sibling, 0 replies; 6+ messages in thread
From: Eric Wong @ 2012-12-01 0:13 UTC (permalink / raw)
To: systemtap
Josh Stone <jistone@redhat.com> wrote:
> On 11/30/2012 02:35 PM, Eric Wong wrote:
> > /* XXX: This doesn't seem to work, so I'm probing skb_recv_diagram instead */
> > probe kernel.function("unix_accept").return {
> > if ($return != 0)
> > next
> > sk = $newsock->sk
> > /* state: 3 == SS_CONNECTED, but this shows 1 (SS_UNCONNECTED) ... */
> > /* printf("accept %d %lx %lx\n", $newsock->state, $newsock, sk); */
>
> This is probably an instance of one common confusion -- in .return
> probes, all instances of $var, $var->member, etc. are captured when the
> function is *entered* (except for $return, of course). The context in
> which the return trampoline actually runs is just not amenable to
> decoding DWARF from the function we just left.
>
> So it makes twisted sense to see SS_UNCONNECTED from the entry of
> unix_accept, but you want to see the SS_CONNECTED when it returns.
>
> One way to cheat this is to go ahead and grab the implicit entry value
> for $newsock, and then use a dynamic @cast to get the member you really
> want at return time. Something like:
> @cast($newsock, "struct socket")->state
>
> You probably need the same for $newsock->sk too.
Ah, thank you! I was very confused by sk==0 and state==SS_UNCONNECTED
but your explanation makes sense.
> We had an old bug arguing about whether delaying member access should be
> standard, which I can't find now. Suffice to say we never agreed to
> change this behavior, so here we are. PR14437 would give a mechanism to
> be explicit about it though, like @entry($newsock)->state.
> http://sourceware.org/bugzilla/show_bug.cgi?id=14437
That's the one surprising/confusing thing I've found about the language
so far. Otherwise, I'm very happy at how easy and powerful this is!
I think I'll just declare start with '%' for wrapping and not worry
about future/past callers of unix_create1(). I may not even need to
probe unix_create() at all with wrapping...
Here's my updated unix_accept.stp:
-------------------------------- 8< ------------------------------
/* wrap in case there are more unix_create1() callers in other versions */
global start%
/*
* 3.6.8 only has two unix_create1() callers:
* - unix_stream_connect() - we care about this
* - unix_create() - we don't care about this
*/
probe kernel.function("unix_create1").return {
start[$return] = cpu_clock_ms(0)
}
/*
* we only care about unix_create1() called from unix_stream_connect(),
* so avoid wasting space here.
*/
probe kernel.function("unix_create").return {
delete start[$return]
}
probe kernel.function("unix_accept").return {
if ($return != 0)
next
sk = @cast($newsock, "struct socket")->sk
started = start[sk]
if (started) {
delete start[sk]
diff = cpu_clock_ms(0) - started
printf("%s[%d] %ld\n", execname(), pid(), diff)
}
}
-------------------------------- 8< ------------------------------
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: measuring accept queueing time (revisited)
2012-11-30 23:37 ` Josh Stone
2012-12-01 0:13 ` Eric Wong
@ 2012-12-02 19:48 ` Turgis, Frederic
1 sibling, 0 replies; 6+ messages in thread
From: Turgis, Frederic @ 2012-12-02 19:48 UTC (permalink / raw)
To: systemtap; +Cc: Josh Stone
>One way to cheat this is to go ahead and grab the implicit entry value
>for $newsock, and then use a dynamic @cast to get the member you really
>want at return time. Something like:
> @cast($newsock, "struct socket")->state
>
As a small addition to Josh comments: http://sourceware.org/ml/systemtap/2012-q1/msg00025.html summarizes few experiments, there seems to be also the possibility to control arrays in return probe.
But http://sourceware.org/bugzilla/show_bug.cgi?id=14437 is clearly an answer to this.
Regards
Fred
OMAP Platform Business Unit - System Platform Engineering - Platform & Product Entitlement
Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 12.654.784
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-12-02 19:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-30 3:06 measuring accept queueing time (revisited) Eric Wong
2012-11-30 17:32 ` Frank Ch. Eigler
2012-11-30 22:35 ` Eric Wong
2012-11-30 23:37 ` Josh Stone
2012-12-01 0:13 ` Eric Wong
2012-12-02 19:48 ` Turgis, Frederic
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).