public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* measuring accept queueing time (revisited)
@ 2012-11-30  3:06 Eric Wong
  2012-11-30 17:32 ` Frank Ch. Eigler
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Wong @ 2012-11-30  3:06 UTC (permalink / raw)
  To: systemtap

Hi all, I'm wondering if there's an updated/better version of
the tapset posted here:

  http://sourceware.org/ml/systemtap/2006-q1/msg00385.html

I can probably figure it out on my own, but would like to avoid
doing so if somebody's already done the work :)

In userspace, I've been relying on tcpi_last_data_recv from the TCP_INFO
getsockopt().  but I'd rather not have to modify/restart every TCP
server to get that info.

I'd also like to be able to get the same accept queueing time for
Unix sockets, too.

Thanks for reading.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: measuring accept queueing time (revisited)
  2012-11-30  3:06 measuring accept queueing time (revisited) Eric Wong
@ 2012-11-30 17:32 ` Frank Ch. Eigler
  2012-11-30 22:35   ` Eric Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Frank Ch. Eigler @ 2012-11-30 17:32 UTC (permalink / raw)
  To: Eric Wong; +Cc: systemtap


Hi, Eric -


normalperson (we'll just see about that :-) wrote:

> Hi all, I'm wondering if there's an updated/better version of
> the [script] posted here:
>
>   http://sourceware.org/ml/systemtap/2006-q1/msg00385.html
>
> I can probably figure it out on my own, but would like to avoid
> doing so if somebody's already done the work :)

Yeah, some changes since then:
- data structures are more easily traversed by pure
  script code -> and @cast type operators
- the kernel has useful tracepoints compiled in for more events,
  **but** this does not currently seem to include any interesting
  tcp/network/socket stuff
- the translator has gotten better placing probes
- the compiler has gotten better at generating debug data to let
  us find data 

So these days, I'd try to go for intercepting the inline functions

   probe kernel.function("sk_acceptq_added") {
      /* record $sk, timestamp */
   }
   probe kernel.function("sk_acceptq_removed") { 
      /* match $sk  */
      /* record elapsed time, execname(), pid() */
   }

or their (source-code-reading-determined) callers.

And perhaps try to raise a flag on lkml to get some tracepoints into
the tcp/ip layer pretty please.


> [...]  I'd also like to be able to get the same accept queueing time
> for Unix sockets, too.

There doesn't seem to be much queuing logic in UNIX sockets, for the
connect/accept path at least.


- FChE

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: measuring accept queueing time (revisited)
  2012-11-30 17:32 ` Frank Ch. Eigler
@ 2012-11-30 22:35   ` Eric Wong
  2012-11-30 23:37     ` Josh Stone
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Wong @ 2012-11-30 22:35 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

"Frank Ch. Eigler" <fche@redhat.com> wrote:
> normalperson (we'll just see about that :-) wrote:
> 
> > Hi all, I'm wondering if there's an updated/better version of
> > the [script] posted here:
> >
> >   http://sourceware.org/ml/systemtap/2006-q1/msg00385.html
> >
> > I can probably figure it out on my own, but would like to avoid
> > doing so if somebody's already done the work :)
> 
> Yeah, some changes since then:
> - data structures are more easily traversed by pure
>   script code -> and @cast type operators
> - the kernel has useful tracepoints compiled in for more events,
>   **but** this does not currently seem to include any interesting
>   tcp/network/socket stuff
> - the translator has gotten better placing probes
> - the compiler has gotten better at generating debug data to let
>   us find data 
> 
> So these days, I'd try to go for intercepting the inline functions
> 
>    probe kernel.function("sk_acceptq_added") {
>       /* record $sk, timestamp */
>    }
>    probe kernel.function("sk_acceptq_removed") { 
>       /* match $sk  */
>       /* record elapsed time, execname(), pid() */
>    }

Thank you!  With that hint, I came up with the following for TCP:
-------------------------------- 8<----------------------------------
global start

probe kernel.function("sk_acceptq_added") {
	start[$sk] = cpu_clock_ms(0)
}

probe kernel.function("sk_acceptq_removed") {
	started = start[$sk]
	if (started) {
		delete start[$sk]
		diff = cpu_clock_ms(0) - started
		printf("%s[%d] %ld\n", execname(), pid(), diff)
	}
}
-------------------------------- 8<----------------------------------

> or their (source-code-reading-determined) callers.
> 
> And perhaps try to raise a flag on lkml to get some tracepoints into
> the tcp/ip layer pretty please.

Will do if needed (not sure what to even ask for at this point, but
I'll try to figure it out).

> > [...]  I'd also like to be able to get the same accept queueing time
> > for Unix sockets, too.
> 
> There doesn't seem to be much queuing logic in UNIX sockets, for the
> connect/accept path at least.

I managed to come up with the following for UNIX sockets.  It could
probably be done better, but I'm not sure why my probe for unix_accept()
does not work.

Thus, I'm probing skb_recv_datagram() instead, but that might cast
too big of a net (I expect it to be slower, but haven't benched).

-------------------------------- 8<----------------------------------
global start

/*
 * 3.6.8 only has two unix_create1() callers:
 * - unix_stream_connect() - we care about this
 * - unix_create()         - we don't care about this
 */
probe kernel.function("unix_create1").return {
	start[$return] = cpu_clock_ms(0)
}

/*
 * we only care about unix_create1() called from unix_stream_connect(),
 * so avoid wasting space here.
 * Maybe future (or past) kernels had more unix_create1() callers,
 * that would be bad...
 */
probe kernel.function("unix_create").return {
	delete start[$return]
}

/* XXX: This doesn't seem to work, so I'm probing skb_recv_diagram instead */
probe kernel.function("unix_accept").return {
	if ($return != 0)
		next
	sk = $newsock->sk
	/* state: 3 == SS_CONNECTED, but this shows 1 (SS_UNCONNECTED) ... */
	/* printf("accept %d %lx %lx\n", $newsock->state, $newsock, sk); */
	started = start[sk]
	if (started) {
		delete start[sk]
		diff = cpu_clock_ms(0) - started
		printf("%s[%d] %ld\n", execname(), pid(), diff)
	}
}

probe kernel.function("skb_recv_datagram").return {
	if (!$return)
		next
	sk = $return->sk
	started = start[sk]
	if (started) {
		delete start[sk]
		diff = cpu_clock_ms(0) - started
		printf("%s[%d] %ld\n", execname(), pid(), diff)
	}
}
-------------------------------- 8<----------------------------------
I'm using an unpatched Linux 3.6.8 from the stable branch

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: measuring accept queueing time (revisited)
  2012-11-30 22:35   ` Eric Wong
@ 2012-11-30 23:37     ` Josh Stone
  2012-12-01  0:13       ` Eric Wong
  2012-12-02 19:48       ` Turgis, Frederic
  0 siblings, 2 replies; 6+ messages in thread
From: Josh Stone @ 2012-11-30 23:37 UTC (permalink / raw)
  To: systemtap

On 11/30/2012 02:35 PM, Eric Wong wrote:
> /* XXX: This doesn't seem to work, so I'm probing skb_recv_diagram instead */
> probe kernel.function("unix_accept").return {
> 	if ($return != 0)
> 		next
> 	sk = $newsock->sk
> 	/* state: 3 == SS_CONNECTED, but this shows 1 (SS_UNCONNECTED) ... */
> 	/* printf("accept %d %lx %lx\n", $newsock->state, $newsock, sk); */

This is probably an instance of one common confusion -- in .return
probes, all instances of $var, $var->member, etc. are captured when the
function is *entered* (except for $return, of course).  The context in
which the return trampoline actually runs is just not amenable to
decoding DWARF from the function we just left.

So it makes twisted sense to see SS_UNCONNECTED from the entry of
unix_accept, but you want to see the SS_CONNECTED when it returns.

One way to cheat this is to go ahead and grab the implicit entry value
for $newsock, and then use a dynamic @cast to get the member you really
want at return time.  Something like:
  @cast($newsock, "struct socket")->state

You probably need the same for $newsock->sk too.

We had an old bug arguing about whether delaying member access should be
standard, which I can't find now.  Suffice to say we never agreed to
change this behavior, so here we are.  PR14437 would give a mechanism to
be explicit about it though, like @entry($newsock)->state.
  http://sourceware.org/bugzilla/show_bug.cgi?id=14437

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: measuring accept queueing time (revisited)
  2012-11-30 23:37     ` Josh Stone
@ 2012-12-01  0:13       ` Eric Wong
  2012-12-02 19:48       ` Turgis, Frederic
  1 sibling, 0 replies; 6+ messages in thread
From: Eric Wong @ 2012-12-01  0:13 UTC (permalink / raw)
  To: systemtap

Josh Stone <jistone@redhat.com> wrote:
> On 11/30/2012 02:35 PM, Eric Wong wrote:
> > /* XXX: This doesn't seem to work, so I'm probing skb_recv_diagram instead */
> > probe kernel.function("unix_accept").return {
> > 	if ($return != 0)
> > 		next
> > 	sk = $newsock->sk
> > 	/* state: 3 == SS_CONNECTED, but this shows 1 (SS_UNCONNECTED) ... */
> > 	/* printf("accept %d %lx %lx\n", $newsock->state, $newsock, sk); */
> 
> This is probably an instance of one common confusion -- in .return
> probes, all instances of $var, $var->member, etc. are captured when the
> function is *entered* (except for $return, of course).  The context in
> which the return trampoline actually runs is just not amenable to
> decoding DWARF from the function we just left.
> 
> So it makes twisted sense to see SS_UNCONNECTED from the entry of
> unix_accept, but you want to see the SS_CONNECTED when it returns.
> 
> One way to cheat this is to go ahead and grab the implicit entry value
> for $newsock, and then use a dynamic @cast to get the member you really
> want at return time.  Something like:
>   @cast($newsock, "struct socket")->state
> 
> You probably need the same for $newsock->sk too.

Ah, thank you!  I was very confused by sk==0 and state==SS_UNCONNECTED
but your explanation makes sense.

> We had an old bug arguing about whether delaying member access should be
> standard, which I can't find now.  Suffice to say we never agreed to
> change this behavior, so here we are.  PR14437 would give a mechanism to
> be explicit about it though, like @entry($newsock)->state.
>  http://sourceware.org/bugzilla/show_bug.cgi?id=14437

That's the one surprising/confusing thing I've found about the language
so far.  Otherwise, I'm very happy at how easy and powerful this is!

I think I'll just declare start with '%' for wrapping and not worry
about future/past callers of unix_create1().  I may not even need to
probe unix_create() at all with wrapping...

Here's my updated unix_accept.stp:
-------------------------------- 8< ------------------------------
/* wrap in case there are more unix_create1() callers in other versions */
global start%

/*
 * 3.6.8 only has two unix_create1() callers:
 * - unix_stream_connect() - we care about this
 * - unix_create()         - we don't care about this
 */
probe kernel.function("unix_create1").return {
	start[$return] = cpu_clock_ms(0)
}

/*
 * we only care about unix_create1() called from unix_stream_connect(),
 * so avoid wasting space here.
 */
probe kernel.function("unix_create").return {
	delete start[$return]
}

probe kernel.function("unix_accept").return {
	if ($return != 0)
		next
	sk = @cast($newsock, "struct socket")->sk
	started = start[sk]
	if (started) {
		delete start[sk]
		diff = cpu_clock_ms(0) - started
		printf("%s[%d] %ld\n", execname(), pid(), diff)
	}
}
-------------------------------- 8< ------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: measuring accept queueing time (revisited)
  2012-11-30 23:37     ` Josh Stone
  2012-12-01  0:13       ` Eric Wong
@ 2012-12-02 19:48       ` Turgis, Frederic
  1 sibling, 0 replies; 6+ messages in thread
From: Turgis, Frederic @ 2012-12-02 19:48 UTC (permalink / raw)
  To: systemtap; +Cc: Josh Stone

>One way to cheat this is to go ahead and grab the implicit entry value
>for $newsock, and then use a dynamic @cast to get the member you really
>want at return time.  Something like:
>  @cast($newsock, "struct socket")->state
>

As a small addition to Josh comments: http://sourceware.org/ml/systemtap/2012-q1/msg00025.html summarizes few experiments, there seems to be also the possibility to control arrays in return probe.
But http://sourceware.org/bugzilla/show_bug.cgi?id=14437 is clearly an answer to this.


Regards
Fred

OMAP Platform Business Unit - System Platform Engineering - Platform & Product Entitlement


Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 12.654.784


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-12-02 19:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-30  3:06 measuring accept queueing time (revisited) Eric Wong
2012-11-30 17:32 ` Frank Ch. Eigler
2012-11-30 22:35   ` Eric Wong
2012-11-30 23:37     ` Josh Stone
2012-12-01  0:13       ` Eric Wong
2012-12-02 19:48       ` Turgis, Frederic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).