public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Example script to look at delays handling network RX packets
@ 2010-10-27 20:04 William Cohen
  0 siblings, 0 replies; only message in thread
From: William Cohen @ 2010-10-27 20:04 UTC (permalink / raw)
  To: SystemTAP; +Cc: Andrew Gospodarek, Neil Horman

[-- Attachment #1: Type: text/plain, Size: 1616 bytes --]

Hi All,

This past week I was talking with Andy Gospodarek about a script that could help diagnose why received packets were being dropped by the kernel for a gigabit ethernet card. The network card uses a circular ring buffer to hold information for the received packets. If the packets are small, excessive latency by the kernel could lead to overwriting some of the older entries in the circular ring buffer and losing some rx packets. This issue can be observed by looking at hardware counters on the network hardware. If they are tcp packets the application are not going to notices this because things are automatically resent.

The attached script, napi_poll_rx.stp, attempts to provides some more information about the latencies between different places in the code. It provides statistics between the time that a the rx operation is scheduled and the time that start napi_polling, the time between napi_poll operations, and the number of napi_poll operations per rx operation.

I would like to incorporate this into the systemtap examples. Some questions related to the current script:

-Would it be possible to have a trace_point for __netif_rx_schedule to reduce overhead and eliminate debuginfo
-Would it make sense to remove cpu() from the associative arrays and just key off the $napi pointer to allow
	netif_rx_schedul/napi_poll on different processors?
	Or could multiple processors be using the same $napi pointer at the same time?
-Are there other things that the script could be measuring to give a better understanding of latency issues?

Any feedback on the script would be appreciated.

-Will


[-- Attachment #2: napi_poll_rx.stp --]
[-- Type: text/plain, Size: 2653 bytes --]

/* This script is designed to answer the following questions:
1) The amount of time between the network card scheduling rx and handling
2) The statistics on network polling operations between network rx scheduling
	a) number of rx scheduling ops
	b) min, average, and max time
3) Time between last rx polling operation and next network card rx scheduling

*/

global rx_time, napi_time, napi_count
global rx_to_napi, napi_to_rx, napi_to_napi, napi_per_rx

probe kernel.function("__netif_rx_schedule") {
  t = gettimeofday_us(); p = cpu();
  dev = $dev;
  rx_time[dev,p] = t;
  nt = napi_time[dev,p];
  if (nt) {
    napi_to_rx[dev,p] <<< t - nt;
    napi_per_rx[dev,p] <<< napi_count[dev,p];
    delete napi_time[dev,p]
    delete napi_count[dev,p]
  }
}


probe kernel.trace("napi_poll") {
  t = gettimeofday_us(); p = cpu();
  dev = $napi
  pt = napi_time[dev,p];
  /* Check to see if this was was first napi poll after rx_schedule */
  ni =  rx_time[dev,p];
  if (ni) {
    delete rx_time[dev,p];
    rx_to_napi[dev,p] <<< t - ni;
   } else {
    if (pt)
      napi_to_napi[dev,p] <<< t - pt;
  }
  napi_time[dev,p] = t;
  napi_count[dev,p]++ 
}

probe end {
  /* printout results */
  printf("\nrx_to_napi\n");
  printf("%18s %4s %12s %12s %12s %12s\n", "device",
         "cpu", "count", "min(us)", "avg(us)", "max(us)");
  foreach([dev+,p] in rx_to_napi)
    printf("%18p %4d %12d %12d %12d %12d\n",
         dev, p, @count(rx_to_napi[dev, p]),
         @min(rx_to_napi[dev, p]), @avg(rx_to_napi[dev, p]),
         @max(rx_to_napi[dev, p]));

  printf("\nnapi_to_napi\n");
  printf("%18s %4s %12s %12s %12s %12s\n", "device",
         "cpu", "count", "min(us)", "avg(us)", "max(us)");
  foreach([dev+, p] in napi_to_napi)
    printf("%18p %4d %12d %12d %12d %12d\n",
         dev, p, @count(napi_to_napi[dev, p]),
         @min(napi_to_napi[dev, p]), @avg(napi_to_napi[dev, p]),
         @max(napi_to_napi[dev, p]));

  printf("\nlast_napi_to_rx\n");
  printf("%18s %4s %12s %12s %12s %12s\n", "device",
         "cpu", "count", "min(us)", "avg(us)", "max(us)");
  foreach([dev+, p] in napi_to_rx)
    printf("%18p %4d %12d %12d %12d %12d\n",
         dev, p, @count(napi_to_rx[dev, p]),
         @min(napi_to_rx[dev, p]), @avg(napi_to_rx[dev, p]),
         @max(napi_to_rx[dev, p]));

  printf("\nnapi_per_rx\n");
  printf("%18s %4s %12s %12s %12s %12s\n", "device",
         "cpu", "samples", "min", "avg", "max");
  foreach([dev+, p] in napi_per_rx)
    printf("%18p %4d %12d %12d %12d %12d\n",
         dev, p, @count(napi_per_rx[dev, p]),
         @min(napi_per_rx[dev, p]), @avg(napi_per_rx[dev, p]),
         @max(napi_per_rx[dev, p]));
}


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2010-10-27 20:04 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-27 20:04 Example script to look at delays handling network RX packets William Cohen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).