From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <systemtap-return-2658-listarch-systemtap=sources.redhat.com@sourceware.org>
Received: (qmail 3703 invoked by alias); 27 Feb 2006 21:06:35 -0000
Received: (qmail 3684 invoked by uid 22791); 27 Feb 2006 21:06:32 -0000
X-Spam-Status: No, hits=-1.3 required=5.0 	tests=AWL,BAYES_40,SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 27 Feb 2006 21:06:28 +0000
Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) 	by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id k1RL6QRe017356 	for <systemtap@sources.redhat.com>; Mon, 27 Feb 2006 16:06:26 -0500
Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [172.16.52.156]) 	by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id k1RL6Q123214 	for <systemtap@sources.redhat.com>; Mon, 27 Feb 2006 16:06:26 -0500
Received: from [172.16.59.162] (dhcp59-162.rdu.redhat.com [172.16.59.162]) 	by pobox.corp.redhat.com (8.12.8/8.12.8) with ESMTP id k1RL6QNr001468 	for <systemtap@sources.redhat.com>; Mon, 27 Feb 2006 16:06:26 -0500
Message-ID: <440369D1.6030705@redhat.com>
Date: Mon, 27 Feb 2006 21:06:00 -0000
From: William Cohen <wcohen@redhat.com>
User-Agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: SystemTAP <systemtap@sources.redhat.com>
Subject: Systemtap benchmarking doc draft
Content-Type: multipart/mixed;  boundary="------------060007000904010000070803"
X-Virus-Checked: Checked by ClamAV on sourceware.org
X-IsSubscribed: yes
Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:systemtap-subscribe@sourceware.org>
List-Post: <mailto:systemtap@sourceware.org>
List-Help: <mailto:systemtap-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: systemtap-owner@sourceware.org
X-SW-Source: 2006-q1/txt/msg00652.txt.bz2

This is a multi-part message in MIME format.
--------------060007000904010000070803
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-length: 349

Here are some thoughts on Benchmarking SystemTap. As Brad Chen would
say "got any tomatoes?", there are certainly things that need to
refined in this document. It is definitely a work in progress. Probably 
the most useful feedback would be on the metrics. Do they contain useful 
information? Should some be dropped? Should others be added?

-Will

--------------060007000904010000070803
Content-Type: text/plain;
 name="stap_benchmarking"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="stap_benchmarking"
Content-length: 5184

Systemtap Benchmarking
Feb 27, 2006

1. OVERVIEW

Instrumentation to collect measurements perturb the system being
measured. The question is often how much the perturbation is
instroduced. For the kernel developers are very interested knowing the
cost of the instrumentation, so they are able to judge whether the
instrumentation will add unacceptable overhead to the system or
perturb the system so much to make the measurement inaccurate.

This document outlines the suggested set of metrics to measure on
SystemTap's performance, the framework to collect the data, and
presentation of the results.


2. METRICS

The selected metrics should give insight into the cost and overhead of
common operations in SystemTap and supporting software. The SystemTap
users should be able to gauge how much the instrumentation will affect
the system. The metrics also allow the SystemTap developers to monitor
SystemTap for performance regressions.

Kernel micro-metrics:
	kprobe, kretprobe, and djprobes costs:
		insertion
		removal
		minimal probe firing
		colocated probes
		max number probes active
		space requirement per probe

SystemTap Language micro metrics (costs of various operations):
	cost of associative array operations

Kernel instrumentation limits:
	Number of probes that can be registered at a time
	Number of kretprobes active

SystemTap limits:
	Maximum elements in associative array
	Number of actions/steps in probe.

Instrumentation Process costs:
	latency from time stap is generated to time instrumentation running
	latency to shutdown instrumentation
	Profile where time is spend during instrumentation process
	Size of instrumentation kernel modules


3. BENCHMARK DATA COLLECTION

For some of the benchmarks data could be collected from the existing
testsuite scripts. However, there some additional benchmarks will be
required to exercise particular aspects of the system.

A variety of mechanisms will be used to collect the benchmark
information.  In some cases additional options will need to be passed
to the systemtap translator to list the latency required for different
phases of the translator.


3.1 USING EXISTING TESTSUITE INFRASTRUCTURE

There are existing testsuites for systemtap and the kernel
support. Currently, these are function tests to determine whether some
aspect of the system is working correctly. However, many of the
systemtap scripts would be represenative of what a user might write.
The testsuite scripts could provide code to pass through the
translator to measure things such as the amount of time required to
compile and install a script.

Additional performance tests will be need to be written. However, it
would be useful to fold these into the existing testsuites to expand
the set of tests to run and provide stress testing.


3.2 LATENCY MEASUREMENTS

Latency is one of the most visible metrics for the user of
SystemTap. How long does it take for the various phases of SystemTap
to complete their work and get the instrumentation collecting data?
Recently the SystemTap translator was modified to produce timing
information with the "-v" option. Such as the example below:

$ stap -v -p4 testsuite/buildok/thirteen.stp
Pass 1: parsed user script and 10 library script(s) in 180usr/10sys/289real ms.
Pass 2: analyzed script: 1 probe(s), 3 function(s), 0 global(s) in 570usr/30sys/631real ms.
Pass 3: translated to C into "/tmp/stapZrONwM/stap_2173.c" in 190usr/90sys/302real ms.
Pass 4: compiled C into "stap_2173.ko" in 5520usr/840sys/6125real ms.

It will be a simple task to parse this information to provide the cost
of various phases translation.


3.3  PROCESSOR UTILIZATION

The latency measurements provide a coarse-grained view of how long
each phase is.  Profiling with OProfile will provide some insight into
whether there are any hot spots in the SystemTap or the associated
code on the system. On X86 and X86_64 OProfile can provide samples on
most code include the kernel trap handlers.


3.4 MICROBENCHMARKS

The microbenchmarks are listed in section 2. The small scale tests
will need to be added testsuite to measure the cost of specific
operations. Currently, the testsuite does not include tests to measure
these.


4. ARCHIVING THE MEASUREMENTS

There is always going to be the cases of "Your mileage may vary,"
where the mearurements may not apply to exactly to a particular
situation.  However, it would be useful to have this information
publically available for reference, conceviably on a SystemTap webpage
on sourceware.org, so people can get a feel for the cost on various
systems. Something like the search for SPEC CPU2000 Results [SPEC06]
would be nice, but at this time it probably would more work that we
can justify at this time.  Tracking performance such as done for GCC
performance [Novillo2006] or code-size (CSiBE) [CSiBE2006] to identify
regressions might be more practical


REFERENCES

[CSiBE2006] GCC Code-Size Benchmark Environment (CSiBE), Feb 2006.
	http://www.inf.u-szeged.hu/csibe

[Novillo2006] Novillo, Deigo, Performance Tracking for GCC, Feb
2006. /http://people.redhat.com/dnovillo/spec2000/

[SPEC2006] SPEC CPU2000 Results, Feb 2006. http://www.spec.org/cpu2000/results/

--------------060007000904010000070803--